Skip to content
  • About
  • CoursesExpand
    • Problem Solving using C Language
    • Mastering Database Management
    • Linux System Administration
    • Linux and Shell Programming
  • Publications
  • Professional Certificates
  • BooksExpand
    • Books Authored
  • Patents
Download CV
Artificial Intelligence

What is an Activation Function?

An activation function is a mathematical function applied to the output of each neuron in a neural network. It determines whether a neuron should be activated or not based on its input. Activation functions introduce non-linearity into the network, allowing it to model complex patterns and interactions in the data.

Why Do We Need Activation Functions?

  1. Introducing Non-linearity:
    • Real-world data is often non-linear, and to model such complex relationships, non-linear activation functions are essential. Without them, a neural network would essentially be a linear regression model, no matter how many layers it has.
  2. Enabling Deep Learning:
    • Activation functions enable neural networks to stack multiple layers, making them deep. Each layer can learn different levels of abstraction thanks to the non-linearity introduced by the activation function.
  3. Controlling Neuron Output:
    • Activation functions help in controlling the output of neurons, ensuring that they fall within a certain range (e.g., between 0 and 1 for sigmoid).

Common Activation Functions

  1. Sigmoid Function:
    • Formula: σ(x)=11+e−x\sigma(x) = \frac{1}{1 + e^{-x}}σ(x)=1+e−x1​
    • Range: (0, 1)
    • Used in: Output layers for binary classification problems.
    • Pros: Smooth gradient, output values bound between 0 and 1.
    • Cons: Vanishing gradient problem, outputs not zero-centered.
  2. Hyperbolic Tangent (Tanh) Function:
    • Formula: tanh⁡(x)=ex−e−xex+e−x\tanh(x) = \frac{e^x – e^{-x}}{e^x + e^{-x}}tanh(x)=ex+e−xex−e−x​
    • Range: (-1, 1)
    • Used in: Hidden layers.
    • Pros: Zero-centered output, stronger gradients than sigmoid.
    • Cons: Vanishing gradient problem.
  3. ReLU (Rectified Linear Unit):
    • Formula: ReLU(x)=max⁡(0,x)\text{ReLU}(x) = \max(0, x)ReLU(x)=max(0,x)
    • Range: [0, ∞)
    • Used in: Hidden layers.
    • Pros: Computationally efficient, mitigates vanishing gradient problem, sparsity (many neurons are deactivated).
    • Cons: Can cause dead neurons if many neurons output zero (dying ReLU problem).
  4. Leaky ReLU:
    • Formula: Leaky ReLU(x)=max⁡(0.01x,x)\text{Leaky ReLU}(x) = \max(0.01x, x)Leaky ReLU(x)=max(0.01x,x)
    • Range: (-∞, ∞)
    • Used in: Hidden layers.
    • Pros: Addresses dying ReLU problem by allowing a small gradient when the unit is not active.
    • Cons: Introduces a small slope which might not always be beneficial.
  5. Softmax Function:
    • Formula: Softmax(xi)=exi∑jexj\text{Softmax}(x_i) = \frac{e^{x_i}}{\sum_{j} e^{x_j}}Softmax(xi​)=∑j​exj​exi​​
    • Range: (0, 1), with all outputs summing to 1.
    • Used in: Output layers for multi-class classification problems.
    • Pros: Converts logits to probabilities, useful for multi-class classification.
    • Cons: Can be computationally expensive for a large number of classes.

Example of Using Activation Functions

Here’s an example of a simple neural network with ReLU and sigmoid activation functions:

import numpy as np

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def relu(x):
    return np.maximum(0, x)

def initialize_parameters(input_size, hidden_size, output_size):
    W1 = np.random.randn(input_size, hidden_size) * np.sqrt(1 / input_size)
    b1 = np.zeros((1, hidden_size))
    W2 = np.random.randn(hidden_size, output_size) * np.sqrt(1 / hidden_size)
    b2 = np.zeros((1, output_size))
    return W1, b1, W2, b2

def forward_propagation(X, W1, b1, W2, b2):
    Z1 = np.dot(X, W1) + b1
    A1 = relu(Z1)  # Apply ReLU activation function
    Z2 = np.dot(A1, W2) + b2
    A2 = sigmoid(Z2)  # Apply Sigmoid activation function
    return Z1, A1, Z2, A2

# Define the neural network structure
input_size = 3  # Number of input features
hidden_size = 4  # Number of neurons in the hidden layer
output_size = 1  # Number of output neurons

# Initialize parameters
W1, b1, W2, b2 = initialize_parameters(input_size, hidden_size, output_size)

# Input data (example)
X = np.array([[0, 0, 1],
              [1, 1, 1],
              [1, 0, 1],
              [0, 1, 1]])

# Forward propagation
Z1, A1, Z2, A2 = forward_propagation(X, W1, b1, W2, b2)

# Print the outputs
print("Z1:", Z1)
print("A1:", A1)
print("Z2:", Z2)
print("A2:", A2)

Summary

Activation functions are vital in neural networks for introducing non-linearity, which allows the network to model complex patterns. They control the output of neurons and enable the stacking of multiple layers, making deep learning possible. Different activation functions are used in various parts of the network depending on the specific requirements and characteristics of the data.

Post navigation

Previous Previous
What are Biases in Neural Networks?
NextContinue
Sigmoid Function in Neural Network
Latest

Advance AI PPT

Read More Advance AI PPTContinue

Latest

Prompts for Image Descriptions

Describe the scene using three vivid sensory details — one for sight, one for sound, and one for touch. Summarize the mood of the image…

Read More Prompts for Image DescriptionsContinue

Latest

Dimensionality Reduction

Dimensionality reduction is the process of reducing the number of features (variables) in a dataset while preserving important information. It helps in: ✅ Reducing computational…

Read More Dimensionality ReductionContinue

Artificial Intelligence

Tanh Function in Neural Network

The tanh function, short for hyperbolic tangent function, is another commonly used activation function in neural networks. It maps any real-valued number into a value…

Read More Tanh Function in Neural NetworkContinue

Latest

Why Initialize Weights in Neural Network

Initializing weights and biases is a crucial step in building a neural network. Proper initialization helps ensure that the network converges to a good solution…

Read More Why Initialize Weights in Neural NetworkContinue

Nishant Munjal

Coding Humanity’s Future </>

Facebook Twitter Linkedin YouTube Github Email

Tools

  • SIP Calculator
  • Write with AI
  • SamplePHP
  • Image Converter

Resources

  • Blog
  • Contact
  • Refund and Returns

Legal

  • Disclaimer
  • Privacy Policy
  • Terms and Conditions

© 2025 - All Rights Reserved

  • About
  • Courses
    • Problem Solving using C Language
    • Mastering Database Management
    • Linux System Administration
    • Linux and Shell Programming
  • Publications
  • Professional Certificates
  • Books
    • Books Authored
  • Patents
Download CV
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.Ok