Skip to content
  • About
  • Courses
  • ResearchExpand
    • Research Publications
    • Books
    • Patents
  • Workshop/Conferences
  • ToolsExpand
    • Creative Image Converter
    • Creative QRCode Generator
    • Creative QR Code Generator Tool
    • EMI Calculator
    • SIP Calculator
  • Blog
  • Resume
Download CV
Artificial Intelligence

What is an Activation Function?

An activation function is a mathematical function applied to the output of each neuron in a neural network. It determines whether a neuron should be activated or not based on its input. Activation functions introduce non-linearity into the network, allowing it to model complex patterns and interactions in the data.

Why Do We Need Activation Functions?

  1. Introducing Non-linearity:
    • Real-world data is often non-linear, and to model such complex relationships, non-linear activation functions are essential. Without them, a neural network would essentially be a linear regression model, no matter how many layers it has.
  2. Enabling Deep Learning:
    • Activation functions enable neural networks to stack multiple layers, making them deep. Each layer can learn different levels of abstraction thanks to the non-linearity introduced by the activation function.
  3. Controlling Neuron Output:
    • Activation functions help in controlling the output of neurons, ensuring that they fall within a certain range (e.g., between 0 and 1 for sigmoid).

Common Activation Functions

  1. Sigmoid Function:
    • Formula: σ(x)=11+e−x\sigma(x) = \frac{1}{1 + e^{-x}}σ(x)=1+e−x1​
    • Range: (0, 1)
    • Used in: Output layers for binary classification problems.
    • Pros: Smooth gradient, output values bound between 0 and 1.
    • Cons: Vanishing gradient problem, outputs not zero-centered.
  2. Hyperbolic Tangent (Tanh) Function:
    • Formula: tanh⁡(x)=ex−e−xex+e−x\tanh(x) = \frac{e^x – e^{-x}}{e^x + e^{-x}}tanh(x)=ex+e−xex−e−x​
    • Range: (-1, 1)
    • Used in: Hidden layers.
    • Pros: Zero-centered output, stronger gradients than sigmoid.
    • Cons: Vanishing gradient problem.
  3. ReLU (Rectified Linear Unit):
    • Formula: ReLU(x)=max⁡(0,x)\text{ReLU}(x) = \max(0, x)ReLU(x)=max(0,x)
    • Range: [0, ∞)
    • Used in: Hidden layers.
    • Pros: Computationally efficient, mitigates vanishing gradient problem, sparsity (many neurons are deactivated).
    • Cons: Can cause dead neurons if many neurons output zero (dying ReLU problem).
  4. Leaky ReLU:
    • Formula: Leaky ReLU(x)=max⁡(0.01x,x)\text{Leaky ReLU}(x) = \max(0.01x, x)Leaky ReLU(x)=max(0.01x,x)
    • Range: (-∞, ∞)
    • Used in: Hidden layers.
    • Pros: Addresses dying ReLU problem by allowing a small gradient when the unit is not active.
    • Cons: Introduces a small slope which might not always be beneficial.
  5. Softmax Function:
    • Formula: Softmax(xi)=exi∑jexj\text{Softmax}(x_i) = \frac{e^{x_i}}{\sum_{j} e^{x_j}}Softmax(xi​)=∑j​exj​exi​​
    • Range: (0, 1), with all outputs summing to 1.
    • Used in: Output layers for multi-class classification problems.
    • Pros: Converts logits to probabilities, useful for multi-class classification.
    • Cons: Can be computationally expensive for a large number of classes.

Example of Using Activation Functions

Here’s an example of a simple neural network with ReLU and sigmoid activation functions:

import numpy as np

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def relu(x):
    return np.maximum(0, x)

def initialize_parameters(input_size, hidden_size, output_size):
    W1 = np.random.randn(input_size, hidden_size) * np.sqrt(1 / input_size)
    b1 = np.zeros((1, hidden_size))
    W2 = np.random.randn(hidden_size, output_size) * np.sqrt(1 / hidden_size)
    b2 = np.zeros((1, output_size))
    return W1, b1, W2, b2

def forward_propagation(X, W1, b1, W2, b2):
    Z1 = np.dot(X, W1) + b1
    A1 = relu(Z1)  # Apply ReLU activation function
    Z2 = np.dot(A1, W2) + b2
    A2 = sigmoid(Z2)  # Apply Sigmoid activation function
    return Z1, A1, Z2, A2

# Define the neural network structure
input_size = 3  # Number of input features
hidden_size = 4  # Number of neurons in the hidden layer
output_size = 1  # Number of output neurons

# Initialize parameters
W1, b1, W2, b2 = initialize_parameters(input_size, hidden_size, output_size)

# Input data (example)
X = np.array([[0, 0, 1],
              [1, 1, 1],
              [1, 0, 1],
              [0, 1, 1]])

# Forward propagation
Z1, A1, Z2, A2 = forward_propagation(X, W1, b1, W2, b2)

# Print the outputs
print("Z1:", Z1)
print("A1:", A1)
print("Z2:", Z2)
print("A2:", A2)

Summary

Activation functions are vital in neural networks for introducing non-linearity, which allows the network to model complex patterns. They control the output of neurons and enable the stacking of multiple layers, making deep learning possible. Different activation functions are used in various parts of the network depending on the specific requirements and characteristics of the data.

Post navigation

Previous Previous
What are Biases in Neural Networks?
NextContinue
Sigmoid Function in Neural Network
Latest

🎓 Why Original Work Matters in Your Final Year Project (And How It Can Shape Your Career)

In engineering colleges across the country, final year projects are often treated as just another academic task. But what many students fail to realize is…

Read More 🎓 Why Original Work Matters in Your Final Year Project (And How It Can Shape Your Career)Continue

Latest

🎓 How to Choose Your Final Year Project: A Practical Guide for BTech Students

Choosing the right final year project is one of the most important decisions of your engineering journey. It’s more than just a submission — it’s…

Read More 🎓 How to Choose Your Final Year Project: A Practical Guide for BTech StudentsContinue

Latest

🧠 MCP Server: Model Context Prototyping with Gemini + MySQL + FastAPI

GitHub: https://github.com/nishantmunjal2003/mcp-server-gemini 📌 Project Overview MCP Server is a lightweight, extendable API server that: ⚙️ Features 📁 Folder Structure bashCopyEditmcp-server/ │ ├── app.py # Main…

Read More 🧠 MCP Server: Model Context Prototyping with Gemini + MySQL + FastAPIContinue

Artificial Intelligence Psychology

Why You Can’t Stop Scrolling — And What AI Has to Do With It

Ever caught yourself reaching for your phone, telling yourself it’s “just for a minute”… and then suddenly it’s midnight?You didn’t mean to spend the last…

Read More Why You Can’t Stop Scrolling — And What AI Has to Do With ItContinue

Latest

Advance AI PPT

Read More Advance AI PPTContinue

Nishant Munjal

Coding Humanity’s Future </>


Facebook Twitter Linkedin YouTube Github Email

Tools

  • SIP Calculator
  • EMI Calculator
  • Creative QR Code
  • Image Converter

Resources

  • Blog
  • Contact
  • Refund and Returns

Legal

  • Disclaimer
  • Privacy Policy
  • Terms and Conditions

© 2025 - All Rights Reserved

  • About
  • Courses
  • Research
    • Research Publications
    • Books
    • Patents
  • Workshop/Conferences
  • Tools
    • Creative Image Converter
    • Creative QRCode Generator
    • Creative QR Code Generator Tool
    • EMI Calculator
    • SIP Calculator
  • Blog
  • Resume
Download CV
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.