What are Biases in Neural Networks? -

What are Biases in Neural Networks?

Biases are additional parameters in neural networks that are added to the weighted sum of inputs to a neuron before applying the activation function. They help the model to fit the data better by providing an additional degree of freedom.

Role of Biases

Shifting the Activation Function: Biases allow the activation function to be shifted to the left or right, which can be crucial for learning complex patterns. Without biases, the activation function would always pass through the origin, which can limit the flexibility of the model.
Controlling Neuron Activation: Biases help control whether neurons fire (activate) or not, allowing the network to learn even when all input features are zero.

How to Define Biases

Biases are typically initialized to zero or small random values. They are learned during training through the backpropagation algorithm, similar to weights.

Example of Defining Biases in Code

Let’s go through the steps of initializing biases, using a neural network with one hidden layer.

Step-by-Step Example

Import Libraries:

import numpy as np

Define Initialization Function:

def initialize_parameters(input_size, hidden_size, output_size):
    # Xavier Initialization for weights
    W1 = np.random.randn(input_size, hidden_size) * np.sqrt(1 / input_size)
    b1 = np.zeros((1, hidden_size))  # Biases for hidden layer
    W2 = np.random.randn(hidden_size, output_size) * np.sqrt(1 / hidden_size)
    b2 = np.zeros((1, output_size))  # Biases for output layer
    return W1, b1, W2, b2

3. Specify Network Dimensions:

input_size = 3  # Number of input features
hidden_size = 4  # Number of neurons in the hidden layer
output_size = 1  # Number of output neurons

4. Initialize Weights and Biases:

W1, b1, W2, b2 = initialize_parameters(input_size, hidden_size, output_size)

Explanation

b1: Biases for the hidden layer.
- Shape: (1, hidden_size) which is (1, 4).
- Initialized to zeros.
b2: Biases for the output layer.
- Shape: (1, output_size) which is (1, 1).
- Initialized to zeros.

Biases in the Context of Forward Propagation

During forward propagation, biases are added to the weighted sum of inputs before applying the activation function. Here’s how it looks in code:

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def forward_propagation(X, W1, b1, W2, b2):
    Z1 = np.dot(X, W1) + b1  # Add bias b1 to the weighted sum
    A1 = sigmoid(Z1)  # Apply activation function
    Z2 = np.dot(A1, W2) + b2  # Add bias b2 to the weighted sum
    A2 = sigmoid(Z2)  # Apply activation function
    return Z1, A1, Z2, A2

Putting It All Together

Here’s a complete code snippet including forward propagation with biases:

import numpy as np

def initialize_parameters(input_size, hidden_size, output_size):
    # Xavier Initialization for weights
    W1 = np.random.randn(input_size, hidden_size) * np.sqrt(1 / input_size)
    b1 = np.zeros((1, hidden_size))  # Biases for hidden layer
    W2 = np.random.randn(hidden_size, output_size) * np.sqrt(1 / hidden_size)
    b2 = np.zeros((1, output_size))  # Biases for output layer
    return W1, b1, W2, b2

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def forward_propagation(X, W1, b1, W2, b2):
    Z1 = np.dot(X, W1) + b1  # Add bias b1 to the weighted sum
    A1 = sigmoid(Z1)  # Apply activation function
    Z2 = np.dot(A1, W2) + b2  # Add bias b2 to the weighted sum
    A2 = sigmoid(Z2)  # Apply activation function
    return Z1, A1, Z2, A2

# Define the neural network structure
input_size = 3  # Number of input features
hidden_size = 4  # Number of neurons in the hidden layer
output_size = 1  # Number of output neurons

# Initialize parameters
W1, b1, W2, b2 = initialize_parameters(input_size, hidden_size, output_size)

# Input data (example)
X = np.array([[0, 0, 1],
              [1, 1, 1],
              [1, 0, 1],
              [0, 1, 1]])

# Forward propagation
Z1, A1, Z2, A2 = forward_propagation(X, W1, b1, W2, b2)

# Print the outputs
print("Z1:", Z1)
print("A1:", A1)
print("Z2:", Z2)
print("A2:", A2)

Summary

Biases are additional parameters that allow the activation function to be shifted, providing more flexibility to the model.
Biases are typically initialized to zero or small random values.
They are added to the weighted sum of inputs before applying the activation function during forward propagation.
Proper initialization of biases, like weights, helps in efficient and effective training of the neural network.

What are Biases in Neural Networks?