What are Biases in Neural Networks?
What are Biases in Neural Networks?
Biases are additional parameters in neural networks that are added to the weighted sum of inputs to a neuron before applying the activation function. They help the model to fit the data better by providing an additional degree of freedom.
Role of Biases
- Shifting the Activation Function: Biases allow the activation function to be shifted to the left or right, which can be crucial for learning complex patterns. Without biases, the activation function would always pass through the origin, which can limit the flexibility of the model.
- Controlling Neuron Activation: Biases help control whether neurons fire (activate) or not, allowing the network to learn even when all input features are zero.
How to Define Biases
Biases are typically initialized to zero or small random values. They are learned during training through the backpropagation algorithm, similar to weights.
Example of Defining Biases in Code
Let’s go through the steps of initializing biases, using a neural network with one hidden layer.
Step-by-Step Example
- Import Libraries:
import numpy as np
- Define Initialization Function:
def initialize_parameters(input_size, hidden_size, output_size):
# Xavier Initialization for weights
W1 = np.random.randn(input_size, hidden_size) * np.sqrt(1 / input_size)
b1 = np.zeros((1, hidden_size)) # Biases for hidden layer
W2 = np.random.randn(hidden_size, output_size) * np.sqrt(1 / hidden_size)
b2 = np.zeros((1, output_size)) # Biases for output layer
return W1, b1, W2, b2
3. Specify Network Dimensions:
input_size = 3 # Number of input features
hidden_size = 4 # Number of neurons in the hidden layer
output_size = 1 # Number of output neurons
4. Initialize Weights and Biases:
W1, b1, W2, b2 = initialize_parameters(input_size, hidden_size, output_size)
Explanation
- b1: Biases for the hidden layer.
- Shape:
(1, hidden_size)
which is(1, 4)
. - Initialized to zeros.
- Shape:
- b2: Biases for the output layer.
- Shape:
(1, output_size)
which is(1, 1)
. - Initialized to zeros.
- Shape:
Biases in the Context of Forward Propagation
During forward propagation, biases are added to the weighted sum of inputs before applying the activation function. Here’s how it looks in code:
def sigmoid(x):
return 1 / (1 + np.exp(-x))
def forward_propagation(X, W1, b1, W2, b2):
Z1 = np.dot(X, W1) + b1 # Add bias b1 to the weighted sum
A1 = sigmoid(Z1) # Apply activation function
Z2 = np.dot(A1, W2) + b2 # Add bias b2 to the weighted sum
A2 = sigmoid(Z2) # Apply activation function
return Z1, A1, Z2, A2
Putting It All Together
Here’s a complete code snippet including forward propagation with biases:
import numpy as np
def initialize_parameters(input_size, hidden_size, output_size):
# Xavier Initialization for weights
W1 = np.random.randn(input_size, hidden_size) * np.sqrt(1 / input_size)
b1 = np.zeros((1, hidden_size)) # Biases for hidden layer
W2 = np.random.randn(hidden_size, output_size) * np.sqrt(1 / hidden_size)
b2 = np.zeros((1, output_size)) # Biases for output layer
return W1, b1, W2, b2
def sigmoid(x):
return 1 / (1 + np.exp(-x))
def forward_propagation(X, W1, b1, W2, b2):
Z1 = np.dot(X, W1) + b1 # Add bias b1 to the weighted sum
A1 = sigmoid(Z1) # Apply activation function
Z2 = np.dot(A1, W2) + b2 # Add bias b2 to the weighted sum
A2 = sigmoid(Z2) # Apply activation function
return Z1, A1, Z2, A2
# Define the neural network structure
input_size = 3 # Number of input features
hidden_size = 4 # Number of neurons in the hidden layer
output_size = 1 # Number of output neurons
# Initialize parameters
W1, b1, W2, b2 = initialize_parameters(input_size, hidden_size, output_size)
# Input data (example)
X = np.array([[0, 0, 1],
[1, 1, 1],
[1, 0, 1],
[0, 1, 1]])
# Forward propagation
Z1, A1, Z2, A2 = forward_propagation(X, W1, b1, W2, b2)
# Print the outputs
print("Z1:", Z1)
print("A1:", A1)
print("Z2:", Z2)
print("A2:", A2)
Summary
- Biases are additional parameters that allow the activation function to be shifted, providing more flexibility to the model.
- Biases are typically initialized to zero or small random values.
- They are added to the weighted sum of inputs before applying the activation function during forward propagation.
- Proper initialization of biases, like weights, helps in efficient and effective training of the neural network.