Neural Networks: From Biological to Artificial

Key Similarities

Integration: Both biological and artificial neurons combine multiple inputs into a single output.

Threshold: Both have a threshold mechanism - biological neurons fire when membrane potential exceeds threshold, artificial neurons activate when weighted sum exceeds bias.

Non-linearity: Both introduce non-linearity - biological through action potential generation, artificial through activation functions.

Interactive Calculation

Activation Function:

Input 1 (x₁): 0.5

Input 2 (x₂): 0.3

Weight 1 (w₁): 1.0

Weight 2 (w₂): 0.8

Bias (b): -0.2

Step 1: Calculate Weighted Sum

z = (x₁ × w₁) + (x₂ × w₂) + b

z = (0.5 × 1.0) + (0.3 × 0.8) + (-0.2) = 0.54

Each input is multiplied by its corresponding weight, then all products are summed together with the bias term.

Step 2: Apply Activation Function

y = σ(z)

y = σ(0.54) = 0.631

The weighted sum is passed through the sigmoid activation function to produce the final output. This introduces non-linearity and bounds the output between 0 and 1.

Final Output

0.631

This is the neuron's prediction or classification result

Why Do We Need Activation Functions?

Activation functions are crucial components of artificial neurons. They serve multiple important purposes:

1. Introduce Non-linearity: Without activation functions, a neural network would just be a series of linear transformations, which can be represented by a single matrix multiplication. Activation functions allow networks to learn complex, non-linear patterns.

2. Control Output Range: Different activation functions bound the output to specific ranges (e.g., sigmoid outputs between 0 and 1, tanh between -1 and 1), which helps with numerical stability and interpretation.

3. Enable Learning: The gradients of activation functions are essential for backpropagation, the algorithm that allows neural networks to learn from data.

In the calculation above, we used the sigmoid function. Below, you can explore other common activation functions and their properties.

Common Activation Functions

Sigmoid

σ(x) = 1 / (1 + e^(-x))

Output range: (0, 1). Smooth gradient, but can cause vanishing gradients. Good for binary classification.

ReLU (Rectified Linear Unit)

ReLU(x) = max(0, x)

Output range: [0, ∞). Most popular in deep learning. Solves vanishing gradient problem, but can cause "dying ReLU" issue.

Tanh (Hyperbolic Tangent)

tanh(x) = (e^x - e^(-x)) / (e^x + e^(-x))

Output range: (-1, 1). Zero-centered, making it better than sigmoid for some applications. Used in hidden layers.

Neural Networks: From Biological to Artificial

How it works:

How it works:

Key Similarities

Why Do We Need Activation Functions?