Neural Networks: From Biological to Artificial

Understanding how artificial neurons mimic biological neural networks

Biological Neuron
Dendrites Cell Body Axon

How it works:

Dendrites receive signals from other neurons. The cell body integrates these signals. If the combined signal exceeds a threshold, the neuron fires an action potential through the axon to other neurons.

Artificial Neuron (Perceptron)
x₁ × w₁
x₂ × w₂
Σ
+ b
f(x) = y

How it works:

Inputs (x₁, x₂, ...) are weighted (w₁, w₂, ...) and summed. A bias (b) is added. The result passes through an activation function f(·) to produce the output y.

Key Similarities

Integration: Both biological and artificial neurons combine multiple inputs into a single output.

Threshold: Both have a threshold mechanism - biological neurons fire when membrane potential exceeds threshold, artificial neurons activate when weighted sum exceeds bias.

Non-linearity: Both introduce non-linearity - biological through action potential generation, artificial through activation functions.

Interactive Calculation
Activation Function:
Step 1: Calculate Weighted Sum
z = (x₁ × w₁) + (x₂ × w₂) + b
z = (0.5 × 1.0) + (0.3 × 0.8) + (-0.2) = 0.54
Each input is multiplied by its corresponding weight, then all products are summed together with the bias term.
Step 2: Apply Activation Function
y = σ(z)
y = σ(0.54) = 0.631
The weighted sum is passed through the sigmoid activation function to produce the final output. This introduces non-linearity and bounds the output between 0 and 1.
Final Output
0.631
This is the neuron's prediction or classification result

Why Do We Need Activation Functions?

Activation functions are crucial components of artificial neurons. They serve multiple important purposes:

1. Introduce Non-linearity: Without activation functions, a neural network would just be a series of linear transformations, which can be represented by a single matrix multiplication. Activation functions allow networks to learn complex, non-linear patterns.

2. Control Output Range: Different activation functions bound the output to specific ranges (e.g., sigmoid outputs between 0 and 1, tanh between -1 and 1), which helps with numerical stability and interpretation.

3. Enable Learning: The gradients of activation functions are essential for backpropagation, the algorithm that allows neural networks to learn from data.

In the calculation above, we used the sigmoid function. Below, you can explore other common activation functions and their properties.

Common Activation Functions
Sigmoid
σ(x) = 1 / (1 + e^(-x))
Output range: (0, 1). Smooth gradient, but can cause vanishing gradients. Good for binary classification.
ReLU (Rectified Linear Unit)
ReLU(x) = max(0, x)
Output range: [0, ∞). Most popular in deep learning. Solves vanishing gradient problem, but can cause "dying ReLU" issue.
Tanh (Hyperbolic Tangent)
tanh(x) = (e^x - e^(-x)) / (e^x + e^(-x))
Output range: (-1, 1). Zero-centered, making it better than sigmoid for some applications. Used in hidden layers.