top of page

Mystery behind Activation Function

  • Writer: subrata sarkar
    subrata sarkar
  • Aug 12
  • 1 min read

The activation function is a core concept in neural networks—it's what gives artificial neurons the ability to learn complex patterns and make decisions beyond simple linear relationships.

What Is an Activation Function?

An activation function determines whether a neuron should "fire" or not, based on the input it receives. It transforms the weighted sum of inputs into an output signal that gets passed to the next layer.

Think of it as the decision-making gate of a neuron.

Why Is It Important?

Without activation functions, a neural network would just be a linear regression model. Activation functions introduce non-linearity, allowing networks to learn and represent complex functions, such as image recognition, language understanding, or financial forecasting.

Common Types of Activation Functions

Activation Function

Formula

Characteristics

Use Cases

Sigmoid

σ(x)=11+e−x\sigma(x) = \frac{1}{1 + e^{-x}}

Smooth, outputs between 0 and 1

Binary classification

Tanh

tanh⁡(x)=ex−e−xex+e−x\tanh(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}}

Outputs between -1 and 1

Hidden layers in early models

ReLU (Rectified Linear Unit)

f(x)=max⁡(0,x)f(x) = \max(0, x)

Fast, sparse activation

Most common in deep networks

Leaky ReLU

f(x)=max⁡(0.01x,x)f(x) = \max(0.01x, x)

Fixes ReLU's "dying neuron" issue

Deep learning

Softmax

Softmax(xi)=exi∑jexj\text{Softmax}(x_i) = \frac{e^{x_i}}{\sum_j e^{x_j}}

Converts outputs to probabilities

Multi-class classification

Where It Fits in the Network

  1. Input Layer: No activation function—just raw data.

  2. Hidden Layers: Use ReLU, Tanh, etc., to extract features.

  3. Output Layer: Depends on the task:

    • Sigmoid for binary classification

    • Softmax for multi-class

    • Linear for regression

Comments


bottom of page