Activations API
Activation functions introduce non-linearity into the network, allowing it to learn complex patterns.
Base Activation
mpneuralnetwork.activations.Activation
Bases: Layer
Base class for activation functions.
Activations are treated as layers in this framework. They apply a non-linear transformation element-wise to the input.
Attributes:
| Name | Type | Description |
|---|---|---|
activation |
Callable
|
The function to apply during the forward pass. |
activation_prime |
Callable
|
The derivative of the function for the backward pass. |
Source code in src/mpneuralnetwork/activations.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 | |
params
property
__init__(activation, activation_prime)
Initializes the activation layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
activation
|
Callable[[ArrayType], ArrayType]
|
The activation function. |
required |
activation_prime
|
Callable[[ArrayType], ArrayType]
|
The derivative of the activation function. |
required |
Source code in src/mpneuralnetwork/activations.py
20 21 22 23 24 25 26 27 28 | |
backward(output_gradient_batch)
Computes the gradient of the activation function.
Applies the chain rule: grad = output_gradient * activation'(input).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
output_gradient_batch
|
ArrayType
|
Gradient flowing from the next layer. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
ArrayType |
ArrayType
|
Gradient with respect to the input. |
Source code in src/mpneuralnetwork/activations.py
43 44 45 46 47 48 49 50 51 52 53 54 55 | |
forward(input_batch, training=True)
Applies the activation function to the input.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_batch
|
ArrayType
|
Input data of any shape. |
required |
training
|
bool
|
Whether in training mode. Defaults to True. |
True
|
Returns:
| Name | Type | Description |
|---|---|---|
ArrayType |
ArrayType
|
Activated output with the same shape as |
Source code in src/mpneuralnetwork/activations.py
30 31 32 33 34 35 36 37 38 39 40 41 | |
Hidden Layers
These activations are typically used in intermediate layers.
mpneuralnetwork.activations.ReLU
Bases: Activation
Rectified Linear Unit activation function.
Formula
f(x) = max(0, x)
Range: [0, inf). Computationally efficient and mitigates the vanishing gradient problem. Most common activation for hidden layers in deep networks.
Source code in src/mpneuralnetwork/activations.py
102 103 104 105 106 107 108 109 110 111 112 113 114 | |
mpneuralnetwork.activations.PReLU
Bases: Activation
Parametric Rectified Linear Unit.
Formula
f(x) = x if x > 0
f(x) = alpha * x if x <= 0
Where alpha is a learnable parameter updated during training.
Allows the network to learn the negative slope, avoiding "dying ReLU" problems.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
alpha
|
float
|
Initial value for the negative slope. Defaults to 0.01. |
0.01
|
Source code in src/mpneuralnetwork/activations.py
117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 | |
mpneuralnetwork.activations.Swish
Bases: Activation
Swish activation function.
Formula
f(x) = x * sigmoid(x)
Range: (~-0.28, inf). Proposed by Google. A smooth, non-monotonic function that often outperforms ReLU on deep networks.
Source code in src/mpneuralnetwork/activations.py
144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 | |
mpneuralnetwork.activations.Tanh
Bases: Activation
Hyperbolic Tangent activation function.
Formula
f(x) = tanh(x) = (exp(x) - exp(-x)) / (exp(x) + exp(-x))
Range: (-1, 1). Zero-centered, making it often preferable to Sigmoid for hidden layers.
Source code in src/mpneuralnetwork/activations.py
67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 | |
mpneuralnetwork.activations.Sigmoid
Bases: Activation
Sigmoid activation function.
Formula
f(x) = 1 / (1 + exp(-x))
Range: (0, 1). Used for binary classification (output layer) or gating mechanisms (like in LSTMs). Can suffer from vanishing gradients in deep networks.
Source code in src/mpneuralnetwork/activations.py
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 | |
Output Layers
These activations are typically used in the final layer to produce probability distributions.
mpneuralnetwork.activations.Softmax
Bases: Layer
Softmax activation function.
Formula
f(x)_i = exp(x_i / T) / sum(exp(x_j / T))
Typically used in the output layer for multi-class classification. Converts a vector of K real numbers into a probability distribution of K possible outcomes. The temperature parameter T is used to scale the logits before computing the softmax.
Source code in src/mpneuralnetwork/activations.py
162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 | |
__init__(temperature=1.0, epsilon=1e-08)
Initializes the Softmax layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
temperature
|
float
|
Temperature parameter. Defaults to 1.0. |
1.0
|
epsilon
|
float
|
Small float added to denominator to avoid dividing by zero. Defaults to 1e-8. |
1e-08
|
Source code in src/mpneuralnetwork/activations.py
173 174 175 176 177 178 179 180 181 | |
backward(output_gradient_batch)
Computes gradient for Softmax.
Note: This is rarely used directly if using CategoricalCrossEntropy loss,
as the framework optimizes the combined gradient calculation for numerical stability.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
output_gradient_batch
|
ArrayType
|
Gradient from next layer. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
ArrayType |
ArrayType
|
Gradient w.r.t input. |
Source code in src/mpneuralnetwork/activations.py
201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 | |
forward(input_batch, training=True)
Applies Softmax function.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_batch
|
ArrayType
|
Input logits of shape (batch_size, num_classes). |
required |
training
|
bool
|
Unused. Defaults to True. |
True
|
Returns:
| Name | Type | Description |
|---|---|---|
ArrayType |
ArrayType
|
Probabilities of shape (batch_size, num_classes). |
Source code in src/mpneuralnetwork/activations.py
183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 | |
mpneuralnetwork.activations.Sigmoid
Bases: Activation
Sigmoid activation function.
Formula
f(x) = 1 / (1 + exp(-x))
Range: (0, 1). Used for binary classification (output layer) or gating mechanisms (like in LSTMs). Can suffer from vanishing gradients in deep networks.
Source code in src/mpneuralnetwork/activations.py
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 | |