📅  最后修改于: 2023-12-03 15:34:17.474000             🧑  作者: Mango
When programming deep learning models with TensorFlow, commonly used activation functions are relu
(rectified linear unit) and leaky_relu
(leaky rectified linear unit). These functions add non-linearity to the models and are responsible for transforming the input data into more useful representations.
nn.relu()
is a function that keeps the positive values of input vector and sets the negative values to zero. It can be defined mathematically as:
relu(x) = max(0, x)
Let's see an example of how to use nn.relu()
function in TensorFlow:
import tensorflow as tf
x = tf.constant([-2-1, 0, 1, 2, 3], dtype=tf.float32)
y = tf.nn.relu(x)
print(y) # output: [0. 0. 1. 2. 3.]
nn.leaky_relu()
is similar to nn.relu()
but instead of setting negative values to zero, it applies a small negative slope to them. This helps to avoid the “dying ReLU” problem, which can occur when gradient of relu
becomes zero and stops updating during training.
The function can be defined as:
leaky_relu(x) = max(α*x, x)
where α is the slope for negative values (typically between 0.01 and 0.3, depending on the problem domain).
Let's see an example of how to use nn.leaky_relu()
function in TensorFlow:
import tensorflow as tf
x = tf.constant([-2-1, 0, 1, 2, 3], dtype=tf.float32)
y = tf.nn.leaky_relu(x, alpha=0.1)
print(y) # output: [-0.2 -0.1 1. 2. 3. ]
nn.relu()
and nn.leaky_relu()
are two popular activation functions used in deep learning with TensorFlow. nn.relu()
sets negative values to zero while nn.leaky_relu()
applies a small negative slope to them. Be careful when choosing the right activation function for your problem based on its properties and characteristics.