MENU

Derivatives of activation functions

April 7, 2019 • Read: 347 • Deep Learning

在梯度下降反向计算过程中少不了计算激活函数的导数即梯度

首先看一下sigmoid函数的导数:

$$ g(z)=\frac{1}{1+e^{(−z)}} \\ g′(z)=\frac{d}{dz}g(z)=g(z)(1-g(z))=a(1-a) $$

对于tanh函数的导数:

$$ g(z) = tanh(z) = \frac{e^{z} - e^{-z}}{e^{z} + e^{-z}} \\ \frac{d}{{d}z}g(z) = 1 - (g(z))^{2}=1-a^2 $$

对于ReLU函数的导数:

$$ g(z)=\max(0,z) \\ g′(z)= \begin{cases} 0& \text{if z < 0} \\ 1& \text{if z > 0} \\ undefined& \text{if z = 0} \end{cases} $$

对于Leaky ReLU函数:

$$ g(z)=\max(0.01z,z)\\ g′(z)= \begin{cases} 0.01& \text{if z < 0}\\ 1& \text{if z > 0}\\ undefined& \text{if z = 0} \end{cases} $$

Archives Tip
QR Code for this page
Tipping QR Code