To introduce non-linearity, a hyperbolic tangent or sigmoid (S-shaped) function is commonly used for . Non-linearity is deliberately analogous to biological neurons, and responsible for its versatile information processing properties.
A reason for its popularity [Wik07b] in neural networks is because the sigmoid function satisfies the differential equation . The right hand side is a low order polynomial. Furthermore, the polynomial has factors y and , both of which are simple to compute. Given at a particular , the derivative of the sigmoid function at that can be obtained by multiplying the two factors together. These relationships result in simplified implementations of artificial neural networks with artificial neurons.
Erik de Bruijn 2007-10-19