We motivated the need for activation functions in our comprehensive article on multilayer percetrons. In that article, we provided the example of the ReLU activation function as a way of incorporating nonlinearity into the activations of the hidden layers of the deep feedforward network.
In this article, we list some other activation functions and provide a commentary on the appropriate scenario for their usage.
We have specifically included the activation functions that are commonly used in deep neural networks. For a more comprehensive listing of activation functions refer to the corresponding Wikipedia article.