What exactly is the “softmax and the multinomial logistic loss” in the context of machine learning?
The softmax function is simply a generalization of the logistic function that allows us to compute meaningful class-probabilities in multi-class settings (multinomial logistic regression). In softmax, we compute the probability that a particular sample (with net input z) belongs to the ith class using a normalization term in the denominator that is the sum of all M linear functions:
In contrast, the logistic function:
And for completeness, we define the net input as
where the weight coefficients of your model are stored as vector “w” and “x” is the feature vector of your sample.