As a statistics professor who teaches machine learning classes, this is among the top questions I get frequently asked by students. There are, of course, many ways to slice and dice it. In my opinion, if I had to boil it down to a few single points, I would highlight the following:

  • In statistical modeling we usually use parametric approaches (e.g., think of linear or logistic regression as the simplest examples of parametric models – we specify the number of parameters upfront), whereas in machine learning, we often use nonparametric approaches, which means that we don’t pre-specify the structure of the model (e.g., K-nearest neighbors, decision trees, kernel SVM, etc.)
  • In most statistical models, we assume that features (predictors, covariates) are additive
  • Using statistical models, we usually care a lot about uncertainty estimates (confidence intervals, hypothesis tests, etc.)
  • When we are using machine learning models, we typically don’t make any substantial/particular assumptions like non-collinearity, normally distributed residuals, etc.
  • The absolute predictive performance of ML models is usually better than for statistical models (although, they often don’t have the same level of interpretability as statistical models)

Two articles related to the topic I would recommend are:




If you like this content and you are looking for similar, more polished Q & A’s, check out my new book Machine Learning Q and AI.