Feature Extraction

  • Tips and Tricks for Encoding Categorical Features in Classification Tasks [IPython nb]
  • About Feature Scaling: Standardization and Min-Max-Scaling (Normalization) [IPython nb]

Dimensionality Reduction

  • Principal Component Analysis (PCA) [IPython nb]
  • The effect of scaling and mean centering of variables prior to a PCA [PDF]
  • PCA based on the covariance vs. correlation matrix [IPython nb]
  • Linear Discriminant Analysis (LDA) [IPython nb]
  • Kernel tricks and nonlinear dimensionality reduction via PCA [IPython nb]

Representing Text

  • Tf-idf Walkthrough for scikit-learn [IPython nb]

Supervised Learning Algorithms

Neural Networks

  • Activation Function Cheatsheet [IPython nb]
  • Artificial Neurons and Single-Layer Neural Networks - How Machine Learning Algorithms Work Part 1 [IPython nb]

Decision Trees

  • Cheatsheet for Decision Tree Classification [IPython nb]

Ensemble Methods

  • Implementing a Weighted Majority Rule Ensemble Classifier in scikit-learn [IPython nb]

Logistic Regression

  • Out-of-core Learning and Model Persistence using scikit-learn [IPython nb]
  • Logistic Regression Implementation [IPython nb]

Naive Bayes

  • Naive Bayes and Text Classification I - Introduction and Theory [View PDF]

Unsupervised Learning Algorithms

  • Complete-Linkage Clustering and Heatmaps in Python [IPython nb]


  • An Overview of General Performance Metrics of Binary Classifier Systems PDF]
  • A Basic Pipeline and Grid Search Setup [IPython nb]
  • An Extended Cross-Validation Example [IPython nb]


Parametric Parameter Estimation

  • Introduction to the Maximum Likelihood Estimate (MLE) [IPython nb
  • How to calculate Maximum Likelihood Estimates (MLE) for different distributions [IPython nb]

Non-Parametric Parameter Estimation

  • Kernel density estimation via the Parzen-window technique [IPython nb]

Collecting Data

  • Reading handwritten digits from MNIST into NumPy arrays [IPython nb]
  • Download Your Twitter Timeline and Turn into a Word Cloud Using Python [IPython nb]
  • Collecting Fantasy Soccer Data with Python and Beautiful Soup IPython nb]
  • Open-source datasets [Markdown]

Data Visualization

  • Exploratory Analysis of the Star Wars API IPython nb]
  • Matplotlib examples – Exploratory data analysis of the Iris dataset [IPython nb]
  • Artificial Intelligence publications per country [IPython nb]


  • Free Machine Learning eBooks [Markdown]
  • Terms in data science defined in less than 50 words [Markdown]
  • Useful libraries for data science in Python [Markdown]
  • A matrix cheatsheat for Python, R, Julia, and MATLAB [HTML]