Machine Learning Essentials

In this course, participants learn the essentials of machine learning.

In this course, participants learn the essentials of Machine Learning. We start with an introduction to machine learning and its applications. We then discuss data preprocessing and feature engineering. Both are essential steps to build high-performing machine learning models. This is followed by introducing the basic concepts of regression and classification. We then discuss how to measure the performance of predictive analytics techniques. Next, we zoom in on association rules, sequence rules and clustering. We then elaborate on advanced machine learning techniques such as neural networks, support vector machines and ensemble models. We also review Bayesian networks as probabilistic white box machine learning models. A next section reviews variable selection. We extensively discuss machine learning model interpretation and deployment. The course concludes by highlighting some machine learning pitfalls. The course provides a sound mix of both theoretical and technical insights, as well as practical implementation details. These are illustrated by several real-life case studies and examples. The course also features code examples in both R and Python. Throughout the course, the instructors also extensively report upon their research and industry experience.

The course features more than 8 hours of video lectures, multiple multiple choice questions, and various references to background literature. A certificate signed by the instructors is provided upon successful completion.

See this sample lecture video:

Price

Part of our course revenue is used towards funding organizations involvement in protecting and cleaning our oceans.

Requirements

Before subscribing to this course, you should have a basic understanding of descriptive statistics (e.g., mean, median, standard deviation, histograms, scatter plots, etc.) and inference (e.g., confidence intervals, hypothesis testing). Previous R and Python experience is helpful but not necessary.

Course Outline

Introduction
- Instructor team
- Our Machine Learning Publications
- Software
- R/Python tutorials
- Data sets
- Disclaimer
Introduction to Machine Learning
- Machine Learning
- Machine Learning Examples
- Machine Learning Process Model
- Types of Machine Learning
- Quiz
Data Preprocessing
- Motivation
- Types of data
- Types of variables
- Denormalizing data
- Sampling
  - Sampling in R
  - Sampling in Python
- Visual data exploration
  - Visual data exploration in R
  - Visual data exploration in Python
- Descriptive statistics
- Missing values
  - Missing values in R
  - Missing values in Python
- Outliers
  - Outliers in R
  - Outliers in Python
- Categorization
  - Categorization in R
  - Categorization in Python
- WOE and IV
  - WOE and IV in R
  - WOE and IV in Python
- Quiz
Feature Engineering
- Feature Engineering Defined
- RFM features
- Trend features
- Logarithmic transformation
- Power transformation
- Box-Cox transformation
  - Box-Cox transformation in R
  - Box-Cox transformation in Python
- Yeo-Johnson transformation
- Performance Optimization
  - Performance Optimisation with Yeo Johnson transformation in R
  - Performance Optimisation with Yeo Johnson transformation in Python
- Principal Component Analysis
- t-SNE
- Quiz
Regression
- Linear Regression
  - Linear Regression in R
  - Linear Regression in Python
- High Dimensional Data
- Ridge Regression
  - Ridge Regression in R
  - Ridge Regression in Python
- LASSO Regression
  - LASSO Regression in R
  - LASSO Regression in Python
- Elastic Net
  - Elastic net in R
  - Elastic net in Python
- Principal Component Regression
- Partial Least Squares (PLS) regression
- Generalized Linear Models (GLMs)
- Generalized Additive Models (GAMs)
Classification
- Linear Regression
- Logistic Regression
  - Logistic Regression in R
  - Logistic Regression in Python
- Nomograms
  - Nomograms in R
- Decision trees
  - Decision trees in R
  - Decision trees in Python
- K-nearest neighbor
  - K-nearest neighbor in R
  - K-nearest neighbor in Python
- Multiclass classification
- One versus One coding
- One versus All coding coding
- Multiclass decision trees
Measuring the performance of predictive analytics techniques
- Performance measurement
- Split sample method
- Cross-validation
- Single sample method
- Performance measures for classification
- Confusion matrix (classification accuracy, classification error, sensitivity, specificity)
- ROC curve and area under ROC curve
  - ROC curve in R
  - ROC curve in Python
- CAP curve and Accuracy Ratio
- Lift curve
- Kolmogorov-Smirnov distance
- Mahalanobis distance
- Performance measures for regression
- Quiz
Association and Sequence Rules
- Association Rules
- Support and Confidence
- Association rule mining
  - Association rule mining in R
  - Association rule mining in Python
- Lift
- Association rule extensions
- Post-Processing Association Rules
- Association rules applications
- Sequence rules
- Quiz
Clustering Techiques
- Hierharchical clustering
  - Hierarchical clustering in R
  - Hierarchical clustering in Python
- K-means clustering
  - K-means clustering in R
  - K-means clustering in Python
- DBSCAN
  - DBSCAN in R
  - DBSCAN in Python
- Evaluating clustering solutions
- Quiz
Neural Networks
- Neural Networks
  - Neural Networks in R
  - Neural Networks in Python
- Deep Learning Neural Networks
- Opening Neural Network Black Box
- Variable Selection
- Rule Extraction
- Decompositional Rule Extraction
- Pedagogical Rule Extraction
- Quality of Extracted Rule Set
- Rule Extraction Example
- Two-Stage Model
- Self-Organizing Maps
  - SOMs in R
- Self-Organizing Maps Example
- Self-Organizing Maps Evaluated
- Quiz
Support Vector Machines (SVMs)
- Problems with neural networks
- Linear programming
- Linear Separable case
- Linear non-separable case
- Non linear SVM classifier
  - RBF SVM in R
  - RBF SVM in Python
- Kernel functions
- Neural Network Interpretation of SVM classifier
- Tuning the hyperparameters
  - Tuning the hyperparameters of an RBF SVM in R
  - Tuning the hyperparameters of an RBF SVM in Python
- Benchmarking study
- SVMs for regression
- One-class SVMs
  - One-class SVM in R
  - One-class SVM in Python
- Extensions to SVMs
- Opening the SVM black box
- Quiz
Ensemble Methods
- Ensemble methods
- Bootstrapping
- Bagging
  - Baggin in R
  - Bagging in Python
- Boosting
  - Adaboost in R
  - Adaboost in Python
- Random Forests
  - Random Forests in R
  - Random Forests in Python
- XGBoost
  - XGBoost in R
  - XGBoost in Python
- Quiz
Bayesian Networks
- Bayesian Networks
- Example Bayesian Network Classifier
- Naive Bayes Classifier
  - Naive Bayes classifier in R
  - Naive Bayes classifier in Python
- Tree Augmented Naive Bayes Classifiers
- Bayesian networks examples
- Quiz
Variable Selection
- Variable selection
- Filter methods (gain, Cramer’s V, Fisher score)
  - Cramer’s V in R
  - Cramer’s V in Python
  - Information Value in R
  - Information Value in Python
- Forward/Backward/Stepwise regression
  - Forward/Backward/Stepwise in R
- BART: Backward Regression Trimming
  - BART variable selection in R
- Criteria for variable selection
- Quiz
Model interpretation
- Model interpretation
- Feature Importance
- Permutation based feature importance
- Partial dependence plots
  - Partial dependence plots in Python
- Individual conditional expectation (ICE) plots
  - ICE plots in Python
- Visual analytics
- Decision tables
- LIME
  - LIME in Python
- Shapley value
  - Shapley value in Python
Model deployment
- Model deployment
- Model governance
- Model ethics
- Model documentation
- Model backtesting
- Model benchmarking
- Model stress testing
- Privacy and Security
- Quiz
Machine Learning Pitfalls
- Sample bias
- Model risk
- Deep everything
- Leader versus follower
- Complexity versus trust
- Statistical myopia
- Profit Driven Machine Learning
- Quiz

Request for Information

Personal Details

Address

In this course, participants learn the essentials of machine learning.

Requirements

Course Outline