Course 5: Actuarial Data Science

Course 5 provides an introduction to actuarial data science techniques with an emphasis on machine learning.

Course 5: Actuarial Data Science

We cover the situations in which the learning target is known (supervised learning), both continuous (regression) and discrete (classification), as well as the case of an unspecified target (unsupervised learning). The focus is on techniques and examples that are useful for the actuarial profession, and non-life insurance in particular.

Topics

The course covers the following topics:

  • Introduction to data science and machine learning: basic concepts, including overfitting, bias-variance tradeoff, (cross-)validation techniques, parameter tuning;
  • Techniques to automatically deal with a large feature space (forward stepwise, Ridge regression, Lasso) starting from the familiar (generalized) linear regression model;
  • Classification and regression trees;
  • Ensemble learning, such as bagging, boosting and random forests.
  • Neural networks and deep learning, as well as Combined Actuarial Neural Networks;
  • Unsupervised learning techniques, such as dimension reduction (PCA, auto encoders), clustering and anomaly detection;
  • Actuarial applications, such as pricing, reserving, mortality modelling and insurance fraud detection;
  • Responsible Data Science, regulation and recent developments, such as fairness and explainability.

The meetings consist of theoretical parts and practical parts. In the practical parts, the data preparation, model training and validation procedures are applied to realistic cases.

Teacher

Prof. dr. Katrien Antonio (Universiteit van Amsterdam)