Summary

The course teaches practical skills, and theoretical foundations behind these skills, for the analysis of data, the core subject of Data Science. Thanks to advances in machine learning, elaborated dependencies can be learned from data. Bayesian data analysis builds on machine learning and the Bayesian approach to probability to perform inference in complex probabilistic models. In the center of Bayesian data analysis lies the concept of a generative probabilistic model, which describes the process through which the data is, or could be, generated. Inference is then performed on the model given the data, allowing to make predictions both about future, yet unseen data, and about unobservable phenomena which affect the data. Uncertainty is naturally modeled within the framework of the Bayesian approach.

During the course we will learn to specify probabilistic generative models for a number of important classes of data science problems, such as generalised linear models, hierarchical models, mixture models, and others, and perform inference on these models using modern tools and inference algorithms. We will explore model checking, comparison and selection. The homework will help develop hands-on skills in Bayesian data modelling and analysis.

How we learn

We meet weekly for 3 hours on Zoom (Wed 16:00-19:00). First two hours are mostly theory. The last hour will be solving practical problems together, with a Jupyter notebook or Unix/X11 terminal.

We will have 4 homework assignments, each combining theoretical and programming exercises. Homework assignments should be done in pairs.

We use Slack for announcements, questions, and discussions.

Lectures

Homework

Homework should be submitted in pairs via Moodle. You may submit either a Jupyter (.ipynb) or Pluto (.jl) notebook. If your solution requires external files (data or images), put the files online and load via their URLs.