Applied Bayesian Data Analysis
Introduction

(also known as data science)

David Tolpin, david.tolpin@gmail.com

Course topics

  • Expressing prior knowledge as statistical models.
  • Learning based on prior knowledge (model) and observations (data).
  • Analysing models and inference results.
  • Presenting analyses and conclusions.

Three pillars

  • Bayesian statistics
  • Generative machine learning
  • Probabilistic programming

Bayesian statistics

Statistics

The science of fiddling with data.

  • Collection.
  • Organization.
  • Analysis.
  • Interpretation.
  • Presentation.

Different from data science!

Bayesian

Thomas Bayes - Look up on Wikipedia.

  • Probability measures belief (vs. frequency).
  • Prior beliefs.
  • Belief updating.

Generative Machine Learning

What does machine learning do?

It should be machine training instead.
  • Takes a bunch of code - a model.
  • Feeds old data to the model - training.
  • Passes new data through the model - inference.

Discriminative vs. Generative

  • Discriminative:

      Model(question, parameters) → answer

    run forward for inference.
  • Generative:

    Model(answer, parameters) → question

    run backward for inference.
  • Both: adjust parameters until answers are good enough for questions.

Running Backward

  • Running a model (program) backward is impossible!
  • But can be approximated.
  • Approximation methods: message passing, Monte Carlo, variational inference

Probabilistic Programming

Software X.0

  • Software 1.0 - Python, C++, etc.
  • Software 2.0 - deep learning models.
  • Software 3.0 - programs that learn:
        probabilistic programs.

Probabilistic programming:

  • models = programs
  • inference = execution

Probabilistic program is a program.

  • Language? - new or existing.
  • Semantics? - what a program means.
  • Translation and execution? - how to run.

Probabilistic program is a distribution.

  • Point estimates - maximum-likelihood/maximum a posteriori.
  • Summary statistics - mean, variance, skew, ...
  • Approximation - Monte Carlo, variational, ...

Probabilistic program is a model.

  • Simulation - exploring the system.
  • Optimization - best parameters.
  • Control - achieving goals.

You need probabilistic programming when you:

  • have small data (Uber)
  • exploit laws of nature or society
  • look for novelties

Venues:

  • PROBPROG - http://probprog.cc/
  • Machine learning - AISTATS, NeurIPS, ICML, ...
  • Artificial Intelligence - AAAI, IJCAI, ECAI, ...
  • Languages - POPL, PLDI, IFL, ...
  • Software - SPLASH, FSE, ICSE, ...

Industry:

Facebook (HackPPL), Google (Edward), Uber (Pyro), CRA (Figaro), Invrea (Anglican), PUB+ (Infergo)

Application: Facebook Prophet

https://facebook.github.io/prophet/

Advanced topics

  • Automatic differentiation
  • Langevin dynamics Monte Carlo
  • Variational inference
  • Inference at scale

Automatic differentiation

  • Numeric differentiation: $\frac {df} {dx} \approx \frac {f (x + \Delta x) - f(x)} {\Delta x}$
  • Symbolic differentiation: $\frac {duv} {dx} = u\frac{dv} {dx} + v \frac{du} {dx}$
  • Algorithmic differentiation
    • Operator overloading.
    • Source code transformation.

Langevin Dynamics Monte Carlo

  • Monte Carlo methods sample from the posterior distribution.
  • Smarter sampling involves computing the gradient and estimating where to jump next.
  • The inspiration comes from Hamiltonian/Langevin mechanics.
  • Stan's success is due to NUTS, a variant of Hamiltonian Monte Carlo.

Variational inference

  • Posterior distribution must be approximated.
  • Markov Chain Monte Carlo approximate the posterior with samples.
  • Variational inference approximates the posterior with known distributions.

Inference at scale

Example: Gaussian mixture model

  • We get a sample of N (100) real values.
  • We know this sample comes from a mixture of two Gaussian distributions.
  • What are the means and variances of these two distributions?
  • Implementations in Stan, Infergo, and Turing.jl.

Resources

Course web site

David Tolpin
http://www.cs.bgu.ac.il/~tolpin/
tolpin@cs.bgu.ac.il
312/37