Masterclass in Bayesian Statistics (1854)
Masterclass en statistiques bayésiennes
Dates: 22-26 October 2018
Place: CIRM (Marseille Luminy, France)


La plupart des domaines scientifiques sont aujourd’hui confrontés à la question des « Big Data », c’est-à-dire l’afflux massif de données, potentiellement à structures multiples et complexes. Pour faire face à ce déluge de données, l’approche bayésienne semble particulièrement prometteuse car elle permet, par la spécification d’une distribution préalable sur le système inconnu, de structurer des problèmes de grande dimension, soit en exploitant une expertise préalable sur le phénomène observé, soit en utilisant des outils génériques de modélisation tels que les processus gaussiens. Cette Masterclass vise à introduire des outils algorithmiques et inférentiels nouveaux et de pointe.


Most scientific fields now face the issue of “big data”, ie the influx of massive datasets, potentially with multiple and complex structure. To deal with this data deluge, the Bayesian approach sounds particularly promising as it allows, through the specification of a prior distribution on the unknown system, to add structure to problems of large dimension, either by exploiting prior expertise on the observed phenomenon, or by using generic modelling tools such as Gaussian processes. As a concrete example, consider brain imaging in tumor detection: the dimension of the problem is the number of voxels (i.e., unitary elements of an image in three dimensions, which typically range in the order of a million objects), and a prior distribution makes it possible to impose that neighboring voxels are similar with high probability, to reflect the structure of gray matter. However, Bayesian approaches are still relatively rarely used in very large problems because the basic algorithms for computing Bayes estimators (especially Markov chain Monte Carlo (MCMC) methods) may prove too costly in computing time and memory size. It is therefore often necessary, when implementing a Bayesian approach in a non-trivial problem, to turn to more advanced methods, either on the computationally speaking (like an implementation on a parallel architecture) or on the mathematically speaking (e.g., convergence of approximate methods, use of continuous-time process). More precisely, this masterclass school aims at introducing novel and state-of-the art algorithmic and inferential tools, from advanced algorithms (Approximate Bayesian computation (ABC), synthetic likelihood, indirect inference, noisy and consensus Monte Carlo, Langevin diffusion subsampling, Hamiltonian Monte Carlo, sequential and asynchronous methods) to inference techniques for large data sets (synthetic likelihood, indirect and non-parametric inference, pseudolikelihood, variational approaches, automatic selection of summaries).



Model assessment, selection and averaging (pdf)  – VIDEO – 
Prior and posterior predictive checking (pdf)
Dynamic Hamiltonian Monte Carlo in Stan (pdf)
Generic MCMC convergence diagnostics  (pdf)



Variational Approximations and How to Improve Them (pdf)

Sequential Monte Carlo smoother for partially observed 
diffusion processes

Bayesian learning at scale with approximate models (pdf)

Asymptotic Genealogies of Sequential Monte Carlo Algorithms (pdf)

Maximum likelihood inference for large & sparse hidden random graphs (pdf)

Cognitive models of memory processes in sentence comprehension: A case study using Bayesian hierarchical modeling (pdf)

Introduction to data assimilation (pdf)

​​Computational statistics for biological models (pdf)

Bayesian Inference Without Tears (pdf)

Scalable Importance Tempering and Bayesian Variable Selection (pdf)