Bayesian Statistics

A New Website Is Born: Estimite.com

Forecasting the 2021 Norwegian Election using R and Stan

Posted on January 20, 2021 | 2 minutes (221 words)

Bayesian modeling has proven its usefulness for poll aggregation and election forecasting – for instance through FiveThirtyEight. However, in Norway, media coverage of public opinion trends still tends to focus on a single poll at a time, or a simple average of polls at best. I was convinced it would be possible to do better, and after about three times as many long days and nights as I thought it would take, the result is finally live – both in Norwegian and English – at Estimite. [Read More]

Applied data science Bayesian statistics

How Efficient is Stan Compared to JAGS?

Conjugacy, pooling, centering, and posterior correlations

Posted on January 2, 2019 | 12 minutes (2479 words)

For a good while JAGS was your best bet if you wanted to do MCMC on a distribution of your own choosing. Then Stan came along, potentially replacing JAGS as the black-box sampler of choice for many Bayesians. But how do they compare in terms of performance? The obvious answer is: It depends. In fact, the question is nearly impossible to answer properly, as any comparison will be conditional on the data, model specifications, test criteria, and more. [Read More]

R Bayesian statistics Stan

Bayesian Hierarchical Modeling

Comparing partially pooled and unpooled models in R

Posted on August 8, 2018 | 12 minutes (2407 words)

I used to think so-called multilevel models were a little boring. I was interested in causal inference, and the people using these models did not seem to have better causal identification strategies than those running plain old regressions. I have gradually come to change my mind on these models, although it is not because I think they solve challenges of causal identification. It is rather because I think a large share of our data can be thought of as hierarchical, and that proper modeling help us make the most of such data. [Read More]

R Bayesian statistics

An Introduction to Markov Chain Monte Carlo Sampling

Writing and diagnosing a Metropolis sampler in R

Posted on July 23, 2018 | 10 minutes (2065 words)

It is usually not too difficult to define priors and specify a likelihood function, which means we can calculate the unnormalized posterior for any combination of relevant parameter values. However, that is still insufficient to give us marginal posterior distributions for the parameters of interest. The grid method that was used in the previous post is not feasible for situations with a large number of parameters, and conjugate models with analytical solutions are mainly relevant for a subset of suitable problems. [Read More]

R Bayesian statistics

The Basics of Bayesian Inference

Evaluating continuous distributions over a grid in R

Posted on July 22, 2018 | 11 minutes (2257 words)

The goal of data analysis is typically to learn (more) about some unknown features of the world, and Bayesian inference offers a consistent framework for doing so. This framework is particularly useful when we have noisy, limited, or hierarchical data – or very complicated models. You may be aware of Bayes’ theorem, which states that the posterior is proportional to the likelihood times the prior. But what does that mean? This post offers a very basic introduction to key concepts in Bayesian statistics, with illustrations in R. [Read More]

R Bayesian statistics