P.Mean: Running JAGS from R, a simple example (created 2013-09-04)

P.Mean: Running JAGS from R, a simple example (created 2013-09-04).

News: Sign up for "The Monthly Mean," the newsletter that dares to call itself average, www.pmean.com/news.

I was bemoaning the problems with BUGS yesterday, so today I investigated using JAGS instead. This is a stand-alone program, like BUGS, and also like BUGS it has an interface within R. I want to run from inside R so I can compare different models and run a few simple simulations. The first step, like the first step with BUGS was to run a simple beta-binomial model. This model is trivial, and does not need BUGS or JAGS or any other fancy package. It is just a quick way to test things.

I've run both WinBUGS and OpenBUGS and JAGS before, and I am leaning towards JAGS more these days. There has been no new developments in WinBUGS since 2008. Even the development for OpenBUGS is slowing down, it seems. Also, OpenBUGS is written in Conceptual Pascal, which may limit its portability.

JAGS is written in C++, and its error messages are a bit less cryptic than either BUGS program. Also, the R interface to JAGS, rjags, is available in CRAN rather than shunted aside to CRANextras. I'm going to keep my options open, of course, but for the short term, JAGS looks like a pretty good choice.

By the way, all of these programs are open source, so I have to tip my hat to those programmers who have labored to create them.

Here is the model statement for JAGS.

model { p ~ dbeta(a,b) x ~ dbin(p,n) }

It needs to be stored in as a text file. I stored it under the name jags.beta.binomial.txt. I used the upper case MODEL in BUGS, but it looks like you need lowercase in JAGS.

Here's the R program.

library("rjags") dat.beta.binomial <- list(a=1,b=1,x=11,n=25) mod.beta.binomial <- jags.model("jags.beta.binomial.txt",data=dat.beta.binomial) update(mod.beta.binomial,1000) out.beta.binomial <- coda.samples(mod.beta.binomial,variable.names="p",n.iter=1000)
out.beta.binomial
attributes(out.beta.binomial[[1]]) summary(out.beta.binomial) plot(out.beta.binomial)

Let's take this apart step by step.

library("rjags")

This loads the user contributed package rjags.

dat.beta.binomial <- list(a=1,b=1,x=11,n=25)

The data needs to be stored in a list. It gets a bit messier for more complex examples. The variables a and b represent parameters for the beta prior. It corresponds to a uniform prior. The variable x represents the number of success out of n total trials.

mod.beta.binomial <- jags.model("jags.beta.binomial.txt",data=dat.beta.binomial)

You have to tell JAGS where to find the model and the data. It produces the following output.

Compiling model graph Resolving undeclared variables Allocating nodes Graph Size: 5

Initializing model

Next you run a "burn-in" period.

update(mod.beta.binomial,1000)

This is probably excessive in an example as simple as this, but it is normally a good idea to let the model run a thousand times or sometimes even ten thousand times before you evaluate any statistics. This allows the Markov Chain Monte Carlo to converge properly. Insuring convergence is actually one of the very tricky parts of these Bayesian models.

As this function runs, there is a counter that notes your progress. It is instantaneous for this example, but for larger examples, you need this to tell if you need to take a coffee break or a lunch break.

out.beta.binomial <- coda.samples(mod.beta.binomial,variable.names="p",n.iter=1000)

This creates the output that you want to evaluate.

out.beta.binomial

If you print the output from coda.samples, you will see that it is a list (the [[1]] below tells you it is a list).

[[1]]Markov Chain Monte Carlo (MCMC) output: Start = 1001 End = 2000 Thinning interval = 1 p [1,] 0.6947935 [2,] 0.3117113 [3,] 0.3878973 . . . [998,] 0.3222727 [999,] 0.4297565 [1000,] 0.5060621 attr(,"class") [1] "mcmc.list"

The length of the list is the number of chains. In our case, the number of chains is 1. Each element of the list is a object of class mcmc.list. The individual items of the list are objects of class mcmc, and to explore them in detail, you need to use the attributes function.

attributes(out.beta.binomial[[1]]) $dim [1] 1000 1

$dimnames $dimnames[[1]] NULL

$dimnames[[2]] [1] "p"$mcpar [1] 1001 2000 1

$class [1] "mcmc"

This object includes a matrix. The number of columns of this matrix is the number of nodes you are monitoring. The number of rows of this matrix is the number of iterations. The mcmc object also includes information about the beginning and end iteration numbers and the amount of thinning. In our example, the number of iterations started at 1001 (because we had a burn-in period of 1,000 samples first) and ends at 2000. The thinning parameter equals 1 because we did not request any thinning. The "matrix" here has only one column because we are only monitoring the node for p. The matrix has names for the columns, but not for the rows.

There are special summary and plot functions for objects of class mcmc.

> summary(out.beta.binomial)

Iterations = 1001:2000 Thinning interval = 1 Number of chains = 1 Sample size per chain = 1000

1. Empirical mean and standard deviation for each variable, plus standard error of the mean:

Mean SD Naive SE Time-series SE 0.443718 0.094281 0.002981 0.002981

2. Quantiles for each variable:

2.5% 25% 50% 75% 97.5% 0.2672 0.3783 0.4439 0.5031 0.6345

This example is one where JAGS or any other program is not really needed, because there is a simple closed form solution. I ran this example just because I wanted to start with something so easy that even I couldn't screw it up. If you are interested in checking, though, the closed form solution for the posterior distribution is beta(12,15). The mean is

12 / (12+15) = 0.444

and the standard deviation is

sqrt(12*15 / ((12+15)^2(12+15+1))) = 0.0939

which matches the mean and standard deviation pretty well. Remember that JAGS uses a simulation approach to estimate the posterior distribution, so it will not match the closed form solution perfectly. You can see, however, that the use of 1,000 samples does fairly well, as either standard error for the mean is very small. The percentiles also match up well

> qbeta(c(0.025,0.25,0.5,0.75,0.975),12,15) [1] 0.2658712 0.3789053 0.4430528 0.5085033 0.6308196

The plot function produces the following

Plot of mcmc object

The left plot looks like white noise which is some level of reassurance that the chain has converged on the true solution. There is actually much more that you should do if you are worried about convergence, but that's a topic for a different web page. The right plot shows the estimated posterior density, which looks remarkably like a beta distribution.

This example just scratches the surface of what you can and should do, but it helped me get started, and if you are wrestling with R and JAGS, I hope it helps you as well.

This page was written by Steve Simon and is licensed under the Creative Commons Attribution 3.0 United States License. Need more information? I have a page with general help resources. You can also browse for pages similar to this one at Bayesian Statistics.