P.Mean: Transforming the parameter also transforms the prior distribution (created 2010-11-25)

Transforming the parameter also transforms the prior distribution (created 2010-11-25).

This page is moving to a new website.

All my work on Bayesian models recently has forced me to remember some of my mathematical statistics that I had not touched since college. Here's another example of this. Suppose you have a prior distribution on a parameter θ and you want to find the comparable prior for a transformation φ=u(θ).

Let's assume, for simplicity, that the function u is monotone (either strictly increasing for all value of θ or strictly decreasing for all values of θ). The results for non-monotone functions are not that much more complex, but the calculations are a bit more tedious. Let's also assume that θ is a single value and not a vector. The result is taken directly from Hogg and Craig, Section 4.3 with a minor change in notation.

Let θ be a random variable with probability density function f(θ). Define φ=u(θ). Then the probability density function for φ is

where w( ) = u^-1( ).

Here is an example that I am working out for a class taught by Peter Congdon on Bayesian models taught at statistics.com.

Suppose that π has a beta prior distribution. The pdf for the beta distribution is

In this example, I am excluding some constants involving factorials to simplify the algebra.

You may wish to consider a prior parameter

because simulations involving η do not have range restrictions. In contrast, π is restricted to the interval [0,1], so any values appearing outside that interval need special handling. This can occur with the Metropolis algorithm, which has a jump distribution that may, from time to time, produce a value of π outside the legal range.

The inverse function is

and the derivative is

So the pdf for η, assuming a beta distribution for π, is

which simplifies to

This looks something like the generalized logistic distribution, but I can't quite figure out the exact relationship here. I'm also having some trouble with the constants in front of the equation that make the probability density function integrate to 1 across the entire range.

For the purposes of the simulation, though, the form of the distribution and the constants are not important. It turns out that this has the rough semblance to a beta distribution since

The difference between this and the original beta distribution is that you have added 1 to both α and β. Pretty cool!

Robert V. Hogg, Allen Craig (1994). Introduction to Mathematical Statistics, Fifth Edition. Prentice Hall. ISBN: 0023557222