Markov chain Monte Carlo

2023. 5. 20. 04:28Data science/Machine Learning

반응형

Markov chain Monte Carlo is a powerful technique that combines the concepts of Markov chain and Monte Carlo methods. It provides a way to sample from otherwise intractable probability distributions. MCMC methods are statistical techniques that provide a way to sample from complex, high-dimensional probability distributions that are otherwise intractable or difficult to analyze mathematically. MCMC methods are particularly useful in Bayesian inference, a framework for updating beliefs or making predictions based on prior knowledge and observed data. 

 

MCMC methods arise from constructing a Markov chain - a sequence of random variables where the probability of each variable depends on the last variable in the sequence. The MCMC method aims to generate samples from a target probability distribution, often a posterior distribution in Bayesian inference. The key idea is constructing a Markov Chain whose stationary distribution equals the target distribution. We obtain a representative set of samples from the target distribution by iteratively generating samples from this Markov chain. 

https://www.google.com/url?sa=i&url=https%3A%2F%2Fwww.researchgate.net%2Ffigure%2FIllustration-of-Markov-Chain-Monte-Carlo-method_fig1_334001505&psig=AOvVaw376MgcRDqSyf1XOBM9scPy&ust=1684582332392000&source=images&cd=vfe&ved=0CBMQjhxqFwoTCJiVlK6kgf8CFQAAAAAdAAAAABA5

The main advantage of MCMC methods is their ability to explore and sample from intractable probability distributions with no closed-form solution. This makes MCMC particularly valuable in cases where direct sampling or analytical approaches are not feasible. 

 

Metropolis-Hastings algorithm

1) Start with an initial sample from the target distribution.

2) Propose a new sample by randomly perturbing the current sample

3) Evaluate the acceptance probability for the proposed sample based on the ratio of the target distribution values

4) Accept the proposed sample with a certain probability, or reject it and keep the current sample

5) Repeat steps 2 to 4 for a large number of iterations to obtain a sequence of samples

 

The Metropolis-Hastings algorithm leverages the Markov chain property to explore the target distribution iteratively. By generating proposals and accepting or rejecting them based on the acceptance probability, the algorithm allows for efficient sampling from high-dimensional and complicated distributions. Over a sufficient number of iterations, the sequence of samples produced by the algorithm converges to a representative sample from the target distribution.

 

In Bayesian Inference, MCMC methods explore the posterior distribution of model parameters given observed data. By sampling from the posterior distribution, one can estimate the uncertainty in parameter values, assess model fit, and make predictions. It allows for the incorporation of prior knowledge and the quantification of uncertainty.

 

In the Random-Walk Metropolis-Hastings (RW-MH) algorithms are accepted a proposed move from a point 𝞱 to a point 𝞱' with probability min ( 1, 𝞹(𝞱|𝞱')𝞺(𝞱')/𝞹(𝞱'|𝞱)𝞺(𝞱) ). 

Proposal
Distribution
The distribution is used to generate candidate points in the Markov chain. RW-MH often follows a random walk pattern, where the new point is obtained by perturbing the current point according to a specific distribution.
𝞹(θ|θ') The density of proposing θ' given θ. It represents the likelihood of transitioning from the current point θ to the proposed point θ'.
𝞹(θ'|θ) The density of proposing θ given θ'. It represents the likelihood of transitioning from the proposed point θ' back to the current point θ.
Target
Distribution
The distribution from which we want to sample. Often, it is the posterior distribution in Bayesian inference. η(θ) represents the density of the target distribution at point θ, and η(θ') represents the density at the proposed point θ'.
𝞺(θ) The density of the target distribution at point θ (The current state or parameter). It indicates the relative probability of observing θ according to the target distribution.
𝞺(θ') The density of the target distribution at the proposed point θ' (New proposed state or parameter). It indicates the relative probability of observing θ' according to the target distribution.
Acceptance Probability The probability of accepting a proposed move from point θ to point θ', determined by the formula min(1, (ϕ(θ|θ')η(θ')) / (ϕ(θ'|θ)η(θ))). It ensures that the Markov chain transitions between points with a probability proportional to the target distribution.

min( 1, x ) denotes taking the minimum between 1 and x. This means that this will ensure the proposed state 𝞱' is accepted with a probability of at least one or proportional to the target and proposal densities ratio. 

 

This acceptance probability determines whether the proposed state is incorporated into Markov Chain or Rejected. 

반응형