Samplers

Quick overview

Samplers contains sampling algorithms to solve inverse problems. It focuses mainly on the Hamiltonian Monte Carlo (HMC) algorithm and its variants. Currently the following samplers are implemented:

the "standard" HMC algorithm with optional constraints as described in ^[4] and ^[5], and
the No U-Turn (NUTS) algorithm as described in ^[8].
the Extended Metropolis algorithm as described in ^[9].

See the User guide section for an explanation of the basics of Samplers and some minimal usage examples.

In the probabilistic approach, an inverse problem essentially represents an indirect measurement where the knowledge about the observed data and model parameters is completely expressed in terms of probabilities ^[1]. Within such formalism, the general solution to the inverse problem is a probability density function (PDF), i.e., the posterior PDF (see ^[1] and ^[2] for a detailed explanation). Typically, the posterior PDF cannot be computed analytically, therefore sampling methods come into play. The result is thus not a single ``optimal'' model, but a collection of plausible models which may feature significant differences while they are generally able to explain the observed data and are compatible with the prior information. This reflects the nature of the inverse problem: if different scenarios are plausible from the point of view of the observed data, then the probabilistic solution will aim at reflecting that.

The posterior PDF is constructed from the combination of two pieces of separate information: 1) the prior knowledge on the model parameters, expressed by the PDF $\rho(\mathbf{m})$, where $\mathbf{m}$ represents the model parameters and 2) the information provided by the experiment, described by $L(\mathbf{m})$. The posterior distribution, under certain fairly wide assumptions, is then given by ^[2] ^[3]:

\[\begin{align} \sigma(\mathbf{m}) = k \, \rho(\mathbf{m}) \, L(\mathbf{m}). \end{align}\]

Since $\sigma(\mathbf{m})$ is a PDF, it requires to evaluate the relevant integrals to find features of interest. For example, calculating the expected model given the data requires evaluating the following integral

\[\begin{equation} \mathbb{E} \left[ \mathbf{m} \right] = \int_{M} \mathbf{m} \, \sigma(\mathbf{m}) \, \mathrm{d}\mathbf{m}, \end{equation}\]

where $M$ represents the whole model space.

Finally, to calculate some arbitrary function $\phi(\mathbf{m})$ of $\mathbf{m}$, we can use the following relationship once we have obtained a collection of samples from the posterior distribution:

\[\begin{equation} \int_{\mathrm{M}} \phi(\mathbf{m}) \sigma(\mathbf{m}) \, \mathrm{d} \mathbf{m} \approx \frac{1}{N} \sum_{i=1}^N \phi(\mathbf{m}_{(i)}) \end{equation}\]

where $N$ is the number of available samples

Basic HMC algorithm

HMC constructs a Markov chain over an $n$-dimensional probability density function $\sigma(\mathbf{m})$ using classical Hamiltonian mechanics. The algorithm regards the current state $\mathbf{m}$ of the Markov chain as the location of a physical particle in an $n$-dimensional space $\mathcal{M}$ (i.e., model or parameter space). It moves under the influence of a potential energy, $U$, which is defined as

\[\begin{align} U(\mathbf{m})=-\ln{ \left( \sigma(\mathbf{m}) \right) }. \end{align}\]

To complete the physical system, the state of the Markov chain needs to be artificially augmented with momentum variables $\mathbf{p}$ and a generalized mass for every dimension pair. The collection of resulting masses is contained in a positive definite mass matrix $\mathbf{M}$ of dimension $n \times n$. The momenta and the mass matrix define the kinetic energy of the particle as

\[\begin{align} K(\mathbf{p})=\frac{1}{2} \mathbf{p}^T \mathbf{M}^{-1} \mathbf{p}. \end{align}\]

In the HMC algorithm, the momenta $\mathbf{p}$ are drawn randomly from a multivariate Gaussian with covariance matrix $\mathbf{M}$ (the mass matrix). The sum of the location-dependent potential and momentum-dependent kinetic energy constitute the total energy, or Hamiltonian, of the system

\[\begin{align} H(\mathbf{m},\mathbf{p})=U(\mathbf{m})+K(\mathbf{p}). \end{align}\]

The Hamiltonian dynamics are governed by the following equations,

\[\begin{align} \frac{\partial\mathbf{m}}{\partial\tau} = \frac{\partial H}{\partial\mathbf{p}},\quad \frac{\partial\mathbf{p}}{\partial\tau} = - \frac{\partial H}{\partial\mathbf{m}} \, , \end{align}\]

which determine the position of the particle as a function of time $\tau$. This time $\tau$ is artificial just like the mass matrix, it has no connection to the actual physics of the inverse problem at hand.

We can simplify Hamilton's equations using the fact that kinetic and potential energy depend only on momentum and location, respectively, to obtain

\[\begin{align} \frac{\partial\mathbf{m}}{\partial\tau} = \mathbf{M}^{-1} \mathbf{p}, \quad \frac{\partial\mathbf{p}}{\partial\tau} = - \frac{\partial U}{\partial\mathbf{m}} \, . \end{align}\]

Evolving $\mathbf{m}$ over time $\tau$ generates another possible state of the system with new position $\mathbf{\tilde{m}}$, momentum $\mathbf{\tilde{p}}$, potential energy $\tilde{U}$, and kinetic energy $\tilde{K}$. Due to the conservation of energy, the Hamiltonian is equal in both states, i.e., $U+K = \tilde{U} + \tilde{K}$. Successively drawing random momenta and evolving the system generates a distribution of the possible states of the system. Thereby, HMC samples the joint momentum and model space, referred to as phase space. As we are not interested in the momentum component of phase space, we marginalize over the momenta by simply dropping them. This results in samples drawn from $\sigma(\mathbf{m})$.

If one could solve Hamilton's equations exactly, every proposed state (after burn-in) would be a valid sample of $\sigma(\mathbf{m})$. Since Hamilton's equations for non-linear forward models cannot be solved analytically, the system must be integrated numerically. Suitable integrators are symplectic, meaning that time reversibility, phase space partitioning and volume preservation are satisfied ^[4]^[5]. We employ the leapfrog method as described in ^[4]. However, the Hamiltonian is generally not preserved exactly when explicit time-stepping schemes are used (e.g., ^[6]). Therefore, the time evolution generates samples not exactly proportional to the original distribution. A Metropolis-Hastings correction step is therefore applied at the end of numerical integration.

In summary, at each iteration, samples are generated starting from a randomly drawn model $\mathbf{m}$ in the following way:

$\mathbf{p}$ according to the Gaussian with mean $\mathbf{0}$ and covariance matrix $\mathbf{M}$;
Compute the Hamiltonian $H$ of model $\mathbf{m}$ with momenta $\mathbf{p}$;
Propagate $\mathbf{m}$ and $\mathbf{p}$ for some time $\tau$ to $\tilde{\mathbf{m}}$ and $\tilde{\mathbf{p}}$, using the discretized version of Hamilton's equations and a suitable numerical integrator;
Compute the Hamiltonian $\tilde{H}$ of model $\tilde{\mathbf{m}}$ with momenta $\mathbf{\tilde{p}}$;
Accept the proposed move $\mathbf{m} \rightarrow \tilde{\mathbf{m}}$ with probability $p_\text{accept} = \min \left( 1, \exp ( H-\tilde{H} ) \right)\,.$
If accepted, use (and count) $\tilde{\mathbf{m}}$ as the new state. Otherwise, keep (and count) the previous state. Then return to point 1.

The mass matrix $\mathbf{M}$ is one of the important tuning parameters of the HMC algorithm; details on its meaning and suggestions for tuning can be found in ^[5]^[7]. Moreover, employing the discrete leapfrog integrator implies that there are two additional parameters that need to be tuned, namely the time step $\epsilon$ and the number of iterations $L$ ^[4].

User guide

Monte Carlo simulations

All kinds of simulations are started using the function runMC. The arguments passed to runMC determine which algorithm will be used, number of iterations, the output files, etc.. The signature of the function is:

runMC(
    pars::Dict,
    MCpar::AbstractMCParams,
    mstart::Vector{Float64};
    likelihood,
    prior
)

The argument pars is a dictionary containing a set of generic parameters for the simulation, such as the maximum number of iterations, the name of the simulation, the directory used to write the output, etc. See the documentation of runMC for more details. The argument MCparis a Julia structure which depends on the algorithm intended to be employed. Its type, in fact, determines which algorithm will be used to perform the sampling. As such, the parameters that need to be specified for each algorithm are different, and so the "fields" of the MCpar structure. Currently the following AbstractMCParams structures (types) are defined:

HMCParams for the "standard" HMC algorithm with optional constraints as described in ^[4] and ^[5];
NUTSParams for the No U-Turn (NUTS) algorithm as described in ^[8];
ExtMetropParams for the Extended Metropolis algorithm as described in ^[9].

See their documentation for details. The argument mstart is a vector containing the starting model for the simulation. The model has to be in the form of a 1D vector for the computations to work, i.e., physically 3D models need to be "unrolled" to 1D. The last two arguments, likelihood and prior are related to the two PDFs composing the posterior PDF as seen above, i.e., the prior PDF $\rho(\mathbf{m})$ and the likelihood PDF $L(\mathbf{m})$. However, likelihood and prior refer in practice to the negative logarithm of those, namely $- \ln( \rho(\mathbf{m}))$ and the likelihood PDF $- \ln (L(\mathbf{m})$.

Once the above parameters have been set, the simulation can be started by issuing, for instance, something like

runMC(pars,MCpar,mstart,likelihood=mylikelihood,prior=myprior)

Output

The output of a simulation is written to two files:

<name of the simulation>_inp.jld2, and
<name of the simulation>_outp.h5.

The first file contains all the input parameters of the simulation which completely specify the problem under study. These include the prior and likelihood models, the generic simulation parameters, etc. Such file is saved in the .jld2 format (using the package JLD2), which is a HDF5-compatible format capable of saving the Julia structures and retrieving them later on. The second file contains mainly the results of the simulation, namely the saved models, some statistical info, etc.. Such file is saved in the .h5 format (using the package HDF5).

The full output can be read and saved in an appropriate structure by using the function readMCoutput. The signature of this function is

readMCoutput(
    datadir::String,
    simname::String;
    istart,
    iend
)

The argument datadir specifies the directory in which the output is saved and simname the name of the simulation. The optional parameters istart and iend specify which subset of the saved models to retrieve.

Interrupting and continuing simulations

This package allows you to interrupt the simulation at any time by simply killing the Julia process. That should not create problems to the output files even in case of a crash of the program. The output files can then be used to continue (or re-start) the simulation from the point where it was interrupted by using the function contrunMC. This function needs as the first argument a dictionary containing the generic simulation parameters (see above). Moreover, this function allows for a continuation using the same exact parameters used before or, by additionally passing a new instance of AbstractMCParams of the appropriate type (e.g., a HMCParams struct) to continue the previous simulation with new algorithm-specific parameters.

The output files are written using the Single Writer Multiple Reader (SWMR) capabilities of the HDF5 file format, so it is easy while the simulation is running to launch a different Julia instance and, from there, to plot or analyze the results so far. The SWMR will guarantee that only the initial process will the able to write to the file, while allowing others to read the data.

Finally, in case the output file containing the saved models gets corrupted because of wrong HDF5 "status flags", the function clear_stflags_hdf5 provides a way to clear them and then continue the simulation using contrunMC.

Example 1: Sampling a 2D Gaussian with the "standard" HMC algorithm

In the following we show a simple example of how to construct a custom problem and run Samplers to sample the target distribution. In this case we aim at sampling a simple 2D Gaussian probability density function (PDF), which corresponds to having a Gaussian likelihood function and a forward model which is simply the identity matrix. In other words, the forward model is given by

\[\mathbf{G} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \, .\]

while the misfit function (negative logarithm of the PDF) by

\[U(\mathbf{m}) = \frac{1}{2} \, (\mathbf{G} \mathbf{m}-\mathbf{d})^{\mathrm{T}} \mathbf{C}_{\mathrm{D}}^{-1} (\mathbf{G} \mathbf{m}-\mathbf{d}) \, ,\]

where $\mathbf{m}$ are the model parameters (the unknowns) and $\mathbf{d}$ the vector of observed (or measured) data, i.e., in this example, the mean of the Gaussian PDF.

First we load some needed packages

using LinearAlgebra
using InverseAlgos.Samplers  # sampling algos
using CairoMakie  # for plotting

Then the problem to solve is defined by using a custom type (struct) named GaussProb:

struct GaussProb
    mean::Vector
    invcov::Matrix

    function GaussProb(mean,invcov)
        @assert isposdef(invcov)
        @assert issymmetric(invcov)
        return new(mean,invcov)
    end
end

The above defines a structure (type) which contains two fields: mean, representing the mean of the Gaussian pdf and invcov, representing the inverse of the covariance matrix (i.e., the precision matrix). These two pieces of information completely specify our problem.

We now need to provide functions to compute the value of the negative logarithm of probability density function, i.e, $U(\mathbf{m})$, and its gradient with respect to the model parameters $\mathbf{m}$. To do so, we make the struct callable (sometimes called a "functor") with a function that computes either the negative natural logarithm of the pdf or the gradient of the latter for an input vector x depending on the value of the argument kind:

function (gp::GaussProb)(x,kind)
   if kind==:nlogpdf
	   # negative logarithm of the pdf
       dif = x .- gp.mean
       pd = 1/2 * dot(dif, (gp.invcov * dif))
       return pd
   elseif kind==:gradnlogpdf
	   # gradient of neg. log. of the pdf
       gra = gp.invcov * x
       return gra
   else
       error("Wrong argument 'kind'. It can be either ':nlogpdf' or ':gradnlogpdf'.")
   end
end

An important thing here is that in order to use the above function with Samplers the signature of the function must be (x::Vector{<:Real},kind::Symbol) where x is a vector and kind is a Symbol. Moreover the only possible values for kind are either :nlogpdf which evaluates the negative logarithm of the likelihood functional or :gradnlogpdf which computes its gradient.

Computing the value of the negative logarithm of the pdf and its gradient it is all that is needed for setting up a HMC inversion (except for additional tuning parameters). Computing the negative logarithm of the pdf essentially corresponds to evaluating the misfit functional.

We now define the values for the mean (representing the observed data $\mathbf{d}$ and inverse covariance $\mathbf{C}_{\mathrm{D}}^{-1}$ and instantiate our likelihood model:

mea = [0.2,-0.3] # then mean value
invcov = inv([1.5 0.2; 0.2 2.3]) # the inverse covariance matrix
likelihood = GaussProb(mea,invcov) # instantiate a GaussProb struct

Next we set up the HMC simulation (sampling) by first defining the starting model and the parameters needed for HMC. We define the starting model as

mstart = [5.0,-3.0]

and the precision matrix as a diagonal one

imM = Diagonal(1.0*I,length(mstart))
LchoM = cholesky(inv(imM)).L
@show imM
@show LchoM

imM = [1.0 0.0; 0.0 1.0]
LchoM = [1.0 0.0; 0.0 1.0]

Now we need to set the parameters relative to the chosen HMC algorithm, in this case the plain version of HMC. Moreover, we need to define whether the model parameters will be constrained or not (false in our case), the values for ϵ (type \epsilon-TAB in Julia) and L, i.e., the step length and number of iterations for the leap-frog integration

constrained = false
ϵ = 0.25
L = 10

We thus set all the needed parameters for a "standard" HMC simulation, therefore we can instantiate the HMCParams structure

hmcpars = HMCParams(invmassM=imM,
                    LcholmassM=LchoM,
                    constrained=constrained,
                    ϵ=ϵ,
                    L=L)

The type of the above structure also defines the algorithm that will be used for sampling, i.e., HMCParams implies the use of the classic HMC algorithm, while, e.g., NUTSParams implies the use of the NUTS algorithm.

Finally, we can set in a dictionary the general parameters about the simulation, i.e., the simulation name, the maximum number of iterations, the output directory, etc..

pars = Dict()
pars["simname"] = "gauss1" # simulation name
pars["maxiter"] = 5000     # maximum number of iterations
pars["outdir"] = "results" # output directory
pars["saveevery"] = 5      # save every 2 models to disk, e.g., thinning the chain
pars["infoevery"] = 500    # print info every 10 iterations

We can now start the simulation using the generic function runMC:

runMC(pars,hmcpars,mstart,likelihood=likelihood)


 ====================================
      Hamiltonian Monte Carlo
 ====================================
 Running simulation gauss1
 Initial Ucur: 10.134
 Iter.#: 1  Acc.: 100.00%  Ucur: 5.345  ELA: 0h:0m:1s  ETA: 1h:50m:46s    
 Iter.#: 500  Acc.: 84.40%  Ucur: 0.465  ELA: 0h:0m:1s  ETA: 0h:0m:16s    
 Iter.#: 1000  Acc.: 84.90%  Ucur: 0.811  ELA: 0h:0m:2s  ETA: 0h:0m:8s    
 Iter.#: 1500  Acc.: 84.47%  Ucur: 1.241  ELA: 0h:0m:2s  ETA: 0h:0m:5s    
 Iter.#: 2000  Acc.: 84.45%  Ucur: 1.125  ELA: 0h:0m:3s  ETA: 0h:0m:4s    
 Iter.#: 2500  Acc.: 83.76%  Ucur: 0.034  ELA: 0h:0m:4s  ETA: 0h:0m:4s    
 Iter.#: 3000  Acc.: 83.60%  Ucur: 0.938  ELA: 0h:0m:5s  ETA: 0h:0m:3s    
 Iter.#: 3500  Acc.: 83.31%  Ucur: 0.340  ELA: 0h:0m:6s  ETA: 0h:0m:2s    
 Iter.#: 4000  Acc.: 83.35%  Ucur: 0.683  ELA: 0h:0m:8s  ETA: 0h:0m:2s    
 Iter.#: 4500  Acc.: 83.49%  Ucur: 0.079  ELA: 0h:0m:10s  ETA: 0h:0m:1s    
 Iter.#: 5000  Acc.: 83.28%  Ucur: 0.364  ELA: 0h:0m:12s  ETA: 0h:0m:0s
  End of main loop. Saved 1001 models.
 ====================================

We can load the full results of the simulation (using readMCoutput), which will store the outputs in a structure of type Samplers.HMCOutput. See the documentation of Samplers.HMCOutput for an exaplanation of the various fields.

outhmc = readMCoutput("results","gauss1")

InverseAlgos.Samplers.HMCOutput("results", "gauss1", [5.0, -3.0], "730ff26a-0e20-11f0-1a1b-01ecb98ad875", InverseAlgos.Samplers.NLogPostPDF(Reconstruct@GaussProb(Any[[0.2, -0.3], [0.6744868035190615 -0.05865102639296188; -0.05865102639296188 0.43988269794721413]]), nothing, Union{Missing, Float64}[0.0, missing]), InverseAlgos.Samplers.HMCParams(false, "", [1.0 0.0; 0.0 1.0], [1.0 0.0; 0.0 1.0], false, nothing, nothing, 0.25, 10, 1, :plainHMC), nothing, [1, 2, 3, 4, 5, 6, 7, 8, 9, 10  …  4991, 4992, 4993, 4994, 4995, 4996, 4997, 4998, 4999, 5000], [1, 5, 10, 15, 20, 25, 30, 35, 40, 45  …  4955, 4960, 4965, 4970, 4975, 4980, 4985, 4990, 4995, 5000], [-3.7174484729766846 0.025722434744238853 … -0.05706125125288963 -0.4503899812698364; 0.1988266557455063 0.22655130922794342 … -2.324765920639038 -1.3942525386810303], [1, 2, 2, 3, 4, 5, 6, 7, 8, 9  …  4158, 4159, 4160, 4160, 4161, 4162, 4163, 4163, 4163, 4164], [5.344812007994469, 1.1311128524887022, 1.1311128524887022, 1.3167988737281693, 0.07660527727493868, 0.4461035735648223, 0.48829264231052116, 0.10457153258180271, 0.5311289105196042, 0.27487820502682886  …  1.251870350015011, 3.719260515486866, 0.01557263869910431, 0.01557263869910431, 0.8934465291085592, 0.48651423456233317, 0.4021576856709354, 0.4021576856709354, 0.4021576856709354, 0.3642702406990698], [5.344812007994469, 1.1311128524887022, 1.1311128524887022, 1.3167988737281693, 0.07660527727493868, 0.4461035735648223, 0.48829264231052116, 0.10457153258180271, 0.5311289105196042, 0.27487820502682886  …  1.251870350015011, 3.719260515486866, 0.01557263869910431, 0.01557263869910431, 0.8934465291085592, 0.48651423456233317, 0.4021576856709354, 0.4021576856709354, 0.4021576856709354, 0.3642702406990698], nothing, [5.344812007994469, 0.07660527727493868, 0.27487820502682886, 0.21132945289767227, 1.052359026347749, 0.7025422893770027, 0.8199339505191914, 0.2298674796158129, 0.8448409563182128, 0.303743207987968  …  4.498101299672123, 1.5590137669426076, 0.4694640078194733, 1.1374278584561672, 3.7276056035498435, 0.22279367717422124, 0.6866116796421211, 0.8720156471503543, 0.8934465291085592, 0.3642702406990698], nothing)

Now we can plot some quantities of interest. As a very simple example, here we plot the values of the samples $\mathbf{m}$ as $(x,y)$ coordinates and the potential energy (misfit) as a function of iterations.

userprob = outhmc.userprob

mods = outhmc.mods
iter = outhmc.iter
ucur = outhmc.ucur

fig = Figure()

ax1 = Axis(fig[1:2,1],xlabel="x",ylabel="y")
scatter!(ax1,mods[1,:],mods[2,:])
scatter!(ax1,mea[1],mea[2],color="red")

ax2 = Axis(fig[3,1],xlabel="iteration",ylabel="Pot. energy U")
lines!(ax2,iter,ucur)

fig

Example 2: Sampling a 2D Gaussian with the NUTS algorithm

Firstly, we repeat the same initial steps than above to set up the problem.

using LinearAlgebra
using InverseAlgos.Samplers  # sampling algos
using CairoMakie  # for plotting

struct GaussProb
    mean::Vector
    invcov::Matrix

    function GaussProb(mean,invcov)
        @assert isposdef(invcov)
        @assert issymmetric(invcov)
        return new(mean,invcov)
    end
end

function (gp::GaussProb)(x,kind)
   if kind==:nlogpdf
	   # negative logarithm of the pdf
       dif = x .- gp.mean
       pd = 1/2 * dot(dif, (gp.invcov * dif))
       return pd
   elseif kind==:gradnlogpdf
	   # gradient of neg. log. of the pdf
       gra = gp.invcov * x
       return gra
   else
       error("Wrong argument 'kind'. It can be either ':nlogpdf' or ':gradnlogpdf'.")
   end
end

We define the observed data, i.e., mean value in this case, the inverse of the covariance function, instantiate the GaussProb and define the starting model.

mea = [0.2,-0.3] # then mean value
invcov = inv([1.5 0.2; 0.2 2.3]) # the inverse covariance matrix
likelihood = GaussProb(mea,invcov) # instantiate a GaussProb struct

mstart = [0.0,-3.0]

This time we intend to use the NUTS algorithm, so we start by defining the required parameters for the struct NUTSParams as follows

imM = Diagonal(1.0*I,length(mstart))  # inverse of the mass matrix
LchoM = cholesky(inv(imM)).L          # lower triangular matrix from Cholesky decomposition of the mass matrix
constrained = false                   # constrained problem?
δ = 0.65                              # target acceptance rate for dual averaging
Δmax = 1000.0                         # maximum change in the Hamiltonian value
niteradapt = 500                      # number of iterations for adaptation
ϵmax = 0.25                           # maximum value for ϵ (type \epsilon TAB in Julia)
maxtreeheight = 4                     # maximum height of the balanced tree for NUTS

See the documentation for NUTSParams for more details. Now we instantiate a NUTSParams structure:

hmcpars = NUTSParams(invmassM=imM,
                     LcholmassM=LchoM,
                     constrained=constrained,
                     δ=δ,
                     Δmax=Δmax,
                     niteradapt=niteradapt,
                     ϵmax=ϵmax,
                     maxtreeheight=maxtreeheight )

InverseAlgos.Samplers.NUTSParams(false, "", [1.0 0.0; 0.0 1.0], [1.0 0.0; 0.0 1.0], false, nothing, nothing, 0.65, 1000.0, 500, 0.25, 1, :NUTS, :velocity, 4, InverseAlgos.Samplers.AvEpsFromHMCIter(20))

Now set in a dictionary the general parameters about the simulation

pars = Dict()
pars["simname"] = "gauss2" # simulation name
pars["maxiter"] = 5000     # maximum number of iterations
pars["outdir"] = "results" # output directory
pars["saveevery"] = 5      # save every 2 models to disk, e.g., thinning the chain
pars["infoevery"] = 500    # print info every 10 iterations

and then run the simulation using NUTS

runMC(pars,hmcpars,mstart,likelihood=likelihood)


 ====================================
      NUTS (dynamic HMC)
 ====================================
 Running simulation gauss2
 Initial 'reasonable' ϵ: 0.125
 Initial Ucur: 1.585
 Iter.#: 1  Acc.: 100.00%  Ucur: 1.643  ELA: 0h:0m:0s  ETA: 0h:48m:7s    
 Iter.#: 500  Acc.: 100.00%  Ucur: 0.293  ELA: 0h:0m:0s  ETA: 0h:0m:6s    
 Iter.#: 1000  Acc.: 99.90%  Ucur: 0.987  ELA: 0h:0m:1s  ETA: 0h:0m:4s    
 Iter.#: 1500  Acc.: 99.73%  Ucur: 1.202  ELA: 0h:0m:1s  ETA: 0h:0m:4s    
 Iter.#: 2000  Acc.: 99.75%  Ucur: 1.016  ELA: 0h:0m:2s  ETA: 0h:0m:3s    
 Iter.#: 2500  Acc.: 99.72%  Ucur: 0.823  ELA: 0h:0m:2s  ETA: 0h:0m:2s    
 Iter.#: 3000  Acc.: 99.67%  Ucur: 0.002  ELA: 0h:0m:3s  ETA: 0h:0m:2s    
 Iter.#: 3500  Acc.: 99.69%  Ucur: 1.021  ELA: 0h:0m:4s  ETA: 0h:0m:1s    
 Iter.#: 4000  Acc.: 99.72%  Ucur: 0.942  ELA: 0h:0m:5s  ETA: 0h:0m:1s    
 Iter.#: 4500  Acc.: 99.73%  Ucur: 0.712  ELA: 0h:0m:7s  ETA: 0h:0m:0s    
 Iter.#: 5000  Acc.: 99.76%  Ucur: 0.610  ELA: 0h:0m:9s  ETA: 0h:0m:0s
  End of main loop. Saved 1001 models.
 ====================================

Load the full results using readMCoutput

outhmc = readMCoutput("results","gauss2")

InverseAlgos.Samplers.HMCOutput("results", "gauss2", [0.0, -3.0], "8058c4f6-0e20-11f0-3d5b-2f5cce3609d1", InverseAlgos.Samplers.NLogPostPDF(Reconstruct@GaussProb(Any[[0.2, -0.3], [0.6744868035190615 -0.05865102639296188; -0.05865102639296188 0.43988269794721413]]), nothing, Union{Missing, Float64}[0.0, missing]), InverseAlgos.Samplers.NUTSParams(false, "", [1.0 0.0; 0.0 1.0], [1.0 0.0; 0.0 1.0], false, nothing, nothing, 0.65, 1000.0, 500, 0.25, 1, :NUTS, :velocity, 4, InverseAlgos.Samplers.AvEpsFromHMCIter(20)), nothing, [1, 2, 3, 4, 5, 6, 7, 8, 9, 10  …  4991, 4992, 4993, 4994, 4995, 4996, 4997, 4998, 4999, 5000], [1, 5, 10, 15, 20, 25, 30, 35, 40, 45  …  4955, 4960, 4965, 4970, 4975, 4980, 4985, 4990, 4995, 5000], [-0.1044495701789856 -1.764655590057373 … 1.406533122062683 -1.1449445486068726; -3.0483505725860596 0.5314836502075195 … 0.40199315547943115 -0.662078320980072], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10  …  4979, 4980, 4981, 4982, 4983, 4984, 4985, 4986, 4987, 4988], [1.6434953900843223, 0.7012685286157666, 0.7779729684158949, 1.3289841553625046, 1.549587011302618, 1.0548051449225435, 0.18713679810924397, 0.19402380658466514, 0.09633713916116789, 0.8985725779725054  …  2.5108994438767533, 4.124833089087697, 3.545310988127975, 4.040799714817977, 0.549642413815707, 0.46584474271507065, 0.8576116177260584, 0.2552081779580121, 0.2807796971915875, 0.6103042942981712], [1.6434953900843223, 0.7012685286157666, 0.7779729684158949, 1.3289841553625046, 1.549587011302618, 1.0548051449225435, 0.18713679810924397, 0.19402380658466514, 0.09633713916116789, 0.8985725779725054  …  2.5108994438767533, 4.124833089087697, 3.545310988127975, 4.040799714817977, 0.549642413815707, 0.46584474271507065, 0.8576116177260584, 0.2552081779580121, 0.2807796971915875, 0.6103042942981712], nothing, [1.6434953900843223, 1.549587011302618, 0.8985725779725054, 1.4827853449625936, 0.011729589443318002, 0.28072392733654367, 0.09710854413313694, 0.2473368018545581, 0.34884191212782456, 0.4710426851037893  …  0.014478257628629565, 0.8141934667872358, 0.42728010032587227, 0.29201248888307707, 0.708538126565455, 2.7833661331142645, 2.4291567115458017, 0.8037153084158187, 0.549642413815707, 0.6103042942981712], Dict{Any, Any}("epsilonbar" => [0.1271875, 0.12941328125, 0.131678013671875, 0.13398237891113282, 0.13632707054207766, 0.13871279427656402, 0.1411402681764039, 0.14361022286949096, 0.14612340176970706, 0.14868056130067694  …  0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25], "treeheight" => [2, 4, 4, 3, 2, 4, 3, 4, 4, 4  …  3, 3, 1, 4, 2, 1, 3, 2, 3, 2], "lfsteps" => [8, 32, 32, 16, 8, 32, 16, 32, 32, 32  …  16, 16, 4, 32, 8, 4, 16, 8, 16, 8], "epsilon" => [0.1271875, 0.12941328125, 0.131678013671875, 0.13398237891113282, 0.13632707054207766, 0.13871279427656402, 0.1411402681764039, 0.14361022286949096, 0.14612340176970706, 0.14868056130067694  …  0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25], "accepted_at_lfstep" => [8, 16, 32, 16, 8, 16, 8, 16, 16, 16  …  16, 16, 4, 16, 8, 4, 16, 4, 16, 4]))

and plot some results

userprob = outhmc.userprob

mods = outhmc.mods
iter = outhmc.iter
ucur = outhmc.ucur

fig = Figure()

ax1 = Axis(fig[1:2,1],xlabel="x",ylabel="y")
scatter!(ax1,mods[1,:],mods[2,:])
scatter!(ax1,mea[1],mea[2],color="red")

ax2 = Axis(fig[3,1],xlabel="iteration",ylabel="Pot. energy U")
lines!(ax2,iter,ucur)

fig

Example 3: Sampling a 2D Gaussian with the Extended Metropolis algorithm

In this example we aim at sampling a 2D Gaussian using also a prior model with the Extended Metropolis (E-M) algorithm. First we load the needed packages

using LinearAlgebra
using InverseAlgos.Samplers  # sampling algos
using CairoMakie  # for plotting

We then setup the problem by creating two structures, one for the prior GaussPrior and one for the likelihood GaussLikelihood. The algorithm requires a function to draw samples from the prior, since it is used to create the new proposed state. We implement that in the struct GaussPrior as shown below

struct GaussPrior
    mea::Vector
    Lchol::AbstractMatrix

    function GaussPrior(mea,invcov)
        @assert isposdef(invcov)
        @assert issymmetric(invcov)
        return new(mea,invcov)
    end
end

function (gp::GaussPrior)(x,kind)
   if kind==:drawsample
       # draw sample
       zdev = randn(size(vec(x)))
       sa = gp.Lchol * zdev + gp.mea
       return sa
   else
       error("Wrong argument 'kind'")
   end
end

Next we define the likelihood with the struct GaussLikelihood. In this case we only need a function to compute the value of the negative-logarithm of the likelihood PDF. No derivatives are required with this algorithm.

struct GaussLikelihood
    mean::Vector
    invcov::Matrix

    function GaussLikelihood(mean,invcov)
        @assert isposdef(invcov)
        @assert issymmetric(invcov)
        return new(mean,invcov)
    end
end

function (gp::GaussLikelihood)(x,kind)
   if kind==:nlogpdf
	   # negative logarithm of the pdf
       dif = x .- gp.mean
       pd = 1/2 * dot(dif, (gp.invcov * dif))
       return pd
   else
       error("Wrong argument 'kind'")
   end
end

The prior and likelihood can now be instantiated once the means and covariance matrices are set.

mea1 = [-0.1,-0.5]
Cm = [2.5 0.0; 0.0 3.0]
Lchol = cholesky(Cm).L
prior = GaussPrior(mea1,Lchol) # instantiate a GaussProb struct

mea2 = [0.2,-0.3] # then mean value
invcov = inv([1.5 0.2; 0.2 2.3]) # the inverse covariance matrix
likelihood = GaussLikelihood(mea2,invcov) # instantiate a GaussProb struct

Main.GaussLikelihood([0.2, -0.3], [0.6744868035190615 -0.05865102639296188; -0.05865102639296188 0.43988269794721413])

The starting model is set to

mstart = [5.0,-3.0]

2-element Vector{Float64}:
  5.0
 -3.0

Now, the structure ExtMetropParams for the E-M algorithm need to be instantiated. In this case, such structure requires no parameters to be specified.

mcpars = ExtMetropParams()

InverseAlgos.Samplers.ExtMetropParams(:ExtMetrop)

Finally, the generic parameters for the simulation are specified

pars = Dict()
pars["simname"] = "metrop1" # simulation name
pars["maxiter"] = 5000      # maximum number of iterations
pars["outdir"] = "results"  # output directory
pars["saveevery"] = 5       # save every 2 models to disk, e.g., thinning the chain
pars["infoevery"] = 500     # print info every 10 iterations

At this point, we can run the simulation

runMC(pars,mcpars,mstart,likelihood=likelihood,prior=prior)


 ====================================
     Extended Metropolis algo
 ====================================
 Running simulation metrop1
 Initial -log(likelihood): 10.134
 Iter.#: 1  Acc.: 100.00%  Ucur: 5.781  ELA: 0h:0m:0s  ETA: 0h:2m:2s    
 Iter.#: 500  Acc.: 58.40%  Ucur: 0.723  ELA: 0h:0m:0s  ETA: 0h:0m:0s    
 Iter.#: 1000  Acc.: 58.90%  Ucur: 1.379  ELA: 0h:0m:0s  ETA: 0h:0m:1s    
 Iter.#: 1500  Acc.: 57.53%  Ucur: 0.512  ELA: 0h:0m:0s  ETA: 0h:0m:1s    
 Iter.#: 2000  Acc.: 56.80%  Ucur: 0.008  ELA: 0h:0m:1s  ETA: 0h:0m:2s    
 Iter.#: 2500  Acc.: 57.64%  Ucur: 0.090  ELA: 0h:0m:2s  ETA: 0h:0m:2s    
 Iter.#: 3000  Acc.: 57.10%  Ucur: 0.201  ELA: 0h:0m:3s  ETA: 0h:0m:2s    
 Iter.#: 3500  Acc.: 56.91%  Ucur: 1.295  ELA: 0h:0m:5s  ETA: 0h:0m:2s    
 Iter.#: 4000  Acc.: 56.88%  Ucur: 1.169  ELA: 0h:0m:6s  ETA: 0h:0m:1s    
 Iter.#: 4500  Acc.: 56.49%  Ucur: 0.236  ELA: 0h:0m:8s  ETA: 0h:0m:0s    
 Iter.#: 5000  Acc.: 56.38%  Ucur: 0.209  ELA: 0h:0m:10s  ETA: 0h:0m:0s
  End of main loop. Saved 1001 models.
 ====================================

then collect the output

outmc = readMCoutput("results","metrop1")

InverseAlgos.Samplers.ExtMetropOutput("results", "metrop1", [5.0, -3.0], "8736d43e-0e20-11f0-3e4c-591ae7cd7ccf", InverseAlgos.Samplers.NLogPostPDF(Reconstruct@GaussLikelihood(Any[[0.2, -0.3], [0.6744868035190615 -0.05865102639296188; -0.05865102639296188 0.43988269794721413]]), Reconstruct@GaussPrior(Any[[-0.1, -0.5], [1.5811388300841898 0.0; 0.0 1.7320508075688772]]), [0.0, 0.0]), InverseAlgos.Samplers.ExtMetropParams(:ExtMetrop), nothing, [1, 2, 3, 4, 5, 6, 7, 8, 9, 10  …  4991, 4992, 4993, 4994, 4995, 4996, 4997, 4998, 4999, 5000], [1, 5, 10, 15, 20, 25, 30, 35, 40, 45  …  4955, 4960, 4965, 4970, 4975, 4980, 4985, 4990, 4995, 5000], [2.1365914344787598 -1.1169803142547607 … 1.3471001386642456 -0.4386489689350128; 4.496926784515381 0.19289202988147736 … -0.5728082656860352 -0.9600436687469482], [1, 2, 3, 4, 5, 6, 6, 7, 8, 9  …  2814, 2815, 2815, 2816, 2817, 2817, 2818, 2819, 2819, 2819], [5.780905587436436, 0.22149493297657263, 1.4491273891142447, 1.0059542343428132, 0.6764326590282903, 0.3007004137205439, 0.3007004137205439, 0.5956480114624766, 0.8829510109211548, 0.9417911279108577  …  2.3099212791429373, 1.1994139517656408, 1.1994139517656408, 0.49150988817499386, 0.4784810871154519, 0.4784810871154519, 0.5132085513317033, 0.2086479003402959, 0.2086479003402959, 0.2086479003402959], [5.780905587436436, 0.6764326590282903, 0.9417911279108577, 0.9542614770064086, 0.06191632578107316, 0.002379973891263248, 0.018836665251017466, 0.09121208226695005, 0.4501526543365034, 0.3834695987272667  …  0.3794756217020035, 1.0914692916248547, 0.3390164823915637, 2.174055837152103, 0.5910541052613437, 0.7351791474412568, 0.3700465534605132, 0.7852837484309798, 0.4784810871154519, 0.2086479003402959])

And finally plot some results

userprob = outmc.userprob

mods = outmc.mods
iter = outmc.iter
nllk = outmc.nllk

fig = Figure()

ax1 = Axis(fig[1:2,1],xlabel="x",ylabel="y")
scatter!(ax1,mods[1,:],mods[2,:])

ax2 = Axis(fig[3,1],xlabel="iteration",ylabel="Pot. energy U")
lines!(ax2,iter,nllk)

fig

Public API

InverseAlgos.Samplers — Module

Samplers

A module to solve inverse problems using Monte Carlo methods. Main targets are geophysical inverse problems. It contains a set of samplers such as the Hamiltonian Monte Carlo algorithm and its variant NUTS and the Extended Metropolis algorithm.

Exports

AvEpsFromHMCIter
AvEpsFromTree
ExtMetropParams
HMCParams
NUTSParams
clear_stflags_hdf5
contrunMC
readMCoutput
runMC