Almost Distributions

7 October 2021 by Remco Bouckaert

For complex analyses with at large number of priors it can be hard to find a starting state that satisfies all constraints, so one or more of the priors return a negative infinity at the start of the MCMC run and BEAST will not start.

The AlmostDistributions package contains a number of distributions that behave mostly like standard distributions, but are a bit more lenient towards the constraints: instead of returning negative infinity if a value is out of the range of support, a large penalty value is returned instead. This can sometimes help a run take off. Over time, the MCMC will move parameter or tree clades into range.

AlmostUniform

The AlmostUniform distribution is a parametric distribution that can be used as alternative for the Uniform distribution in parameter priors or for node age calibrations. For values between the lower and upper bound, the AlmostUniform distribution behaves the same as the Uniform distribution, but for values outside the range where the Uniform distribution returns negative infinity for the log-density, the AlmostUniform distribution returns a small penalty value. The value is calculated to encourage the value to move towards the range between lower and upper bound when using MCMC, and is calculated as

-penalty * (x-centre) * (x-centre)

where penalty a user defined penalty values (default 10000), centre is the middle of the interval defined by lower and upper bound, and x is the value for which the log-density is calculated. Initially, the value of x may be outside the range, but during the MCMC, the value of x will move towards the desired bounds and once x is between lower and upper bound it will be practically impossible to escape from the desired range (though in theory it can still happen).

• The AlmostUniform distribution can handle bounds that are infinite, but in that case the centre will be defined as the bound that is finite.

• The cummulativeProbability and inverseCumulativeProbability methods return the same values as the Uniform distribution.

AlmostNormal

The AlmostNormal distribution is a parametric distribution that behaves like the Normal distribution when x is in the 95% HPD of the normal distribution. For values of x more than two standard deviations from the mean, a penalty similar to that of the AlmostUniform distribution is returned for the logdensity method but with the mean of the normal distribution as centre.

• The cummulativeProbability return 0 for x more than two standard deviations smaller than the mean, and 1 for x more than two standard deviations larger than the mean. In between, it returns the cumulative probability of the normal but normalised to ensure it is 0 at mean-2*sigma and 1 at mean+2*sigma.

• The inverseCumulativeProbability methods only returns values between mean-2*sigma and mean+2*sigma.

AlmostLogNormalDistributionModel

The AlmostLogNormalDistributionModel is a parametric distribution that behaves like the LogNormalDistributionModel distribution, but only in the 0.025 to 0.975 probability range of the log normal with the same parameters. Outside that range, it returns a small log density, calculated as

-penalty * (lower - x)

when x less than the lower bound and

-penalty * (x - upper);

when x higher than the upper bound.

AlmostMRCAPrior

The AlmostMRCAPrior can be used as alternative for the MRCAPrior, which allows you to make clades monophyletic. Unlike the MRCAPrior, the AlmostMRCAPrior will not return -Infinity when the clade is not monophyletic, but produce a large penalty value instead. This can be handy when there are many tree constraints and finding a valid starting tree is hard. A previous post on constraining trees already explains how to use the AlmostMRCAPrior.

AlmostMultiMRCAPrior

The AlmostMultiMRCAPrior extends the MultiMRCAPriors, and can handle multiple monophyletic constraints. When one or more AlmostMRCAPriors are used instead of MRCAPriors, the associated clades may violate monophyly constraints but still return a finite (though very small) log probability.

Reference

Remco Bouckaert. AlmostDistributions package for BEAST 2. 2021 DOI: 10.5281/zenodo.5348449 