Adding calibrations by hand

5 April 2018 by Remco Bouckaert

Adding a time calibration is easiest done in BEAUti, but some templates (like for SNAPP), do not allow this. It is still possible to add a calibration by hand by editing the XML in a text editor, by adding a prior on the most recent common ancestor of a set of taxa, that is an MRCAPrior element.

1. Define an MRCAPrior

Simply add the following XML fragment just after the distribution element with id="prior"
<distribution id="root.prior" spec='beast.math.distributions.MRCAPrior' tree='@Tree.snps' >
</distribution>
Replace the id of the tree, here Tree.snps with the id of the tree that the calibration applies to. You can find the tree in the state element. Leave the @-sign in there, it is short-cut for the more verbose <tree idref="Tree.snps"/>.

2. Specify taxon set

The easiest calibration is one on the root, which can be done by adding a taxonset element to the MRCAPrior
<distribution id="root.prior" spec='beast.math.distributions.MRCAPrior' tree='@Tree.snps' >
	<taxonset id="root.taxa" spec="beast.evolution.alignment.TaxonSet" alignment="@snap.snps"/>
</distribution>

You need to change the name of the alignment, here snap.snps. Replace it with the id of your alignment, which you can probably find at the top of the XML file if you used BEAUti to set up the XML. Again, leave the @-sign in place.

Instead of specify the set of all taxa, which have the root as MRCA you can specify a calibration on the MRCA of a set of taxa. For this, you need to specify the taxon set with individual taxa names. For example for a set of 4 taxa, (TaxonA, TaxonB, TaxonC and TaxonD) we can use something like

<distribution id="ABCD.prior" spec='beast.math.distributions.MRCAPrior' tree='@Tree.snps'>
    <taxonset id="Clade-ABCD" spec="beast.evolution.alignment.TaxonSet">
    	<!-- TaxonA and TaxonB not defined anywhere else -->
        <taxon id="TaxonA" spec="beast.evolution.alignment.Taxon"/>
        <taxon id="TaxonB" spec="beast.evolution.alignment.Taxon"/>
        <!-- TaxonC and TaxonCD already defined elsewhere -->
        <taxon idref="TaxonC"/> 
        <taxon idref="TaxonD"/>
    </taxonset>
</distribution>
Note the difference between TaxonA and TaxonC: TaxonA is not specified anywhere else in the XML, while here we assume TaxonC is specified at some other point in the XML, for example, in the taxon set of another MRCAPrior

3. Add a distribution to the MRCAPrior

To specify an age distribution, you need to add a distr element to the MRCAPrior. For instance, adding normal with mean 4 and standard deviation 0.5 would look something like this:

<distribution id="root.prior" spec='beast.math.distributions.MRCAPrior' tree='@Tree.snps' >
	<taxonset id="root.taxa" spec="beast.evolution.alignment.TaxonSet" alignment="@snap.snps"/>
	<distr spec="beast.math.distributions.Normal" mean="4.0" sigma="0.5"/>
</distribution>

A few other distributions you can choose from are

<distr spec="beast.math.distributions.Uniform" lower="2.0" upper="0.5"/>
<distr spec="beast.math.distributions.LogNormal" meanInRealSpace="true" M="2.0" S="0.5"/>
<distr spec="beast.math.distributions.Uniform" lower="2.0" upper="0.5"/>
<distr spec="beast.math.distributions.OneOnX"/>
<distr spec="beast.math.distributions.Exponential" mean="2.0"/>
<distr spec="beast.math.distributions.Gamma" alpha="0.001" beta="1000"/>
<distr spec="beast.math.distributions.InverseGamma" alpha="0.001" beta="1000"/>
All distributions have an offset attribute as well, which shifts the distribution with the offset value, for example
<distr spec="beast.math.distributions.Normal" offset="2.0" mean="4.0" sigma="0.5"/>
shifts the distribution by 2, so gives a normal distribution with mean of 6 and standard deviation of 0.5.

4. Set MRCAPrior attributes (optional)

The MRCAPrior has a few flags that can be set, namely

  • monophyletic: whether the taxon set is monophyletic (forms a clade without other taxa) or nor. Default is false.
  • tipsonly: flag to indicate tip dates are to be used instead of the MRCA node. If set to true, the prior is applied to the height of all tips in the taxonset and the monophyletic flag is ignored. Default is false.
  • useOriginate: Use parent of clade instead of clade. Cannot be used with tipsonly, or on the root.
Since the root is already monophyletic, it does not matter for a root prior, but for the ABCD.prior above, we can enforce it to be monophyletic using:
<distribution id="ABCD.prior" monophyletic="true" spec='beast.math.distributions.MRCAPrior' tree='@Tree.snps'>
    <taxonset id="Clade-ABCD" spec="beast.evolution.alignment.TaxonSet">
    	<!-- TaxonA and TaxonB not defined anywhere else -->
        <taxon id="TaxonA" spec="beast.evolution.alignment.Taxon"/>
        <taxon id="TaxonB" spec="beast.evolution.alignment.Taxon"/>
        <!-- TaxonC and TaxonCD already defined elsewhere -->
        <taxon idref="TaxonC"/> 
        <taxon idref="TaxonD"/>
    </taxonset>
</distribution>

5. Adding operators (optional)

When adding a time calibration, be aware that you need to have the clock rate estimated. If the clock rate is not estimated, BEAST estimates branch lengths in units of expected number of substitutions per site. Adding a time calibration means the units must have a time component, e.g. expected number of substitutions per site per million year.

If you do not already have a clock rate estimated, you need an additional scale operator for the mean clock rate as well as a prior on that clock rate.