Every self respecting BEAST developer has encountered the problem of wanting to create a lot of XML files, for example for a simulation study, and using BEAUti just takes too long. It seems there are as many ways to generate XML as there are developers. Here is a non-exhaustive list in no particular order of things people have done in the past to generate XML.
Roll your own Java
There are a few classes in BEAST that can be useful for generating XML, in particularbeast.app.seqgen.SequenceSimulator
to simulate sequences on a tree, beast.app.seqgen.MergeDataWith
to merge the generated alignment with an XML file, and beast.util.XMLProducer
.
Use BEAST XML
You can use XML to generate new XML in BEAST, as explained here: simulation studies with BEAST 2. BEAST v2.5 (available soon) will have a command line interface to replace variables in the XML, allowing say
<run chainLength="$(chainLength)" ...
to be replaced by a number from the command line when running
beast -D "chainLength=1000000" file.xml
so it will be interpreted as
<run chainLength="1000000" ...
.
Furthermore, since BEAST v2.6.3, you can store these values in an external file in JSON format that defines name/value pairs, which is what a JSON dictionary provides quite naturally, and allows for multiple lines for values. For example, like so:
```
{
"sequences":"
where `...` means many more of the same. This allows for using the same analysis with multiple data sets (or multiple analyses with the same data set), which can be handy for well calibrated simulation studies or situations where the data set rapidly evolves.
This is particularly useful when you want to run the same analysis on multiple alignments.
The resulting XML, where user defined parameters are replaced by the information from the JSON file, is by default written to a file with the same name as input XML file, but with `.out` added before `.xml` (so input `beast.xml` becomes output `beast.out.xml`). The output file can be specified using the `-DFout` option, e.g. ``` beast -DF definitions.json -DFout result.xml beast.xml ```
If no output is desired, you can output to `/dev/null` using `-DFout /dev/null` on OS X and Linux, or `-DFout NUL` on Windows.
BEASTGen
BEASTGen by Andre Rambaut is part of BEAST 1, but is a generic XML producer from templates, so it can produce BEAST 2 XML as well. It has functionality to recognise alignment files, so you can refer to NEXUS and FASTA files to include in the XML.BEASTmasteR
BEASTmasteR by Nick Matzke is based on R. The purpose of BEASTmasteR is to convert NEXUS data file(s) (DNA, amino acids, discrete morphological characters, and/or continuous traits), plus an Excel settings file, into Beast2 XML format.MiXeL
MiXeL by Joseph Heled is based on python and uses XML template files that can be mixed and matched.BEASTLing
BEASTLing by Luke Mauritz et al aim specifically at generating BEAST XML for linguistic analyses.babette
babette by Richelle Bilderbeek and Rampal Etienne stands for "BEAUti 2, BEAST2 and Tracer for R" aims to aid in the workflow or performing an analysis through scripting in R.BEAST2XML
BEAST2XML by Terry Jones is a Python class and command-line script to generate BEAST 2 XML files.BEASTShell
BEASTShell is based on Bean Shell and integrated with BEAST. This post explains how to use it to generate XML.Beasy
Beasy is a Java based attempt by myself to use the power of BEAUti templates. Unlike most of the above approaches, when new packages get added no custom scripts are required to be able to use the new packages when there is a BEAUti template. It is still in development...
If you know any other ways to generate XML, please let me know.
References
Matzke, Nicholas J. (2015). "BEASTmasteR: R tools for automated conversion of NEXUS data to BEAST2 XML format, for fossil tip-dating and other uses."
Richel J.C. Bilderbeek, Rampal S. Etienne. babette: BEAUti 2, BEAST2 and Tracer for R doi: https://doi.org/10.1101/271866
Maurits, Luke, Robert Forkel, Gereon A. Kaiping, and Quentin D. Atkinson. "BEASTling: A software tool for linguistic phylogenetics using BEAST 2." PloS one 12, no. 8 (2017): e0180908.