Category Archives: Blog

What is new in v2.4.5

Replicability support

The biggest addition is that it is now possible to run an XML file with BEAST that uses
exactly the package versions that were used to create the XML in BEAUti. This means that
changes in the XML due to changes in package versions are no longer a problem, and you can
run an analysis with exactly those package versions of the original analysis.

The way it works is that BEAUti adds a “required” attribute on the beast-element in the
XML containing packages and their versions used to set up the analysis. Of course, you can
edit the XML by hand and change version numbers and add packages if you like. BEAST now has
a “-strictversions” flag, so when you start BEAST that option it only loads packages and
versions as specified in the “required” attribute.

Of course, the versions of these packages must be installed for BEAST to be able to load
them. Therefore, the package manager in BEAUti now allows specifying specific versions of
the package to install, and multiple package version can be installed side by side. By
default, the latest version of the package that is installed will be loaded, unless the
“-strictversions” flag is set. The addonmanager utility has a -version flag for specifying
the package version to install, if you prefer installing packages from the command line.

BEAUti

Previously, it was possible to edit priors, but these editing actions could interere with
cloning substitution and clock models. A

When importing FASTA files, previously a single character other then A, C, G or T meant that
the alignment was classified as amino acid, even if it is a nucleotide alignment. This
version counts the number of non-A,C,G,T characters and makes a better guess based on that
number wrt total number of characters in the alignment. Furthermore, a dialog pops up
where you can change that guess if it was incorrect. If many alignments of the same datatype
are loaded at the same time, you can choose to mark al as the same type so you don’t have
to close down the dialog for every of these alignments.

MRCAPriors imported through NEXUS files (using a calibrate entry) were not logged in the
trace log, but now they are.

BEAST

There are a few tree parser fixes

StarBeastStartState now takes bounds of parameters it sets in account so if you specify
a bounds on birth rate or any of the population sizes the initial state will not violate
them. Previously, these bounds were ignored resulting in the analysis not being able to
be started.

Operator schedules can now be nested. This means that you can specify a portion of the
operator weights to

Improved error reporting (as usual).

Package manager

Added -version flag to specify exactly which package version to install.

TreeAnnotator

Now calculates 2D HPD intervals by default (for phylogeography analyses). Spread3 requires the
uncertain intervals to be available, but they were not by default, resulting in confusion
by several users.

Added -nohpd2D flag to suppress 2D HPD interval calculation, since any 2-dimensional continuous
trait that is logged on the tree will now by default gets a 2D HPD interval calculated. However,
if the interval is not contiguous, TreeAnnotator produces a warning message, which may not
be appropriate for any but geographical regions. This flag helps suppress the messages and
reduces calculation time.

Added -noSA flag to suppress tree set being seen as that of a sampled ancestor analysis.
It can happen that a tree set contains a branch of length zero, which is interpreted by
TreeAnnotator as a sampled ancestor tree. Setting this flag prevents this interpretation.

How to open Tracer and FigTree in macOS Sierra

There are two known difficulties when using Tracer v1.6.0 and FigTree v1.4.2 with macOS Sierra:

  • their reliance on a legacy version of Java (6), and
  • Sierra’s stronger Gatekeeper security.

The 1st issue can be solved by downloading the Java for OS X 2015-001. This won’t change the default Java version in Sierra during my test. You can also check it after the installation using the command below:

java -version

After macOS Sierra introduces a stronger Gatekeeper, the applications not from the App Store and identified developers become difficult to launch after the download. The error message below will pop up when Tracer v1.6 is opened at the first time.

To open the application, find Gatekeeper options in Apple menu > System Preference > Security & Privacy > General tab, and click the button Open Anyway.

Go back to open the Tracer v1.6.0 application, and choose Open in the dialog box to confirm the process.

After the Tracer or FigTree is successfully opened, you can open them straight away afterwards.

Alternatively, removing com.apple.quarantine from the Tracer application will permanently solve the problem:

xattr -d -r com.apple.quarantine /Applications/Tracer\ v1.6.0.app

Best wishes,
Walter

What is new in v2.4.4

Smooth out some issues with importing Nexus files in BEAUti. The NEXUS file can contain information about calibrations on clades and tip ages, which is more convenient when there are many calibrations or dated tips. In the previous release, BEAUti displayed any distribution as being ‘Uniform’, even when other distributions were specified in the NEXUS file, and could not be changed in BEAUti, which is fixed now. In some circumstances, when tip dates were specified a major problem occurred preventing BEAUti to set up connections, which manifested itself in missing priors and other components, which is fixed in this release.

A TreeAnnotator fix was made so user defined trees to can be annotated instead of using an MCC tree based on the tree file.

Allow smaller log files by logging fewer significant digits of metadata. The TreeWithMetaDataLogger has a “dp” flag that can be used to specify the number of decimal places to use writing branch lengths, rates and now also real-valued metadata. When logging large trees this can reduce the size of the file considerably.

Fix that prevented starting any BEAST application on Mac Sierra. The OSX version was released on a computer that did not run Sierra, and it appears that prevented any BEAST application to open under the new security settings in Sierra, though it did not prevent them to run on any older version of OSX.

What is new in v2.4.3

BEAUti

If you want to sample tip dates, you can create an MRCA prior in the priors panel (by clicking the little ‘+’ button at the bottom of the screen). Once you specified the set of taxa and an age distribution, click the ‘tips only’ checkbox, and a tip sampling operator will be automatically added.

Multi-monophyletic constraints through Newick

BEAUti now allows packages to specify add package specific priors; when you click the ‘+’ button at the bottom of the priors tab, and a package (such as BEASTLabs that can add a multi-monophyletic prior) provides a new prior a dialog pops up showing a list to choose from. By default, and MRCA prior is added if no other package provides anything. The multi-monophyletic prior from BEASTLabs allows you to specify a large number of monophyletic constraints through a tree in Newick format.

Microsattelite support

Another way packages now can extend BEAUti is by catering for package specific file formats. For example, the BEASTvntr package reads in alignments from a comma separated file format and interprets them as numbers of tandem repeats. The BEASTvntr provides microsattelite support.

Misc

Gamma distribution now allows multiple parameterisations: shape/scale, shape/rate, shape/mean, and one parameter, but defaults to the shape/scale as in previous version.

When saving a file to XML from BEAUti, all packages used in the XML are now encoded in XML, so when starting BEAST on a different computer, it can provide better error reporting of missing packages.

Better looking on high-res screens.

BEAST

Allow multiple citation annotations per class.

Allow trait sets with unspecified dates instead of failing when not all taxa had a date specified.

Allow multiple arguments to Sum so you can add values from various sources.

Improved error reporting, as usual.

Package Manager

The GUI version of the package manager now has links to documentation. By clicking the link of a particular package, a web browser opens that should bring you a page with package specific information.

Some work has been done to make the layout of the package manager look better.

Misc

TreeAnnotator fix for phylogeography in low-mem mode — previously, any meta data in array format such as location information was ignored.

LogCombiner suppresses duplicate ‘=’ in tree output.

What is new in v2.4.2

BEAUti has a menu — View/Zoom In and View/Zoom out — which causes everything to scale up or down respectively. Once a particular zoom level is set in BEAUti, all other applications with a graphical user interface, like BEAST, TreeAnnotator, LogCombiner, etc. scale up to the same level. Also, by default scaling is such that they should look acceptable on high resolution screens.

Both BEAUti and BEAST have some improved error reporting.

One annoying bug was that BEAST closed its console window on XML parsing errors, making it impossible to read what was wrong with the XML file. This bug is solved now.

LogCombiner used to read in all log and tree files before writing them to the combined files. The new implementation processes input files line by line and directly write them to the combined log, so it requires much less memory than before.

Densitree updated to version 2.2.5, which supports export of DensiTree in SVG vector format.

What is new in v2.4.1

BEAUti

BEAUti now allows imports of calibrations from NEXUS files, so you can specify tip dates, distributions on tip dates, monophyletic constraints and clade calibrations in a NEXUS file. This is especially handy when there are a large number of calibrations or when a large number of clades need to be defined.

BEAUti now has a “File/Launch Apps” menu to start applications provided by packages, such as the GUI for doing a Path Sampling analysis (as the AppStore does).

In Windows and Linux, the *BEAST template went missing at the second time BEAUti was started due to a bug in the way packages are handled. This is fixed now.

Streamlined upgrades of BEAST so when you can upgrade BEAST as simple as upgrading any package. When upgrading BEAST, BEAUti exits and when restarting it downloads the latest version — which may take a little time.

BEAST

On OSX, a common problem was that a CUDA driver was installed to support BEAGLE, but that there is no hardware that is CUDA enabled. The result was a crash of BEAST without an error message, which made it hard to find out what went wrong. In this version a test is done for this condition, and if it exists, instructions are provided on how to uninstall CUDA drivers, which should fix the problem.

The CLI script for BEAST should have less trouble loading the BEAGLE library in Linux and OSX.

Two operators have improved operator tuning resulting in slightly better performance (higher ESSs) in most cases.

There are some improvements in reporting error conditions, which should help diagnose problems.

LogAnalyser

A bug crept into v2.4.0 causing LogAnalyser not to show progress on loading and processing the log file when started from CLI, which is fixed now.

BEAST 1 vs 2 performance benchmarking

March 2016 by Remco Bouckaert, Tim Vaughan, Walter Xie, and Alexei Drummond

Recently, a few users reported problems with BEAST 2 performance, concluding it was worse than BEAST 1. This puzzled us, because BEAST 1 and 2 share the same core algorithms, and both spend most of their time doing phylogenetic likelihood calculations, which is optimised using BEAGLE, a library shared by both programs. In fact, recently we changed the way that BEAST 2 handles proportion invariant categories, saving some phylogenetic likelihood calculations, so in theory it should be faster when using a proportion of invariant sites in the model. So, we became curious whether there are real performance differences between BEAST 1 and 2 and decided to do a benchmark. We expected them to perform roughly the same on GTR and GRT+G analyses, and BEAST 2 to do better on GTR+I and GTR+G+I analyses.

The picture below summarises the speed of BEAST 2 over BEAST 1 using 1, 2, 4 thread(s) in the 3 different operation systems. As you can see the performance is very similar for GTR and GTR+G, with BEAST2 being perhaps slight faster (although this could be due to debugging that BEAST1 performs at the start of the chain):

 

What we did

Analyses

BEAST can do many kinds of analyses, but for the purpose of this benchmark, we want to see whether the TreeLikelihood calculations, which typically dominate the computational time of MCMC runs, are comparable. To see the impact of the way BEAST 2 handles proportion invariant, we want to have an analysis with and without a proportion invariant category. And since many analyses use gamma rate heterogeneity with and without proportion invariants, we end up with four variants:

  • GTR
  • GTR + 4 gamma categories
  • GTR + proportion invariant
  • GTR + 4 gamma categories + proportion invariant

To keep things otherwise simple, we use a Yule tree prior, a strict clock and start with a random tree. To be practical, we set up the analysis in BEAUti 1 and 2, just importing an alignment, choosing the site model, setting the tree prior in BEAST 1 (BEAST 2 uses Yule by default) and save to file. As it turns out, the analyses produced that way are almost the same, but there are some small differences in the operator settings. Due to auto-optimisation, they will eventually become almost the same, but to make the two analyses as equal as possible we edited the XML so that they have the same operator weights and tuning values. Also, the population size used to generate the random starting tree differed so these were made the same as well.

The MCMC runs were run for 1 million steps in order to make them long enough that the slightly different ways extra likelihood calculations are done at the start for debugging purposes has little effect on the outcome. Also, with longer runs JIT compiler differences are eliminated. We took care to run the different programs under the same circumstances, on a computer not doing any other jobs at the time.

This whole process was automated to deal with the various data sets we wanted to test.

Threading

The way to set up threads in BEAST 2 is a bit cumbersome (v2.4.0 improves things a lot), so perhaps the reason is different configurations of threading. Therefore, we want to see what the impact of threading is. That led us to 3 variants:

  • 1 thread BEAGLE SSE
  • 2 thread BEAGLE SSE
  • 4 thread BEAGLE SSE

For BEAST 1, we used the flags -overwrite -beagle_instances. For BEAST 2 we used -overwrite -threads for the SSE runs. For all cases, we verified that both programs use the same settings of BEAGLE as reported at the start of the run.

Data sets

To get an impression of the impact of different data, we randomly selected a number of data sets from treebase.org with a number of sizes. We also used the data sets from the BEAST 1 examples benchmark directory giving a total of 15 data sets.

dataset taxa sites patterns
. . . .
M1044 50 1133 493
M1366 41 1137 769
M1510 36 1812 1020
M1748 67 955 336
M1749 74 2253 1673
M1809 59 1824 1037
M336 27 1949 934
M3475 50 378 256
M501 29 2520 1253
M520 67 1098 534
M755 64 1008 407
M767 71 1082 446
benchmark1 1441 98 593
benchmark2 62 10869 5565
old_benchmark 17 1485 138

Versions

To have a fair comparison, we used the latest versions currently avaiable v1.8.3 and v2.4.0.

Results


The images below show the run time for 1, 2, 4 thread(s) in Linux, where 1.8.3(t0) presents no threading pool for single thread in BEAST 1.8.3.

  • 1 thread:
  • 2 threads:
  • 4 threads:

With increasing number of threads, the difference in run time in seconds decreases, but BEAST 2 is almost always slightly faster than BEAST 1 in these comparisons. However, it turned out that the data sets are too small for four threads to be of much use — the four threaded runs tended to be slower than for two threads, which is optimal for most of these datasets for both BEAST versions. This may also be a function of the hardware used.

Cursory checks of ESSs for BEAST 1 and 2 in Tracer did not show any substantial difference, which is not surprising since the same mixture of operators was used. Also, parameter estimates tended to agree between some randomly selected analyses.

To make sure that it differences are not OS dependent, we ran the analyses on Windows 7, OS X and Linux, but did not find any substantial differences between the operating systems.

Conclusions

To our surprise, we found that BEAST 2 is slightly faster than BEAST 1. This is not what we expected since both programs perform the same analysis using the same BEAGLE library. Although we did our best to compare apples with apples, it is possible we overlooked something, so let us know if you find anything that can explain the differences in performance.

If you want to replicate these runs, you can find them in the benchmark repository on https://github.com/CompEvol/benchmark, which includes the data, some instructions and scripts to run them.

What is new in v2.4.0 and its packages

BEAST improved performance

BEAST is up to 2x faster when using proportion invariant sites and BEAGLE. When using proportion invariant in combination with gamma rate heterogeneity, it still is faster than before.

BEAST always had a “beagle_instance” command line flag, that was essentially ignored. This is now replaced by an flag that actually works, and is named “instances” since it works both with java and BEAGLE tree likelihoods.

By default, the treelikelihood is now threaded for analyses using the Standard template in BEAUti. The number of threads used per treelikelihood is determined by the “instances” flag, but can be overridden using the “threads” attribute of ThreadedTreeLikelihood (which was migrated from the BEASTLabs package).

Further, there are a few minor performance improvements, including faster MRCAPrior handling.

A bug in StartBeastStartState was fixed to work with calibrations with other than the CalibratedYule prior.

BEAUti

The parametric distributions in priors panel now show mean as well as median of the distribution.

There is better taxon management preventing adding numbers to taxon names

The layout tip dates panel was improved to deal with changing window sizes.

A bug in *BEAST clock cloning is fixed.

Allow setting branch length as substitution option on tree logger, which was previously not possible.’

Improved JSON export of BEAST analyses (just use json as extension to the file when saving) and using a library with a more sensible license.

Package manager

The package manager has been changed so it can read all package information (including that of older versions) from a single package file. A bigger change is that BEAST is now treated as a separate package: when you start any of the BEAST applications, it loads the beast.jar file from the user package directory, and if it is not already there, will put a copy in that place. This makes it much easier to upgrade BEAST: just select BEAST in the package list and click the install/update button.

The GUI of the package manager is improved, among other things, showing by colour whether a package can be installed.

For developers

The biggest change with this release is really for developers, as outlined in a separate post here.

Packages

Due to some API changes, all packages have been re-released. Some packages have not been updated yet, but will be soon. New packages expected soon that have not been available before include startbeast2 and correlated characters.

What will change in v2.4.0 for developers

3 February 2016 by Remco Bouckaert

Most significant upcoming changes are

  • Annotated constructor support, so instead of using Input and initAndValidate you can use constructors and use most of the info that now goes into and Input in a @Param annotation. See, for example AnnotatedRunnableTestClass and JSONTest.
  • Better JSON support for BEAST specifications, using a non-evil JSON library.
  • Removal of Exceptions in favour of classes that derive from Exception. This means that many methods that previously were throwing Exceptions, are throwing more specialised Exceptions, or nothing at all (if only RuntimeExceptions are thrown).
  • Cleaned up code, better conforming to Java 8 constructions and naming conventions. Also, attempt to remove the term ‘Plugin’ and replace with BEAST object where appropriate, since the term plugin is not used any more.

Code changes

This is a (still evolving) list of changes for package developers containing possible changes required to make packages compatible with BEAST v2.4.0. Mostly, there are minor method signature changes, and some member variables name changes, with the exception of Exceptions.

Exceptions

However, the biggest change is that throws Exception on initAndValidate will be removed. initAndValidate is supposed to check validity of values of inputs, and initialise. If for some reason this fails, the most appropriate exception to throw is IllegalArgumentException or RuntimeException.

Note you can always throw fewer exceptions than the method derived from, so you can change your code to work with both v2.3 and v2.4 by just removing or specialising the exception that is thrown.

Signature changes

The signature of BeautiDoc.deepCopyPlugin changes: requires an extra argument to tell which partition to copy from.

Access changes

A number of package private member and methods now are protected to allow access from difference packages.
BeautiAlignmentProvider.getAlignments(),

Most inputs are now final, so cannot be re-assigned.

Name changes

SubtreeSlide.fSize is now SubtreeSlide.size
InputEditor.m_plugin is now InputEditor.beastObject
BeautiConfig.inlinePlugin, collapsedPlugins, suppressPlugins are now inlineBEASTObject, collapsedBEASTObjects, suppressBEASTObjects

Deprecated

BEASTObject.outputs is now private. Use BEASTObject.getOutputs() to access the set of outputs.

What is new in v2.3.2 and its packages

Main reason for this release is to get the path corrected so Standard and StarBeast templates are visible under templates menu. In the v2.3.1 they got lost due to a new way of handling search paths. But there are many other reasons to upgrade to this release as pointed out below.

BEAUti

A fix of import of traits from file when partitions are split into say codon positions.

A fix for cloning of scripts with partition information.

Set up weights correctly of FixedMeanRate operator when ascertainment correction is applied. Previously, ascertainment correction columns were included in the weights.

Allows ParameterInputEditor to edit Parameter inputs.

Ensure when focus is on an editable field in Taxon set dialog the last value entered is captured when switching tabs.

BEAST

A “-validate” command line option for was added for parsing XML files without running them. This can be useful for just testing whether an XML file is correct, without having to stop the MCMC run and delete log files that are being created.

The MRCAPrior is now much more efficient. This gives performance improvements when there is little data and many MRCAPriors.

The way of generating random trees has been robustified.

More robust storing the state file on Windows.

LogCombiner

Ensured in the GUI version of LogCombiner burn in editing finished properly. The burn in was previously ignored if the burn in field was edited and the focus left on the edit field when pressing the run button.ds

LogAnalyser

LogAnalyser now has one line per file mode, so you can analyse multiple files and instead of having all info printed as blocks it can output all results for a single log file on a single line. This is handy when importing in R for further post-processing.

A CLI script added in the bin directory for ease of launch.

Error messages

More sensible error messages in many classes, for instance TreeParser, RPNCalculator, NodeReheight.

DensiTree is updated to version 2.2.4.

Packages

New releases of the following packages were created since the release v2.3.1:
* BACTER,
* GEO_SHERE,
* STACEY,
* bModelTest,
* SNAPP,
* BASTA,
* RBS.
* MultiTypTree and
* MASTER.