BFD* : Bayes Factor Delimitation (*with genomic data)
BFD* is a method to determine which species assignment of individual lineages is most plausible.
What you need
- BEAST 2 version 2.1 or better (see download BEAST).
- model-selection package (pre BEAST version 2.1.2, use the BEASTii package)
- SNAPP plugin version 1.1.4 or better (1.1.3 has a bug which sometimes halts running the analysis)
The easiest way to Manage Add-ons (Plugins) and install/un-install them is to start up BEAUti and select the menu File/Manage Add-ons. A window pops up where you can select the plugins to (un)install.
Setting up the Base Scenario
1. You can use BEAUti to set up a SNAPP analysis where the individual sequences are assigned according to the base scenario against you wish to test alternative scenarios.
The rough guide to SNAPP has details on setting up an analysis and explains the impact of the various settings.
2. Once the base scenario is set up in an XML file, the XML file needs to be edited so that a path-sampling analysis is set up. The page on Path Sampling has details on how to set up such an analysis.
Note that older versions of SNAPP (<v1.1.10) use a non-standard MCMC element that needs to be replaced, see comments at the end of the Path Sampling page
3. Run the path-sampling analysis with BEAST, and the path sample analyser reports the marginal likelihood for this analysis.
Setting up an Alternative Scenario
1. You might want to test a scenario where species are
- clumped -- e.g., assign all lineages from species A to species B --,
- split, -- e.g., assign some lineages from species A to a new species A' -- or
- have lineages reassigned -- e.g., relabel some lineages from species A to species B.
Load the original XML analysis (not the path sampling analysis!) in BEAUti, and use the taxonset panel to assign lineages to species.
2. Edit the XML to a path sampling analysis
3. Run the path-sampling analysis.
4. Calculate the difference in marginal likelihoods from the base scenario and the alternative scenario. The difference is the Bayes Factor (in logs) between the two scenarios.
If the marginal likelihood of the base scenario is larger than the alternative scenario, this points towards evidence in favour of the base scenario.
Adam D. Leaché, Matthew K. Fujita, Vladimir N. Minin, and Remco R Bouckaert. 2014. Species Delimitation using Genome-Wide SNP Data. Systematic Biology, doi:10.1093/sysbio/syu018.