SNAPP

From BEAST2
Jump to: navigation, search

Snapp.png

Contents

Introduction

SNAPP (SNP and AFLP Package for Phylogenetic analysis) is package for inferring species trees and species demographics from independent (unlinked) biallelic markers such as well spaced SNPs. It implements a full coalescent model, but uses a novel algorithm to integrate over all possible gene trees, rather than sampling them explicitly.

Like BEAST, a SNAPP analysis is controlled using a specially formatted XML file. This file can be created from scratch, or by using the graphical user interface in BEAUTi. SNAPP can be run directly from inside BEAUTi, or you can take the xml file and use it to run SNAPP on any computer or server.

Like BEAST and MrBayes, SNAPP does not output a single tree. Rather, it uses Markov chain Monte Carlo (MCMC) to generate multiple trees (and corresponding parameter values), each of which is a sample from the posterior distribution of species trees and parameters. The trees and parameter values are output as two files. As with all MCMC analyses, care must be taken when interpreting these outputs. For example, it is necessary to check convergence and be prepared to consider multiple plausible species trees, instead of just one. The file containing parameter values can be analysed using the Tracer software. As with all MCMC software, it is advisable to run multiple analyses, possibly tweaking proposal parameters, to validate an analysis.


SNAPP is built on top of Beast 2. More information on Beast 2 can be obtained at Beast 2 web site.

Download

Documentation

A rough guide to SNAPP

Binary install programs

Mac

Linux

Windows

Source code

Installing SNAPP

SNAPP requires a Java Virtual Machine to run. Many systems will already have this installed. It requires at least version 1.6 of Java to run. The latest versions of Java can be downloaded from here.

If in doubt type "java -version" to see what version of java is installed (or if it is installed at all).

Mac OS X will already have a suitable version of Java installed.

Within the SNAPP package will be the following directories:

Directory Contents

doc/       Documentation of SNAPP
examples/  Some NEXUS and XML files
lib/       Java & native libraries used by SNAPP
bin/       Scripts of the corresponding OS
templates/ Templates to initiate BEAUti 

Converting Sequences

A program called "BEAUti" will import data in NEXUS format, allow you to select various models and options and generate an XML file ready for use in SNAPP.

To run BEAUti simply double-click the "BEAUti.exe" file in the SNAPP folder. If this doesn't work then you may not have Java installed correctly. Try opening an MS-DOS window and typing:

java -cp lib/SNAPP.jar beast.app.beauti.Beauti

Running SNAPP

To run SNAPP simply double-click the "SNAPP.exe" file in the SNAPP folder. You will be asked to select a SNAPP XML input file.

Alternatively open a Command window and type:

java -jar lib/SNAPP.jar input.xml

Where "input.xml" is the name of a SNAPP XML format file. This file can either be created from scratch using a text editor or be created by the BEAUti program from a NEXUS format file.

For documentation on creating and tuning the input files look at the documentation and tutorials on-line at:

Help -

     SNAPP-FAQ

Tutorials -

Usage: snapp [-window] [-options] [-working] [-seed] [-prefix ] [-overwrite] [-resume] [-errors ] [-threads ] [-help] []

-window Provide a console window

-options Display an options dialog

-working Change working directory to input file's directory

-seed Specify a random number generator seed

-prefix Specify a prefix for all output log filenames

-overwrite Allow overwriting of log files

-resume Allow appending of log files

-errors Specify maximum number of numerical errors before stopping

-threads The number of computational threads to use (default auto)

-help Print this information and stop

Example: snapp test.xml

Example: snapp -window test.xml

Example: snapp -help

For example:

java -jar lib/SNAPP.jar -seed 123456 -overwrite input.xml 

Analysing Results

DensiTree is provided with SNAPP to visualy analyse tree sets. It can be used to draw trees where branch width indicate population sizes.

Furthermore, TreeSetAnalyser is provided with SNAPP to analyse tree sets. It takes as input a tree set produced by SNAPP, and compares with a tree if provided (e.g. the original used to simulate data from). It Outputs

  1. Size of the 95% credible set.
  2. Probabilities of the top 20 (?) trees
  3. Whether or not the true tree is in the credible set.

Also, if a tree is provided, population sizes for the branches in the tree will be printed. By running TreeSetAnalyser once (without tree argument) a topology can be selected. Running the TreeSetAnalyser again, but now with a tree argument will show population sizes.

A powerful graphical program for analysing MCMC log files (it can also analyse output from MrBayes and other MCMCs). This is called 'Tracer' and is available from the Tracer web site:

Tracer

Alternatively, LogCombiner & TreeAnnotator distributed with Beast can be used. LogCombiner can combine log or tree files from multiple runs of SNAPP into a single combined results file (after removing appropriate burn-ins). TreeAnnotator can summarize a sample of trees from SNAPP using a single target tree, annotating it with posterior probabilities, HPD node heights and rates. This tree can then be viewed in a new program called 'FigTree' which is available from:

FigTree

Support & Links

SNAPP is an extremely complex program and as such will inevitably have bugs. Please email us to discuss any problems:

David Bryant

Remco Bouckaert

Some discussion can be found on the BEAST mailing list.

Source code distributed under the GNU Lesser General Public License: code

Bugs and planned features

Unfortunately, we have absolutely no funding for this project at the moment, but are implementing improvements and bug fixes as time permits. This is an open source project, and we strongly encourage you to get involved if there are some features you'd like to see added.

Below is a (partial) list of work planned.

  1. Allow inclusion of constant characters in BEAUTI.
  2. More extensive corrections for acquisition biases.
  3. Inclusion of sequence error corrections.


SNAPP-FAQ

Can I use SNAPP with diploid data?

Yes, under the assumption that sites or markers are unlinked. As usual, a SNP which is homozygous for allele 1 is coded by '2', a SNP which is homozygous fir allel '0' is coded by '0', and a heterozygous SNP is coded by '1'. Note that diploid individuals are effectively treated as two haploid individuals, with heterozygous alleles allocated at random.

How do I make SNAPP treat the data as dominant AFLP data?

Either:

  1. In BEAUTI, select the `Dominant' check box in the `Mutation Model' tab pane, or
  2. In the XML file, look for the tag determining the likelihood distribution within the tag determining the posterior distribution:
<distribution id="posterior" spec="util.CompoundDistribution">
  <distribution idref="prior"/>
    <distribution id="likelihood" spec="util.CompoundDistribution">
       <distribution data="..." dominant="true" ... spec="snap.likelihood.SnAPTreeLikelihood" ...>
          <siteModel ...>
                    ...
          </siteModel>
        </distribution>
    </distribution>
  </distribution>

Insert the option `dominant = "true"' as above.

How can I do species delimitation with SNP/AFLP data?

See Bayes Factor Delimitation (*with genomic data).


Acknowledgements

Thanks for supplying code or assisting with the creation or testing of SNAPP 2 development team.

Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox