FreeBayes

Revision as of 14:01, 21 February 2017 by Botto (talk | contribs)

Description

From the FreeBayes Home Page: FreeBayes is a Bayesian genetic variant detector designed to find small polymorphisms, specifically SNPs (single-nucleotide polymorphisms), indels (insertions and deletions), MNPs (multi-nucleotide polymorphisms), and complex events (composite insertion and substitution events) smaller than the length of a short-read sequencing alignment.

FreeBayes is haplotype-based, in the sense that it calls variants based on the literal sequences of reads aligned to a particular target, not their precise alignment. This model is a straightforward generalization of previous ones (e.g. PolyBayes, samtools, GATK) which detect or report variants based on alignments. This method avoids one of the core problems with alignment-based variant detection —- that identical sequences may have multiple possible alignments.

FreeBayes uses short-read alignments (BAM files with Phred+33 encoded quality scores, now standard) for any number of individuals from a population and a reference genome (in FASTA format) to determine the most-likely combination of genotypes for the population at each position in the reference. It reports positions which it finds putatively polymorphic in variant call file (VCF) format. It can also use an input set of variants (VCF) as a source of prior information, and a copy number variant map (BED) to define non-uniform ploidy variation across the samples under analysis.

Version

  • 1.0.1

Authorized Users

  • CIRCE account holders
  • RRA account holders
  • SC account holders

Platforms

  • CIRCE cluster
  • RRA cluster
  • SC cluster

Modules

FreeBayes requires the following module file to run:

  • apps/freebayes/1.0.1

Running FreeBayes on CIRCE/SC

The FreeBayes user guide is essential to understanding the application and making the most of it. The guide and this page should help you to get started with your simulations. Please refer to the Documentation section for a link to the guide.

  • Note on CIRCE: Make sure to run your jobs from your $WORK directory!
  • Note: Scripts are provided as examples only. Your SLURM executables, tools, and options may vary from the example below. For help on submitting jobs to the queue, see our SLURM User’s Guide.

Interactive Mode

Next, use the following commands to open an SRUN Interactive Session, load the module for FreeBayes, and execute the FreeBayes binary:

[user@login0 ~]$ srun --time=48:00:00 --nodes=1 --ntasks-per-node=1 --pty /bin/bash
[user@wh-520-4-1 ~]$ module load apps/freebayes/1.0.1
[user@wh-520-4-1 ~]$ freebayes -f ref.fa aln.bam >var.vcf

Batch Job submission

To run batch jobs on the CIRCE/SC cluster, users will need to submit their jobs to the scheduling environment if their jobs take more than 20 minutes to run on a standard PC.

If, for example, you have wish to call variants assuming a diploid sample, you would set up a submit script to use FreeBayes like this

#!/bin/bash
#
#SBATCH --job-name=freebayes-test
#SBATCH --time=48:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --output=output.%j.freebayes-test

#### SLURM 1 processor FreeBayes test to run for 48 hours.

# Load the FreeBayes module:
module load apps/freebayes/1.0.1
# Start FreeBayes
freebayes -f ref.fa aln.bam >var.vcf

If you find freebayes to be slow, you should probably be running it in parallel using this script to run on a single host, or generating a series of scripts, one per region, and run them on a cluster.

#!/bin/bash
#
#SBATCH --job-name=freebayes-test
#SBATCH --time=48:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=16
#SBATCH --output=output.%j.freebayes-test

#### SLURM 1 processor FreeBayes test to run for 48 hours.

# Load the FreeBayes and vcflib modules:
module load apps/freebayes/1.0.1
module load apps/vcflib/1.0.0

# Start FreeBayes
freebayes-parallel <(fasta_generate_regions.py ref.fa.fai 100000) $SLURM_NTASKS_PER_NODE \
    -f ref.fa aln.bam >var.vcf

 
Next, you can change to your job’s directory, and run the sbatch command to submit the job:

[user@login0 ~]$ cd my/jobdir
[user@login0 jobdir]$ sbatch ./freebayes-test.sh
  • You can view the status of your job with the “squeue -u <username>” command

Documentation

Home Page, User Guides, and Manuals

More Job Information

See the following for more detailed job submission information:

Reporting Bugs

Report bugs with FreeBayes to the IT Help Desk: rc-help@usf.edu