Difference between revisions of "MpiBLAST"

m (Botto moved page MPI BLAST to MpiBLAST)
(No difference)

Revision as of 19:23, 29 June 2016

Description

mpiBLAST is a freely available, open-source, parallel implementation of NCBI BLAST. By efficiently utilizing distributed computational resources through database fragmentation, query segmentation, intelligent scheduling, and parallel I/O, mpiBLAST improves NCBI BLAST performance by several orders of magnitude while scaling to hundreds of processors. mpiBLAST is also portable across many different platforms and operating systems. Lastly, a renewed focus and consolidation of the many codebases has positioned mpiBLAST to continue to be of high utility to the bioinformatics community.

Version

  • 1.6.0

Authorized Users

  • CIRCE account holders
  • RRA account holders
  • SC account holders

Platforms

  • CIRCE cluster
  • RRA cluster
  • SC cluster

Modules

MpiBLAST requires the following module file to run:

  • apps/mpiblast/1.6.0

Creating and Submitting a Job

Create a .ncbirc file in your home directory. It should be formatted as follows (user location of /work directory will vary):

[mpiBLAST]
  Shared=/work/j/joeuser/blast
  Local=/tmp/joeuser/blast

‘Shared’ is a location that all of the nodes doing the computation will have access to and ‘Local’ should be a faster, local drive.

Prepare a database for use

Research Computing mirrors several of the available databases. They can be found in /opt/apps/ncbi-6.1/blast/fasta. If one of those will serve your needs, simply copy it from that location to your Shared location, and decompress it.

[joeuser@host ~]$ module add apps/mpiblast/1.6.0
[joeuser@host ~]$ cp /opt/apps/ncbi-6.1/blast/fasta/drosoph.nt.gz /work/j/joeuser/blast
[joeuser@host ~]$ cd /work/j/joeuser/blast
[joeuser@host blast]$ gunzip drosoph.nt.gz

Then, you need to format the database for use. Below is an example way to do so. It assumes you want to run a job searching against the drosoph.nt database, on 4 nodes.

[joeuser@host blast]$ mpiformatdb --nfrags=4 -i drosoph.nt -pF --quiet

Submit the Job

Below is a sample job script. It assumes that you want a hard runtime of 1 hour and that you have set up the database in 4 fragments. It also assumes that your value being searched for is in a file called query.in, in the current working directory. You’ll need to request two more processors than the number of fragments for the two coordinating processes that are created during execution.

#!/bin/sh
#
#SBATCH --job-name=mpiblast-test
#SBATCH --time=01:00:00
#SBATCH --ntasks=6

#### Slurm 6 processor mpiBLAST test to run for 1 hour.

module purge
module load apps/mpiblast/1.6.0

mpirun mpiblast -d drosoph.nt -i query.in -p blastn -o results.txt 

 
Next, you can change to your job’s directory, and run the sbatch command to submit the job:

[user@login0 ~]$ cd my/jobdir
[user@login0 jobdir]$ sbatch ./mpiblast-test.sh

Example query.in file for drosoph.nt:

>Search sequence
AGTACGTAAAGAAAATCTTTTTTTGGCGACATCATTGTATTTGGTAGTATACGATTTCCAGCATCCACAACTATTTCCT

Documentation

Home Page, User Guides, and Manuals

More Job Information

See the following for more detailed job submission information:

Reporting Bugs

Report bugs with MpiBLAST to the IT Help Desk: rc-help@usf.edu