BLAST

Revision as of 21:56, 20 February 2017 by Botto (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Description

The Basic Local Alignment Search Tool (BLAST) finds regions of local similarity between sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches. BLAST can be used to infer functional and evolutionary relationships between sequences as well as help identify members of gene families.

Note About this Software

We also have mpiBLAST available. We recommend use of this software for high-performance benefits. However, we will still support the serial version for special cases.

Version

  • 2.2.22

Authorized Users

  • CIRCE account holders
  • RRA account holders
  • SC account holders

Platforms

  • CIRCE cluster
  • RRA cluster
  • SC cluster

Modules

BLAST requires the following module file to run:

  • apps/blast/2.2.22

Running BLAST on CIRCE/SC

The BLAST user guide is essential to understanding the application and making the most of it. The guide and this page should help you to get started with your simulations. Please refer to the Documentation section for a link to the guide.

  • Note on CIRCE: Make sure to run your jobs from your $WORK directory!
  • Note: Scripts are provided as examples only. Your SLURM executables, tools, and options may vary from the example below. For help on submitting jobs to the queue, see our SLURM User’s Guide.

Creating and Submitting a Job

Create a .ncbirc file in your home directory. It should be formatted as follows (user location of /work and /home directories will vary):

[NCBI]
  DATA=/work/j/joeuser/blast/data

  [BLAST]
  BLASTDB=/work/j/joeuser/blast/db
  • Description of the above directories:
    • The ‘DATA’ variable gives the location of the NCBI Data directory containing BLOSUM and PAM scoring matrices, among other things. The scoring matrix files are necessary for any type of protein BLAST search and should be accessible by all cluster nodes. The ‘BLASTDB’ variable tells standard NCBI blastall where to find BLAST databases.

Prepare a database for use

Research Computing mirrors several of the available databases. They can be found in /opt/apps/ncbi-6.1/blast/fasta. If one of those will serve your needs, simply copy it from that location to your BLASTDB location, and decompress it.

[joeuser@host ~]$ module add apps/blast/2.2.22
[joeuser@host ~]$ cp /opt/apps/ncbi-6.1/blast/fasta/drosoph.nt.gz /work/j/joeuser/blast
[joeuser@host ~]$ cd /work/j/joeuser/blast
[joeuser@host blast]$ gunzip drosoph.nt.gz

Submit the Job

If, for example, you have an BLAST input file named query.in, you would set up a submit script like this:

  • The script below (for testing, name it “blast-test.sh”) can be copied into your job directory (the folder with your input files) and modified so that you can submit batch processes to the queue. For help on submitting jobs to the queue, see our SLURM User’s Guide. Scripts are provided as examples only. Your SLURM executables, tools, and options will vary.
#!/bin/bash
#
#SBATCH --comment=blast-test
#SBATCH --ntasks=4
#SBATCH --job-name=blast-test
#SBATCH --output=output.%j.blast-test
#SBATCH --time=01:00:00

#### SLURM 4 processor BLAST test to run for 1 hour.

module add apps/blast/2.2.22
blastp -db drosoph.nt -query query.in -out results.txt

 
Next, you can change to your job’s directory, and run the sbatch command to submit the job:

[user@login0 ~]$ cd my/jobdir
[user@login0 jobdir]$ sbatch ./blast-test.sh
  • You can view the status of your job with the “squeue -u <username>” command

Documentation

Home Page, User Guides, and Manuals

More Job Information

See the following for more detailed job submission information:

Reporting Bugs

Report bugs with BLAST to the IT Help Desk: rc-help@usf.edu