Difference between revisions of "ESPRIT"
Line 5: | Line 5: | ||
{{AppStandardHeader|20090715|esprit}} | {{AppStandardHeader|20090715|esprit}} | ||
{{ | {{AppRunningOnAll}} | ||
=== Creating and Submitting a Job === | === Creating and Submitting a Job === |
Latest revision as of 21:55, 20 February 2017
Description
ESPRIT is a computational algorithm that can be used to estimate microbial diversity based on 16S rRNA shotgun sequences. It consists of four modules: (1) removes low-quality reads using various criteria, (2) computes pairwise distances of reads, (3) groups reads into OTUs at different dissimilarity levels, and (4) performs statistical inference to estimate species richness.
Version
- 20090715
Authorized Users
CIRCE
account holdersRRA
account holdersSC
account holders
Platforms
CIRCE
clusterRRA
clusterSC
cluster
Modules
ESPRIT requires the following module file to run:
apps/esprit/20090715
- See Modules for more information.
Running ESPRIT on CIRCE/SC
The ESPRIT user guide is essential to understanding the application and making the most of it. The guide and this page should help you to get started with your simulations. Please refer to the Documentation section for a link to the guide.
- Note on CIRCE: Make sure to run your jobs from your $WORK directory!
- Note: Scripts are provided as examples only. Your SLURM executables, tools, and options may vary from the example below. For help on submitting jobs to the queue, see our SLURM User’s Guide.
Creating and Submitting a Job
Job Considerations
In order to submit a ESPRIT job, you will need to have two things:
- an input FASTA file that is pre-trimmed
- the number of pieces the FASTA file should be split into for parallelization
The number of pieces you choose determines how many tasks will be submitted to the cluster. The kmer-distance calculation will require ((num_pieces SELECT 2) + num_pieces) tasks, and the Needleman-Wunch calculation will require (num_pieces * num_pieces) tasks.
For a FASTA file split into 10 pieces, for example, you will have 55 and 100 tasks, respectively.
It is recommended that you do not split the file into more than 10 pieces, since the grid will have to dispatch a lot of tasks and it will take longer for them to finish since not all of them will be dispatched at the same time.
It is also recommended that you run your job from /work to ensure integrity of results.
Submit the Job
- Ensure that the ESPRIT module is loaded in the persistent environment:
[user@login1 ~]$ module load apps/esprit/20090715
- Create a directory for the ESPRIT run:
[user@login1 ~]$ mkdir $WORK/esprit-job1
- Copy your FASTA file into the directory:
[user@login1 ~]$ cp data/test.fas $WORK/esprit-job1
- Finally, change into the job directory and run esprit_cc with the proper options. Press ENTER again to confirm the parameters:
[user@login1 esprit-job1]$ esprit_cc test.fas 5 Specified input to be handled in 5 parts. The number of processes for the KMER distance and Needleman-Wunsh calculations are 15 and 25, respectively. Press Enter to continue. Otherwise, press Ctrl+c to exit. 17666 Seqs Match Primer 17666 Seqs Valid Len 8234 Seqs After Process 0.07 secs in Purging Strings. Calculating KMER distances.. Your job-array 77519.1-15:1 ("kmerdist.26893") has been submitted Waiting on job kmerdist.26893 to complete. Please wait.. Counting Total Records.... 497836 Records Found, Splitting... Writing kmer.dist_24 Calculating Needleman-Wunch Distances: Your job-array 77521.1-25:1 ("needledist.26893") has been submitted Waiting on job needledist.26893 to complete. Please wait.. Your job 77523 ("hcluster.26893") has been submitted Waiting on job hcluster.26893 to complete. Please wait.. Dist 0.01 Seqs 17666 Clusters 8028 Dist 0.02 Seqs 17666 Clusters 6363 Dist 0.03 Seqs 17666 Clusters 6108 Dist 0.04 Seqs 17666 Clusters 5035 Dist 0.05 Seqs 17666 Clusters 4335 Dist 0.06 Seqs 17666 Clusters 3961 Dist 0.07 Seqs 17666 Clusters 3479 Dist 0.08 Seqs 17666 Clusters 3107 Dist 0.09 Seqs 17666 Clusters 2816 Dist 0.10 Seqs 17666 Clusters 2536 Dist 0.11 Seqs 17666 Clusters 2312 Dist 0.12 Seqs 17666 Clusters 2147 Dist 0.13 Seqs 17666 Clusters 2053 Dist 0.14 Seqs 17666 Clusters 1994 Dist 0.15 Seqs 17666 Clusters 1942 Dist 0.16 Seqs 17666 Clusters 1917 Dist 0.17 Seqs 17666 Clusters 1896 Dist 0.18 Seqs 17666 Clusters 1892 Dist 0.19 Seqs 17666 Clusters 1891 Dist 0.20 Seqs 17666 Clusters 1890 4.85 secs in Statistical Analysis.
Documentation
Home Page, User Guides, and Manuals
- ESPRIT User's Guide
- /opt/apps/esprit/20090715/esprit_user_guide.pdf
Reporting Bugs
Report bugs with ESPRIT to the IT Help Desk: rc-help@usf.edu