From the VCFtools Home Page: VCFtools is a program package designed for working with VCF files, such as those generated by the 1000 Genomes Project. The aim of VCFtools is to provide easily accessible methods for working with complex genetic variation data in the form of VCF files.
This toolset can be used to perform the following operations on VCF files:
- Filter out specific variants
- Compare files
- Summarize variants
- Convert to different file types
- Validate and merge files
- Create intersections and subsets of variants
VCFtools consists of two parts, a perl module and a binary executable. The perl module is a general Perl API for manipulating VCF files, whereas the binary executable provides general analysis routines.ken.
VCFtools requires the following module file to run:
- See Modules for more information.
Running VCFtools on CIRCE/SC
The VCFtools user guide is essential to understanding the application and making the most of it. The guide and this page should help you to get started with your simulations. Please refer to the Documentation section for a link to the guide.
- Note on CIRCE: Make sure to run your jobs from your $WORK directory!
- Note: Scripts are provided as examples only. Your SLURM executables, tools, and options may vary from the example below. For help on submitting jobs to the queue, see our SLURM User’s Guide.
Next, use the following commands to open an SRUN Interactive Session, load the module for VCFtools, and execute the VCFtools binary:
[user@login0 ~]$ srun --time=48:00:00 --nodes=1 --ntasks-per-node=1 --pty /bin/bash [user@wh-520-4-1 ~]$ module load apps/vcftools/0.1.14 [user@wh-520-4-1 ~]$ vcftools --vcf input_data.vcf --chr 1 --from-bp 1000000 --to-bp 2000000
Batch Job submission
To run batch jobs on the CIRCE/SC cluster, users will need to submit their jobs to the scheduling environment if their jobs take more than 20 minutes to run on a standard PC.
If, for example, you wish to filter out variants from a VCF file, you would set up a submit script to use vcftools like this
#!/bin/bash # #SBATCH --job-name=vcftools-test #SBATCH --time=01:00:00 #SBATCH --nodes=1 #SBATCH --ntasks-per-node=1 #SBATCH --output=output.%j.vcftools-test #### Slurm 1 processor vcftools test to run for 1 hours. # Load the vcftools module: module load apps/vcftools/0.1.14 # Start vcftools vcftools --vcf input_data.vcf --chr 1 --from-bp 1000000 --to-bp 2000000
Next, you can change to your job’s directory, and run the sbatch command to submit the job:
[user@login0 ~]$ cd my/jobdir [user@login0 jobdir]$ sbatch ./vcftools-test.sh
- You can view the status of your job with the “squeue -u <username>” command
Home Page, User Guides, and Manuals
- VCFtools Home Page:
- VCFtools Documentation:
More Job Information
See the following for more detailed job submission information:
Report bugs with VCFtools to the IT Help Desk: firstname.lastname@example.org