Difference between revisions of "GDC-client"

(Created page with "== Description == ''From the Genomic Data Commons (GDC) Home Page:'' The '''GDC Data Transfer Tool''', a command-line driven application, provides an optimized method of tran...")
 
(5 intermediate revisions by 3 users not shown)
Line 3: Line 3:
''From the Genomic Data Commons (GDC) Home Page:'' The '''GDC Data Transfer Tool''', a command-line driven application, provides an optimized method of transferring data to and from the GDC and enables resumption of interrupted transfers.
''From the Genomic Data Commons (GDC) Home Page:'' The '''GDC Data Transfer Tool''', a command-line driven application, provides an optimized method of transferring data to and from the GDC and enables resumption of interrupted transfers.


{{AppStandardHeader|1.3.0|gdc-client}}
{{AppVersion|1.4.0}}
{{AuthorizedUsers}}
{{Platforms-all}}


{{AppRunningOnAll}}
{{AppRunningOnAll}}
Line 9: Line 11:
=== Interactive Mode ===
=== Interactive Mode ===


Next, use the following commands to open an [[SLURM Interactive|SRUN Interactive Session]], load the module for {{PAGENAME}}, and execute the {{PAGENAME}} binary:
Next, use the following commands to open an [[SLURM Interactive|SRUN Interactive Session]], load the module for GDC Data Transfer Tool, and execute the gdc-client binary:


<pre style="white-space:pre-wrap; width:80%; border:1px solid lightgrey; background:#000000; color:white;">[user@itn0 ~]$ srun --time=24:00:00 --nodes=1 --ntasks-per-node=1 --pty /bin/bash
<pre style="white-space:pre-wrap; width:80%; border:1px solid lightgrey; background:#000000; color:white;">[user@itn0 ~]$ srun --time=24:00:00 --nodes=1 --ntasks-per-node=1 --pty /bin/bash
[user@wh-520-4-1 ~]$ module load apps/gdc-client/1.3.0
[user@wh-520-4-1 ~]$ module load apps/gdc-client/1.4.0
[user@wh-520-4-1 ~]$ gdc-client download -m gdc_manifest_6746fe840d924cf623b4634b5ec6c630bd4c06b5.txt
[user@wh-520-4-1 ~]$ gdc-client download -m gdc_manifest_6746fe840d924cf623b4634b5ec6c630bd4c06b5.txt
</pre>
</pre>
Line 21: Line 23:
To run batch jobs on the CIRCE/SC cluster, users will need to submit their jobs to the scheduling environment if their jobs take more than 20 minutes to run on a standard PC.
To run batch jobs on the CIRCE/SC cluster, users will need to submit their jobs to the scheduling environment if their jobs take more than 20 minutes to run on a standard PC.


If, for example, you have wish to count the number of alignments in a BAM file, you would set up a submit script to use bamtools like this
If, for example, you want to download data using a manifest file, you would set up a submit script to use gdc-client like the following.


<pre style="white-space:pre-wrap; width:50%; border:1px solid lightgrey; background:#E0E0E0; color:black;">
<pre style="white-space:pre-wrap; width:50%; border:1px solid lightgrey; background:#E0E0E0; color:black;">
Line 35: Line 37:


# Load the gdc-client module:
# Load the gdc-client module:
module load apps/gdc-client/1.3.0
module load apps/gdc-client/1.4.0


# Start gdc-client download
# Start gdc-client download
Line 53: Line 55:
*GDC Data Transfer Tool Documentation:
*GDC Data Transfer Tool Documentation:
**https://gdc.cancer.gov/access-data/gdc-data-transfer-tool
**https://gdc.cancer.gov/access-data/gdc-data-transfer-tool
**https://docs.gdc.cancer.gov/Data_Transfer_Tool/Users_Guide/Data_Download_and_Upload/


{{AppStandardFooter}}
{{AppStandardFooter}}

Revision as of 14:05, 29 January 2019

Description

From the Genomic Data Commons (GDC) Home Page: The GDC Data Transfer Tool, a command-line driven application, provides an optimized method of transferring data to and from the GDC and enables resumption of interrupted transfers.

Version

  • 1.4.0

Authorized Users

  • CIRCE account holders
  • RRA account holders
  • SC account holders

Platforms

  • CIRCE cluster
  • RRA cluster
  • SC cluster

Running GDC-client on CIRCE/SC

The GDC-client user guide is essential to understanding the application and making the most of it. The guide and this page should help you to get started with your simulations. Please refer to the Documentation section for a link to the guide.

  • Note on CIRCE: Make sure to run your jobs from your $WORK directory!
  • Note: Scripts are provided as examples only. Your SLURM executables, tools, and options may vary from the example below. For help on submitting jobs to the queue, see our SLURM User’s Guide.

Interactive Mode

Next, use the following commands to open an SRUN Interactive Session, load the module for GDC Data Transfer Tool, and execute the gdc-client binary:

[user@itn0 ~]$ srun --time=24:00:00 --nodes=1 --ntasks-per-node=1 --pty /bin/bash
[user@wh-520-4-1 ~]$ module load apps/gdc-client/1.4.0
[user@wh-520-4-1 ~]$ gdc-client download -m gdc_manifest_6746fe840d924cf623b4634b5ec6c630bd4c06b5.txt


Batch Job submission

To run batch jobs on the CIRCE/SC cluster, users will need to submit their jobs to the scheduling environment if their jobs take more than 20 minutes to run on a standard PC.

If, for example, you want to download data using a manifest file, you would set up a submit script to use gdc-client like the following.

#!/bin/bash
#
#SBATCH --job-name=bamtools-test
#SBATCH --time=24:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --output=output.%j.bamtools-test

#### SLURM 1 processor gdc-client example to run for 24 hours.

# Load the gdc-client module:
module load apps/gdc-client/1.4.0

# Start gdc-client download
gdc-client download -m gdc_manifest_6746fe840d924cf623b4634b5ec6c630bd4c06b5.txt

 
Next, be sure that you are in the correct job directory, and run the sbatch command to submit the job:

[user@itn0 ~]$ cd my/jobdir
[user@itn0 jobdir]$ sbatch ./gdc-client-test.sh
  • You can view the status of your job with the “squeue -u <username>” command

Documentation

Home Page, User Guides, and Manuals

More Job Information

See the following for more detailed job submission information:

Reporting Bugs

Report bugs with GDC-client to the IT Help Desk: rc-help@usf.edu