Hero image

Data Transfer & Storage Policies

Molecular Biology Core Data Transfer

Current state-of-the-art sequencing platforms like the Illumina NovaSeq X Plus are capable of generating large amounts of data quickly, and for historically low prices. As a result, transferring and storing these data presents a unique challenge. In order to make data transfer as painless as possible, we try to support a number of different options including local share folders, cloud resources, and platforms such as DNAnexus.


Data Storage Policies

Historically, the MBCF has tried to maintain local storage of all the data we generate "forever". Moore's Law made this possible through our history running Sanger sequencing, Microarray, and even Illumina MiSeq and NextSeq. However, the NovaSeq 6000 and X plus have made storing data "forever" cost prohibitive. 

Current Policy

  • Raw data will be stored for no more than 5 years.
  • Within 3 months after initial transfer, data will be archived.
  • Retrieval from archive will take time (1-2 weeks) and incur additional fees ($1/GB).

Download and Store your FASTQ files.


Dana-Farber Internal Data Transfer

RCSM Share Folders: This is the fastest and most efficient way to transfer internal Dana-Farber data. The shares requires an RCSM storage space. This is the future of Dana-Farber storage, if not set up yet, open a ticket with research computing today!

Legacy rc-stor Share Folders: A fast and efficient way to transfer internal Dana-Farber data, but space is limited. Please consider migrating to the RCSM store system (see above) or utilizing DNAnexus (see below).

DNAnexus: A cloud based storage and analysis platform (primarily for external data transfer). 


Data Transfer (Internal & External)

Note: We are in the process of shutting down our ftp data transfer service to provide more efficient data transfer with increased capacity. Data previous transferred via FTP will now be transferred using the DNAnexus cloud platform or an alternative such as AWS s3 or google buckets.

DNAnexus:

  1. Register DNAnexus account: Dana-Farber DNAnexus SSO (must use an institutional email address such as *edu, *org, etc. Email address cannot be gmail for example. Dana-Farber users must use your MGB ID.

  2. Get access to your project:  Let the MBCF know that you have set up your account and provide your username/email so that we can give you access to your project.

  3. Download Data (command line tools): While it is possible to download data using the browser, we strongly recommend that you use the command line tools for large data files (fastq, bam, etc). 

Command line options to download data:

  1. Install dx-toolkit for local command line access: https://documentation.dnanexus.com/downloads
  2. Select Project: $ dx select [Project ID]
  3. View folders: $ dx ls
  4. Download Data: $ dx download -r [Folder]

For more information, please refer the DNAnexus download manual


Other Data Transfer Methods

Already have your own cloud storage in place? Excellent! Give us permission and we can post data directly to your bucket.

AWS: Please email the MBCF group to coordinate. We will need you to provide the Access Key and Secret Key. 

Google Bucket: Please email the group to coordinate. We can provide a email address to grant permissions.