Genomic Researchers Rely on Globus for Large Data Transfers
University of Michigan
The UM Advanced Genomics Core is one of several core labs within the Michigan Medicine’s Office of Research, Biomedical Research Core Facilities. They offer shared resources to the University, and focus on DNA and RNA sequencing, genotyping, and spatial transcriptomics.
The Biomedical Research Core Facility, like all facilities dealing with human genome sequencing, are experiencing explosive data growth. Fact: every two days they are delivering more data than they did in their first 20 years of existence. This is due to the dramatic reduction in cost and availability of new scientific instruments: today’s genome sequencing instruments generate up to 6TB of raw data per run, and then another 6.5TB of data can be generated through demultiplexing, alignment and post-processing work.
The Advanced Genomics core team employs both “push and pull” methods to handle data transfers with Globus. The sequencers are set up with a remote mount so they can write directly to the network, and then the Advanced Genomics team does post-processing work such as de-muxing and alignment, and in some cases, some primary analysis work for the researcher. Then the data is moved to medium-term storage, where storage space has been set up, and the researcher has granted write access to the data management team so that data can be pushed to the researcher’s storage. For heavy users, the facility pushes the data out to the researcher. The researcher no longer needs to worry about any of the details surrounding moving the data. Instead, the researcher can simply publish the data, do additional analysis, and finally archive the data. In other cases, the team uses more of a “pull” method, where Globus is set up to send an email to the researcher to let them know that there is new data available. The researcher logs into Globus and then pulls down the data; a couple of weeks later the facility uses Globus to move the data to a long-term tape archive.
Quotes
-
“We have hundreds of researchers who have no idea they are ‘big data’ researchers because of Globus. Your services are so much better than FTP, email or mailing out hard drives. There is no way our campus would be able to handle this amount of data without Globus. "
- Software Development Lead, University of Michigan