Genomic Sciences Lab uses Globus to streamline research data sharing
When David (Andy) Baltzegar reached out to NC State’s Research Facilitation Service (RFS) two years ago, his team at the Genomic Sciences Laboratory (GSL) was looking for a better way to send data to researchers.
The GSL generates hundreds of terabytes of research data each year through its next-generation DNA sequencing services. The lab serves about 600 students, faculty and staff, as well as external customers like the U.S. Department of Agriculture. A large portion of its sequencing work contributes to crop improvements such as better drought resistance and pest tolerance.
“More and more people are using us, and the data is bigger than we can handle ourselves on a server,” said Baltzegar, the GSL’s director.
The lab worked with the RFS to add Globus, a secure data transfer and sharing service managed by the Office of Information Technology (OIT), as a convenient data retrieval option for customers.
Baltzegar had already addressed several pain points the year before with the help of the RFS, a collaboration between OIT, the Office of Research and Innovation and the NC State University Libraries that serves as a single point of contact for researchers to learn about the resources available to them.
At the time, the GSL’s process for managing its sequencing results was cumbersome. The team had to move the data from their local server to the High Performance Computing (HPC) cluster for processing and then back to the local server for the customer to download and move to their own storage location. Not only did the GSL need to streamline its process, but it just didn’t have the storage capacity to support the amount of data generated by the advanced sequencing technology.
“We looked at some cloud options. They were extremely expensive and we would have to pass that pricing to the user,” said Baltzegar.
The RFS team updated the GSL’s Lab Management order form to improve order processing and allow customers to retrieve their data directly from NC State Research Storage. This cut out the need for the GSL to move data back to the local server, where it was taking up precious storage space.
With this foundation in place, adding Globus as a retrieval option further enhanced the GSL’s services. RFS-built workflows made the Globus piece seamless for both the lab and its customers.
“It puts everything in one order and one form that’s captured, and then on the backend the scripts work with what we do and we can just plug and play,” said Baltzegar.
Because the Hazel HPC cluster and Research Storage are already mounted in Globus, NC State researchers can easily use the tool to move their sequencing data to their OIT-managed data storage. Individuals, including external customers, can add additional endpoints to transfer the data to other locations.
The customer simply inputs their Globus ID — for NC State researchers, that’s their Unity ID — into the order form. Once the GSL makes the data available, the customer chooses the storage destination and clicks a button to start the transfer, which happens behind the scenes. The customer and the GSL both receive a notification when the transfer is complete.
“Researchers across the university are drowning in data, struggling to transfer massive datasets and share them with others. More and more often, the answer is simply, ‘Just use Globus,’” said Andy Kurth, OIT Research Storage specialist and RFS coordinator. “Its convenience, speed and versatility have been invaluable in helping solve many of today’s complex data management challenges.”
While the majority of NC State’s current Globus users are researchers sending data to or receiving data from external collaborators, the uses of Globus are broad.
“Globus is not limited to massive research datasets,” said Derek Ballard, IT manager in OIT Shared Services. “Globus can handle any large data transfer regardless of the types of data.”