SC25
- St. Louis, MO
Stop by the Globus booth #4324 while at SC25. This year we will are also hosting a few members of the community who will be discussing how they are using Globus to help them in their research.
Check back for more details on the schedule.
Here are some other sessions that you will want to join while at the conference:
Sunday, November 16, 2025
Building Scalable Agentic Systems for Science: Concepts, Architectures, and Hands-On with Academy
8:30 a.m. - 12 p.m., Room 126
Tutorial
Agentic systems, in which autonomous agents collaborate to solve complex problems, are emerging as a transformative methodology in AI. However, adapting agentic architectures to scientific cyberinfrastructure — spanning HPC systems, experimental facilities, and federated data repositories — introduces new technical challenges.
In this half-day tutorial, we introduce participants to the design, deployment, and management of scalable agentic systems for scientific discovery. We will present Academy, a Python-based middleware platform built to support agentic workflows across heterogeneous research environments. Participants will learn core agentic system concepts, including asynchronous execution models, stateful agent orchestration, and dynamic resource management. We will explore the design of real-world agentic applications and discuss common patterns for integrating with widely used scientific tools and infrastructure. A guided hands-on session will then help attendees build and launch their own agentic systems. This tutorial is designed for researchers, developers, and cyberinfrastructure professionals interested in advancing AI-driven science with next-generation autonomous systems.
Presenters: Ian Foster, Kyle Chard, J. Gregory Pauloski, Alok Kamatar
HUST 25: 12th International Workshop on HPC User Support Tools
Workshop
2:00 p.m. - 5:30 p.m. Room 276
Supercomputing centers exist to drive scientific discovery by supporting researchers in computational science fields. To make users more productive in the complex HPC environment, HPC centers employ user support teams. These teams serve many roles, from setting up accounts, to consulting on math libraries and code optimization, to managing HPC software stacks. Often, support teams struggle to adequately support scientists. HPC environments are extremely complex, and combined with the complexity of multi-user installations, exotic hardware, and maintaining research software, supporting HPC users can be extremely demanding. With the twelfth HPC User Support Tools (HUST) workshop, we continue to provide a necessary forum for system administrators, user support team members, tool developers, policy makers, and end users. We provide a forum to discuss support issues and we provide a publication venue for current support developments. Scope includes best practices, user support tools, and ideas to streamline user support at supercomputing centers.
Session Chairs: Elsa J. Gonsiorowski, Lawrence Livermore National Laboratory (LLNL); Lev Gorenstein, Globus, University of Chicago; Chris Bording, University of Western Australia; Aaron P Jezghani, Georgia Institute of Technology; The Nab Collaboration, Ilya Zhukov, Jülich Supercomputing Centre (JSC)
Exploring Distributed Vector Databases Performance on HPC Platforms: A Study with Qdrant
workshop
2:00 p.m. - 2:20 p.m., Room 241
Vector databases have rapidly grown in popularity, enabling efficient similarity search over data such as text, images, and video. They now play a central role in modern AI workflows, aiding large language models (LLMs) by grounding model outputs in external scientific literature through retrieval-augmented generation (RAG). Despite their importance, little is known about vector databases’ performance characteristics on high-performance computing (HPC) systems that drive large-scale science. This work presents an empirical study of distributed vector database performance on SUPERCOMPUTER. We construct a realistic biological-text workload from BV-BRC and generate embeddings from the peS2o corpus using Qwen3-Embedding-4B. We select Qdrant to evaluate insertion, index construction, and query latency with up to 32 workers. Drawing on practical lessons from our study, this work takes a first step toward characterizing vector database performance on HPC platforms to guide future research and optimization.
Seth Ockerman (University of Wisconsin-Madison, sockerman@anl.gov); Amal Gueroudji (Argonne National Laboratory); Song Young Oh (University of Chicago); Robert Underwood, Nickolas Chia, and Kyle Chard (Argonne National Laboratory); Robert Ross (Argonne National Laboratory (ANL)); and Shivaram Venkataraman (University of Wisconsin-Madison)
Monday, November 17, 2025
Enabling Scalable and Sustainable Research for Data-Intensive Science
Invited Talk
9:04 a.m. - 10 a.m. , Room 265
Scientific research increasingly depends on the movement, management, and analysis of massive data volumes. Globus, a widely used research IT platform, addresses these needs by providing secure, reliable, and high-performance capabilities for data management, computation, and workflows across global research cyberinfrastructure. Serving more than 700,000 researchers and applications across 60,000 active data collections in over 80 countries, Globus has become a critical enabler of data-intensive science. In this talk, I will highlight two ways in which Globus supports innovation and sustainability in research computing. First, I will describe a framework built on Globus that integrates error-bounded lossy compression into data transfers, using machine learning-based quality estimation and optimized transfer strategies to achieve performance improvements while maintaining user-specified quality. Second, I will discuss how Globus itself provides a model for sustainable research software, via hybrid cloud and “freemium” subscription approaches that balance accessibility with long-term viability.
Scientific research increasingly depends on the movement, management, and analysis of massive data volumes. Globus, a widely used research IT platform, addresses these needs by providing secure, reliable, and high-performance capabilities for data management, computation, and workflows across global research cyberinfrastructure. Serving more than 700,000 researchers and applications across 60,000 active data collections in over 80 countries, Globus has become a critical enabler of data-intensive science. In this talk, I will highlight two ways in which Globus supports innovation and sustainability in research computing. First, I will describe a framework built on Globus that integrates error-bounded lossy compression into data transfers, using machine learning-based quality estimation and optimized transfer strategies to achieve performance improvements while maintaining user-specified quality. Second, I will discuss how Globus itself provides a model for sustainable research software, via hybrid cloud and “freemium” subscription approaches that balance accessibility with long-term viability.
Presenter: Kyle Chard
RESILIO : A Scalable and Composable Architecture for Tomographic Reconstruction Workflows
Workshop
9:06 a.m. - 9:24 a.m., Room 264
Tomographic reconstruction (TR) aims to reconstruct a 3D object from 2D projections. It is an important technique across domains such as medical imaging and materials science, where high-resolution volumetric data is essential for decision-making. With advanced facilities such as the upgraded APS enabling unprecedented data acquisition rates, TR pipelines struggle to handle large data volumes while maintaining low latency, fault tolerance, and scalability. Traditional, tightly coupled, batch-oriented workflows are increasingly inadequate in such high-performance contexts. In response, we propose RESILIO , a composable, high-performance TR framework built atop the Mochi ecosystem that uses persistent streaming and fully leverages HPC platforms. Our design enables scalable and elastic execution across heterogeneous environments. We contribute a reimagined TR architecture, its implementation using Mochi, and an empirical evaluation showing up to 3490× reduction in the per-event overhead compared to the original implementation, and up to 3268× improvement in throughput with performance-tuned configurations using Mofka.
Authors/Presenters: Amal Gueroudji, Matthieu Dorier, Philip Carns, Parth Paterl, Tekin Bicer, Robert Latham, Robert Ross, Ian Foster, Kyle Chard
Tuesday, November 18, 2025
State of the Practice
Addressing Reproducibility Challenges in HPC with Continuous IntegrationLink
paper
1:30 p.m. - 2:58 p.m., Room 275
The high performance computing (HPC) community has adopted incentive structures to motivate reproducible research, with major conferences awarding badges to papers that meet reproducibility requirements. Yet, many papers do not meet such requirements. The uniqueness of HPC infrastructure and software, coupled with strict access requirements, may limit opportunities for reproducibility. In the absence of resource access, we believe that regular documented testing, through continuous integration (CI), coupled with complete provenance information, can be used as a substitute. Here, we argue that better HPC-compliant CI solutions will improve reproducibility of applications. We present a survey of reproducibility initiatives and describe the barriers to reproducibility in HPC. To address existing limitations, we present a GitHub Action, CORRECT, that enables secure execution of tests on remote HPC resources. We evaluate CORRECT’s usability across three different types of HPC applications, demonstrating the effectiveness of using CORRECT for automating and documenting reproducibility evaluations.
Authors: Valerie Hayot-Sasson, Nathaniel Hudson, Andre Bauer, Ian Foster, Maxime Gonthier, Kyle Chard
Session: Containerization and Software Development
XaaS Containers: Performance-Portable Representation with Source and IR Containers
paper
3:30 p.m.- 3:52 p.m., Room 275
HPC systems and cloud data centers are converging, and containers are becoming the default software deployment method. While containers simplify software management, they face significant performance challenges: they must sacrifice hardware-specific optimizations to achieve portability. Although HPC containers can use runtime hooks to access optimized libraries and devices, they are limited by ABI compatibility and cannot reverse the effects of early-stage compilation decisions. XaaS containers proposed a vision of performance-portable containers, and we present a practical realization with Source and Intermediate Representation (IR) containers. We delay performance-critical decisions until the target system specification is known. We analyze specialization mechanisms in HPC software and propose a new LLM-assisted method for their automatic discovery. By examining the compilation pipeline, we develop a methodology to build containers optimized for target architectures at deployment time. Our prototype demonstrates that new XaaS containers combine the convenience of containerization with the performance benefits of system-specialized builds.
Workflows Community: Bridging Intelligent Workflows with Quantum and HPC for Scientific DiscoveryLink
BoF
5:15 p.m. - 6:45 p.m., Room 274
This BoF will convene the workflows community to discuss emerging directions in scientific workflow execution, including agentic workflows, integration of high-performance and quantum computing workflows, and coordinated allocation and scheduling across experimental and computing facilities. A central focus will be on ensuring end-to-end resource availability when workflows depend on limited instrument time and distributed infrastructure. The session will also address the need for infrastructure and policy reforms to support intelligent, cross-facility execution. Through interactive discussions, participants will explore collaborative strategies to enable resilient, scalable, and adaptive workflows that meet the evolving demands of scientific discovery.
Session Leaders: Rafael Ferreira da Silva, Daniela Cassol, Frederic Suter, Kyle Chard, Ian Foster, Deborah Bard, Florina Ciorba, Shantenu Jha, Marco Verdicchio
Wednesday, November 19, 2025
Core Hours and Carbon Credits: Incentivizing Sustainability in HPCLink
10:52 a.m. - 11:14 a.m. (rooms 263-264)
Efforts to reduce the environmental impact of HPC often focus on resource providers, but choices made by users (e.g., concerning where to run) can be equally consequential. Here we present evidence that new accounting methods that charge users for energy used can incentivize significantly more efficient behavior. We first survey 300 HPC users and find that fewer than 30% are aware of their energy consumption, and that energy efficiency is a low-priority concern. We then propose two new multi-resource accounting methods that charge for computations based on their energy consumption or carbon footprint, respectively. Finally, we conduct both simulation studies and a user study to evaluate the impact of these two methods on user behavior. We find that while only providing users feedback on their energy use had no impact on their behavior, associating energy with cost incentivized users to select more efficient resources and to use 40% less energy.
Authors:
Alok Kamatar, Maxime Gonthier, Valerie Hayot-Sasson, Andre Bauer, Marcin Copik, Raurl Castro Fernandez, Torsten Hoefler, Kyle Chard, Ian Foster
Thursday, November 20, 2025
CSx4HPC: Computational Storage for High-Performance ComputingLink
BoF
12:15 p.m. - 1:15 p.m., Room 126
Exponentially growing data volumes present fundamental challenges to manage and access large quantities of data. With a new generation of more flexible hardware and software, computational storage re-emerges as a promising technology to reduce network contention and improve performance for key applications. As industry is converging on a first set of standards, it is up to the HPC community as well as developers and scientists from the different domains to find the use cases and tools necessary. This BoF strives to connect the stakeholders from application, to middleware and hardware developers to explore the potential for HPC and scientific computing.
Session Leaders: Jakob Luettgau, Michael Kuhn, Kira Duwe, Gary Grider, Garth Gibson, Jean-Thomas Aequaviva, Kyle Chard, Nick Brown
What To Support When You’re Compressing: The State of Practice, Gaps, and Opportunities for Scientific Data CompressionLink\
paper
1:30 p.m. - 1:52 p.m., Rooms 161-262-265-266
Over the last nearly 20 years, lossy compression has become an essential aspect of HPC applications’ data pipelines, allowing them to overcome limitations in storage capacity and bandwidth and, in some cases, increase computational throughput and capacity. However, with the adoption of lossy compression comes the requirement to assess and control the impact lossy compression has on scientific outcomes.
In this work, we take a major step forward in describing the state of practice and characterizing workloads. We examine applications’ needs and compressors’ capabilities across nine different supercomputing application domains. We present 25 takeaways that provide best practices for applications, operational impacts for facilities achieving compressed data, and gaps in application needs not addressed by production compressors that point towards opportunities for future compression research.
We hope to see you in St. Louis!
The Globus Team