📝 2025 Annual Report

Sam Foreman Sep 22, 2025 09/22/25 3 min read drafts

 

Draft annual report covering scientific accomplishments, publications, presentations, and goals at ALCF.

Sam Foreman 2025-09-22

Goals for Next Year (2026)
Goals from Last Year (2024)
Contributions to ALCF
Publications
Presentations
Posts
Organizational Efforts
Mentoring
Scientific / Technical Accomplishments
References

Goals for Next Year (2026)

Build out generic training services for science teams
Continue to push on resilient / fault-tolerant training techniques

Goals from Last Year (2024)

Continue to contribute to division(/lab)-wide efforts
Continue to work with application teams to efficiently scale on ALCF systems
[WIP] Publish retrospective on initial pre-training of AuroraGPT

Contributions to ALCF

AERIS: Argonne Earth Systems Model for Reliable and Skillful Predictions
- ACM Gordon Bell Prize Finalist (co-author)
- Contributed to model development, performance analysis, and scaling studies
MProt-DPO: Breaking the ExaFLOPS Barrier for Multimodal Protein Design with DPO
- Finalist for the 2024 ACM Gordon Bell Prize (first-author)
AuroraGPT
- Co-lead Models and Training team with Venkat Vishwanath
- Ongoing writeup of pre-training efforts
- Successfully pre-trained:
  - AuroraGPT-7B on 2T tokens
  - AuroraGPT-2B on 4T tokens (ongoing)
Catalyst for:
- Arvind Ramanthan’s INCITE Project (FoundEpidem)
- Zheng Zhang’s ALCC Project
- Rao Kotamarthi’s ALCC Project
Member of Software Committee
Intro to HPC Undergraduate Bootcamp:
- Project lead for Intro to {AI, HPC} for Science

Publications

AERIS: Argonne Earth Systems Model for Reliable and Skillful Predictions (Hatanpää et al. (2025))¹
Aurora: Architecting Argonne’s First Exascale Supercomputer for Accelerated Scientific Discovery (Allen et al. (2025))
HiPerRAG: High-Performance Retrieval Augmented Generation for Scientific Insights (Gokdemir et al. (2025))
Automated Tuning for HMC Mass Ratios (Torsiello et al. (2025))
MOFA: Discovering Materials for Carbon Capture with a GenAI and Simulation-Based Workflow (Yan et al. (2025))
MProt-DPO: Breaking the ExaFLOPS Barrier for Multimodal Protein Design with DPO (Dharuman et al. (2024))²

Presentations

Posts

Organizational Efforts

Organizer for:
Served as reviewer for:
- HiPC 2025
- SPIGM @ NeurIPS
- ML4PS Workshop @ NeurIPS’24
- AI4Science Workshop @ NeurIPS’24
- GenBio Workshop @ NeurIPS’24
- AI4Science Workshop @ ICML’24

Mentoring

Khalid Hossain: Supported Khalid’s successful transition from postdoc to staff
Joseph Frimpong: Postdoc in Center for Nanoscale Materials
Hung Nguyen: Graduate student @ UIUC

Scientific / Technical Accomplishments

References

Allen, Benjamin S., James Anchell, Victor Anisimov, et al. 2025. Aurora: Architecting Argonne’s First Exascale Supercomputer for Accelerated Scientific Discovery. https://arxiv.org/abs/2509.08207.

Dharuman, Gautham, Kyle Hippe, Alexander Brace, et al. 2024. “MProt-DPO: Breaking the ExaFLOPS Barrier for Multimodal Protein Design Workflows with Direct Preference Optimization.” Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (Atlanta, GA, USA), SC ’24. https://doi.org/10.1109/SC41406.2024.00013.

Gokdemir, Ozan, Carlo Siebenschuh, Alexander Brace, et al. 2025. HiPerRAG: High-Performance Retrieval Augmented Generation for Scientific Insights. https://arxiv.org/abs/2505.04846.

Hatanpää, Väinö, Eugene Ku, Jason Stock, et al. 2025. AERIS: Argonne Earth Systems Model for Reliable and Skillful Predictions. https://arxiv.org/abs/2509.13523.

Torsiello, J., G. T. Fleming, S. Foreman, X.-Y. Jin, and J. C. Osborn. 2025. “Automated Tuning for HMC Mass Ratios.” In PoS. Argonne, ALCF; Argonne National Laboratory (ANL), Argonne, IL (United States); Temple U.; Fermi National Accelerator Laboratory (FNAL), Batavia, IL (United States). https://doi.org/10.22323/1.466.0052.

Yan, Xiaoli, Nathaniel Hudson, Hyun Park, et al. 2025. MOFA: Discovering Materials for Carbon Capture with a GenAI- and Simulation-Based Workflow. https://arxiv.org/abs/2501.10651.

← [b]ack

posts/ 🎨 Mixing Between Distributions While Training [n]ext → posts/ 📊 `pbs-tui`: TUI for PBS Job Scheduler Monitoring