 Command

Sam Foreman's personal site. Vim-style keybinds for navigation; theme + font pickers below.

Theme
 Font
Keybinds
Navigation
j / ↓ Next item k / ↑ Previous item g First item in region G Last item in region zz Center focused item h / l Move left/right region ] / [ Next/previous heading } / { Next/previous block ⌃D / ⌃U Half-page down/up
Layout
<zh> / <zl> Toggle left/right sidebar <zj> / <zk> Focus main/navbar <S-h/j/k/l> Focus left/main/navbar/right ⌃H / ⌃L Focus left/right sidebar ⌃J / ⌃K Focus main/navbar ⇧C / ⇧E Collapse / expand all sections
Dialogs
⌃P / : Command palette ⌃X Theme picker / Search ? Show keybinds Esc / ⌃C Close dialog
History
⌃N Next document ⌃B Previous document ⌃O History back ⌃I History forward
 Search
about: Sam Foreman docs/test: Docs Test ideas: πŸ’‘ Ideas about/more: πŸͺͺ More now: Now more: βž• More posts: πŸ“¬ Posts projects: πŸ“š Projects talks: πŸŽ™οΈ Talks webtui: Style posts/2025: πŸ“† 2025 posts/auroragpt: πŸ€– AuroraGPT posts/ai-for-physics: βš›οΈ AI for Physics posts/dope-slides: πŸ’… How to Make Dope Slides posts/ezpz-at-alcf: πŸ‹ ezpz @ ALCF posts/ezpz-v1: πŸ“ ezpz-v1 posts/jupyter: πŸ“— Jupyter posts/resume: πŸ§‘πŸ»β€πŸ’» Sam Foreman’s RΓ©sumΓ© posts/svgbob: πŸ«₯ svgbob posts/torchtune-aurora: πŸͺ› Torchtune on Aurora posts/torchtune-patch-aurora: πŸš‘ Torchtune Patch on Aurora talks/auroragpt-siam25: AuroraGPT talks/ai-for-science-2024: Parallel Training Methods talks/aurora-gpt-fm-for-electric-grid/auroragpt-fm-for-electric-grid: AuroraGPT: Foundation Models for Science talks/hpc-user-forum/auroragpt: AuroraGPT talks/alcf-hpc-workshop-2024/alcf-hpc-workshop-2024: Deep Learning and Foundation Models at Scale talks/demo-slides: AuroraGPT: Training Foundation Models on Supercomputers talks/incite-hackathon-2025: ALCF Incite Hackathon 2025 talks/llms-at-scale: Training LLMs at Scale talks/llms-on-polaris: Training LLMs on Polaris talks/openskai25: Open SkAI2025 webtui/components/accordion: Accordion webtui/components/badge: Badge webtui/components/button: Button webtui/components/checkbox: Checkbox webtui/components/dialog: Dialog webtui/components/input: Input webtui/components/popover: Popover webtui/components/pre: Pre webtui/components/progress: Progress webtui/components/radio: Radio webtui/components/range: Range webtui/components/separator: Separator webtui/components/spinner: Spinner webtui/components/switch: Switch webtui/components/table: Table webtui/components/textarea: Textarea webtui/components/tooltip: Popover webtui/components/typography: Typography webtui/components/view: View webtui/contributing/contributing: Contributing webtui/contributing/contributing: ## Local Development webtui/contributing/contributing: ## Issues webtui/contributing/contributing: ## Pull Requests webtui/contributing/style-guide: Style Guide webtui/contributing/style-guide: ## CSS Units webtui/contributing/style-guide: ## Selectors webtui/contributing/style-guide: ## Documentation webtui/installation/astro: Astro webtui/installation/astro: ## Scoping webtui/installation/astro: ### Frontmatter Imports webtui/installation/astro: ### <style> tag webtui/installation/astro: ### Full Library Import webtui/installation/nextjs: Next.js webtui/installation/vite: Vite webtui/start/ascii-boxes: ASCII Boxes webtui/start/changelog: Changelog webtui/start/installation: Installation webtui/start/installation: ## Installation webtui/start/installation: ## Using CSS webtui/start/installation: ## Using ESM webtui/start/installation: ## Using a CDN webtui/start/installation: ## Full Library Import webtui/start/installation: ### CSS webtui/start/installation: ### ESM webtui/start/installation: ### CDN webtui/start/intro: Introduction webtui/start/intro: ## Features webtui/start/plugins: Plugins webtui/start/plugins: ## Official Plugins webtui/start/plugins: ### Themes webtui/start/plugins: ## Community Plugins webtui/start/theming: Theming webtui/start/theming: ## CSS Variables webtui/start/theming: ### Font Styles webtui/start/theming: ### Colors webtui/start/theming: ### Light & Dark webtui/start/theming: ## Theme Plugins webtui/start/theming: ### Using Multiple Theme Accents webtui/start/tuis-vs-guis: TUIs vs GUIs webtui/start/tuis-vs-guis: ## Monospace Fonts webtui/start/tuis-vs-guis: ## Character Cells webtui/plugins/plugin-nf: Nerd Font Plugin webtui/plugins/plugin-dev: Developing Plugins webtui/plugins/plugin-dev: ### Style Layers webtui/plugins/theme-catppuccin: Catppuccin Theme webtui/plugins/theme-custom: Custom Theme webtui/plugins/theme-everforest: Everforest Theme webtui/plugins/theme-gruvbox: Gruvbox Theme webtui/plugins/theme-nord: Nord Theme webtui/plugins/theme-vitesse: Vitesse Theme posts/2025/06: 06 posts/auroragpt/aurora-gpt: 🏎️ Megatron-DeepSpeed on Intel XPU posts/auroragpt/determinstic-flash-attn/deterministic-flash-attn: 🎰 Deterministic `flash-attn` posts/auroragpt/flash-attn-sunspot: πŸ“Έ `flash-attn` on Sunspot posts/auroragpt/long-sequences: πŸš‚ Loooooooong Sequence Lengths posts/auroragpt/checkpoints: πŸ’Ύ Converting Checkpoints posts/auroragpt/spike-skipper: πŸ”οΈ Spike Skipper posts/auroragpt/mpi4py-reproducer: πŸ› `mpi4py` bug on Sunspot posts/auroragpt/startup-times: 🐒 Starting Up Distributed Training on Aurora posts/auroragpt/startup-times: ## Response posts/auroragpt/startup-times: ### Measuring / Calculating Startup Time posts/auroragpt/startup-times: ## Minimal Working Example posts/ai-for-physics/diffusion: 🎲 MCMC + Diffusion Sampling posts/ai-for-physics/l2hmc-qcd: 🎒 L2HMC for LQCD posts/jupyter/test: 🏁 `l2hmc` Example: 2D $U(1)$ talks/auroragpt/alcf-hpc-workshop-2024/auroragpt-alcf-hands-on-hpc-workshop-2024: AuroraGPT: ANL's General Purpose Scientific LLM posts/jupyter/l2hmc-4dsu3: πŸ”³ `l2hmc-qcd` Example: 4D SU(3) talks/incite-hackathon-2025/auroragpt: LLMs on Aurora: Overview talks/incite-hackathon-2025/ezpz: LLMs on Aurora: Hands-On talks/openskai25/ai4science: Scientific AI at Scale: AuroraGPT posts/2025/04/28: πŸ”₯ Building PyTorch 2.6 from Source on Aurora talks/openskai25/training: Scientific AI at Scale: Distributed Training posts/2025/05/03: 🚧 Frameworks Issue with numpy \> 2 posts/2025/06/01: πŸ“° Nice Headings posts/2025/10/06: 🎨 Mixing Between Distributions While Training posts/2025/06/14: πŸ—οΈ Building PyTorch 2.8 from Source on Aurora posts/2025/09/12: 🍹 BlendCorpus + TorchTitan @ ALCF posts/2025/11/12: 🧊 Cooling Down Checkpoints: Best Practices for Model Evaluation posts/2026/01/10: πŸ‹ ezpz: distributed PyTorch across any hardware posts/2025/06/02: πŸ§œβ€β™€οΈ Mermaid posts/2025/09/17: πŸ“Š `pbs-tui`: TUI for PBS Job Scheduler Monitoring posts/2026/05/01: Running 50k Python Processes on Aurora with ezpz yeet posts/2026/05/01: ## What it does posts/2026/05/01: ## CLI surface posts/2026/05/01: ### Choosing a local copy method posts/2026/05/01: ### Tarball source posts/2026/05/01: ### Generic (non-venv) sources posts/2026/05/01: ## How it works posts/2026/05/01: ### Local copy + patch posts/2026/05/01: ### Greedy fan-out posts/2026/05/01: ## Scaling on Aurora: 8 β†’ 4096 nodes posts/2026/05/01: ### Two regimes posts/2026/05/01: ### Why tarball broadcast scales so much better than per-file rsync posts/2026/05/01: ## Reproducing posts/2026/05/01: ## Complete workflow posts/2026/05/01: ## See also posts/2026/01/07: πŸŽ‰ Happy New Year! posts/2026/02/28: ⏱️ Comparing Launchers on Aurora posts/2026/02/28: ## torchrun posts/2026/02/28: ## ezpz posts/2026/04/27: Pre-Training AuroraGPT with TorchTitan posts/2026/04/27: ## Two-Week Summary (Apr 12–27, 2026) posts/2026/04/27: ## Detailed Breakdown posts/2026/04/27: ### Week 1: Apr 12–18 β€” Benchmarking, LR Finder, XPU Fixes posts/2026/04/27: #### Benchmarking (Apr 12–15) posts/2026/04/27: #### LR Finder (Apr 12–14) posts/2026/04/27: #### Scaling Study (Apr 12) posts/2026/04/27: #### Upstream Syncs (Apr 12–18, syncs 6–14) posts/2026/04/27: #### XPU Bug Fixes (Apr 18) posts/2026/04/27: #### RL Experiment (Apr 18) posts/2026/04/27: ### Week 1.5: Apr 18–25 β€” Production Readiness posts/2026/04/27: #### Torch 2.12 Benchmarks (Apr 18) posts/2026/04/27: #### LR Finder Extensions (Apr 20–21) posts/2026/04/27: #### XPU Fixes (Apr 23) posts/2026/04/27: #### Torch 2.13 Environment (Apr 25) posts/2026/04/27: #### 2B Scaling Study on Torch 2.13 (Apr 25) posts/2026/04/27: #### Production Training (Apr 25) posts/2026/04/27: ### Week 2: Apr 26–27 β€” Optimizer Competition posts/2026/04/27: #### RL Multi-Task Refactor (Apr 26) posts/2026/04/27: #### Docs Reorganization (Apr 26) posts/2026/04/27: #### Generic HF Dataset Streaming (Apr 26) posts/2026/04/27: #### New Optimizers (Apr 26) posts/2026/04/27: #### Architecture Tweaks (Apr 26–27) posts/2026/04/27: ## Competition Results posts/2026/04/27: ### Round 1–3: Speedrun β€” 2N, GBS=48, 1000 steps posts/2026/04/27: ### 10B Full Training β€” 8N, GBS=384, ~3,178 steps posts/2026/04/27: ### Round 4: Reproducible Speedrun β€” 2N, GAS=8, GBS=384, 1000 steps posts/2026/04/27: ## Key Discoveries posts/2026/04/27: ## Infrastructure Built posts/2026/04/27: ## High-Level posts/2026/04/27: ## Detailed Breakdown posts/2026/04/27: ### Week 1: Apr 12–18 β€” Benchmarking, LR Finder, XPU Fixes posts/2026/04/27: #### Benchmarking (Apr 12–15) posts/2026/04/27: #### LR Finder (Apr 12–14) posts/2026/04/27: #### Scaling Study (Apr 12) posts/2026/04/27: #### Upstream Syncs (Apr 12–18, syncs 6–14) posts/2026/04/27: #### XPU Bug Fixes (Apr 18) posts/2026/04/27: #### RL Experiment (Apr 18) posts/2026/04/27: ### Week 1.5: Apr 18–25 β€” Production Readiness posts/2026/04/27: #### Torch 2.12 Benchmarks (Apr 18) posts/2026/04/27: #### LR Finder Extensions (Apr 20–21) posts/2026/04/27: #### XPU Fixes (Apr 23) posts/2026/04/27: #### Torch 2.13 Environment (Apr 25) posts/2026/04/27: #### 2B Scaling Study on Torch 2.13 (Apr 25) posts/2026/04/27: #### Production Training (Apr 25) posts/2026/04/27: ### Week 2: Apr 26–27 β€” Optimizer Competition posts/2026/04/27: #### RL Multi-Task Refactor (Apr 26) posts/2026/04/27: #### Docs Reorganization (Apr 26) posts/2026/04/27: #### Generic HF Dataset Streaming (Apr 26) posts/2026/04/27: #### New Optimizers (Apr 26) posts/2026/04/27: #### Architecture Tweaks (Apr 26–27) posts/2026/04/27: ## Competition Results posts/2026/04/27: ### Round 1–3: 1000-step speedruns, 2 nodes, GBS=48 (17 configs) posts/2026/04/27: ### Round 4 (10B full training, 8 nodes, GBS=384, 5 configs) posts/2026/04/27: ### Round 5 (2 nodes, GAS=8, GBS=384, local dataset, 8 configs β€” in progress) posts/2026/04/27: ## Key Discoveries posts/2026/04/27: ## Infrastructure Built posts/ai-for-physics/l2hmc-qcd/2du1: 🎒 l2hmc-qcd Example: 2D U(1) posts/jupyter/l2hmc/4dsu3: πŸ”³ l2hmc-qcd Example: 4D SU(3) talks/2025/10/08: AERIS: Argonne's Earth Systems Model posts/ai-for-physics/l2hmc-qcd/4dsu3nb/index-broken: πŸ•ΈοΈ l2hmc-qcd Example: 4D SU(3) talks/2025/10/15: Training Foundation Models on Supercomputers talks/2025/09/24: Training Foundation Models on Supercomputers talks/2025/10/24: Training Foundation Models on Supercomputers talks/2026/06/03: Production Pre-Training at Scale: The Good, the Bad, and the Restarts talks/2025/12/16: AuroraGPT: Training Foundation Models on Supercomputers posts/drafts/2025/09/22: πŸ“ 2025 Annual Report
 Theme Current: Light j/k or ↑/↓ + Enter

🚧 Frameworks Issue with numpy \> 2

Documenting a breaking issue where upgrading numpy beyond version 2 breaks TensorFlow in the ALCF frameworks module.

Sam Foreman 2025-05-03

Something I just learned

The TensorFlow that is included in the new frameworks module (aurora_nre_models_frameworks-2025.0.0) was built with numpy==1.26.4 (< 2).

Unfortunately, if you then (for whatever reason) then tries to install / upgrade a package1 that has numpy in its dependencies, e.g.:

python3 -m pip install --upgrade transformers

This will pull in numpy > 2, effectively breaking the frameworks module.

In particular, any application that uses intel/extension_for_pytorch:

import intel_extension_for_pytorch as ipex

Will crash with:

A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.2.5 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.
which is obviously frustrating for an application that is only using PyTorch.

Following through the stack trace, I see that the error is actually coming from huggingface/transformers/image_transforms.py#L47.

Digging around a bit more I found there is a flag in transformers that allows you to bypass the entire import tensorflow as tf logic:

USE_TORCH=1

which not only prevents things from crashing with numpy > 2, but is also noticeably quicker.

Reinstall Modules built with numpy < 2

In addition to tensorflow, it seems that: { jax, jaxlib, ml-dtypes, opt-einsum, scipy } were all built with numpy < 2, and so need to be rebuilt after upgrading numpy.

To do so:

python3 -m pip install --upgrade  numpy jax jaxlib ml-dtypes opt-einsum scipy transformers

βœ… Now, we’re able to successfully:

#[🐍 aurora_nre_models_frameworks-2025.0.0](πŸ‘» aurora_nre_models_frameworks-2025.0.0)
#[05/03/25 @ 12:12:29][x4515c7s4b0n0][/f/d/f/p/s/ezpz][🌱 update-utils][πŸ“¦πŸ€·βœ“] [⏱️ 25s]
; USE_TORCH=1 python3 -c 'import numpy as np; print(np.__version__) ; import ezpz '
2.2.5
[W503 12:12:34.201673197 OperatorEntry.cpp:155] Warning: Warning only once for all operators,  other operators may also be overridden.
  Overriding a previously registered kernel for the same operator and the same dispatch key
  operator: aten::_cummax_helper(Tensor self, Tensor(a!) values, Tensor(b!) indices, int dim) -> ()
    registered at /build/pytorch/build/aten/src/ATen/RegisterSchema.cpp:6
  dispatch key: XPU
  previous kernel: registered at /build/pytorch/build/aten/src/ATen/RegisterCPU.cpp:30476
       new kernel: registered at /build/intel-pytorch-extension/build/Release/csrc/gpu/csrc/aten/generated/ATen/RegisterXPU.cpp:2971 (function operator())
[2025-05-03 12:13:18][W][utils/_logger:68:ezpz] Unable to import deepspeed. Please install it to use DeepSpeed features.
took: 0h:00m:48s

Profiles and Timing Comparisons

Below we present timing profiles obtained from

Below we present the profiles obtained by the usual mechanism, i.e.

module load frameworks
python3 -m venv --system-site-packages "venvs/$(basename ${CONDA_PREFIX})"
source venvs/$(basename ${CONDA_PREFIX})/bin/activate
python3 -m pip install -e "git+https://github.com/saforem2/ezpz"

DEFAULT BEHAVIOR

Bash Profile (hyperfine)
$ hyperfine --max-runs=10 --shell=zsh --show-output 'ezpz_init() { $(which python3) -c "import ezpz" }; ezpz_init'
# ...[clipped]...
Time (mean Β± Οƒ):     10.754 s Β±  0.122 s    [User: 14.574 s, System: 12.795 s]
Range (min … max):   10.467 s … 10.910 s    10 runs

Python Profile (pyinstrument)

$ pyinstrument -c 'import ezpz'
# ...[clipped]...
12.021 <module>  ezpz/__init__.py:1
β”œβ”€ 8.624 <module>  ezpz/dist.py:1
β”‚  └─ 8.542 <module>  intel_extension_for_pytorch/__init__.py:1
β”‚        [206 frames hidden]  intel_extension_for_pytorch, transfor...
β”‚           0.247 <module>  torch/utils/_sympy/functions.py:1
β”‚           └─ 0.241 <module>  sympy/__init__.py:1
β”‚              └─ 0.126 <module>  sympy/polys/__init__.py:1

WITH USE_TORCH=1 and numpy==2.2.5

Below we present the profiles and timing measurements obtained:

  1. After upgrading numpy==2.25

  2. Skipping the import tensorflow as tf logic in transformers by specifying USE_TORCH=1, explicitly.

Bash Profile (hyperfine)

$ hyperfine --max-runs=10 --shell=zsh --show-output 'ezpz_init() { USE_TORCH=1 $(which python3) -c "import ezpz" }; ezpz_init'
# ...[clipped]...
Time (mean Β± Οƒ):      7.491 s Β±  0.162 s    [User: 12.130 s, System: 11.940 s]
Range (min … max):    7.311 s …  7.883 s    10 runs

Python Profile (pyinstrument)

$ USE_TORCH=1 pyinstrument -c 'import ezpz'
# ...[clipped]...
8.478 <module>  ezpz/__init__.py:1
β”œβ”€ 5.109 <module>  ezpz/dist.py:1
β”‚  └─ 5.016 <module>  intel_extension_for_pytorch/__init__.py:1
β”‚        [174 frames hidden]  intel_extension_for_pytorch, transfor...
β”‚           0.249 <module>  torch/utils/_sympy/functions.py:1
β”‚           └─ 0.241 <module>  sympy/__init__.py:1
β”‚              └─ 0.124 <module>  sympy/polys/__init__.py:1

Stack Trace from numpy > 2 issue

Stack Trace
#[🐍 aurora_nre_models_frameworks-2025.0.0](πŸ‘» aurora_nre_models_frameworks-2025.0.0)
#[05/02/25 @ 15:46:43][x4005c2s6b0n0][/f/d/f/p/s/t/2/ezpz][🌱 update-utils][βœ“] [⏱️ 19s]
; ezpz-test --profile --tp 2 --pp 4
[W502 16:00:18.739960487 OperatorEntry.cpp:155] Warning: Warning only once for all operators,  other operators may also be overridden.
  Overriding a previously registered kernel for the same operator and the same dispatch key
  operator: aten::_cummax_helper(Tensor self, Tensor(a!) values, Tensor(b!) indices, int dim) -> ()
    registered at /build/pytorch/build/aten/src/ATen/RegisterSchema.cpp:6
  dispatch key: XPU
  previous kernel: registered at /build/pytorch/build/aten/src/ATen/RegisterCPU.cpp:30476
       new kernel: registered at /build/intel-pytorch-extension/build/Release/csrc/gpu/csrc/aten/generated/ATen/RegisterXPU.cpp:2971 (function operator())

A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.2.5 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last):  File "/lus/flare/projects/datascience/foremans/projects/saforem2/tmp/2025-05-02-150121/ezpz/venvs/aurora_nre_models_frameworks-2025.0.0/bin/ezpz-test", line 6, in <module>
    from ezpz.test import main
  File "/lus/flare/projects/datascience/foremans/projects/saforem2/tmp/2025-05-02-150121/ezpz/src/ezpz/__init__.py", line 102, in <module>
    from ezpz.dist import (
  File "/lus/flare/projects/datascience/foremans/projects/saforem2/tmp/2025-05-02-150121/ezpz/src/ezpz/dist.py", line 42, in <module>
    import intel_extension_for_pytorch as ipex  # type:ignore[missingTypeStubs]
  File "/opt/aurora/24.347.0/frameworks/aurora_nre_models_frameworks-2025.0.0/lib/python3.10/site-packages/intel_extension_for_pytorch/__init__.py", line 128, in <module>
    from . import xpu
  File "/opt/aurora/24.347.0/frameworks/aurora_nre_models_frameworks-2025.0.0/lib/python3.10/site-packages/intel_extension_for_pytorch/xpu/__init__.py", line 20, in <module>
    from .utils import *
  File "/opt/aurora/24.347.0/frameworks/aurora_nre_models_frameworks-2025.0.0/lib/python3.10/site-packages/intel_extension_for_pytorch/xpu/utils.py", line 7, in <module>
    from .. import frontend
  File "/opt/aurora/24.347.0/frameworks/aurora_nre_models_frameworks-2025.0.0/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py", line 9, in <module>
    from .nn import utils
  File "/opt/aurora/24.347.0/frameworks/aurora_nre_models_frameworks-2025.0.0/lib/python3.10/site-packages/intel_extension_for_pytorch/nn/__init__.py", line 6, in <module>
    from .modules import FrozenBatchNorm2d
  File "/opt/aurora/24.347.0/frameworks/aurora_nre_models_frameworks-2025.0.0/lib/python3.10/site-packages/intel_extension_for_pytorch/nn/modules/__init__.py", line 11, in <module>
    from ...cpu.nn.linear_fuse_eltwise import IPEXLinearEltwise
  File "/opt/aurora/24.347.0/frameworks/aurora_nre_models_frameworks-2025.0.0/lib/python3.10/site-packages/intel_extension_for_pytorch/cpu/nn/linear_fuse_eltwise.py", line 3, in <module>
    from intel_extension_for_pytorch.nn.utils._weight_prepack import (
  File "/opt/aurora/24.347.0/frameworks/aurora_nre_models_frameworks-2025.0.0/lib/python3.10/site-packages/intel_extension_for_pytorch/nn/utils/__init__.py", line 1, in <module>
    from intel_extension_for_pytorch.nn.utils import _weight_prepack
  File "/opt/aurora/24.347.0/frameworks/aurora_nre_models_frameworks-2025.0.0/lib/python3.10/site-packages/intel_extension_for_pytorch/nn/utils/_weight_prepack.py", line 8, in <module>
    from intel_extension_for_pytorch.cpu.tpp.utils.blocked_layout import (
  File "/opt/aurora/24.347.0/frameworks/aurora_nre_models_frameworks-2025.0.0/lib/python3.10/site-packages/intel_extension_for_pytorch/cpu/tpp/__init__.py", line 2, in <module>
    from . import fused_bert
  File "/opt/aurora/24.347.0/frameworks/aurora_nre_models_frameworks-2025.0.0/lib/python3.10/site-packages/intel_extension_for_pytorch/cpu/tpp/fused_bert.py", line 16, in <module>
    from transformers.modeling_utils import apply_chunking_to_forward
  File "/lus/flare/projects/datascience/foremans/projects/saforem2/tmp/2025-05-02-150121/ezpz/venvs/aurora_nre_models_frameworks-2025.0.0/lib/python3.10/site-packages/transformers/modeling_utils.py", line 69, in <module>
    from .loss.loss_utils import LOSS_MAPPING
  File "/lus/flare/projects/datascience/foremans/projects/saforem2/tmp/2025-05-02-150121/ezpz/venvs/aurora_nre_models_frameworks-2025.0.0/lib/python3.10/site-packages/transformers/loss/loss_utils.py", line 21, in <module>
    from .loss_deformable_detr import DeformableDetrForObjectDetectionLoss, DeformableDetrForSegmentationLoss
  File "/lus/flare/projects/datascience/foremans/projects/saforem2/tmp/2025-05-02-150121/ezpz/venvs/aurora_nre_models_frameworks-2025.0.0/lib/python3.10/site-packages/transformers/loss/loss_deformable_detr.py", line 4, in <module>
    from ..image_transforms import center_to_corners_format
  File "/lus/flare/projects/datascience/foremans/projects/saforem2/tmp/2025-05-02-150121/ezpz/venvs/aurora_nre_models_frameworks-2025.0.0/lib/python3.10/site-packages/transformers/image_transforms.py", line 47, in <module>
    import tensorflow as tf
  File "/opt/aurora/24.347.0/frameworks/aurora_nre_models_frameworks-2025.0.0/lib/python3.10/site-packages/tensorflow/__init__.py", line 48, in <module>
    from tensorflow._api.v2 import __internal__
  File "/opt/aurora/24.347.0/frameworks/aurora_nre_models_frameworks-2025.0.0/lib/python3.10/site-packages/tensorflow/_api/v2/__internal__/__init__.py", line 8, in <module>
    from tensorflow._api.v2.__internal__ import autograph
  File "/opt/aurora/24.347.0/frameworks/aurora_nre_models_frameworks-2025.0.0/lib/python3.10/site-packages/tensorflow/_api/v2/__internal__/autograph/__init__.py", line 8, in <module>
    from tensorflow.python.autograph.core.ag_ctx import control_status_ctx # line: 34
  File "/opt/aurora/24.347.0/frameworks/aurora_nre_models_frameworks-2025.0.0/lib/python3.10/site-packages/tensorflow/python/autograph/core/ag_ctx.py", line 21, in <module>
    from tensorflow.python.autograph.utils import ag_logging
  File "/opt/aurora/24.347.0/frameworks/aurora_nre_models_frameworks-2025.0.0/lib/python3.10/site-packages/tensorflow/python/autograph/utils/__init__.py", line 17, in <module>
    from tensorflow.python.autograph.utils.context_managers import control_dependency_on_returns
  File "/opt/aurora/24.347.0/frameworks/aurora_nre_models_frameworks-2025.0.0/lib/python3.10/site-packages/tensorflow/python/autograph/utils/context_managers.py", line 19, in <module>
    from tensorflow.python.framework import ops
  File "/opt/aurora/24.347.0/frameworks/aurora_nre_models_frameworks-2025.0.0/lib/python3.10/site-packages/tensorflow/python/framework/ops.py", line 41, in <module>
    from tensorflow.python import pywrap_tfe
  File "/opt/aurora/24.347.0/frameworks/aurora_nre_models_frameworks-2025.0.0/lib/python3.10/site-packages/tensorflow/python/pywrap_tfe.py", line 25, in <module>
    from tensorflow.python._pywrap_tfe import *
AttributeError: _ARRAY_API not found
AttributeError: 'MessageFactory' object has no attribute 'GetPrototype'

# ...[clipped]...

Footnotes

  1. This is a lot of packages, including: { torch, jax, tensorflow, scipy, jaxlib, numpy, ml-dtypes, opt-einsum, …, }. ↩

NORMAL  main  sam.onl/ posts/2025/05/03/index.mdx Β· Top 1:1