MILAB-6382: expose host NVIDIA driver via runtime envs#91
Conversation
There was a problem hiding this comment.
Code Review
This pull request adds NVIDIA-related environment variables to the configuration of the python-3.12.10-rapids package. A critical issue was raised regarding the use of ${PATH} in the static JSON configuration, as it may not be expanded by the runtime executor and could break executable lookups.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| "NVIDIA_VISIBLE_DEVICES=all", | ||
| "NVIDIA_DRIVER_CAPABILITIES=compute,utility", | ||
| "LD_LIBRARY_PATH=/usr/local/nvidia/lib64:/usr/local/nvidia/lib", | ||
| "PATH=/usr/local/nvidia/bin:${PATH}" |
There was a problem hiding this comment.
Using ${PATH} in a static JSON configuration file like package.json will likely not be expanded by the platform's runtime executor, resulting in a literal ${PATH} string in the environment variable. This will break standard executable lookups because the original path directories will be lost. If the platform runner does not support shell-style variable expansion for envVars, consider if this prepending is necessary or if the platform provides an alternative mechanism to append/prepend to the PATH variable.
| "NVIDIA_VISIBLE_DEVICES=all", | ||
| "NVIDIA_DRIVER_CAPABILITIES=compute,utility", | ||
| "LD_LIBRARY_PATH=/usr/local/nvidia/lib64:/usr/local/nvidia/lib", | ||
| "PATH=/usr/local/nvidia/bin:${PATH}" |
There was a problem hiding this comment.
${PATH} expansion not guaranteed in JSON envVars
Every other envVars entry in this repo is a plain KEY=VALUE literal with no variable references. If the Platforma runtime sets these strings directly (e.g. via setenv) rather than passing them through a shell, PATH will be set to the literal string /usr/local/nvidia/bin:${PATH} — with the unexpanded text ${PATH} as the suffix. That would break resolution of every binary (including python itself) that lives outside /usr/local/nvidia/bin, making the entire run environment non-functional. Please confirm that the runtime performs ${...} interpolation before applying envVars, or replace this entry with a fully resolved static path.
Prompt To Fix With AI
This is a comment left during a code review.
Path: python-3.12.10-rapids/package.json
Line: 29
Comment:
**`${PATH}` expansion not guaranteed in JSON `envVars`**
Every other `envVars` entry in this repo is a plain `KEY=VALUE` literal with no variable references. If the Platforma runtime sets these strings directly (e.g. via `setenv`) rather than passing them through a shell, `PATH` will be set to the literal string `/usr/local/nvidia/bin:${PATH}` — with the unexpanded text `${PATH}` as the suffix. That would break resolution of every binary (including `python` itself) that lives outside `/usr/local/nvidia/bin`, making the entire run environment non-functional. Please confirm that the runtime performs `${...}` interpolation before applying `envVars`, or replace this entry with a fully resolved static path.
How can I resolve this? If you propose a fix, please make it concise.| "LD_LIBRARY_PATH=/usr/local/nvidia/lib64:/usr/local/nvidia/lib", | ||
| "PATH=/usr/local/nvidia/bin:${PATH}" |
There was a problem hiding this comment.
NVIDIA runtime vars applied to all platforms including macOS and Windows
NVIDIA_VISIBLE_DEVICES, NVIDIA_DRIVER_CAPABILITIES, and the /usr/local/nvidia/... paths are NVIDIA Container Runtime constructs that are only meaningful in Linux containers with GPU passthrough. The roots in this package include macosx-x64, macosx-aarch64, and windows-x64. On those platforms the paths won't exist and LD_LIBRARY_PATH will point to directories that are absent, potentially interfering with any loader that respects LD_LIBRARY_PATH on macOS (DYLD_LIBRARY_PATH is the macOS equivalent). If RAPIDS GPU features are explicitly unsupported on non-Linux platforms, consider conditionally applying these vars only for Linux targets (if the runtime supports platform-specific env vars), or document that the NVIDIA vars are no-ops on non-Linux platforms.
Prompt To Fix With AI
This is a comment left during a code review.
Path: python-3.12.10-rapids/package.json
Line: 28-29
Comment:
**NVIDIA runtime vars applied to all platforms including macOS and Windows**
`NVIDIA_VISIBLE_DEVICES`, `NVIDIA_DRIVER_CAPABILITIES`, and the `/usr/local/nvidia/...` paths are NVIDIA Container Runtime constructs that are only meaningful in Linux containers with GPU passthrough. The `roots` in this package include `macosx-x64`, `macosx-aarch64`, and `windows-x64`. On those platforms the paths won't exist and `LD_LIBRARY_PATH` will point to directories that are absent, potentially interfering with any loader that respects `LD_LIBRARY_PATH` on macOS (`DYLD_LIBRARY_PATH` is the macOS equivalent). If RAPIDS GPU features are explicitly unsupported on non-Linux platforms, consider conditionally applying these vars only for Linux targets (if the runtime supports platform-specific env vars), or document that the NVIDIA vars are no-ops on non-Linux platforms.
How can I resolve this? If you propose a fix, please make it concise.| "RPY2_CFFI_MODE=ABI", | ||
| "NVIDIA_VISIBLE_DEVICES=all", | ||
| "NVIDIA_DRIVER_CAPABILITIES=compute,utility", | ||
| "LD_LIBRARY_PATH=/usr/local/nvidia/lib64:/usr/local/nvidia/lib", |
There was a problem hiding this comment.
LD_LIBRARY_PATH silently overwrites any pre-existing value
Unlike PATH, which at least attempts to incorporate the existing value via ${PATH}, LD_LIBRARY_PATH is set to a fixed string. If the host container (or any previously applied runenv layer) already exports LD_LIBRARY_PATH entries (e.g. CUDA toolkit, cuDNN), those entries will be silently dropped, potentially causing runtime linker failures for libraries that depend on them. Using the same ${LD_LIBRARY_PATH} pattern keeps existing paths intact — assuming the runtime supports interpolation.
| "LD_LIBRARY_PATH=/usr/local/nvidia/lib64:/usr/local/nvidia/lib", | |
| "LD_LIBRARY_PATH=/usr/local/nvidia/lib64:/usr/local/nvidia/lib:${LD_LIBRARY_PATH}", |
Prompt To Fix With AI
This is a comment left during a code review.
Path: python-3.12.10-rapids/package.json
Line: 28
Comment:
**`LD_LIBRARY_PATH` silently overwrites any pre-existing value**
Unlike `PATH`, which at least attempts to incorporate the existing value via `${PATH}`, `LD_LIBRARY_PATH` is set to a fixed string. If the host container (or any previously applied runenv layer) already exports `LD_LIBRARY_PATH` entries (e.g. CUDA toolkit, cuDNN), those entries will be silently dropped, potentially causing runtime linker failures for libraries that depend on them. Using the same `${LD_LIBRARY_PATH}` pattern keeps existing paths intact — assuming the runtime supports interpolation.
```suggestion
"LD_LIBRARY_PATH=/usr/local/nvidia/lib64:/usr/local/nvidia/lib:${LD_LIBRARY_PATH}",
```
How can I resolve this? If you propose a fix, please make it concise.
Greptile Summary
This PR adds four NVIDIA GPU environment variables to the
python-3.12.10-rapidsrun environment so that Python processes can discover and use the host NVIDIA driver when running inside a GPU-enabled container.NVIDIA_VISIBLE_DEVICES=allandNVIDIA_DRIVER_CAPABILITIES=compute,utility: NVIDIA Container Runtime control variables that expose all GPU devices and mount CUDA compute + utility capabilities into the container.LD_LIBRARY_PATHandPATH: Prepend/usr/local/nvidia/lib64,/usr/local/nvidia/lib, and/usr/local/nvidia/binso the linker and shell can find NVIDIA shared libraries and utilities. ThePATHentry uses${PATH}variable interpolation — a pattern not previously used in anyenvVarsfield in this codebase, whose runtime support is unconfirmed.LD_LIBRARY_PATHdoes not reference its existing value and will overwrite any previously set paths.Key terms touched by this PR:
envVarsKEY=VALUEstrings in theblock-softwareartifact spec; applied as environment variables when the run environment is activated by the Platforma runtimeRPY2_CFFI_MODE) to 5 entries; introduces variable interpolation (${PATH}) for the first time in this codebaseNVIDIA_VISIBLE_DEVICESall— exposes every available GPU to the Python processNVIDIA_DRIVER_CAPABILITIEScompute,utility— enables CUDA compute andnvidia-smi-style utility accessLD_LIBRARY_PATH/usr/local/nvidia/lib64:/usr/local/nvidia/lib; overwrites any pre-existing valuePATH/usr/local/nvidia/binusing${PATH}interpolation — a pattern unverified for this JSON-based config formatConfidence Score: 3/5
The change is small but introduces a
${PATH}interpolation pattern that has never appeared in any otherenvVarsentry in this codebase; if the Platforma runtime sets environment variables as literal strings rather than through a shell, the Python run environment'sPATHwill be broken on every execution.The
${PATH}interpolation inPATH=/usr/local/nvidia/bin:${PATH}is the first of its kind across allenvVarsdefinitions in the repository. If the runtime simply calls setenv/putenv with the raw string, every binary lookup outside/usr/local/nvidia/binwill fail. This needs explicit confirmation before the package ships. Secondary concerns are the unconditional overwrite ofLD_LIBRARY_PATHand the NVIDIA-only paths being applied to macOS and Windows roots.python-3.12.10-rapids/package.json — specifically the
PATHenv var entry with${PATH}interpolation and theLD_LIBRARY_PATHoverwrite need verification before merging.Important Files Changed
${PATH}interpolation syntax is untested in this JSON context and could break the run environment if the runtime doesn't expand it.Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[package.json block-software envVars] --> B{Runtime applies envVars} B -->|Literal string copy| C["PATH = /usr/local/nvidia/bin:\${PATH}\n⚠️ \${PATH} not expanded"] B -->|Literal string copy| D["LD_LIBRARY_PATH = /usr/local/nvidia/lib64:/usr/local/nvidia/lib\n⚠️ Pre-existing value overwritten"] B -->|Simple assignment| E["NVIDIA_VISIBLE_DEVICES = all"] B -->|Simple assignment| F["NVIDIA_DRIVER_CAPABILITIES = compute,utility"] C -->|If interpolation unsupported| G["❌ PATH broken — only nvidia/bin visible"] C -->|If interpolation supported| H["✅ nvidia/bin prepended to original PATH"] D --> I["NVIDIA libs findable by linker\n⚠️ Other LD paths dropped"] E --> J["Container Runtime exposes all GPUs"] F --> K["CUDA compute + utility APIs mounted"]%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%% flowchart TD A[package.json block-software envVars] --> B{Runtime applies envVars} B -->|Literal string copy| C["PATH = /usr/local/nvidia/bin:\${PATH}\n⚠️ \${PATH} not expanded"] B -->|Literal string copy| D["LD_LIBRARY_PATH = /usr/local/nvidia/lib64:/usr/local/nvidia/lib\n⚠️ Pre-existing value overwritten"] B -->|Simple assignment| E["NVIDIA_VISIBLE_DEVICES = all"] B -->|Simple assignment| F["NVIDIA_DRIVER_CAPABILITIES = compute,utility"] C -->|If interpolation unsupported| G["❌ PATH broken — only nvidia/bin visible"] C -->|If interpolation supported| H["✅ nvidia/bin prepended to original PATH"] D --> I["NVIDIA libs findable by linker\n⚠️ Other LD paths dropped"] E --> J["Container Runtime exposes all GPUs"] F --> K["CUDA compute + utility APIs mounted"]Prompt To Fix All With AI
Reviews (1): Last reviewed commit: "MILAB-6382: expose host NVIDIA driver vi..." | Re-trigger Greptile