Skip to content

[BUG]: cublasmp dependencies are not reflected in supported_nvidia_libs.py #1116

@rwgk

Description

@rwgk

Discovered by chance while testing on a workstation that did not have the CUDA driver installed:

  • libcublasmp.so.0 is the only supported lib that requires libcuda.so.1, which led to a test_load_nvidia_dynamic_lib.py failure when the driver was not installed.
  • To double-check I ran the ldd, which made it obvious that we are missing all dependencies: cublas, cublasLt, nvshmem_host, nccl

The dependency on libcuda.so.1 is unusual: not sure if we want to do something about it. But the other dependencies should be added to supported_nvidia_libs.py.

mgx-c2g2-pvt-66.cl1u1.colossus.nvidia.com:/wrk/forked/cuda-python/cuda_pathfinder $ ldd /wrk/forked/cuda-python/cuda_pathfinder/.pixi/envs/cu12-linux-aarch64/lib/python3.14/site-packages/nvidia/cublasmp/cu12/lib/libcublasmp.so.0
        linux-vdso.so.1 (0x0000fb3f18b6a000)
        libcuda.so.1 => /lib/aarch64-linux-gnu/libcuda.so.1 (0x0000fb3f11a00000)
        libcublas.so.12 => /wrk/forked/cuda-python/cuda_pathfinder/.pixi/envs/cu12-linux-aarch64/lib/python3.14/site-packages/nvidia/cublasmp/cu12/lib/../../../cublas/lib/libcublas.so.12 (0x0000fb3f0b600000)
        libcublasLt.so.12 => /wrk/forked/cuda-python/cuda_pathfinder/.pixi/envs/cu12-linux-aarch64/lib/python3.14/site-packages/nvidia/cublasmp/cu12/lib/../../../cublas/lib/libcublasLt.so.12 (0x0000fb3edb800000)
        libnvshmem_host.so.3 => /wrk/forked/cuda-python/cuda_pathfinder/.pixi/envs/cu12-linux-aarch64/lib/python3.14/site-packages/nvidia/cublasmp/cu12/lib/../../../nvshmem/lib/libnvshmem_host.so.3 (0x0000fb3ed2000000)
        libnccl.so.2 => /wrk/forked/cuda-python/cuda_pathfinder/.pixi/envs/cu12-linux-aarch64/lib/python3.14/site-packages/nvidia/cublasmp/cu12/lib/../../../nccl/lib/libnccl.so.2 (0x0000fb3ebb800000)
        librt.so.1 => /lib/aarch64-linux-gnu/librt.so.1 (0x0000fb3f18af0000)
        libpthread.so.0 => /lib/aarch64-linux-gnu/libpthread.so.0 (0x0000fb3f18ac0000)
        libdl.so.2 => /lib/aarch64-linux-gnu/libdl.so.2 (0x0000fb3f18a90000)
        libstdc++.so.6 => /lib/aarch64-linux-gnu/libstdc++.so.6 (0x0000fb3ebb400000)
        libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000fb3f17750000)
        libgcc_s.so.1 => /lib/aarch64-linux-gnu/libgcc_s.so.1 (0x0000fb3f18a50000)
        libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000fb3ebb240000)
        /lib/ld-linux-aarch64.so.1 (0x0000fb3f18b2d000)

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingcuda.pathfinderEverything related to the cuda.pathfinder module

Type

No fields configured for Bug.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions