Running Unsloth with all perks on DGX Spark (bartusiak.ai)

🤖 AI Summary
NVIDIA DGX Spark — a compact devbox with a B10 (Blackwell) chip and 128 GB of unified memory — can be a strong local platform for large‑model fine‑tuning with Unsloth, but only if you align CUDA, PyTorch and low‑level libraries carefully. The recommended approach is to base your work on NVIDIA’s nvidia/pytorch Docker image (it contains a custom, optimized PyTorch plus Triton/torchvision) and then create a uv virtual environment that uses system site‑packages so you keep the optimized PyTorch from the image. This avoids performance regressions from installing an incompatible PyTorch build. Key technical steps: use uv venv --system-site-packages and add override-dependencies entries in pyproject.toml to block torch, triton and torchvision (e.g., "torch; python_version < '0'") so uv won’t reinstall them; set or unset TORCH_CUDA_ARCH_LIST to avoid the image’s broad list causing parse/compile issues (export TORCH_CUDA_ARCH_LIST=12.0 is recommended for DGX Spark even though the chip reports 12.1/CU130); and build xformers from a compatible source branch (no-build-isolation-package for xformers) until upstream CUDA 12.1 support lands. Recreate the venv after upgrading the base image to avoid native package breakage. With these fixes and an ipykernel pointing to the uv venv, you can run Jupyter Lab and fine‑tune LLMs locally while leveraging DGX Spark’s large memory.
Loading comments...
loading comments...