Rapid-MLX version
rapid-mlx 0.6.1
Hardware
MacBook Pro M5 Max 48gb
macOS version
26.4.1
Python version
3.13.13
Model
qwen3.6-35b
Full serve command
rapid-mlx serve qwen3.6-35b
What happened?
Latest version don't run on my M5 Max
Steps to reproduce
- git clone ...
- uv sync
- rapid-mlx serve qwen3.6-35b
Error logs / output
Traceback (most recent call last):
File "/Users/mlaprise/repos/Rapid-MLX/vllm_mlx/scheduler.py", line 2343, in step
raw_next = self.batch_generator.next()
File "/Users/mlaprise/repos/Rapid-MLX/.venv/lib/python3.13/site-packages/mlx_lm/generate.py", line 1329, in next
return self._next()
~~~~~~~~~~^^
File "/Users/mlaprise/repos/Rapid-MLX/vllm_mlx/scheduler.py", line 564, in _chunked_next
new_batch = self._process_prompts(batch_prompts)
File "/Users/mlaprise/repos/Rapid-MLX/vllm_mlx/scheduler.py", line 236, in _patched_process_prompts
batch = _orig_process_prompts(prompts)
File "/Users/mlaprise/repos/Rapid-MLX/.venv/lib/python3.13/site-packages/mlx_lm/generate.py", line 1107, in _process_prompts
mx.eval([c.state for c in prompt_cache])
~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: There is no Stream(gpu, 0) in current thread.
Rapid-MLX version
rapid-mlx 0.6.1
Hardware
MacBook Pro M5 Max 48gb
macOS version
26.4.1
Python version
3.13.13
Model
qwen3.6-35b
Full serve command
rapid-mlx serve qwen3.6-35b
What happened?
Latest version don't run on my M5 Max
Steps to reproduce
Error logs / output