RuntimeError: There is no Stream(gpu, 0) in current thread.

### Rapid-MLX version

rapid-mlx 0.6.1

### Hardware

MacBook Pro M5 Max 48gb

### macOS version

26.4.1

### Python version

3.13.13

### Model

qwen3.6-35b

### Full serve command

rapid-mlx serve qwen3.6-35b

### What happened?

Latest version don't run on my M5 Max

### Steps to reproduce

1. git clone ...
2. uv sync
3. rapid-mlx serve qwen3.6-35b

### Error logs / output

```shell
Traceback (most recent call last):
  File "/Users/mlaprise/repos/Rapid-MLX/vllm_mlx/scheduler.py", line 2343, in step
    raw_next = self.batch_generator.next()
  File "/Users/mlaprise/repos/Rapid-MLX/.venv/lib/python3.13/site-packages/mlx_lm/generate.py", line 1329, in next
    return self._next()
           ~~~~~~~~~~^^
  File "/Users/mlaprise/repos/Rapid-MLX/vllm_mlx/scheduler.py", line 564, in _chunked_next
    new_batch = self._process_prompts(batch_prompts)
  File "/Users/mlaprise/repos/Rapid-MLX/vllm_mlx/scheduler.py", line 236, in _patched_process_prompts
    batch = _orig_process_prompts(prompts)
  File "/Users/mlaprise/repos/Rapid-MLX/.venv/lib/python3.13/site-packages/mlx_lm/generate.py", line 1107, in _process_prompts
    mx.eval([c.state for c in prompt_cache])
    ~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: There is no Stream(gpu, 0) in current thread.
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: There is no Stream(gpu, 0) in current thread. #160

Rapid-MLX version

Hardware

macOS version

Python version

Model

Full serve command

What happened?

Steps to reproduce

Error logs / output

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

RuntimeError: There is no Stream(gpu, 0) in current thread. #160

Description

Rapid-MLX version

Hardware

macOS version

Python version

Model

Full serve command

What happened?

Steps to reproduce

Error logs / output

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions