The OpenAI API spec supports logprobs: true in chat completion requests, returning token log-probabilities in the response.
Currently Rapid-MLX accepts the parameter silently but doesn't return logprobs data. The response choices[].logprobs field is absent.
What needs to happen:
- When
logprobs: true, capture per-token log-probabilities from mlx-lm's generate_step()
- Return them in the
choices[].logprobs field per the OpenAI spec
- Add a test case
Relevant files: vllm_mlx/server.py (chat completion endpoint)
The OpenAI API spec supports
logprobs: truein chat completion requests, returning token log-probabilities in the response.Currently Rapid-MLX accepts the parameter silently but doesn't return logprobs data. The response
choices[].logprobsfield is absent.What needs to happen:
logprobs: true, capture per-token log-probabilities frommlx-lm'sgenerate_step()choices[].logprobsfield per the OpenAI specRelevant files:
vllm_mlx/server.py(chat completion endpoint)