feat(observability): add configurable grafana.prometheusDatasourceName to agent chart#1982
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds a configurable Prometheus server URL to the observability agent Helm chart and injects it into the agent’s system message so the agent can reliably query the correct Prometheus endpoint.
Changes:
- Introduces
prometheus.urlin chart values as an optional configuration. - Conditionally renders a “Prometheus Configuration” section in the agent system message when the URL is set.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| helm/agents/observability/values.yaml | Adds a new prometheus.url value (default empty) with inline documentation. |
| helm/agents/observability/templates/agent.yaml | Conditionally injects the configured Prometheus URL into the agent’s system prompt. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
f8d444d to
17a001d
Compare
|
I'm not sure that the URL will do much in practice here. The observability agent reaches Prometheus exclusively through the Grafana tool server. The I think the underlying issue is the agent not knowing which datasource to query, which would mean that the fix would be to inject the Grafana datasource name or UID rather than a raw url. Can you clarify a bit more about how the URL is used? Maybe I'm missing something here. |
thx for catching that. yeah, i v updated the code to use grafana.prometheusDatasourceName instead. when set, the agent is told which grafan datasource to use for query_prometheus and related tools. and this keeps the original goal (persistence across restarts) while giving the agent sth it can actually act on. |
I have updated the PR title and the pr description. |
|
dug into the grafana/mcp-grafana source to be more precise here. query_prometheus (and all related tools like list_prometheus_metric_names, list_prometheus_label_names, etc.) strictly require a datasourceUid, not a name: so grafana.prometheusDatasourceName is not directly usable in tool calls. the agent would need to call get_datasource_by_name first to resolve the name to a uid, then pass that uid to the prometheus tools. 2 options from here: nr1: change to grafana.prometheusDatasourceUid so the agent can use the value directly without a lookup. downside: users need to know their datasource uid upfront. nr2: keep prometheusDatasourceName and update the system message to explicitly tell the agent to resolve it via get_datasource_by_name before calling any prometheus tools. more user-friendly to configure, 1 extra tool call at runtime. which one is the best @iplay88keys? |
a674092 to
f8aba17
Compare
I would probably lean toward the second one, though it looks like it's was changed to It should really only need to be run once at the beginning of the context window and then the agent could re-use that in subsequent calls. It also allows the agent to be generic and know how to switch data sources. Another option is to have the agent's system message say to list datasources and if there's more than one, raise that as a question to the user if it's not clear from the message. |
939e2d5 to
3ec71ca
Compare
2c1ba10 to
9079422
Compare
Adds an optional prometheus.url value. When set, the URL is injected into the agent system message so the agent knows which endpoint to use. Signed-off-by: mesutoezdil <mesudozdil@gmail.com>
…asourceName The observability agent queries Prometheus exclusively through the Grafana MCP tool server. Tools like query_prometheus take a Grafana datasourceUid, not a raw Prometheus endpoint, so injecting a URL into the system message provided no actionable value. Replace prometheus.url with grafana.prometheusDatasourceName. When set, the agent is told which Grafana datasource to use for all Prometheus queries, matching how the tools actually work. Signed-off-by: mesutoezdil <mesudozdil@gmail.com>
…by_name with get_datasource Signed-off-by: mesutoezdil <mesudozdil@gmail.com>
9079422 to
0370e7e
Compare
|
If we switch to using the grafana docker images for this, it looks like we could pin to a tagged version and not have to worry about the tool names going out of sync: https://hub.docker.com/r/grafana/mcp-grafana/tags. It might be worth a follow-up or at least an issue to track it, but it's not a big deal at this point. |
oki, opened a follow-up issue to track it: #2040 |
|
Have you had a chance to test this out in your env? Can you provide some screenshots or a testing strategy showing that it works as expected? |
|
Sorry, I meant using the agent to show this fixes the issue raised. |
sure (in my otel env: https://github.com/mesutoezdil/myOTel)
|
|
Looking at your otel env repo, shouldn't it have found the Based on the screenshots, it seems that the agent tried to do a tool call with the |
list_datasources result confirms this Grafana has 3 datasources: Prometheus (uid: webstore-metrics), Jaeger, and OpenSearch. no VictoriaMetrics. agent queried http_server_active_requests successfully and got real data back. the myOTel repo dashboards reference a VictoriaMetrics uid but that is a separate env, unrelated to this test.
|
|
Ok, would you say this is working as expected, then? It seems that the issues could be related to the LLM itself. For your last example, at least, I would have expected the model to know from the prior context that |
…essage Signed-off-by: mesutoezdil <mesudozdil@gmail.com>
|
Cool, I think that's a lot better. The only remaining question is whether we should have helm tests around the configuration. I'm not set on it being a requirement, just putting it out there. |
what would you expect it to test, the system message content or the rendered template? |
|
Mostly was curious if there was much of a benefit to it. It seems that none of the agents currently have tests around them and this is pretty minor, so I don't think it'll actually be necessary. |
Signed-off-by: mesutoezdil <mesudozdil@gmail.com>
a97929f to
b659f97
Compare








Closes #1891
Adds a grafana.prometheusDatasourceName field to the observability agent chart values.
The observability agent reaches Prometheus exclusively through the Grafana MCP tool server.
Tools like query_prometheus take a Grafana datasource name or UID, not a raw endpoint.
When grafana.prometheusDatasourceName is set, it is injected into the agent system message so the agent knows which Grafana datasource to use for Prometheus queries across restarts.
When left empty (default), nothing changes.