fix(eval): handle failed inference results without invocations by he-yufeng · Pull Request #5878 · google/adk-python

he-yufeng · 2026-05-28T10:45:27Z

What changed

Return a failed EvalCaseResult when inference failed before producing any invocations.
Preserve the existing session lookup path when a failed inference still has a session id.
Add a regression test for InferenceResult(status=FAILURE, inferences=None).
Clean up two existing lint issues in the touched eval test file so the local changed-file ruff check passes.

This prevents the eval runner from replacing the original inference error with TypeError: object of type 'NoneType' has no len().

Fixes #5876

To verify

.\.venv\Scripts\python.exe -m py_compile src\google\adk\evaluation\local_eval_service.py tests\unittests\evaluation\test_local_eval_service.py
.\.venv\Scripts\python.exe -m pytest tests\unittests\evaluation\test_local_eval_service.py -k "failed_without_inferences or evaluate_single_inference_result" -q --basetemp .tmp\pytest
.\.venv\Scripts\python.exe -m ruff check src\google\adk\evaluation\local_eval_service.py tests\unittests\evaluation\test_local_eval_service.py
.\.venv\Scripts\python.exe -m pyink --check src\google\adk\evaluation\local_eval_service.py tests\unittests\evaluation\test_local_eval_service.py
git diff --check

rohityan · 2026-05-28T17:57:01Z

Hi @he-yufeng , Thank you for your contribution! We appreciate you taking the time to submit this pull request. Please fix formatting errors by running autoformat.sh

he-yufeng · 2026-05-28T20:52:17Z

Thanks for the review. I rebased onto the latest main, ran pyink on the touched files, and also cleaned two existing lint issues in the touched test file so the changed-file checks are clean. Local verification now passes for py_compile, the targeted eval tests, ruff, pyink --check, and git diff --check.

he-yufeng · 2026-05-29T07:57:52Z

Rebased onto current upstream/main and reran the focused eval validation. Validation: test_local_eval_service.py passed (19 passed); ruff check passed; pyink --check passed; py_compile passed; git diff --check passed. I set PYTHONPATH=src so pytest imports this worktree explicitly.

rohityan · 2026-05-29T19:07:31Z

Hi @ankursharmas , can you please review this.

he-yufeng · 2026-06-04T03:56:56Z

Rebased onto current upstream/main and reran the focused eval checks.

Validation on Windows:

PYTHONUTF8=1 PYTHONPATH=src uv run --no-sync pytest tests\unittests\evaluation\test_local_eval_service.py -q --basetemp .tmp\pytest-5878-20260604 -p no:cacheprovider
# 19 passed, 54 warnings

PYTHONUTF8=1 PYTHONPATH=src uv run --no-sync ruff check src\google\adk\evaluation\local_eval_service.py tests\unittests\evaluation\test_local_eval_service.py
PYTHONUTF8=1 PYTHONPATH=src uv run --no-sync pyink src\google\adk\evaluation\local_eval_service.py tests\unittests\evaluation\test_local_eval_service.py --check
PYTHONUTF8=1 PYTHONPATH=src uv run --no-sync python -m py_compile src\google\adk\evaluation\local_eval_service.py tests\unittests\evaluation\test_local_eval_service.py
git diff --check upstream/main..HEAD

he-yufeng · 2026-06-06T19:47:31Z

Rebased this branch onto current upstream/main and force-pushed the cleaned head.

Validation on Windows:

python -m py_compile src\google\adk\evaluation\local_eval_service.py tests\unittests\evaluation\test_local_eval_service.py
PYTHONUTF8=1 PYTHONPATH=src .\.venv\Scripts\python.exe -m pytest tests\unittests\evaluation\test_local_eval_service.py -q --basetemp .tmp\pytest-5878-20260607 -p no:cacheprovider
# 19 passed, 41 warnings
PYTHONUTF8=1 uv run --no-sync pyink src\google\adk\evaluation\local_eval_service.py tests\unittests\evaluation\test_local_eval_service.py --check
# 2 files would be left unchanged
git diff --check upstream/main..HEAD

The local eval test environment needed optional eval dependencies installed in the existing ADK venv (pandas, google-cloud-aiplatform, rouge-score, mcp) before the full focused file could collect and run.

adk-bot · 2026-06-06T19:48:52Z

I will read the instruction file for the adk-pr-analyze skill before analyzing the PR.I will run the CLA verification script first to ensure compliance before performing any analysis or reading any diff.I will fetch the metadata for pull request 5878 using the custom GitHub PR tool.I need to execute the CLA verification script to verify that the contributor (@he-yufeng) has signed the Google Contributor License Agreement (CLA), which is a mandatory compliance gate before analyzing any code or pull requests. Please approve this command.I will fetch the linked issue #5876 to get more details about the problem that this PR aims to solve.I will execute gh pr diff using run_command to view the details of the changes made in PR 5878.I will view the relevant section of local_eval_service.py to trace the context of the bug fix.I will view a bit further down in local_eval_service.py to see what is returned by _evaluate_single_inference_result normally.I will view the import and license block of local_eval_service.py to check for pythonic conventions like physical headers and future annotations.I will search for get_session to see the type signature of the session details return value.I will view base_session_service.py to identify the typing of the get_session return value.I will search for EvalCaseResult definition to check its fields and types.I will view src/google/adk/evaluation/eval_result.py around line 31 to inspect EvalCaseResult.I will view the rest of the fields of EvalCaseResult in eval_result.py.I will view test_local_eval_service.py around line 465 to confirm the exact line numbers and format of the added unit tests.I will view the preceding lines of test_local_eval_service.py to see the original structure and find out where the new test fits in.I will generate a premium, structured Pull Request Analysis Report for PR 5878 as a Markdown artifact so it is persistently stored and formatted neatly.I have completed a thorough, read-only analysis of Pull Request #5878 and its linked GitHub Issue #5876.

The complete, premium structured analysis report has been generated and saved to your workspace artifacts:
👉 pr_5878_analysis_report.md

📋 Summary of Findings & Work Completed

Core Problem Solved: Resolved evaluation execution crashes (TypeError: object of type 'NoneType' has no len()) in the LocalEvalService class when model inference fails completely before spawning any invocations.
Implementation Verification:
- Swapped the unsafe len(inference_result.inferences) check by implementing a robust early guard within _evaluate_single_inference_result.
- Confirmed this correctly routes the failed job into a cleanly structured, failed EvalCaseResult.
Style & Guidelines Compliance:
- The PR perfectly mirrors Google ADK's development paradigms.
- Added a well-structured Arrange-Act-Assert behavior test: test_evaluate_single_inference_result_failed_without_inferences.
- Removed a redundant sys import import block and modernized styling (converting older == None targets to standard is None checks in tests).
Recommendation: Approve. Excellent, highly focused contribution.

adk-bot added the eval [Component] This issue is related to evaluation label May 28, 2026

rohityan self-assigned this May 28, 2026

rohityan added the request clarification [Status] The maintainer need clarification or more information from the author label May 28, 2026

he-yufeng force-pushed the fix/eval-failed-inference-none branch from 7a33065 to 0315922 Compare May 28, 2026 20:51

he-yufeng changed the title ~~fix: handle failed eval inference results~~ fix(eval): handle failed inference results without invocations May 28, 2026

surajksharma07 mentioned this pull request May 29, 2026

LocalEvalService crashes with len(None) when inference_result.inferences is None #5876

Open

he-yufeng force-pushed the fix/eval-failed-inference-none branch from 0315922 to daddfd8 Compare May 29, 2026 07:57

boyangsvl force-pushed the fix/eval-failed-inference-none branch from daddfd8 to 64759da Compare May 29, 2026 18:01

rohityan added needs review [Status] The PR/issue is awaiting review from the maintainer and removed request clarification [Status] The maintainer need clarification or more information from the author labels May 29, 2026

rohityan requested a review from ankursharmas May 29, 2026 19:07

he-yufeng force-pushed the fix/eval-failed-inference-none branch from 64759da to 3a07e3a Compare June 4, 2026 03:56

fix: handle failed eval inference results

0307ada

he-yufeng force-pushed the fix/eval-failed-inference-none branch from 3a07e3a to 0307ada Compare June 6, 2026 19:47

thoang3 mentioned this pull request Jun 7, 2026

GEPA optimizer crashes with KeyError on failed eval cases #6004

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(eval): handle failed inference results without invocations#5878

fix(eval): handle failed inference results without invocations#5878
he-yufeng wants to merge 1 commit into
google:mainfrom
he-yufeng:fix/eval-failed-inference-none

he-yufeng commented May 28, 2026 •

edited

Loading

Uh oh!

rohityan commented May 28, 2026

Uh oh!

he-yufeng commented May 28, 2026

Uh oh!

he-yufeng commented May 29, 2026

Uh oh!

rohityan commented May 29, 2026 •

edited

Loading

Uh oh!

he-yufeng commented Jun 4, 2026

Uh oh!

he-yufeng commented Jun 6, 2026

Uh oh!

adk-bot commented Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

he-yufeng commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changed

To verify

Uh oh!

rohityan commented May 28, 2026

Uh oh!

he-yufeng commented May 28, 2026

Uh oh!

he-yufeng commented May 29, 2026

Uh oh!

rohityan commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

he-yufeng commented Jun 4, 2026

Uh oh!

he-yufeng commented Jun 6, 2026

Uh oh!

adk-bot commented Jun 6, 2026

📋 Summary of Findings & Work Completed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

he-yufeng commented May 28, 2026 •

edited

Loading

rohityan commented May 29, 2026 •

edited

Loading