Skip to content

[build-tools] Add eas/run_maestro_tests step with smart retry#3635

Open
hSATAC wants to merge 2 commits intomainfrom
ash/eng-19927-only-retry-failed-maestro-flows-in-reuse_devices-mode
Open

[build-tools] Add eas/run_maestro_tests step with smart retry#3635
hSATAC wants to merge 2 commits intomainfrom
ash/eng-19927-only-retry-failed-maestro-flows-in-reuse_devices-mode

Conversation

@hSATAC
Copy link
Copy Markdown
Contributor

@hSATAC hSATAC commented Apr 27, 2026

Why

After porting the Maestro step to TS (#3641), we want to stop re-running the entire flow set on every retry. With dozens of flows a single flaky one masks other failures and inflates the whole run. Smart retry re-runs only the flows that actually failed.

Linear: ENG-19927. Stacked on #3641.

How

After each failed attempt, parse the per-attempt JUnit XML, find the failing testcase names, map them back to flow file paths, and pass only those paths to the next maestro test invocation.

  • parseFailedFlowsFromJUnit returns the failed-flow subset, or null when the result can't be trusted (malformed XML, duplicate testcase names, unknown flow). On null the loop falls back to dumb retry, preserving the previous behavior.
  • Replaces "copy latest attempt's JUnit" with mergeJUnitReports, which keeps the latest result per flow across attempts. The merged final report reflects each flow's best outcome (a flow that flaked once and passed on retry shows as passed).

The flow_name → flow_file_path map is built from Maestro's ai-*.json debug output. PR #3639 (next in stack) replaces this with a self-built map after Maestro 2.5.0 broke this assumption.

Test Plan

  • Unit tests in maestroResultParser.test.ts for parseFailedFlowsFromJUnit (happy path, all-passing, collisions, malformed XML, partial parse) and mergeJUnitReports (per-flow latest, fallback on data errors).
  • New cases in runMaestroTests.test.ts exercising the smart-retry loop.
  • Local end-to-end smoke test against the playground.

@linear
Copy link
Copy Markdown

linear Bot commented Apr 27, 2026

@hSATAC hSATAC added the no changelog PR that doesn't require a changelog entry label Apr 27, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 27, 2026

Codecov Report

❌ Patch coverage is 90.00000% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 56.08%. Comparing base (8229dc8) to head (14e6591).

Files with missing lines Patch % Lines
...d-tools/src/steps/functions/maestroResultParser.ts 90.00% 13 Missing ⚠️
...es/build-tools/src/steps/functions/maestroTests.ts 90.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3635      +/-   ##
==========================================
+ Coverage   55.97%   56.08%   +0.12%     
==========================================
  Files         866      866              
  Lines       37312    37439     +127     
  Branches     7759     7785      +26     
==========================================
+ Hits        20882    20995     +113     
- Misses      16334    16348      +14     
  Partials       96       96              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@hSATAC hSATAC force-pushed the ash/eng-19927-only-retry-failed-maestro-flows-in-reuse_devices-mode branch from 13e1d4c to c0d875b Compare April 27, 2026 15:57
Copy link
Copy Markdown
Contributor Author

hSATAC commented Apr 28, 2026

@hSATAC hSATAC changed the base branch from main to graphite-base/3635 April 28, 2026 14:09
@hSATAC hSATAC force-pushed the ash/eng-19927-only-retry-failed-maestro-flows-in-reuse_devices-mode branch from c0d875b to 0c794e5 Compare April 28, 2026 14:09
@hSATAC hSATAC changed the base branch from graphite-base/3635 to ash/eng-19927-port-bash-maestro-step-to-typescript April 28, 2026 14:09
@hSATAC hSATAC force-pushed the ash/eng-19927-port-bash-maestro-step-to-typescript branch from a89b35a to 9806cfe Compare April 29, 2026 07:30
@hSATAC hSATAC force-pushed the ash/eng-19927-only-retry-failed-maestro-flows-in-reuse_devices-mode branch from 0c794e5 to 64b5fa4 Compare April 29, 2026 07:30
@hSATAC hSATAC requested a review from sjchmiela April 29, 2026 08:04
@hSATAC hSATAC marked this pull request as ready for review April 29, 2026 08:04
@github-actions
Copy link
Copy Markdown

Subscribed to pull request

File Patterns Mentions
**/* @douglowder

Generated by CodeMention

Copy link
Copy Markdown
Contributor

@sjchmiela sjchmiela left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's make this opt-in?

if (isFilesystemError(mergeErr)) {
throw new SystemError('Failed to write final_report_path', { cause: mergeErr });
}
logger.warn({ err: mergeErr }, 'Smart merge failed; falling back to copy-latest.');
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's make it more user-understandable

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +605 to +607
try {
resolvedRoot = await fs.realpath(projectRoot);
} catch {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i like asyncResult(fs.realpath(projectRoot)) for stuff like this

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is removed in the next PR as well. I'll skip it here.

Comment thread packages/build-tools/src/steps/functions/maestroResultParser.ts Outdated
);
const fileGroups: FileGroup[] = [];
const skippedFiles: string[] = [];
for (const { filename, content } of contents) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would your parser do if instead of merging in this smart way you did

<testsuites>
  <testsuite>
    {...testsuitesfromfirstfile}
    {...testsuitesfromsecondfile}
    {...testsuitesfromthirdfile}

?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing to clarify first — this merge isn’t for our own parsers. All of our internal parsers read the per-attempt attempt-*.xml files directly from junit_report_directory. The merged final_report_path only exists as a human-facing artifact for customers who download it and run their own analysis.

Given that, two reasons I’d prefer to keep the smart merge:

  1. JUnit shape — Concatenating attempt-N <testsuite>s under one <testsuites> creates an ambiguous shape (e.g. flow-a, flow-b, flow-c, flow-b, flow-b for retries). It’s not a shape that Maestro emits.

  2. Backwards compatibility — Changing this to concat would likely break existing customer aggregation logic that depends on the current final_report_path shape.

@hSATAC hSATAC force-pushed the ash/eng-19927-only-retry-failed-maestro-flows-in-reuse_devices-mode branch from 64b5fa4 to 08a9151 Compare April 30, 2026 07:27
@hSATAC hSATAC force-pushed the ash/eng-19927-port-bash-maestro-step-to-typescript branch from 9806cfe to 9d16d62 Compare April 30, 2026 07:27
@hSATAC hSATAC force-pushed the ash/eng-19927-only-retry-failed-maestro-flows-in-reuse_devices-mode branch from 08a9151 to 11f2a04 Compare April 30, 2026 07:44
@hSATAC hSATAC force-pushed the ash/eng-19927-port-bash-maestro-step-to-typescript branch from 9d16d62 to b3f6380 Compare April 30, 2026 07:44
@hSATAC hSATAC force-pushed the ash/eng-19927-only-retry-failed-maestro-flows-in-reuse_devices-mode branch from aa0cf30 to e222a62 Compare April 30, 2026 09:38
Base automatically changed from ash/eng-19927-port-bash-maestro-step-to-typescript to main April 30, 2026 09:47
hSATAC added 2 commits April 30, 2026 17:47
In `reuse_devices` mode, narrow the flow_paths to only the failing flows
on retry. After a failed attempt, parse the per-attempt JUnit XML to
identify the flows that failed and pass just those to the next maestro
invocation.

Falls back to retrying everything when the JUnit cannot be trusted
(missing testcases, conflicting names, malformed XML).

Replaces copy-latest with mergeJUnitReports — picks the latest attempt's
result per flow across retries — so the merged final report reflects the
best outcome each flow achieved. Falls back to copy-latest on data errors.

Signed-off-by: Ash Wu <hsatac@gmail.com>
Signed-off-by: Ash Wu <hsatac@gmail.com>
@hSATAC hSATAC force-pushed the ash/eng-19927-only-retry-failed-maestro-flows-in-reuse_devices-mode branch from e222a62 to 14e6591 Compare April 30, 2026 09:48
@github-actions
Copy link
Copy Markdown

⏩ The changelog entry check has been skipped since the "no changelog" label is present.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no changelog PR that doesn't require a changelog entry

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants