Skip to content

Preserve last error in call-counting stubs#60886

Merged
jkotas merged 1 commit into
dotnet:mainfrom
jkotas:issue-60819
Oct 26, 2021
Merged

Preserve last error in call-counting stubs#60886
jkotas merged 1 commit into
dotnet:mainfrom
jkotas:issue-60819

Conversation

@jkotas

@jkotas jkotas commented Oct 26, 2021

Copy link
Copy Markdown
Member

Fixes #60819

@jkotas jkotas requested a review from kouvel October 26, 2021 19:08
@ghost ghost added the area-Interop-coreclr label Oct 26, 2021
@jkotas jkotas requested a review from jkoritzinsky October 26, 2021 19:08
@tannergooding

Copy link
Copy Markdown
Member

Is this being considered for backport or servicing in .NET 6 so users can rely on Marshal.GetLastSystemError?

@jkotas

jkotas commented Oct 26, 2021

Copy link
Copy Markdown
Member Author

Yes, I think we should backport it.

@jkotas

jkotas commented Oct 26, 2021

Copy link
Copy Markdown
Member Author

/backport to release/6.0

@github-actions

Copy link
Copy Markdown
Contributor

Started backporting to release/6.0: https://github.com/dotnet/runtime/actions/runs/1387213911

@kouvel kouvel left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@tannergooding tannergooding left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Is there anything additional we can or should do to help catch cases where BEGIN_PRESERVE_LAST_ERROR should be added to avoid accidentally breaking this, particularly in rare edge cases, in the future?

@jkotas

jkotas commented Oct 26, 2021

Copy link
Copy Markdown
Member Author

We intentionally corrupt last error in checked builds of the runtime #60819 (comment) to increase chances of the problem being caught. It would take a lot longer to catch it without this instrumentation.

I am not sure what we can do better. These sort of bugs are a general problem with unpredictable runtime features like GC or tiered compilation.

It is also fairly common to corrupt the last error in managed code. I think the best fix for that would be to stop depending on thread-local storage for last error, and instead return it as a tuple with the actual result, similar to how it is done in e.g. golang. DllImport source generator may help with that - we would assume that the DllImport source generator can get it right and capture the last error before somebody gets a chance to corrupt it.

@jkotas jkotas merged commit f734307 into dotnet:main Oct 26, 2021
@jkotas jkotas deleted the issue-60819 branch October 26, 2021 22:04
@danmoseley

Copy link
Copy Markdown
Member

Why was seemingly only this test affected? There are many other places we test last error that I would expect to cause the wrong exception etc.

@jkotas

jkotas commented Oct 27, 2021

Copy link
Copy Markdown
Member Author

The error corruption will only happen when we execute slow path of tiered compilation call counting stub. The call counting stub is installed some time after startup and the slow path of call counting stub is executed after it got executed certain number of times. This test was lucky enough to hit the slow path of the tiered compilation call counting stub, and be sensitive to last error corruption (the last error corruption won't cause real problem most of time).

@danmoseley

Copy link
Copy Markdown
Member

Would there be value in running all our tests e.g. locally with the slow path forced?

@kouvel

kouvel commented Oct 27, 2021

Copy link
Copy Markdown
Contributor

Yea it might be useful to lower the call count threshold for debug/checked builds, updated in #60945

@ghost ghost locked as resolved and limited conversation to collaborators Nov 27, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Test failure Color.RedirectedOutputDoesNotUseAnsiSequences

5 participants