doc: create ai-guidelines and include to CONTRIBUTING#62105
doc: create ai-guidelines and include to CONTRIBUTING#62105RafaelGSS wants to merge 11 commits intonodejs:mainfrom
Conversation
|
Review requested:
|
|
There may be some ideas we can borrow from https://llvm.org/docs/AIToolPolicy.html - for example "good first issue" should not be picked up by AI is a good one. |
I took inspiration from https://github.com/zulip/zulip/blob/main/CONTRIBUTING.md#ai-use-policy-and-guidelines |
| * **Verify accuracy** of any LLM-generated content before including it in a | ||
| PR description or comment. | ||
| * **Complete pull request templates fully** rather than replacing them with | ||
| LLM-generated summaries. |
There was a problem hiding this comment.
Do we have a template? I thought those are for issues, not PRs.
There was a problem hiding this comment.
Not strictly a template: https://github.com/nodejs/node/blob/main/.github/PULL_REQUEST_TEMPLATE.md?plain=1
There was a problem hiding this comment.
It's not possible to fulfil the instructions "Complete pull request templates fully" based on the contents of https://github.com/nodejs/node/blob/main/.github/PULL_REQUEST_TEMPLATE.md?plain=1 so it looks like this sentence needs to be removed.
There was a problem hiding this comment.
I'd be against this contribution policy update. While many different opinions exist on the licensing terms of the code produced by LLMs, my opinion is that the generated code isn't explicitly licensed and attributed to the original authors so it cannot be considered open source regardless of the used prompt.
| * **Verify accuracy** of any LLM-generated content before including it in a | ||
| PR description or comment. | ||
| * **Complete pull request templates fully** rather than replacing them with | ||
| LLM-generated summaries. |
There was a problem hiding this comment.
| * **Verify accuracy** of any LLM-generated content before including it in a | |
| PR description or comment. | |
| * **Complete pull request templates fully** rather than replacing them with | |
| LLM-generated summaries. | |
| * **Verify accuracy** of any LLM-generated content before including it in a | |
| PR description or comment. |
|
In #61478 (comment) , regarding the usage of Claude Code, @mcollina suggested:
I added it to the TSC agenda tomorrow for awareness/context collection before moving to a proper vote. @indutny sorry about the short notice since this is just one day ahead of the meeting, but if you'd like to join the meeting to present your points please let us know. |
mcollina
left a comment
There was a problem hiding this comment.
lgtm with a sentence removed
|
@joyeecheung thanks for considering me for this! I'd be happy to join if the time isn't in conflict with my work meetings tomorrow. Could you send me an invite, please? |
|
nodejs/TSC#1830 this is the issue. It might be reschedueld to the next meeting if there is low attedance given the timezone. I think we can schedule the discussion for April 1st which will be in the morning PT so a lot of people can join, and I would table the vote for that session. I would try to get an answer from the Board by then. |
There was a problem hiding this comment.
After today's Summit discussion, I'd like to recommend that we adjust messaging to be "discouraged but allowed". I think that would generally represent the sentiment that was expressed and hit the median of the collaboratorship's opinion.
If people are chill with that, happy to submit prose suggestions.
| acknowledged honestly (e.g., via an `Assisted-by:` tag in the commit | ||
| metadata) so that reviewers have appropriate context. | ||
|
|
||
| Pull requests that consist of AI-generated code the contributor has not |
There was a problem hiding this comment.
| Pull requests that consist of AI-generated code the contributor has not | |
| Pull requests that contain AI-generated code the contributor has not |
| every change they propose. Pull requests consisting of AI-generated code the | ||
| contributor has not personally understood, tested, and verified will likely be closed | ||
| without review. |
There was a problem hiding this comment.
| every change they propose. Pull requests consisting of AI-generated code the | |
| contributor has not personally understood, tested, and verified will likely be closed | |
| without review. | |
| every change they propose. Pull requests containing AI-generated code the | |
| contributor has not personally understood, tested, and verified will likely be closed | |
| without review. |
| * **Test thoroughly.** AI-generated code must pass the full test suite and | ||
| any manually written tests relevant to the change. Existing tests should not | ||
| be removed or modified without human verification. Do not rely on the LLM | ||
| to assess correctness. |
There was a problem hiding this comment.
Some hallucination machines may write tests based on the implementation of a feature instead of considering the expected behavior of the feature.
| * **Test thoroughly.** AI-generated code must pass the full test suite and | |
| any manually written tests relevant to the change. Existing tests should not | |
| be removed or modified without human verification. Do not rely on the LLM | |
| to assess correctness. | |
| * **Test thoroughly.** AI-generated code must pass the full test suite and | |
| any manually written tests relevant to the change. Existing tests should not | |
| be removed or modified without human verification. Do not rely on the LLM | |
| to assess correctness. It is crucial to manually verify the correctness of | |
| tests against the expected behavior of the feature being tested, | |
| independently of the feature's implementation. |
There was a problem hiding this comment.
Can we extract this and also apply to human-written code too? (but keep it here also)
I believe we have existing decade-old tests that should be subject to this
Mostly "increase coverage" PRs are highly subject to documenting bugs instead of fixing them (not just in Node.js, but overall)
|
The moderation policy would also need to be updated to align with this new policy doc (the moderation policy expressly forbids AI contributions from non-collaborators). |
Opened nodejs/admin#1059 |
| never be "I'm not sure. The AI did it." | ||
|
|
||
| If AI tools assisted in generating a contribution, that should be | ||
| acknowledged honestly (e.g., via an `Assisted-by:` tag in the commit |
There was a problem hiding this comment.
I would prefer for mostly-ai-driven (but reviewed by human) contributions to be explicitly labeled as mostly-ai-driven.
Assisted-by obscures that to an extent.
E.g. if the code is mostly auto-generated, it should be the main author and the human only a co-author and a Signed-off-by to signal acceptance/origin in terms of DCO.
There was a problem hiding this comment.
Hm.
The OpenJSF policy (at https://openjsf.cdn.prismic.io/openjsf/aca4d5GXnQHGZDiZ_OpenJS_AI_Coding_Assistants_Policy.pdf) specifies Assisted-by for AI assistants indeed.
Still, non ideal, but likely not something that can be solved at a project level?
There was a problem hiding this comment.
E.g. if the code is mostly auto-generated, it should be the main author and the human only a co-author and a Signed-off-by to signal acceptance/origin in terms of DCO.
I disagree with having tooling be the contribution's main author. This isn't aligned with taking responsibility for the contributions (signed-off-by !== author) and doesn't sit well with changelog entries.
While you may use assistants for the content it is ultimately you, the human, we expect a contribution from.
There was a problem hiding this comment.
an option could be to also add a label
Co-Authored-By: Beth Griggs <bethanyngriggs@gmail.com>
Co-authored-by: Aditi <62544124+Aditi-1400@users.noreply.github.com> Co-authored-by: Joyee Cheung <joyeec9h3@gmail.com>
Co-authored-by: Tobias Nießen <tniessen@tnie.de> Co-authored-by: Antoine du Hamel <duhamelantoine1995@gmail.com> Co-authored-by: Mike McCready <66998419+MikeMcC399@users.noreply.github.com> Co-authored-by: Efe <dogukankrskl@gmail.com>
c9e7c08 to
3fa6b77
Compare
3fa6b77 to
7944826
Compare
There was a problem hiding this comment.
See #62105 (comment) - I think without applying my suggestion or something along the lines, the current wording encourages mentioning brand or trademark names in the commit message, in general I think this is inappropriate regardless it's about AI or not and turns this non-profit project into a marketing ground of vendors. If somehow there are non-profit agent/model names that are neither a brand nor a trademark, I think that's fine to include it int he commit message. But if they are brands/trademarks for profit, then -1 from me to put them in commit message metadata and they should belong to e.g. PR descriptions.
Co-authored-by: Joyee Cheung <joyeec9h3@gmail.com>
joyeecheung
left a comment
There was a problem hiding this comment.
LGTM % a suggestion to spell out exactly how the "approval of automated tooling" works
Co-authored-by: Joyee Cheung <joyeec9h3@gmail.com>
As discussed in today's TSC meeting.
cc: @nodejs/tsc @BridgeAR