ffi: add experimental fast FFI call API #63068
ffi: add experimental fast FFI call API #63068ShogunPanda wants to merge 1 commit intonodejs:mainfrom
Conversation
|
Review requested:
|
|
At first glance, some observations:
|
| ffi_args_heap.resize(nargs); | ||
| values = values_heap.data(); | ||
| ffi_args = ffi_args_heap.data(); | ||
| } |
There was a problem hiding this comment.
This is exactly what MaybeStackBuffer is there for
| } | ||
|
|
||
| return true; | ||
| } |
There was a problem hiding this comment.
C++ style: This should return std::optional<std::pair<FastFFIType, CTypeInfo>>
| std::shared_ptr<void> fast_code; | ||
| std::vector<v8::CTypeInfo> fast_arg_info; | ||
| std::unique_ptr<v8::CFunctionInfo> fast_function_info; | ||
| std::unique_ptr<v8::CFunction> fast_c_function; |
There was a problem hiding this comment.
Feel free to leave a TODO for me to clean up the allocation management here, having 10+ separate heap allocations for each function seems like a lot
| @@ -129,6 +129,9 @@ class EnvironmentOptions : public Options { | |||
| bool experimental_addon_modules = EXPERIMENTALS_DEFAULT_VALUE; | |||
| bool experimental_eventsource = EXPERIMENTALS_DEFAULT_VALUE; | |||
| bool experimental_ffi = EXPERIMENTALS_DEFAULT_VALUE; | |||
| #if HAVE_FAST_FFI | |||
| bool experimental_fast_ffi = EXPERIMENTALS_DEFAULT_VALUE; | |||
| #endif | |||
There was a problem hiding this comment.
Just to echo what @bengl said – It seems like having the flag available unconditionally would not break anything and just make things easier (e.g. save you the file reexecution jumps you're hooping through in the tests).
There was a problem hiding this comment.
This is first-party Node.js core code, right? It probably shouldn't live in deps/ in the long run
| allocate a temporary UTF-8 copy. For performance-sensitive C string APIs, encode | ||
| the string before invoking the native function, for example with `TextEncoder`, | ||
| and declare the parameter as `buffer` or `arraybuffer`. Include the trailing | ||
| `\0` byte when the native API expects a NUL-terminated string. |
There was a problem hiding this comment.
... but that's also a temporary UTF-8 copy, just like passing a string directly would have been?
| kBuffer = 12, | ||
| }; | ||
|
|
||
| bool ToToFastFFIType(ffi_type* type, |
| }; | ||
|
|
||
| bool ToToFastFFIType(ffi_type* type, | ||
| const std::string& type_name, |
There was a problem hiding this comment.
| const std::string& type_name, | |
| std::string_view type_name, |
| #if HAVE_FAST_FFI | ||
| PrepareFastFunction(env, fn.get()); | ||
| const CFunction* fast_c_function = fn->fast_c_function.get(); | ||
| #endif |
There was a problem hiding this comment.
| #endif | |
| #else | |
| const CFunction* fast_c_function = nullptr; | |
| #endif |
that lets you get rid of the much larger conditional below here
Unfortunately that's not the case, as far as I understood this problem. V8 Fast API optimize JS -> C++ entry, Cranelift generates the native wrapper that performs the ABI-correct call to the FFI target. FFI signatures are declared at runtime, while V8 Fast API requires a concrete native signature for each fast callable. Cranelift is what turns the runtime FFI signature into such a concrete callable. A libffi-only Fast API path is possible, but only for a finite set of predefined C++ wrapper signatures, and it would still route through ffi_call(). That would not provide the universal fast path this PR is trying to introduce.
@addaleax Also concurred on this below. I'll remove it.
I'll attach some benchmarks tomorrow so we can compare.
As far as I understand, TCC is LGPL which is not usable in Node.js? Am I wrong? |
Does it? I haven't tried it out myself, but there are CFunction(const void* address, const CFunctionInfo* type_info);
CFunctionInfo(const CTypeInfo& return_info, unsigned int arg_count,
const CTypeInfo* arg_info,
Int64Representation repr = Int64Representation::kNumber);constructors available, which should allow constructing CFunction instances with runtime-supplied type information, no? |
|
I get a little confused here. I guess you're right, but what are they invoking? How are the target functions built? |
|
@ShogunPanda Yeah, so, looking at the code in Like @bengl said, the wrapper logic and its (massive) scaffolding is fairly independent from the core V8 fast call integration, and making these separate PRs (and separate decisions) seems wise. |
As for your questions – they are invoking native functions living in the process's memory, and they are typically built with a compiler. But these don't seem like actual answers to your questions, so I'm not sure I understand what you're saying here |
|
After a brainstorm sessing with @bengl I finally got a confirmation of my interpretation of your request and run a local spike. I checked the direct CFunction(address, CFunctionInfo*) path. It does not work for plain FFI symbols because V8 Fast API signatures include the JS receiver as the first C argument. A native FFI symbol such as int32_t(int32_t) therefore does not match a JS call with one argument; V8 expects a fast callback shaped like int32_t(Local receiver, int32_t). So direct V8 Fast API can use runtime type info, but it still requires an embedder-compatible wrapper. For runtime FFI signatures that means either a finite set of predefined wrappers or generated trampolines. Since I want to have a the "most universal solution" possible, I don't want to introduce predefined wrappers. I've evaluated other possible solutions but so far Cranelift seems to be the only viable. Do you concur on this or am I missing anything? |
Is that requirement made explicit or documented anywhere? I did try manually to remove the receiver argument from some of the Node.js built-in fast API call functions, and it didn't seem to make a difference (obviously this only works if the second argument isn't also a
I think it's still worth thinking about ways in which to remove restrictions on the V8 side (which, yes, that has annoying implications around timelines because it's non-trivial upstream work), but it seems like something that would be significantly cleaner in the medium term |
|
@addaleax Can you point me where you successfully removed it? I can try something similar. |
|
@ShogunPanda Hm, it looks like this just worked "silently" because without You're right that "shifting" away the first argument cannot really be done without some runtime/JIT compilation. I don't know if Cranelift is worth the overhead, since we're using it for a very very specific use case, and it would be possible to implement this for x64/arm64 ourselves, but I also see how an actual compiler library is something worth thinking about in that case. I think I'd still have a mild preference for seeing if there are ways to achieve the same goals by collaborating with the V8 team. Removing the need for a receiver argument, for example, should not be too complex (other than the requirement to modify V8 for this). |
|
I think given all that, for now, the best move is still to start with a PR that adds V8 Fast Calls alone, with no other changes, and then hold off on the rest until the modifying V8 is explored. @ShogunPanda SGTY? |
Signed-off-by: Paolo Insogna <paolo@cowtech.it> Assisted-By: OpenAI:GPT-5.5 <openai/gpt-5.5>
|
Benchmarks ony my machine (Apple M2 Max on MacOS 26): |
Review Guide: Fast FFI
TODO
Summary
This PR adds a V8 Fast API-backed call path for experimental
node:ffi.Fast FFI is not a separate user-facing feature or flag. It is used automatically
for eligible signatures when
--experimental-ffiis enabled. Unsupportedsignatures continue through the SharedBuffer or generic libffi paths.
For implementation details, see:
doc/contributing/ffi-fast-api-internals.mdWhat Changed
Native implementation:
src/ffi/fast.hsrc/ffi/fast.ccsrc/ffi/platforms/*.ccsrc/node_ffi.ccsrc/node_ffi.hsrc/env_properties.hJavaScript routing:
lib/ffi.jslib/internal/ffi/fast-api.jslib/internal/ffi-shared-buffer.jsTests, benchmarks, docs:
test/ffi/test-ffi-fast-buffer.jstest/ffi/test-ffi-shared-buffer.jstest/ffi/test-ffi-calls.jsbenchmark/ffi/*.jsdoc/api/ffi.mddoc/contributing/ffi-fast-api-internals.mdKey Design Points
src/ffi/platforms/*.cc.lib/ffi.jsowns wrapper orchestration.lib/internal/ffi-shared-buffer.jsowns only SharedBuffer wrapping.lib/internal/ffi/fast-api.jsowns Fast API pointer/string/buffer conversions.Reviewer Focus
Please pay particular attention to:
FastFFIMetadatalib/ffi.jsname,length, andpointeron wrappersCorrectness Checklist
i64,u64, and pointer values preserve BigInt behavior.f32/f64preserveNaN, infinities, and-0.byteOffset.Disclaimer
Assisted-By: OpenAI:GPT-5.5 <openai/gpt-5.5>