Skip to content

Platform Handle abstraction + connection negotiation code#45

Merged
pawelchcki merged 1 commit into
mainfrom
pawel/platform_abstraction
Oct 25, 2022
Merged

Platform Handle abstraction + connection negotiation code#45
pawelchcki merged 1 commit into
mainfrom
pawel/platform_abstraction

Conversation

@pawelchcki

@pawelchcki pawelchcki commented Aug 25, 2022

Copy link
Copy Markdown
Contributor

First PR to merge in the IPC work
with basic framework for safe handling of file descriptors both within the process and across processes.

With the recent stabilization of https://doc.rust-lang.org/std/os/unix/io/struct.OwnedFd.html I've removed some no longer necessary abstractions that basically reimplemented OwnedFd. I've used io-lifetimes crate to backport the implementation to older rust versions.

The two biggest parts of this PR that lay groundwork for the IPC mechanism are:

PlatformHandle<T> - which allows safe, ref-counted sharing of a FD both within the process and also facilitates transferring it between processes.

  • it can be used across threads to send data through a shared socket without using locks
  • avoiding mutexes is important as the process could be forked (e.g. in PHP) at anytime

Liaison trait - which allows IPC to be set up between processes

  • it implements mechanism to establish a shared named socket on the system
  • only one process will take ownership of the listener type on the socket
  • the rest will only connect to it when needed

@pawelchcki pawelchcki requested review from a team as code owners August 25, 2022 13:53
@pawelchcki pawelchcki force-pushed the pawel/platform_abstraction branch 2 times, most recently from 57f8c14 to 2fca4a2 Compare August 31, 2022 13:27
@pawelchcki pawelchcki force-pushed the pawel/platform_abstraction branch 4 times, most recently from 58c0e45 to 1e0474a Compare September 6, 2022 09:25
Comment thread ddtelemetry/src/worker.rs
expected_scheduled_after - Duration::from_millis(1) < scheduled_in
&& scheduled_in < expected_scheduled_after
);
assert!(expected_scheduled_after - Duration::from_millis(5) < scheduled_in);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paullegranddc I had to up the Delta, as this test was failing intermitently for me on my laptop

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Timing tests are flaky in essence if the clock is not an injected dependency sadly

@pawelchcki pawelchcki force-pushed the pawel/platform_abstraction branch 2 times, most recently from 0442b10 to c074d87 Compare September 6, 2022 09:50

@paullegranddc paullegranddc left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, since there is use of the actual code, it's kind of hard to juge the API

Comment thread ddtelemetry/src/ipc/platform/unix/locks.rs
Comment thread ddtelemetry/src/ipc/platform/unix/locks.rs
Comment thread ddtelemetry/src/ipc/platform/unix/locks.rs Outdated
Comment thread ddtelemetry/src/ipc/platform/unix/locks.rs Outdated
Comment thread ddtelemetry/src/ipc/platform/unix/platform_handle.rs Outdated
Comment thread ddtelemetry/src/ipc/platform/unix/sockets.rs Outdated
Comment thread ddtelemetry/src/ipc/platform/unix/sockets.rs Outdated
Comment thread ddtelemetry/Cargo.toml Outdated
@pawelchcki pawelchcki force-pushed the pawel/platform_abstraction branch from 81bd913 to 2be105b Compare October 6, 2022 13:34
Comment thread ddtelemetry/src/ipc/platform/unix/platform_handle.rs
Comment thread ddtelemetry/src/ipc/platform/unix/locks.rs
Comment thread ddtelemetry/src/ipc/platform/unix/platform_handle.rs Outdated
@pawelchcki pawelchcki force-pushed the pawel/platform_abstraction branch from ed10edb to 8bbba91 Compare October 17, 2022 11:09
Comment thread ddtelemetry-ffi/cbindgen.toml
Comment thread ddtelemetry-ffi/src/unix.rs
Comment thread ddtelemetry/src/worker.rs
expected_scheduled_after - Duration::from_millis(1) < scheduled_in
&& scheduled_in < expected_scheduled_after
);
assert!(expected_scheduled_after - Duration::from_millis(5) < scheduled_in);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Timing tests are flaky in essence if the clock is not an injected dependency sadly

Comment thread ddtelemetry/src/ipc/sidecar/unix.rs Outdated
Comment thread ddtelemetry/src/ipc/sidecar/unix.rs Outdated
@pawelchcki pawelchcki force-pushed the pawel/platform_abstraction branch from 11f2a6c to 942a0e8 Compare October 25, 2022 12:22
Includes tools for safe handling of file descriptors, locks and forking
a daemon subprocess / sidecar.

With the recent stabilization of https://doc.rust-lang.org/std/os/unix/io/struct.OwnedFd.html I've removed some no longer necessary abstractions that basically reimplemented OwnedFd. I've used io-lifetimes crate to backport the implementation to older rust versions.

The two biggest parts of this PR that lay groundwork for the IPC mechanism are:

PlatformHandle<T> - which allows safe, ref-counted sharing of a FD both within the process and also facilitates transferring it between processes.

it can be used across threads to send data through a shared socket without using locks
avoiding mutexes is important as the process could be forked (e.g. in PHP) at anytime
Liaison trait - which allows IPC to be set up between processes

it implements mechanism to establish a shared named socket on the system
only one process will take ownership of the listener type on the socket
the rest will only connect to it when needed
@pawelchcki pawelchcki force-pushed the pawel/platform_abstraction branch from 942a0e8 to 7bc812a Compare October 25, 2022 12:22
@pawelchcki pawelchcki merged commit 58b6edc into main Oct 25, 2022
@pawelchcki pawelchcki deleted the pawel/platform_abstraction branch October 25, 2022 12:30
ivoanjo added a commit that referenced this pull request Jun 18, 2025
* Fix typos in error constant and message

* WIP: Asynchronous cancellation for ddprof_ffi_ProfileExporterV3_send

By default, Ruby (and Java) set a profile export timeout of 30 seconds
AKA in the worst case, a call to `ddprof_ffi_ProfileExporterV3_send`
can block for that amount of time.

For Ruby in particular, we want to call
`ddprof_ffi_ProfileExporterV3_send` from a Ruby thread, but there
needs to be a way to interrupt a long-running `send` in case the
application wants to exit (e.g. the user pressed ctrl+c).

Without a way to interrupt a `send`, the Ruby VM will hang until
the timeout is hit.

To fix this, we make use of tokio's `CancellationToken`: this
concurrency utility is built so that it can be used to signal from
one thread to another that we want to return early.

**NOTE**: I'm marking this PR as WIP since I need some help with
the lifetime of the `CancellationToken`: right now it's getting
dropped by the functions that use it, whereas it should only be
dropped when `ddprof_ffi_CancellationToken_drop` manually does it.

* Simplify `send` function

* Fix crashes when using CancellationToken and add example

The insight needed to fix my first attempt at implementing cancellation
is that the `Box` type automatically drops whatever it was pointing to
when the current scope ends.

Thus, to leak something out of Rust's control, we need to use
`Box::into_raw(Box::new(ThingToBeLeaked()))`, which returns a
`*mut ThingToBeLeaked` aka a raw, unsafe pointer.

Then, to borrow that something for use, we can use
`unsafe { raw_pointer.as_ref() }` to get a `&ThinkToBeLeaked` back.

Finally, when we want to free that thing again, we can turn it
back into a box; here I'm doing it implicitly, but one other
way is to use `Box::from_raw`. Then `drop` can be used again,
explicitly or implicitly.

* Remove irrelevant comment from ffi header

* Run `cargo fmt`

* Minor tweak to comment

* Make linter happy

* Add CancellationToken for sending HTTP requests

* Fix format

* Fix name on Windows

* Make `CancellationToken` optional when calling `Request::send`

* Introduce `ddprof_ffi_CancellationToken_clone` and update example to use it

A cloned CancellationToken is connected to the CancellationToken it was
created from.
Either the cloned or the original token can be used to cancel or
provided as arguments to send.
The useful part is that they have independent lifetimes and can be
dropped separately.

Thus, it's possible to do something like:

```c
cancel_t1 = ddprof_ffi_CancellationToken_new();
cancel_t2 = ddprof_ffi_CancellationToken_clone(cancel_t1);

// On thread t1:
    ddprof_ffi_ProfileExporterV3_send(..., cancel_t1);
    ddprof_ffi_CancellationToken_drop(cancel_t1);

// On thread t2:
    ddprof_ffi_CancellationToken_cancel(cancel_t2);
    ddprof_ffi_CancellationToken_drop(cancel_t2);
```

Without clone, both t1 and t2 would need to synchronize to make sure
neither was using the cancel before it could be dropped. With clone,
there is no need for such synchronization, both threads have their
own cancel and should drop that cancel after they are done with it.

Co-authored-by: Levi Morrison <levi.morrison@datadoghq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants