Skip to content

feat(sandbox): Landlock PTY + direct TCP + DNS (+ #708 basic auth credential resolve)#2

Open
kosaku-sim wants to merge 5 commits intobase/upstream-syncedfrom
fix/basic-auth-credential-resolve
Open

feat(sandbox): Landlock PTY + direct TCP + DNS (+ #708 basic auth credential resolve)#2
kosaku-sim wants to merge 5 commits intobase/upstream-syncedfrom
fix/basic-auth-credential-resolve

Conversation

@kosaku-sim
Copy link
Copy Markdown
Member

Summary

Custom patches on top of NVIDIA upstream for simount's AutoDev sandbox deployment. Base branch base/upstream-synced is a snapshot of nvidia/OpenShell:main (commit 3dd6d51c) to keep the diff clean (5 commits).

Commits (latest first)

  1. feat(sandbox): add PTY devices to proxy-mode baseline read-write paths (6be4e01b)

    • Adds /dev/ptmx and /dev/pts to PROXY_BASELINE_READ_WRITE so Landlock does not block VS Code Remote-SSH's node-pty allocation for integrated terminals.
    • Extends unit tests covering proxy-mode baseline enrichment.
  2. fix: broad TCP 443 ACCEPT instead of per-IP rules for direct hosts (4cb2f388)

    • Replaces per-IP iptables ACCEPT with broad TCP 443 ACCEPT when OPENSHELL_DIRECT_TCP_HOSTS is set. Google API DNS round-robin invalidates per-IP rules.
  3. feat: allow direct TCP 443 for OPENSHELL_DIRECT_TCP_HOSTS (b91ae834)

    • Bypass proxy for HTTPS to pre-declared hosts (Slack Socket Mode endpoints etc.).
  4. fix: add IP forwarding and NAT for DNS through sandbox veth (59ecdd67)

    • Host-side veth forwarding + MASQUERADE so DNS return traffic reaches sandbox netns.
  5. fix: allow UDP DNS to cluster nameserver in sandbox netns (e63ada7c)

    • iptables ACCEPT for UDP 53 to cluster DNS.

Why a fork branch (not upstream PR)

  • Patches 2–5 are specific to Kubernetes cluster DNS + Slack Socket Mode workflows and are tracked in internal docs.
  • Patch 1 is generally useful (node-pty support under Landlock) but needs NVIDIA input on preferred approach (enumerate devices vs broaden baseline) before an upstream PR.

Deployment status

Build `0.0.18-dev.57+g4cb2f388` is currently running in production sandbox. This branch is already rebased on top of upstream PR NVIDIA#708 (L7 credential injection for Basic auth) which resolves git-over-HTTPS 401 for private repos.

Test plan

  • `cargo build --release -p openshell-sandbox` (aarch64 native, Rust 1.88, 4m23s)
  • `git clone https://github.com/simount/.git` from sandbox succeeds (was 401 before deploy)
  • Sandbox pod restart via `autodev-reconcile.sh restart-sandbox` (cred backup/restore)
  • VS Code Remote-SSH integrated terminal allocates PTY without EACCES
  • Unit tests in `baseline_tests` module (local cargo test)

kosaku-sim and others added 5 commits April 11, 2026 06:58
The sandbox iptables rules unconditionally REJECT all UDP traffic,
which blocks DNS resolution for libraries that bypass HTTP_PROXY
(e.g. Node.js ws used by @slack/socket-mode).

Add an ACCEPT rule for UDP port 53 to the nameserver from
/etc/resolv.conf (or OPENSHELL_DNS_SERVER env override) before
the blanket UDP REJECT, so sandboxed processes can resolve
external hostnames without opening a broad UDP hole.

Fixes: NVIDIA/NemoClaw#409
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The DNS ACCEPT iptables rule alone is insufficient because the
sandbox netns routes everything via 10.200.0.1 (host veth).
DNS UDP packets reach the host side but the pod network cannot
route responses back to 10.200.0.2 (sandbox IP).

Enable IP forwarding on the host veth and add MASQUERADE so DNS
packets appear to come from the pod IP, allowing CoreDNS to
respond correctly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Libraries like Node.js ws (used by @slack/socket-mode) resolve DNS
then connect directly to the resolved IP on TCP 443, ignoring
HTTP_PROXY. The sandbox iptables REJECT all bypass TCP, breaking
these connections even after DNS resolution succeeds.

Add OPENSHELL_DIRECT_TCP_HOSTS env var (comma-separated hostnames).
At sandbox netns setup, resolve these hosts and install:
- iptables ACCEPT for TCP 443 to resolved IPs (sandbox side)
- MASQUERADE + FORWARD rules (host side) for return routing

This pairs with the DNS ACCEPT rule from the previous commit to
provide full direct connectivity for proxy-unaware libraries.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
DNS round-robin causes Google API IPs to change frequently, breaking
per-IP iptables ACCEPT rules and causing 401/timeout errors. Replace
per-IP filtering with broad TCP 443 ACCEPT when OPENSHELL_DIRECT_TCP_HOSTS
is set — apps still route through HTTPS_PROXY for non-NO_PROXY hosts,
so per-IP iptables filtering adds brittleness without security benefit.

Also adds OPENSHELL_DIRECT_TCP_HOSTS entries to NO_PROXY env var so
HTTP clients skip the proxy for those hosts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
VS Code Remote-SSH launches its server under the sandbox policy, and the
server later allocates PTYs for the integrated terminal via node-pty.
Landlock blocks device-file opens unless explicitly whitelisted, so PTY
allocation fails with EACCES unless both the PTY multiplexer (/dev/ptmx)
and the slave PTY directory (/dev/pts) are writable.

Also extend unit tests: baseline_read_write_includes_core_runtime_and_pty_paths,
enrich_proto_baseline_paths_adds_pty_paths_for_proxy_mode, and
runtime_device_paths_are_not_prepared_for_chown.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant