Skip to content

Commit d9c585b

Browse files
v0.8.12
# v0.8.12 — GPT-5.5 + DeepSeek providers, Pi tool registration restored, composer & diff hardening ## Features - **GPT-5.5 is now the default for `openai` and `openai-codex`** — Pi SDK 0.70.0 added `gpt-5.5` to the OpenAI catalog, so `PI_PREFERRED_DEFAULTS` now picks it as the default model for both the `openai` and `openai-codex` auth providers instead of whatever the SDK returned first. New API-key connections and Craft Agents Backend (OpenAI) connections land on `gpt-5.5` out of the box; existing connections keep their explicit model choice. Fixes [#597](#597). - **DeepSeek is now a supported Pi-backed provider** — Adds DeepSeek to `PROVIDER_METADATA` (dashboard URL), `PI_PROVIDER_DISPLAY` (label + placeholder), and `PI_PREFERRED_DEFAULTS` (`deepseek-v4-pro` / `deepseek-v4-flash`) so connections default to a modern model instead of whatever the Pi SDK returns first. The renderer picks up the new provider automatically via `PI_AUTH_PROVIDER_DOMAINS` (`deepseek.com`) for favicon resolution, the API-setup preset, and the settings page label. CLI gains a `DEEPSEEK_API_KEY` env key and extracts `resolveApiKey`, `shouldSetupLlmConnection`, and `getProviderDisplayName` as testable exports; `--base-url` auto-setup now works for non-anthropic providers and the validate step shares the same resolver path. Fixes [#600](#600). ## Improvements - **`source_test` now auto-enables and auto-restarts the turn so tools become callable immediately** — Previously `source_test` only validated a source; users with a valid config but `enabled: false` had to flip the flag manually and restart the session, even though every check passed. The tool now flips `enabled: true` when needed and triggers the session's existing `onSourceActivationRequest` callback so the MCP/API servers are built and applied to the running agent. The follow-up fix (this release) routes the successful activation through the same `source_activated` + `auto_retry` machinery that already handled "tool not found on inactive source" errors: after activation, the current turn aborts cleanly and the renderer resends the user's original message with a `[{slug} activated]` suffix — giving the next `query()`/`handlePrompt` a fresh tool list with the new source live. This fixes a Claude-specific bug where `source_test` reported "tools available now" but the SDK had already frozen `mcpServers` at query-start, so `mcp__{slug}__*` tools were invisible to the model until the user typed another message. Pi behaves the same way for consistency (and also required a turn boundary — its subprocess only picks up new proxy tool defs on the next `handlePrompt`). Opt out with `autoEnable: false` to keep pure-validation behavior. No change to Codex or other backends without an activation callback — they still get the `enabled` flip and a clear "restart session to load tools" hint. - **`spawn_session` accepts `thinkingLevel`** — Agents can now set the reasoning level when delegating to a spawned session (`off` | `low` | `medium` | `high` | `xhigh` | `max`), instead of always inheriting the parent session's level or workspace default. Silently ignored on non-reasoning models (e.g. gpt-4o, gemini-2.5-flash): the Pi provider drivers and Claude SDK both gate the reasoning param on the model's capabilities, so passing `thinkingLevel` to a non-reasoning model is a safe no-op rather than an error. Also fixes a latent bug where `createSession({ thinkingLevel })` in the session-manager API was silently ignored — the option is now honored with caller → workspace → global precedence, matching how `permissionMode` already worked. Fixes [#462](#462). - **Real typecheck gate for `pi-agent-server`** — The package's `typecheck` script was aliased to `bun run build` (bundler, not `tsc`), so API-shape drifts from the Pi SDK uplift slipped through CI (see the Pi-subprocess tool-registration fix below for the concrete regression that escaped). Added a dedicated `tsc --noEmit -p tsconfig.typecheck.json` step, wired into `typecheck:all`, plus ambient shims for turndown/pdfjs-dist/bash-parser so the new typecheck doesn't need @types packages. Fixed the cascade of pre-existing type drifts it surfaced (`PiCredential` vs `AuthCredential`, `agent_end` event shape, `sdkTurnAnchor` enrichment, `CustomModelEntry` at the dynamic-register call site, `initConfig` nullability in queryLlm closures, and an incorrect generic in web-fetch's `result` helper). ## Bug Fixes - **WebUI "Add New Label" no longer launches the desktop app** — Typing `#<new-label>` in the WebUI chat input and clicking "Add New Label" previously opened a popover that, on submit, fired a `craftagents://action/new-session` deep link. The browser resolved that scheme by launching the Electron desktop app instead of creating the label in the browser. Root cause: the chat-input call site cherry-picked fields from the EditPopover config and dropped `inlineExecution: true`, falling back to the legacy same-window deep-link path (which happened to work inside Electron but broke across the WebUI ↔ OS boundary). Switched to a full config spread, matching how `AppShell` already invokes the same popover. - **Attachments no longer leak between sessions (for real this time)** — Attaching a file, switching sessions without sending, and switching back now restores the attachment in the original session across all four attach paths (file picker, OS drag-drop, clipboard paste, web drag). The first-pass fix assumed every attachment had a real OS path — true for Finder drag/OS picker, false for paste/web-drag where Chromium synthesises a `File` from a Blob with no disk origin — so draft refs fell back to filename-only values and failed to re-read on hydrate. The new persistence layer is hybrid: file-picker and OS-drag capture the absolute path via `webUtils.getPathForFile` (Electron 32+) and re-read on hydrate through a dedicated `file:readUserAttachment` RPC; paste and web-drag persist bytes inline in the draft (20 MB per-attachment cap — huge pastes log a warn and drop from the draft). Old 0.8.11-format drafts are rejected on load, so attachments saved by the previous broken release disappear once after upgrade instead of silently haunting the composer. Fixes [#572](#572). - **Custom URL scheme links now open the right app** — Clicking `obsidian://`, `vscode://`, `zed://`, `notion://`, `slack://`, and similar links in chat messages now dispatches to the OS protocol handler instead of being blocked (desktop) or rewritten to `https://<host>/obsidian://...` (WebUI). URL handling switched from a tight allowlist (`http/https/mailto/craftdocs`) to a blocklist of known-dangerous schemes (`javascript:`, `data:`, `vbscript:`, `blob:`, `file:`). The WebUI and Viewer now use an anchor-click fallback for non-http schemes so Chrome routes through the external-protocol dispatcher reliably. Fixes [#590](#590). - **`/compact` no longer times out prematurely on GPT sessions** — Manual compaction (including "Accept & compact" on a submitted plan) against Pi-backed OpenAI models failed after 60s because the subprocess RPC didn't leave room for GPT-5.4's long summary responses on large conversations. Bumped the timeout to 5 min — truly hung subprocesses are still caught by the stdio death watchdog. Claude sessions were unaffected (they use the SDK's native compact channel). - **Pi subprocess tool registration restored** — Pi SDK 0.70.0 quietly reshaped `CreateAgentSessionOptions.tools` from an array of tool objects into a `string[]` name allowlist. The subprocess kept passing `AgentTool[]`, so at runtime `allowedToolNames = new Set(objects)` and `.has(name)` returned `false` for every lookup — every custom tool got filtered out by `_refreshToolRegistry`'s allowlist guard, leaving the LLM with only the built-in `[read, bash, edit, write]`. Fix now routes tool objects through `customTools: ToolDefinition[]` plus a matching `tools: string[]` allowlist that includes every registered name, drops the private `_baseToolsOverride + _buildRuntime` defense-in-depth hack, and restores `grep`/`find`/`ls` that were last bundled pre-0.68 in the monolithic `codingTools` array. A regression test now locks the shape contract (every `customTools[].name` ∈ `tools` allowlist) so the next SDK uplift cannot silently drop tools again. - **Pi `call_llm` honors the requested model** — `queryLlm` routed `call_llm` through the `mini_completion` RPC, which only carried `prompt`. Every `call_llm` silently ran on the connection's mini model (often the stale `pi/gpt-5.1-codex-mini`), ignoring both `request.model` and `request.systemPrompt`. Introduced a new `llm_query` RPC that carries the full `LLMQueryRequest`; the subprocess delegates verbatim to the model-aware `queryLlm`. `PiAgent.queryLlm` tracks pending queries in a map with cleanup on result / generic error / subprocess exit, and the event-adapter `call_llm` override now only fills in `args.model` when absent (never overwrites explicit values). A round-trip invariant test guards the full request envelope byte-for-byte. Fixes [#596](#596). - **Pi mini completions pick a provider-appropriate model** — `handleMiniCompletion` was failing with "No API key found for openai." for users on ChatGPT Plus / openai-codex / google / github-copilot whenever the connection had no explicit `miniModel`. The provider-check fallback in `queryLlm` always assigned `getDefaultSummarizationModel()` (Haiku), which only resolves under anthropic auth — the Pi SDK 0.70.0 default then silently surfaced as an OpenAI model and the misleading auth error bubbled up. New `pickProviderAppropriateMiniModel` helper walks `PI_PREFERRED_DEFAULTS[authProvider]` for a resolvable, non-denied candidate (anthropic explicitly preserved to keep Haiku as its mini default), and `runQueryWithModel` now fails fast with an actionable message if no model resolves. - **Composer no longer crashes on malformed drafts** — Harden the chat input against untrusted draft content: the renderer now coerces draft text at the boundary, `RichTextInput` is defensive against non-string values, and the input area is wrapped in a local error boundary with recovery actions so a bad draft cannot take the whole window down. Fallback UI is localized and passes the staged i18n checks. - **Bare `@@` diff blocks now render as rich PatchDiff** — Diff normalization moved into a pure helper and made robust for three shapes that previously fell through to the syntax-highlighted `CodeBlock` instead of pierre's diff viewer: bare `@@` marker lines, valid numbered hunks without file headers, and already-valid unified/git patches (now preserved byte-identically). The synthesis path preserves user file headers when present, strips hunk markers cleanly without leaving blank context lines, and counts body lines without mistaking legitimate `---` / `+++` deletion/addition content for headers. ## Dependency Changes - **Pi SDK uplifted to 0.70.2** — `@mariozechner/pi-{coding-agent,agent-core,ai}` bumped across root, `pi-agent-server`, `server-core`, and `shared`. The 0.66.1 → 0.70.0 step replaced the removed `codingTools` export with a `createCodingTools(cwd)` factory and reshaped `CreateAgentSessionOptions.tools` (see the tool-registration fix above). The 0.70.0 → 0.70.2 step is a patch-level bump with no API surface changes exercised by our code. (See the GPT-5.5 Features entry above for the default-model change enabled by the 0.70.0 catalog.) - **Claude Agent SDK pinned to exact 0.2.111** — The earlier 0.2.111 → 0.2.119 uplift broke Claude-backed sessions with "Claude Code SDK not found". Starting at 0.2.113 the SDK replaced its JS `cli.js` entry point with platform-scoped native binaries (`@anthropic-ai/claude-agent-sdk-{platform}-{arch}`), which our runtime resolver, build scripts, CI verify-bundles steps, and `--preload`-based unified network interceptor all assume does not exist. Pinned back to the last JS-based release and tightened peer-dep ranges from `^0.2.19` / `>=0.2.19` to an exact `0.2.111` so the next reshape cannot land as a silent patch bump. Full native-binary migration (interceptor rehoming) is tracked separately. ## Breaking Changes - None. `source_test` is backward compatible; `autoEnable` defaults to `true` but the previous validation output is a strict subset of the new output. Pass `autoEnable: false` to reproduce the old behavior exactly. `spawn_session` gains an optional field — existing callers are unaffected.
1 parent 72087dd commit d9c585b

99 files changed

Lines changed: 3814 additions & 439 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

apps/cli/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "@craft-agent/cli",
3-
"version": "0.8.11",
3+
"version": "0.8.12",
44
"license": "Apache-2.0",
55
"description": "Terminal client for Craft Agent server",
66
"type": "module",

apps/cli/src/commands.test.ts

Lines changed: 38 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
import { describe, it, expect } from 'bun:test'
2-
import { parseArgs } from './index.ts'
2+
import { parseArgs, resolveApiKey, shouldSetupLlmConnection } from './index.ts'
33

44
// ---------------------------------------------------------------------------
55
// Arg parsing tests
@@ -225,6 +225,43 @@ describe('parseArgs', () => {
225225
const args = parseArgs(['bun', 'index.ts', 'run', 'hello'])
226226
expect(args.workspaceDir).toBeUndefined()
227227
})
228+
229+
it('parses --provider deepseek for run', () => {
230+
const args = parseArgs(['bun', 'index.ts', '--provider', 'deepseek', 'run', 'hello'])
231+
expect(args.provider).toBe('deepseek')
232+
})
233+
})
234+
235+
// ---------------------------------------------------------------------------
236+
// Provider credential resolution tests
237+
// ---------------------------------------------------------------------------
238+
239+
describe('resolveApiKey', () => {
240+
it('uses DEEPSEEK_API_KEY for the deepseek provider', () => {
241+
const prev = process.env.DEEPSEEK_API_KEY
242+
process.env.DEEPSEEK_API_KEY = 'deepseek-test-key'
243+
244+
try {
245+
expect(resolveApiKey('deepseek', '')).toBe('deepseek-test-key')
246+
} finally {
247+
if (prev === undefined) delete process.env.DEEPSEEK_API_KEY
248+
else process.env.DEEPSEEK_API_KEY = prev
249+
}
250+
})
251+
})
252+
253+
describe('shouldSetupLlmConnection', () => {
254+
it('forces setup for non-default providers even when connections already exist', () => {
255+
expect(shouldSetupLlmConnection(2, { provider: 'deepseek', baseUrl: '' })).toBe(true)
256+
})
257+
258+
it('skips setup for the default anthropic provider when connections already exist', () => {
259+
expect(shouldSetupLlmConnection(2, { provider: 'anthropic', baseUrl: '' })).toBe(false)
260+
})
261+
262+
it('forces setup for custom endpoints', () => {
263+
expect(shouldSetupLlmConnection(2, { provider: 'anthropic', baseUrl: 'https://api.example.com' })).toBe(true)
264+
})
228265
})
229266

230267
// ---------------------------------------------------------------------------

apps/cli/src/index.ts

Lines changed: 57 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -520,12 +520,31 @@ const PROVIDER_ENV_KEYS: Record<string, string> = {
520520
openrouter: 'OPENROUTER_API_KEY',
521521
groq: 'GROQ_API_KEY',
522522
mistral: 'MISTRAL_API_KEY',
523+
deepseek: 'DEEPSEEK_API_KEY',
523524
xai: 'XAI_API_KEY',
524525
cerebras: 'CEREBRAS_API_KEY',
525526
huggingface: 'HUGGINGFACE_API_KEY',
526527
}
527528

528-
function resolveApiKey(provider: string, explicit: string): string {
529+
const PROVIDER_DISPLAY_NAMES: Record<string, string> = {
530+
anthropic: 'Anthropic',
531+
openai: 'OpenAI',
532+
google: 'Google',
533+
openrouter: 'OpenRouter',
534+
groq: 'Groq',
535+
mistral: 'Mistral',
536+
deepseek: 'DeepSeek',
537+
xai: 'xAI',
538+
cerebras: 'Cerebras',
539+
huggingface: 'Hugging Face',
540+
'amazon-bedrock': 'Amazon Bedrock',
541+
}
542+
543+
function getProviderDisplayName(provider: string): string {
544+
return PROVIDER_DISPLAY_NAMES[provider] ?? provider.charAt(0).toUpperCase() + provider.slice(1)
545+
}
546+
547+
export function resolveApiKey(provider: string, explicit: string): string {
529548
if (explicit) return explicit
530549
if (provider === 'amazon-bedrock') return '' // IAM credentials, not API key
531550
const envKey = PROVIDER_ENV_KEYS[provider]
@@ -535,6 +554,10 @@ function resolveApiKey(provider: string, explicit: string): string {
535554
)
536555
}
537556

557+
export function shouldSetupLlmConnection(existingConnectionCount: number, args: Pick<CliArgs, 'provider' | 'baseUrl'>): boolean {
558+
return existingConnectionCount === 0 || !!args.baseUrl || args.provider !== 'anthropic'
559+
}
560+
538561
async function setupLlmConnection(
539562
client: CliRpcClient,
540563
args: CliArgs,
@@ -587,7 +610,7 @@ async function setupLlmConnection(
587610

588611
await client.invoke('LLM_Connection:save', {
589612
slug: connectionSlug,
590-
name: provider.charAt(0).toUpperCase() + provider.slice(1),
613+
name: getProviderDisplayName(provider),
591614
providerType,
592615
authType,
593616
createdAt: Date.now(),
@@ -651,7 +674,7 @@ async function cmdRun(args: CliArgs): Promise<void> {
651674
// (even if other connections exist) so the session routes through it.
652675
const connections = (await client.invoke('LLM_Connection:list')) as any[]
653676
let connectionSlug: string | undefined
654-
if (!connections?.length || args.baseUrl) {
677+
if (shouldSetupLlmConnection(connections?.length ?? 0, args)) {
655678
const result = await setupLlmConnection(client, args)
656679
connectionSlug = result.connectionSlug
657680
}
@@ -1053,12 +1076,17 @@ export function getValidateSteps(): ValidateStep[] {
10531076
// Custom endpoint: always create/update when --base-url is provided
10541077
if (ctx.baseUrl) {
10551078
const provider = ctx.provider || 'anthropic'
1056-
const key = ctx.apiKey || process.env.ANTHROPIC_API_KEY || ''
1079+
let key = ''
1080+
try {
1081+
key = resolveApiKey(provider, ctx.apiKey || '')
1082+
} catch (error) {
1083+
return `0 connections (${error instanceof Error ? error.message : 'missing API key'})`
1084+
}
10571085
const slug = `${provider}-cli`
10581086
const isAnthropicApi = provider === 'anthropic'
10591087
await client.invoke('LLM_Connection:save', {
10601088
slug,
1061-
name: `${provider.charAt(0).toUpperCase() + provider.slice(1)} (Custom Endpoint)`,
1089+
name: `${getProviderDisplayName(provider)} (Custom Endpoint)`,
10621090
providerType: 'pi_compat',
10631091
authType: 'api_key_with_endpoint',
10641092
createdAt: Date.now(),
@@ -1105,22 +1133,34 @@ export function getValidateSteps(): ValidateStep[] {
11051133
return `${r?.length ?? 0} existing + Bedrock IAM (${region})`
11061134
}
11071135

1108-
if (r?.length > 0) return `${r.length} connections`
1109-
// Auto-setup from env for CI environments
1110-
const envKey = process.env.ANTHROPIC_API_KEY
1111-
if (!envKey) return `0 connections (no ANTHROPIC_API_KEY)`
1112-
const slug = 'anthropic-cli'
1136+
const provider = ctx.provider || 'anthropic'
1137+
if (!shouldSetupLlmConnection(r?.length ?? 0, { provider, baseUrl: ctx.baseUrl ?? '' })) {
1138+
return `${r.length} connections`
1139+
}
1140+
// Auto-setup from env / flags for the requested provider.
1141+
let key = ''
1142+
try {
1143+
key = resolveApiKey(provider, ctx.apiKey || '')
1144+
} catch (error) {
1145+
return `0 connections (${error instanceof Error ? error.message : 'missing API key'})`
1146+
}
1147+
const slug = `${provider}-cli`
1148+
const providerType = provider === 'anthropic' ? 'anthropic' : 'pi'
1149+
const authType = 'api_key'
11131150
await client.invoke('LLM_Connection:save', {
11141151
slug,
1115-
name: 'Anthropic',
1116-
providerType: 'anthropic',
1117-
authType: 'api_key',
1152+
name: getProviderDisplayName(provider),
1153+
providerType,
1154+
authType,
11181155
createdAt: Date.now(),
11191156
})
1120-
const result = await client.invoke('settings:setupLlmConnection', { slug, credential: envKey }) as { success: boolean; error?: string }
1157+
const setupPayload = provider === 'anthropic'
1158+
? { slug, credential: key }
1159+
: { slug, credential: key, piAuthProvider: provider }
1160+
const result = await client.invoke('settings:setupLlmConnection', setupPayload) as { success: boolean; error?: string }
11211161
if (!result?.success) return `setup failed: ${result?.error ?? 'unknown'}`
11221162
await client.invoke('LLM_Connection:setDefault', slug)
1123-
return `0 found → created from env`
1163+
return `0 found → created ${provider} connection`
11241164
},
11251165
},
11261166
{
@@ -1865,7 +1905,7 @@ Connection:
18651905
18661906
LLM Configuration (for 'run' command):
18671907
--provider <name> LLM provider (default: anthropic, or $LLM_PROVIDER)
1868-
Supported: anthropic, openai, google, openrouter, groq, mistral, xai, ...
1908+
Supported: anthropic, openai, google, openrouter, groq, mistral, deepseek, xai, ...
18691909
--model <id> Model to use (or $LLM_MODEL)
18701910
--api-key <key> API key (or $LLM_API_KEY, or provider-specific e.g. $OPENAI_API_KEY)
18711911
--base-url <url> Custom API endpoint (or $LLM_BASE_URL)
@@ -1902,6 +1942,7 @@ Examples:
19021942
craft-cli run --provider openai --model gpt-4o "Summarize this repo"
19031943
OPENAI_API_KEY=sk-... craft-cli run --provider openai "Hello"
19041944
GOOGLE_API_KEY=... craft-cli run --provider google --model gemini-2.0-flash "Hello"
1945+
DEEPSEEK_API_KEY=sk-... craft-cli run --provider deepseek --model deepseek-v4-flash "Hello"
19051946
echo "Analyze this code" | craft-cli run
19061947
craft-cli ping
19071948
craft-cli sessions

apps/electron/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "@craft-agent/electron",
3-
"version": "0.8.11",
3+
"version": "0.8.12",
44
"description": "Electron desktop app for Craft Agents",
55
"main": "dist/main.cjs",
66
"private": true,

apps/electron/resources/docs/sources.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -152,6 +152,7 @@ The `source_test` tool:
152152
2. **Downloads and caches the icon** if a URL was provided
153153
3. **Tests the connection** to verify the source is reachable
154154
4. **Reports missing fields** (icon, tagline) that should be added
155+
5. **Auto-enables the source** (default): on a clean run it flips `enabled: true` in config if needed and activates the source in the current session so its tools become available without a restart. Pass `autoEnable: false` to keep pure validation behavior.
155156

156157
After validation passes, trigger the appropriate auth flow:
157158
- OAuth sources: `source_oauth_trigger({ sourceSlug: "{slug}" })`

0 commit comments

Comments
 (0)