feat(skill): BM25-based smart skill retrieval for large catalogues by claudianus · Pull Request #726 · MoonshotAI/kimi-code

claudianus · 2026-06-13T19:30:26Z

Related Issue

Resolve #725

Problem

When a user has a large number of skills installed (e.g. 1,000+), the current implementation injects the entire skill listing into the system prompt — including name, description (250 chars), whenToUse, and file path for every skill.

With 1,530 real skills this amounts to ~118,000 tokens in the system prompt, which:

Consumes ~46% of a 256K context window
Loads all SKILL.md file bodies into memory at startup (~88 MB JS heap)
Forces the LLM to scan a massive listing to find the right skill — poor accuracy due to lost-in-the-middle effects

What Changed

A 3-tier auto-detection system based on invocable skill count. No feature flag needed — the tier is selected automatically at session start.

Skill count	Listing format	Strategy
≤ 80	Legacy full listing (name + desc + path)	Prompt-cache optimal for small catalogues
81–300	Compact (name + 80-char desc)	Skill search available
> 300	Names only	Skill search required

New files

skill/search.ts — BM25 search index with synonym expansion. Zero external dependencies. ~24ms to build for 1,500 skills, <1ms per query.

Modified files

skill/registry.ts — Auto-detect tier based on skill count. searchSkills() method with lazy index rebuild. renderSkillPrompt() lazy-loads SKILL.md content from disk only when a skill is activated. Listing thresholds are now configurable via SkillRegistryOptions.
skill/parser.ts — parseSkillMetaFromFile() reads only YAML frontmatter (~20 lines) instead of the full SKILL.md body. Uses LAZY_CONTENT_SENTINEL to distinguish "not loaded" from "genuinely empty body". Also captures a small body snippet for search and derives better flat-skill descriptions from the first sentence.
skill-tool.ts / .md — Skill tool extended with action: "search" alongside existing action: "load". Backward-compatible — skill field remains required.
profile/default/system.md — Two-step skill usage: search first, then load.
session/index.ts — Minor cleanup (removed flag wiring).
flags/registry.ts — No new flags needed.

Tests

New tests merged into existing registry.test.ts (search, tiers, lazy rebuild, configurable thresholds)
New search.test.ts with dedicated SkillSearchIndex unit tests (Unicode, synonyms, threshold, body snippet, aliases/tags)
New integration-proof.test.ts validates what the LLM actually sees with real 1,530 skills
Existing scanner.test.ts updated for lazy loading via renderSkillPrompt
Updated parser.test.ts for body-snippet and first-sentence description fallback coverage
Skill-search tests: 52/52 pass; full @moonshot-ai/agent-core suite: 2,622 pass / 3 unrelated failures / 1 todo (the 3 failures are flaky skill-tool-manager timeout under full-suite load and 2 unrelated MCP startup tests)

Follow-up improvements added after review

Unicode-aware tokenization — \p{L}-based tokenization so non-English skills (Korean, Portuguese, etc.) are searchable.
minScore threshold — SkillSearchIndex.search() accepts an optional minimum score to filter out low-relevance matches.
Body snippet indexing — First 1KB after frontmatter is indexed with a lower weight, improving recall without loading full bodies.
Aliases/tags as weighted terms — Skill metadata aliases and tags are now indexed with high/medium weight.
Configurable listing thresholds — compactListingThreshold and namesOnlyListingThreshold can be passed to SessionSkillRegistry.
Better flat-skill descriptions — Falls back to the first sentence of the body instead of the first raw line.

Performance (measured with 1,530 real skills from antigravity-awesome-skills)

Metric	Before	After	Reduction
System prompt	118,120 tokens	8,436 tokens	93%
Startup memory	~88 MB	~4 MB	95%
Search latency	N/A (LLM scans listing)	0.0–0.2ms (BM25)	—
Lazy content load	N/A (eager at startup)	0.4ms per activation	—

Checklist

Tests added and passing (skill-search 52/52; full suite 2,622 pass with 3 unrelated failures)
Lint clean for modified files (oxlint + sherif)
TypeScript clean (tsc --noEmit, 0 errors)
Changeset generated (minor bump)
No co-author attribution or agent identity
No internal identifiers in diff
Backward-compatible (existing skill callers unchanged)

Follow-up fix

62d38ab8 — fix(skill-search): handle string tags and aliases in skill search index

Real-world skill YAML frontmatter sometimes declares tags and aliases as comma or space-separated strings instead of arrays. Normalize them to string arrays before indexing so multi-keyword queries can match skills correctly.

When >80 skills are installed, the system prompt switches from a full listing to a compact name-only format. The model discovers skills via the Skill tool's new action:"search" endpoint backed by a BM25 index. Three tiers by skill count: ≤ 80: legacy full listing (prompt-cache optimal) 81-300: compact name + description + search > 300: names only + search required Key changes: - skill/search.ts: BM25 index with synonym expansion (zero deps) - skill/registry.ts: auto-detect tier, lazy content loading - skill/parser.ts: parseSkillMetaFromFile (frontmatter-only) - skill-tool.ts: search action on existing Skill tool - system.md: search-first workflow instructions Performance (measured with 1,530 real skills): - System prompt: 118K → 8.4K tokens (93% reduction) - Startup memory: 88MB → 4MB (95% reduction) - Search latency: 0.0-0.2ms per query - Lazy content load: 0.4ms per skill activation Closes MoonshotAI#725

changeset-bot · 2026-06-13T19:30:31Z

🦋 Changeset detected

Latest commit: f357f22

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package

Name	Type
@moonshot-ai/kimi-code	Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4eca85320e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-06-13T19:33:43Z

 export const SkillToolInputSchema: z.ZodType<SkillToolInput> = z.object({
  skill: z.string(),


Allow search calls without a skill name

When the model follows the new instructions/example and calls Skill as {"action":"search","query":"e2e test"}, preflight validation rejects it before execution() runs because this schema still makes skill required; loop/tool-call.ts validates the advertised JSON schema with AJV. This breaks the large-catalog search-first flow unless the model invents a dummy skill name, so skill should be optional for search or the schema should be discriminated by action.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-13T19:33:43Z

+import { join } from 'node:path';
+import { performance } from 'node:perf_hooks';
+
+const SKILLS_DIR = join(homedir(), '.kimi', 'skills');


Avoid depending on a local home skill catalog

These tests depend on ~/.kimi/skills existing and containing the author's real 1,530-skill catalog. In a clean CI or developer checkout this path is usually absent or has different contents, so the registry loads no/different skills and the listing assertions or results[0]! dereference fail. Build a temporary fixture catalog inside the test instead of reading the user's home directory.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-13T19:33:43Z

+      fencesFound++;
+      if (fencesFound === 2) break;
+    }
+    offset += line.length + 1;


Preserve CRLF frontmatter fences when slicing

For SKILL.md files written with CRLF line endings, split(/\r?\n/) removes a two-character newline but this offset only adds + 1, so frontmatterOnly is sliced before the second ---; parseSkillText then reports a missing closing fence and discovery skips the skill. This regresses Windows-authored skills that the full parser already accepts, so compute the fence position on the original buffer or account for \r\n lengths.

Useful? React with 👍 / 👎.

P1: Make skill field optional in SkillToolInputSchema — search-only calls no longer need a dummy skill name. Validation added for load action requiring skill parameter. P1: Replace home-dir-dependent integration tests with self-contained temp fixture (350 SKILL.md files in mkdtemp). Tests are now portable across CI and developer machines. P2: Fix CRLF fence offset in parseSkillMetaFromFile — split on \n and strip trailing \r to correctly account for 2-byte newlines when computing the slice boundary. Also: update skill-tool.test.ts contract to reflect optional skill.

Six smoke tests with real fixture files (350 SKILL.md): 1. Tier auto-detection (names-only for 350 skills) 2. Lazy content loading (sentinel → readFileSync) 3. BM25 search accuracy (8/8 domain queries correct) 4. Schema validation (search without skill name) 5. CRLF frontmatter parsing 6. Full model flow: search → pick → load → render

… stopwords and listing cache

…MoonshotAI#784

claudianus · 2026-06-16T02:23:27Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a6c632b7a3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-06-16T02:26:56Z

+  const result = parseSkillText({ ...options, text: frontmatterOnly });
+  return { ...result, content: LAZY_CONTENT_SENTINEL };


Preserve body-derived descriptions for flat skills

When a top-level flat .md skill has frontmatter but omits description, the existing parser supports it by deriving the description from the first body line. This lazy path now calls parseSkillText with only the frontmatter, so those valid skills get No description provided. and their body keywords are absent from the BM25 index/listing when loaded through SessionSkillRegistry.loadRoots, making them hard or impossible to discover by search.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-16T02:26:56Z

+
+    // Build the BM25 search index so the model can discover skills
+    // via the `skill_search` tool instead of scanning a full listing.
+    this.searchIndex.build(this.listInvocableSkills());


Filter sub-skills out of the search index

The search index is built from listInvocableSkills(), but the model-facing listing below explicitly filters metadata.isSubSkill !== true. For user sub-skills that do not set disableModelInvocation, Skill search can now return hidden child skill names and prompt the model to load them directly, bypassing the parent skill that previously controlled how sub-skills were exposed; the lazy rebuild path uses the same unfiltered source as well.

Useful? React with 👍 / 👎.

claudianus · 2026-06-16T03:29:24Z

Addressed the latest Codex review comments:

search without skill name — Confirmed the JSON schema already makes skill optional; added a regression test that validates {"action":"search","query":"e2e test"} through the production AJV validator.
local home skill catalog dependency — The integration test already uses a temporary fixture catalog; no change needed.
CRLF frontmatter slicing — Fixed parseSkillMetaFromFile to slice to offset + fenceLine.length instead of offset + 3, so CRLF frontmatter fences are preserved.
body-derived descriptions for flat skills — Added a fallback that re-parses the full file for flat .md skills when no explicit description is found, keeping the lazy sentinel for content.
sub-skills in search index — Added listSearchableSkills() and used it for both initial index build and lazy rebuild, matching the filter already applied in getModelSkillListing().

All affected tests pass and the package typechecks/builds successfully.

claudianus · 2026-06-16T03:47:17Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a6c632b7a3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-06-16T03:52:03Z

+
+    // Build the BM25 search index so the model can discover skills
+    // via the `skill_search` tool instead of scanning a full listing.
+    this.searchIndex.build(this.listInvocableSkills());


Hide sub-skills from search results

When a skill root declares sub-skills, getModelSkillListing() intentionally filters metadata.isSubSkill so the model only sees the parent skill, but this new index is built from the unfiltered listInvocableSkills(). In large catalogs the model is now told to use search, so queries can surface and then load parent.child directly even though that sub-skill was deliberately absent from the model listing; build and rebuild the index with the same non-sub-skill filter used for the prompt listing.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-16T03:52:04Z

+  const frontmatterOnly = buffer.slice(0, offset + 3);
+  const result = parseSkillText({ ...options, text: frontmatterOnly });


Preserve body-derived flat skill descriptions

For top-level flat .md skills that have frontmatter for fields like name/type but omit description (which is supported because only directory SKILL.md files require a description), this truncated parse leaves parseSkillText() with an empty body and records No description provided. instead of the previous first-body-line fallback. In catalogs that switch to search, those body terms are no longer indexed or listed, so these skills become hard to discover unless the exact name is known; read enough body to compute the fallback or fall back to the full parse when the frontmatter has no description.

Useful? React with 👍 / 👎.

claudianus · 2026-06-16T03:59:16Z

Both points are addressed in commit 7da68027 (now pushed to this PR branch):

Sub-skills are excluded from the BM25 search index via listSearchableSkills().
Flat .md skills without a frontmatter description now fall back to the first body line.

Thanks for the review!

- Fix CRLF frontmatter fence slicing in parseSkillMetaFromFile - Derive flat-skill descriptions from body when frontmatter omits it - Exclude sub-skills from BM25 search index - Add regression tests for schema validation, CRLF, and sub-skill filtering Refs MoonshotAI#726

…evance results

claudianus · 2026-06-16T18:43:54Z

@codex please review this PR. The branch has been squashed into a single commit (93d57da).

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f357f22cb4

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-06-16T18:49:02Z

+      const whenToUseTokens = tokenize(skill.metadata.whenToUse ?? '', { removeStopwords: true });
+      const bodySnippetTokens = tokenize(skill.bodySnippet ?? '', { removeStopwords: true });
+      const aliasesTokens = tokenize((skill.metadata.aliases ?? []).join(' '), { removeStopwords: true });
+      const tagsTokens = tokenize((skill.metadata.tags ?? []).join(' '), { removeStopwords: true });


Guard raw skill metadata before tokenizing

When a user skill contains raw YAML metadata such as tags: backend, aliases: docker, or a non-string whenToUse, these values are not normalized by the parser, but the new indexer calls .join()/string tokenization on them. A single such skill will throw while loadRoots() builds the BM25 index, aborting session skill loading and preventing built-in skills from being registered. Please coerce or ignore these fields with Array.isArray/typeof === 'string' before tokenizing.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-16T18:49:02Z

+
+function tokenize(text: string, options: { removeStopwords?: boolean } = {}): string[] {
+  const raw = text
+    .toLowerCase()


Split camelCase before lowercasing tokens

Because tokenize() lowercases the text before calling splitCompoundIdentifier(), the camelCase/PascalCase regex in that helper can never match. In catalogs with skill names like codeReview or ReactHooks, searching for review or hooks will miss the name token unless the same word appears elsewhere, defeating the intended compound-name matching.

Useful? React with 👍 / 👎.

chatgpt-codex-connector Bot reviewed Jun 13, 2026

View reviewed changes

claudianus added 7 commits June 14, 2026 04:41

feat(agent-core): improve skill search with field boosting, synonyms,…

d3be03a

… stopwords and listing cache

Merge branch 'upstream/main' into pr-726-resolve

0508acc

fix(type): expose searchSkills on SkillRegistry interface for upstream …

93f52f6

…MoonshotAI#784

docs: correct names-only tier threshold comment (300, not 200)

165a197

test(skill-tool): cover search action execution path

a6c632b

chatgpt-codex-connector Bot reviewed Jun 16, 2026

View reviewed changes

claudianus force-pushed the main branch from 7da6802 to 4891eb9 Compare June 16, 2026 04:03

claudianus added 7 commits June 16, 2026 13:27

feat(skill-search): support Unicode tokenization for non-English skills

2290fbc

feat(skill-search): add optional minScore threshold to filter low-rel…

f9b8fda

…evance results

feat(skill-search): index a small body snippet to improve recall

6057afd

feat(skill-search): use skill aliases and tags as weighted search terms

25577a2

test(skill-search): add dedicated SkillSearchIndex unit tests

338d6ba

feat(skill-registry): make listing tier thresholds configurable

699f4e5

feat(skill-parser): derive flat-skill description from first sentence

f357f22

chatgpt-codex-connector Bot reviewed Jun 16, 2026

View reviewed changes

		export const SkillToolInputSchema: z.ZodType<SkillToolInput> = z.object({
		skill: z.string(),

		const result = parseSkillText({ ...options, text: frontmatterOnly });
		return { ...result, content: LAZY_CONTENT_SENTINEL };

		const frontmatterOnly = buffer.slice(0, offset + 3);
		const result = parseSkillText({ ...options, text: frontmatterOnly });

Conversation

claudianus commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related Issue

Problem

What Changed

New files

Modified files

Tests

Follow-up improvements added after review

Performance (measured with 1,530 real skills from antigravity-awesome-skills)

Checklist

Follow-up fix

Uh oh!

changeset-bot Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

claudianus commented Jun 16, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

claudianus commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claudianus commented Jun 16, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

claudianus commented Jun 16, 2026

Uh oh!

claudianus commented Jun 16, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claudianus commented Jun 13, 2026 •

edited

Loading

changeset-bot Bot commented Jun 13, 2026 •

edited

Loading

claudianus commented Jun 16, 2026 •

edited

Loading