feat: add opt-in GCF output format for MCP tool responses (--gcf)#843
feat: add opt-in GCF output format for MCP tool responses (--gcf)#843blackwell-systems wants to merge 4 commits into
Conversation
GCF (Graph Compact Format) encodes structured data with 56% fewer tokens than JSON on duplication results. Optional dependency, opt-in via --gcf flag or JSCPD_OUTPUT_FORMAT=gcf env var. Falls back to JSON if the package is not installed or encoding fails. Benchmarked on jscpd response shapes (duplication results + statistics): - 5 duplications: 49% fewer tokens - 20 duplications: 59% fewer tokens - 100 duplications: 58% fewer tokens (14,498 -> 6,112) - Project statistics: 27-28% fewer tokens - Overall: 56% savings, 7/7 GCF wins
WalkthroughAn optional ChangesOptional GCF Serialization for MCP Tool Responses
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
apps/jscpd-server/src/index.ts (1)
26-30: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winCLI flag is registered but not read through Commander.
The
--gcfflag is registered with Commander but never accessed viacli.opts().gcf. Instead,serialize.tsdirectly checksprocess.argv.includes("--gcf"). This works but is non-idiomatic—typically Commander-parsed options should be read through the opts object.Consider either:
- Reading the flag via
serverOpts.gcfand passing it to the serialize module, or- Documenting that the flag is intentionally parsed outside Commander for decoupling reasons.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/jscpd-server/src/index.ts` around lines 26 - 30, The `--gcf` flag is registered with Commander but not read through the idiomatic `cli.opts()` method. After parsing the command line options with Commander, read the gcf flag value via `cli.opts().gcf` and pass this parsed value to the serialize module as a parameter. Then update `serialize.ts` to accept and use this parameter instead of directly checking `process.argv.includes("--gcf")` to follow Commander conventions.apps/jscpd-server/src/server/serialize.ts (1)
37-46: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winRemove redundant encodeGeneric null check.
Line 38 checks
encodeGenericeven thoughisGcfEnabled()already returnsfalsewhenencodeGeneric === null(line 30). The&& encodeGenericcondition is redundant.♻️ Simplified version
export function serialize(data: unknown, indent = 2): string { - if (isGcfEnabled() && encodeGeneric) { + if (isGcfEnabled()) { try { - return encodeGeneric(data); + return encodeGeneric!(data); } catch { // Fall back to JSON on any encoding error }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/jscpd-server/src/server/serialize.ts` around lines 37 - 46, In the serialize function, remove the redundant `&& encodeGeneric` condition from the if statement that checks `isGcfEnabled()`. Since isGcfEnabled() already returns false when encodeGeneric is null (as verified on line 30), the additional null check is unnecessary. Simply change the condition to check only `if (isGcfEnabled())` to eliminate the redundancy.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@apps/jscpd-server/src/server/mcp-server.ts`:
- Line 5: The `serialize` function is imported but not consistently applied
across all tool handlers in the MCP server. The `get_statistics` tool handler
(around line 83) still uses `JSON.stringify` instead of the `serialize`
function, while the other two tool handlers `check_duplication` and
`check_current_directory` have been updated to use `serialize`. Replace the
`JSON.stringify` call in the `get_statistics` tool handler with the `serialize`
function to ensure consistent output formatting when the global configuration
flag is enabled. Note that the `statistics` resource definition around line 149
should continue using `JSON.stringify` since it explicitly declares a JSON
mimeType and resources have different semantics than tools.
---
Nitpick comments:
In `@apps/jscpd-server/src/index.ts`:
- Around line 26-30: The `--gcf` flag is registered with Commander but not read
through the idiomatic `cli.opts()` method. After parsing the command line
options with Commander, read the gcf flag value via `cli.opts().gcf` and pass
this parsed value to the serialize module as a parameter. Then update
`serialize.ts` to accept and use this parameter instead of directly checking
`process.argv.includes("--gcf")` to follow Commander conventions.
In `@apps/jscpd-server/src/server/serialize.ts`:
- Around line 37-46: In the serialize function, remove the redundant `&&
encodeGeneric` condition from the if statement that checks `isGcfEnabled()`.
Since isGcfEnabled() already returns false when encodeGeneric is null (as
verified on line 30), the additional null check is unnecessary. Simply change
the condition to check only `if (isGcfEnabled())` to eliminate the redundancy.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 79092a48-8087-45d9-b580-36b7a43d30b3
📒 Files selected for processing (4)
apps/jscpd-server/package.jsonapps/jscpd-server/src/index.tsapps/jscpd-server/src/server/mcp-server.tsapps/jscpd-server/src/server/serialize.ts
There was a problem hiding this comment.
♻️ Duplicate comments (1)
apps/jscpd-server/src/server/mcp-server.ts (1)
149-149: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick winKeep resource payload aligned with declared JSON MIME type.
Line 140 declares
mimeType: "application/json", but Line 149 usesserialize(stats), which can output GCF when enabled. That breaks the resource contract for JSON consumers.🔧 Minimal fix
return { contents: [ { uri: uri.href, - text: serialize(stats), + text: JSON.stringify(stats, null, 2), }, ], };🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/jscpd-server/src/server/mcp-server.ts` at line 149, The resource declares mimeType as "application/json" on line 140, but the text property uses serialize(stats) which can output GCF format when enabled, creating a contract mismatch. Replace the serialize(stats) call with JSON.stringify(stats) to ensure the output is always valid JSON that aligns with the declared mimeType.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@apps/jscpd-server/src/server/mcp-server.ts`:
- Line 149: The resource declares mimeType as "application/json" on line 140,
but the text property uses serialize(stats) which can output GCF format when
enabled, creating a contract mismatch. Replace the serialize(stats) call with
JSON.stringify(stats) to ensure the output is always valid JSON that aligns with
the declared mimeType.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: df9601e8-ba40-459b-844f-ec5869ae3fd9
📒 Files selected for processing (2)
apps/jscpd-server/src/server/mcp-server.tsapps/jscpd-server/src/server/serialize.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- apps/jscpd-server/src/server/serialize.ts
There was a problem hiding this comment.
🧹 Nitpick comments (1)
apps/jscpd-server/src/server/mcp-server.ts (1)
117-117: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueMinor behavioral change: JSON is now pretty-printed when GCF is disabled.
The original code used
JSON.stringify(statistics)without indentation arguments, producing minified JSON. The newserialize(statistics)call defaults toindent=2, so when GCF is disabled, the JSON output will now be pretty-printed instead of minified.This is a very minor change that improves readability and normalizes formatting across all tool responses. However, it technically contradicts the PR objective's claim of "zero behavior changes" when GCF is disabled.
If the minified format was intentional for this specific handler, you can preserve it by calling
serialize(statistics, 0)instead.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/jscpd-server/src/server/mcp-server.ts` at line 117, The serialize(statistics) call on line 117 now produces pretty-printed JSON with default indentation instead of the original minified JSON format from JSON.stringify(). To preserve the original minified JSON behavior and maintain zero behavioral changes when GCF is disabled, modify the serialize(statistics) call to serialize(statistics, 0) to explicitly disable indentation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@apps/jscpd-server/src/server/mcp-server.ts`:
- Line 117: The serialize(statistics) call on line 117 now produces
pretty-printed JSON with default indentation instead of the original minified
JSON format from JSON.stringify(). To preserve the original minified JSON
behavior and maintain zero behavioral changes when GCF is disabled, modify the
serialize(statistics) call to serialize(statistics, 0) to explicitly disable
indentation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 05215a72-725a-42f6-b7e2-7b62d5510b1d
📒 Files selected for processing (1)
apps/jscpd-server/src/server/mcp-server.ts
Summary
Adds GCF (Graph Compact Format) as an opt-in output format for MCP tool responses. Enable via
--gcfflag orJSCPD_OUTPUT_FORMAT=gcfenvironment variable. Optional dependency, zero changes to existing behavior.Token savings on jscpd data shapes
Benchmarked on realistic duplication results and project statistics using the o200k_base tokenizer:
GCF wins 7/7 comparisons. The largest savings come from duplication results where nested
snippetLocationandcodebaseLocationobjects are flattened into path columns ("codebaseLocation>file","codebaseLocation>startLine") instead of repeating the full nested structure on every row.Benchmark script: jscpd-token-benchmark.mjs
Why GCF
Adopted by Speakeasy (Google, Verizon, Mistral AI, DocuSign), OmniRoute (6.1K stars), and others.
What changed
apps/jscpd-server/package.json: added@blackwell-systems/gcfas optional dependency (pinned to 2.2.1)apps/jscpd-server/src/index.ts: added--gcfCLI flagapps/jscpd-server/src/server/serialize.ts: new module, dynamic import with graceful fallbackapps/jscpd-server/src/server/mcp-server.ts: replacedJSON.stringifycalls withserialize()(3 call sites)Usage
jscpd-server --gcf .Or via environment variable:
JSCPD_OUTPUT_FORMAT=gcf jscpd-server .Without the flag or env var, behavior is identical to current (JSON output). If
@blackwell-systems/gcfis not installed, the flag silently falls back to JSON.This change is
Summary by CodeRabbit
--gcfcommand-line option (and matching configuration) to enable GCF output.