Skip to content

feat: add opt-in GCF output format for MCP tool responses (--gcf)#843

Open
blackwell-systems wants to merge 4 commits into
kucherenko:masterfrom
blackwell-systems:feat/gcf-output-format
Open

feat: add opt-in GCF output format for MCP tool responses (--gcf)#843
blackwell-systems wants to merge 4 commits into
kucherenko:masterfrom
blackwell-systems:feat/gcf-output-format

Conversation

@blackwell-systems

@blackwell-systems blackwell-systems commented Jun 23, 2026

Copy link
Copy Markdown

Summary

Adds GCF (Graph Compact Format) as an opt-in output format for MCP tool responses. Enable via --gcf flag or JSCPD_OUTPUT_FORMAT=gcf environment variable. Optional dependency, zero changes to existing behavior.

Token savings on jscpd data shapes

Benchmarked on realistic duplication results and project statistics using the o200k_base tokenizer:

Response type Size JSON tokens GCF tokens Savings
Duplication results 5 755 386 49%
Duplication results 20 2,773 1,137 59%
Duplication results 50 7,482 3,319 56%
Duplication results 100 14,498 6,112 58%
Project statistics 50 files 256 185 28%
Project statistics 200 files 351 252 28%
Overall 26,325 11,544 56%

GCF wins 7/7 comparisons. The largest savings come from duplication results where nested snippetLocation and codebaseLocation objects are flattened into path columns ("codebaseLocation>file", "codebaseLocation>startLine") instead of repeating the full nested structure on every row.

Benchmark script: jscpd-token-benchmark.mjs

Why GCF

  • 56% fewer tokens on jscpd duplication results vs JSON
  • 100% comprehension on every frontier model (Claude, GPT-5.5, Gemini, Grok). 2,400+ LLM evaluations
  • 43 billion+ lossless round-trips across 5 formats and 6 language implementations. Zero data corruption
  • Zero runtime dependencies. The GCF TypeScript package has no deps
  • Spec v3.2 Stable, 174 conformance fixtures, 6 implementations (Go, TypeScript, Python, Rust, Swift, Kotlin)

Adopted by Speakeasy (Google, Verizon, Mistral AI, DocuSign), OmniRoute (6.1K stars), and others.

What changed

  • apps/jscpd-server/package.json: added @blackwell-systems/gcf as optional dependency (pinned to 2.2.1)
  • apps/jscpd-server/src/index.ts: added --gcf CLI flag
  • apps/jscpd-server/src/server/serialize.ts: new module, dynamic import with graceful fallback
  • apps/jscpd-server/src/server/mcp-server.ts: replaced JSON.stringify calls with serialize() (3 call sites)

Usage

jscpd-server --gcf .

Or via environment variable:

JSCPD_OUTPUT_FORMAT=gcf jscpd-server .

Without the flag or env var, behavior is identical to current (JSON output). If @blackwell-systems/gcf is not installed, the flag silently falls back to JSON.


This change is Reviewable

Summary by CodeRabbit

  • New Features
    • Added optional Graph Compact Format (GCF) serialization for server/MCP tool responses, including duplication checks and statistics.
    • Introduced a --gcf command-line option (and matching configuration) to enable GCF output.
    • If GCF encoding isn’t available or fails, responses automatically fall back to the existing pretty-printed JSON format.

GCF (Graph Compact Format) encodes structured data with 56% fewer tokens
than JSON on duplication results. Optional dependency, opt-in via --gcf
flag or JSCPD_OUTPUT_FORMAT=gcf env var. Falls back to JSON if the
package is not installed or encoding fails.

Benchmarked on jscpd response shapes (duplication results + statistics):
- 5 duplications: 49% fewer tokens
- 20 duplications: 59% fewer tokens
- 100 duplications: 58% fewer tokens (14,498 -> 6,112)
- Project statistics: 27-28% fewer tokens
- Overall: 56% savings, 7/7 GCF wins
@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

Review Change Stack

Walkthrough

An optional @blackwell-systems/gcf dependency is added to jscpd-server. A new serialize.ts module conditionally encodes data as GCF or JSON based on a --gcf CLI flag or JSCPD_OUTPUT_FORMAT env var. All four MCP tool and resource responses are updated to use serialize instead of JSON.stringify.

Changes

Optional GCF Serialization for MCP Tool Responses

Layer / File(s) Summary
Serialize module and optional dependency
apps/jscpd-server/package.json, apps/jscpd-server/src/server/serialize.ts
Declares @blackwell-systems/gcf@2.2.1 as an optional dependency. Implements serialize.ts with a dynamic import of encodeGeneric, a memoized isGcfEnabled() check (env var or --gcf argv), and a serialize(data, indent) function that tries GCF encoding and falls back to JSON.stringify on error or when GCF is unavailable.
CLI flag and mcp-server wiring
apps/jscpd-server/src/index.ts, apps/jscpd-server/src/server/mcp-server.ts
Adds --gcf option to the server CLI definition. Updates all four MCP tool and resource responses (check_duplication, get_statistics, check_current_directory, statistics) to call serialize instead of JSON.stringify.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 A hop through the JSON fields so wide,
Now GCF can join the encoding ride!
An optional package, a flag on the line,
serialize() picks the best format just fine.
Fallback to JSON if something goes wrong —
The rabbit keeps output neat, steady, and strong! 🌿

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding an opt-in GCF output format for MCP tool responses via the --gcf flag, which matches the primary objective of the PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
apps/jscpd-server/src/index.ts (1)

26-30: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

CLI flag is registered but not read through Commander.

The --gcf flag is registered with Commander but never accessed via cli.opts().gcf. Instead, serialize.ts directly checks process.argv.includes("--gcf"). This works but is non-idiomatic—typically Commander-parsed options should be read through the opts object.

Consider either:

  1. Reading the flag via serverOpts.gcf and passing it to the serialize module, or
  2. Documenting that the flag is intentionally parsed outside Commander for decoupling reasons.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/jscpd-server/src/index.ts` around lines 26 - 30, The `--gcf` flag is
registered with Commander but not read through the idiomatic `cli.opts()`
method. After parsing the command line options with Commander, read the gcf flag
value via `cli.opts().gcf` and pass this parsed value to the serialize module as
a parameter. Then update `serialize.ts` to accept and use this parameter instead
of directly checking `process.argv.includes("--gcf")` to follow Commander
conventions.
apps/jscpd-server/src/server/serialize.ts (1)

37-46: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Remove redundant encodeGeneric null check.

Line 38 checks encodeGeneric even though isGcfEnabled() already returns false when encodeGeneric === null (line 30). The && encodeGeneric condition is redundant.

♻️ Simplified version
 export function serialize(data: unknown, indent = 2): string {
-  if (isGcfEnabled() && encodeGeneric) {
+  if (isGcfEnabled()) {
     try {
-      return encodeGeneric(data);
+      return encodeGeneric!(data);
     } catch {
       // Fall back to JSON on any encoding error
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/jscpd-server/src/server/serialize.ts` around lines 37 - 46, In the
serialize function, remove the redundant `&& encodeGeneric` condition from the
if statement that checks `isGcfEnabled()`. Since isGcfEnabled() already returns
false when encodeGeneric is null (as verified on line 30), the additional null
check is unnecessary. Simply change the condition to check only `if
(isGcfEnabled())` to eliminate the redundancy.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@apps/jscpd-server/src/server/mcp-server.ts`:
- Line 5: The `serialize` function is imported but not consistently applied
across all tool handlers in the MCP server. The `get_statistics` tool handler
(around line 83) still uses `JSON.stringify` instead of the `serialize`
function, while the other two tool handlers `check_duplication` and
`check_current_directory` have been updated to use `serialize`. Replace the
`JSON.stringify` call in the `get_statistics` tool handler with the `serialize`
function to ensure consistent output formatting when the global configuration
flag is enabled. Note that the `statistics` resource definition around line 149
should continue using `JSON.stringify` since it explicitly declares a JSON
mimeType and resources have different semantics than tools.

---

Nitpick comments:
In `@apps/jscpd-server/src/index.ts`:
- Around line 26-30: The `--gcf` flag is registered with Commander but not read
through the idiomatic `cli.opts()` method. After parsing the command line
options with Commander, read the gcf flag value via `cli.opts().gcf` and pass
this parsed value to the serialize module as a parameter. Then update
`serialize.ts` to accept and use this parameter instead of directly checking
`process.argv.includes("--gcf")` to follow Commander conventions.

In `@apps/jscpd-server/src/server/serialize.ts`:
- Around line 37-46: In the serialize function, remove the redundant `&&
encodeGeneric` condition from the if statement that checks `isGcfEnabled()`.
Since isGcfEnabled() already returns false when encodeGeneric is null (as
verified on line 30), the additional null check is unnecessary. Simply change
the condition to check only `if (isGcfEnabled())` to eliminate the redundancy.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 79092a48-8087-45d9-b580-36b7a43d30b3

📥 Commits

Reviewing files that changed from the base of the PR and between 9cf3cc3 and 3ae1463.

📒 Files selected for processing (4)
  • apps/jscpd-server/package.json
  • apps/jscpd-server/src/index.ts
  • apps/jscpd-server/src/server/mcp-server.ts
  • apps/jscpd-server/src/server/serialize.ts

Comment thread apps/jscpd-server/src/server/mcp-server.ts

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
apps/jscpd-server/src/server/mcp-server.ts (1)

149-149: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Keep resource payload aligned with declared JSON MIME type.

Line 140 declares mimeType: "application/json", but Line 149 uses serialize(stats), which can output GCF when enabled. That breaks the resource contract for JSON consumers.

🔧 Minimal fix
         return {
           contents: [
             {
               uri: uri.href,
-              text: serialize(stats),
+              text: JSON.stringify(stats, null, 2),
             },
           ],
         };
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/jscpd-server/src/server/mcp-server.ts` at line 149, The resource
declares mimeType as "application/json" on line 140, but the text property uses
serialize(stats) which can output GCF format when enabled, creating a contract
mismatch. Replace the serialize(stats) call with JSON.stringify(stats) to ensure
the output is always valid JSON that aligns with the declared mimeType.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@apps/jscpd-server/src/server/mcp-server.ts`:
- Line 149: The resource declares mimeType as "application/json" on line 140,
but the text property uses serialize(stats) which can output GCF format when
enabled, creating a contract mismatch. Replace the serialize(stats) call with
JSON.stringify(stats) to ensure the output is always valid JSON that aligns with
the declared mimeType.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: df9601e8-ba40-459b-844f-ec5869ae3fd9

📥 Commits

Reviewing files that changed from the base of the PR and between 3ae1463 and 34deaa8.

📒 Files selected for processing (2)
  • apps/jscpd-server/src/server/mcp-server.ts
  • apps/jscpd-server/src/server/serialize.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • apps/jscpd-server/src/server/serialize.ts

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
apps/jscpd-server/src/server/mcp-server.ts (1)

117-117: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Minor behavioral change: JSON is now pretty-printed when GCF is disabled.

The original code used JSON.stringify(statistics) without indentation arguments, producing minified JSON. The new serialize(statistics) call defaults to indent=2, so when GCF is disabled, the JSON output will now be pretty-printed instead of minified.

This is a very minor change that improves readability and normalizes formatting across all tool responses. However, it technically contradicts the PR objective's claim of "zero behavior changes" when GCF is disabled.

If the minified format was intentional for this specific handler, you can preserve it by calling serialize(statistics, 0) instead.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/jscpd-server/src/server/mcp-server.ts` at line 117, The
serialize(statistics) call on line 117 now produces pretty-printed JSON with
default indentation instead of the original minified JSON format from
JSON.stringify(). To preserve the original minified JSON behavior and maintain
zero behavioral changes when GCF is disabled, modify the serialize(statistics)
call to serialize(statistics, 0) to explicitly disable indentation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@apps/jscpd-server/src/server/mcp-server.ts`:
- Line 117: The serialize(statistics) call on line 117 now produces
pretty-printed JSON with default indentation instead of the original minified
JSON format from JSON.stringify(). To preserve the original minified JSON
behavior and maintain zero behavioral changes when GCF is disabled, modify the
serialize(statistics) call to serialize(statistics, 0) to explicitly disable
indentation.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 05215a72-725a-42f6-b7e2-7b62d5510b1d

📥 Commits

Reviewing files that changed from the base of the PR and between 8472990 and 2f83019.

📒 Files selected for processing (1)
  • apps/jscpd-server/src/server/mcp-server.ts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant