test(edit): add MCP edit performance integration by edgarsskore · Pull Request #489 · wonderwhy-er/DesktopCommanderMCP

edgarsskore · 2026-06-02T08:35:44Z

Summary by CodeRabbit

Tests
- Added long-running integration/performance tests covering large-file edit workflows (Markdown and Python), including exact-match and fuzzy-fallback edit handling, periodic checkpoints, parallel workloads, responsiveness/latency probes, and end-to-end assertions.
Chores
- Added a new npm script to build and run the integration suite and a test runner that executes each integration test, reports pass/fail with durations, and emits an aggregate summary.

coderabbitai · 2026-06-02T08:35:56Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds an integration/performance test and npm script that starts the built MCP server over stdio, generates large deterministic markdown and Python fixtures, runs concurrent exact and fuzzy edit_block workflows with paged reads and checkpoints, probes server responsiveness, validates final file contents, and restores configuration on teardown.

Changes

Edit Block Performance Integration Test

Layer / File(s)	Summary
Test Configuration & Utilities `test/integration/edit-block-performance.js` (lines 1–196)	Defines test constants, helpers for MCP calls and assertions, deterministic fixture generators (markdown/Python/fuzzy), marker builders, read-offset calculations, and fuzzy diff parsing.
Edit Block Workflow Implementations `test/integration/edit-block-performance.js` (lines 198–530)	Implements three workflows: markdown same-file edits (various editCounts), Python exact-match edits (150), and Python fuzzy-fallback edits (25). Each writes fixtures, performs paged reads and edit_block calls, writes periodic checkpoints, re-reads to verify edits, and enforces performance bounds.
Orchestration & Responsiveness Probe `test/integration/edit-block-performance.js` (lines 532–581)	Runs workflows in parallel, starts a responsiveness probe that pings the MCP server at intervals, aggregates durations, and asserts max observed latency stays below configured threshold.
MCP Client, Setup & Teardown `test/integration/edit-block-performance.js` (lines 583–658)	Launches the built MCP server via stdio, connects an MCP SDK client, captures server stderr, recreates test directory, validates tools, sets editable config values (allowed dirs, read/write line limits), and restores config and filesystem on teardown.
Main Entrypoint & Process Handling `test/integration/edit-block-performance.js` (lines 660–714)	Orchestrates client creation, setup, parallel workflow execution, verification and performance reporting, strict planned-vs-verified checks (with fuzzy special-casing), guarded teardown, client shutdown, and error-exit behavior.
Integration Test Runner & NPM Wiring `package.json`, `test/integration/run-all-integration-tests.js` (lines 1–90)	Adds `test:integration` npm script to run build then the runner; runner discovers `.js` tests in `test/integration/`, spawns each test sequentially with `node`, records durations and pass/fail results, prints per-test timings and aggregate summary, and exits nonzero if any test failed.

Sequence Diagram(s)

sequenceDiagram
  participant TestMain as Test Main
  participant MCPServer as MCP Server (dist/index.js)
  participant MCPClient as MCP Client (SDK)
  participant Workflows as Concurrent Workflows
  participant Probe as Responsiveness Probe

  TestMain->>MCPServer: spawn via stdio
  MCPServer-->>TestMain: server stdout/stderr (captured)
  TestMain->>MCPClient: connect()
  TestMain->>MCPClient: enumerate_tools
  TestMain->>MCPClient: get_config / set_config (test limits)

  TestMain->>Workflows: start parallel workflows
  TestMain->>Probe: start probe loop

  par Markdown Workflow
    Workflows->>MCPClient: write_file (large markdown)
    Workflows->>MCPClient: read_file (paged reads)
    Workflows->>MCPClient: edit_block (replace marker)
    Workflows->>MCPClient: read_file (verify)
  and Python Exact Workflow
    Workflows->>MCPClient: write_file (python fixture)
    Workflows->>MCPClient: edit_block (exact match)
    Workflows->>MCPClient: read_file (verify)
  and Python Fuzzy Workflow
    Workflows->>MCPClient: write_file (fuzzy fixture)
    Workflows->>MCPClient: edit_block (near-miss)
    MCPClient-->>Workflows: Differences (extracted exact)
    Workflows->>MCPClient: edit_block (retry with exact)
  and Probe Loop
    loop every 1s
      Probe->>MCPClient: list_tools / ping
      MCPClient-->>Probe: respond (measure latency)
    end
  end

  Workflows-->>TestMain: workflows complete
  Probe-->>TestMain: probe stopped (latencies)
  TestMain->>MCPClient: set_config (restore)
  TestMain->>TestMain: teardown & shutdown

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hopped through fixtures large and bright,
Replaced old markers in the quiet night.
Fuzzy fell back, exact matches flew,
Pings kept time as edits grew and grew.
Tests finished green — a carrot-shaped view!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main changes: adding integration tests for MCP edit block performance, including a performance test script, test runner, and npm script.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch investigate/edit-block-performance

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/test-edit-block-performance-integration.js`:
- Around line 595-603: The test calls callTool(client, 'get_config', {}) without
first asserting the MCP advertises the 'get_config' tool; add an assertion
similar to the existing loop that checks tools.tools.some(tool => tool.name ===
'get_config') (using the same tools variable from client.listTools) so the test
validates presence of 'get_config' before invoking callTool and fails with the
correct message if the tool is missing.
- Around line 591-623: The setup() routine can fail after creating TEST_DIR and
mutating server state (via callTool('set_config_value')), leaving the MCP
process and partial config changes running; modify main() to guard the setup
call by wrapping it in try/catch (or move setup into the existing try block) and
on any setup error run the same cleanup/teardown logic used in the finally block
(stop the MCP process, remove TEST_DIR, and revert any config changes if
possible), ensuring resources created during setup are always cleaned even if
setup throws; reference setup(), main(), TEST_DIR, callTool and the
'set_config_value' tool when applying the fix.
- Around line 613-620: The test sets 'fileWriteLineLimit' to 50 via
callTool(client, 'set_config_value', { key, value, origin: 'llm' }) which is too
small for the generated fixtures (thousands of lines) and can cause writes to
fail before exercising edit_block; increase the value for the
'fileWriteLineLimit' entry (in the array iterated by the for..of that calls
set_config_value and assertToolSuccess) to a number larger than the fixture size
(e.g., several thousand) so write_file calls won't be truncated or rejected
during the integration test.
- Around line 510-521: The responsiveness probe (started via
runResponsivenessProbe with stopProbe) is not stopped/drained if Promise.all
rejects; modify runParallelWorkflows to ensure the probe is always stopped and
awaited: wrap the Promise.all([...]) call in a try/finally (or
try/catch/finally), set stopProbe.value = true in the finally block, then await
responsivenessProbe in that same finally to drain it before rethrowing any
error; keep existing variables workflowResults and responsivenessProbe names so
the change is local to runParallelWorkflows and ensures the probe won’t continue
pinging during teardown.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cac780dd-aca7-4fd3-bfc5-805c7c7fb7c7

📥 Commits

Reviewing files that changed from the base of the PR and between ce4669c and d66c3e2.

📒 Files selected for processing (1)

test/test-edit-block-performance-integration.js

coderabbitai

🧹 Nitpick comments (1)

test/integration/run-all-integration-tests.js (1)

16-43: 🏗️ Heavy lift

Consider adding a timeout mechanism for hung tests.

Currently, if a test hangs indefinitely (e.g., waiting for a resource that never arrives), the runner will also hang. While CI pipelines typically have external timeouts, adding a configurable per-test timeout would improve developer experience and provide clearer error messages when tests exceed expected durations.

💡 Example timeout implementation

-function runTestFile(testFile) {
+function runTestFile(testFile, timeoutMs = 300000) { // default 5 min
   return new Promise((resolve) => {
     console.log(`\nRunning integration test: ${testFile}`);
     const startedAt = Date.now();
     const proc = spawn('node', [testFile], {
       cwd: __dirname,
       stdio: 'inherit',
       shell: false,
     });

+    const timer = setTimeout(() => {
+      proc.kill('SIGTERM');
+      const duration = Date.now() - startedAt;
+      console.error(`FAIL ${testFile} (${duration}ms): timeout after ${timeoutMs}ms`);
+      resolve({ file: testFile, success: false, duration, error: 'timeout' });
+    }, timeoutMs);
+
     proc.on('close', (code) => {
+      clearTimeout(timer);
       const duration = Date.now() - startedAt;
       if (code === 0) {
         console.log(`PASS ${testFile} (${duration}ms)`);
         resolve({ file: testFile, success: true, duration });
       } else {
         console.error(`FAIL ${testFile} (${duration}ms, exit code ${code})`);
         resolve({ file: testFile, success: false, duration, exitCode: code });
       }
     });

     proc.on('error', (error) => {
+      clearTimeout(timer);
       const duration = Date.now() - startedAt;
       console.error(`FAIL ${testFile} (${duration}ms): ${error.message}`);
       resolve({ file: testFile, success: false, duration, error: error.message });
     });
   });
 }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/integration/run-all-integration-tests.js` around lines 16 - 43, The
runTestFile function lacks a per-test timeout and can hang indefinitely; add a
configurable timeout (e.g., from an env var or argument like TEST_TIMEOUT_MS
with a sane default) inside runTestFile that starts a timer after spawning the
child, and on timeout logs a clear FAIL message, kills the child process
(proc.kill()), and resolves the promise with success:false, duration and a
timeout flag/message; ensure you clear the timeout when proc emits 'close' or
'error' to avoid leaks and race conditions.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@test/integration/run-all-integration-tests.js`:
- Around line 16-43: The runTestFile function lacks a per-test timeout and can
hang indefinitely; add a configurable timeout (e.g., from an env var or argument
like TEST_TIMEOUT_MS with a sane default) inside runTestFile that starts a timer
after spawning the child, and on timeout logs a clear FAIL message, kills the
child process (proc.kill()), and resolves the promise with success:false,
duration and a timeout flag/message; ensure you clear the timeout when proc
emits 'close' or 'error' to avoid leaks and race conditions.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 601046cf-19b1-4482-86ff-828497e16c48

📥 Commits

Reviewing files that changed from the base of the PR and between 98d7745 and 5bd6800.

📒 Files selected for processing (2)

package.json
test/integration/run-all-integration-tests.js

🚧 Files skipped from review as they are similar to previous changes (1)

package.json

…itch The integration test set DESKTOP_COMMANDER_DISABLE_TELEMETRY on the spawned server, but nothing read it — telemetry was gated only on the persisted `telemetryEnabled` config, so test/CI runs fired real GA4 + BigQuery-proxy events. Add an env-based kill-switch that short-circuits both send paths, independent of config and without mutating the user's config.

Adds a DOCX same-file edit workflow (40 edits) to the parallel performance suite so the docx edit_block path (find/replace on pretty-printed document XML + zip repack) is exercised under the concurrent responsiveness probe. Targets <w:t xml:space="preserve">...</w:t> elements and verifies via the DOCX outline read.

test(edit): add MCP edit performance integration

d66c3e2

coderabbitai Bot reviewed Jun 2, 2026

View reviewed changes

Comment thread test/integration/edit-block-performance.js Outdated

Comment thread test/integration/edit-block-performance.js

Comment thread test/integration/edit-block-performance.js

Comment thread test/integration/edit-block-performance.js

edgarsskore added 4 commits June 2, 2026 11:46

test(edit): gate MCP performance integration

33ec0fb

test(edit): harden performance probe cleanup

98d7745

test(integration): add integration test runner

5bd6800

test(integration): print integration timings

70b6dc4

coderabbitai Bot reviewed Jun 2, 2026

View reviewed changes

wonderwhy-er added 2 commits June 4, 2026 12:53

wonderwhy-er merged commit 3881aed into main Jun 4, 2026
2 checks passed

coderabbitai Bot mentioned this pull request Jun 9, 2026

fix(edit_block): run fuzzy search in a worker thread to keep event lo… #500

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

test(edit): add MCP edit performance integration#489

test(edit): add MCP edit performance integration#489
wonderwhy-er merged 7 commits into
mainfrom
investigate/edit-block-performance

edgarsskore commented Jun 2, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 2, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

edgarsskore commented Jun 2, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

edgarsskore commented Jun 2, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 2, 2026 •

edited

Loading