test(edit): add MCP edit performance integration#489
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds an integration/performance test and npm script that starts the built MCP server over stdio, generates large deterministic markdown and Python fixtures, runs concurrent exact and fuzzy edit_block workflows with paged reads and checkpoints, probes server responsiveness, validates final file contents, and restores configuration on teardown. ChangesEdit Block Performance Integration Test
Sequence Diagram(s)sequenceDiagram
participant TestMain as Test Main
participant MCPServer as MCP Server (dist/index.js)
participant MCPClient as MCP Client (SDK)
participant Workflows as Concurrent Workflows
participant Probe as Responsiveness Probe
TestMain->>MCPServer: spawn via stdio
MCPServer-->>TestMain: server stdout/stderr (captured)
TestMain->>MCPClient: connect()
TestMain->>MCPClient: enumerate_tools
TestMain->>MCPClient: get_config / set_config (test limits)
TestMain->>Workflows: start parallel workflows
TestMain->>Probe: start probe loop
par Markdown Workflow
Workflows->>MCPClient: write_file (large markdown)
Workflows->>MCPClient: read_file (paged reads)
Workflows->>MCPClient: edit_block (replace marker)
Workflows->>MCPClient: read_file (verify)
and Python Exact Workflow
Workflows->>MCPClient: write_file (python fixture)
Workflows->>MCPClient: edit_block (exact match)
Workflows->>MCPClient: read_file (verify)
and Python Fuzzy Workflow
Workflows->>MCPClient: write_file (fuzzy fixture)
Workflows->>MCPClient: edit_block (near-miss)
MCPClient-->>Workflows: Differences (extracted exact)
Workflows->>MCPClient: edit_block (retry with exact)
and Probe Loop
loop every 1s
Probe->>MCPClient: list_tools / ping
MCPClient-->>Probe: respond (measure latency)
end
end
Workflows-->>TestMain: workflows complete
Probe-->>TestMain: probe stopped (latencies)
TestMain->>MCPClient: set_config (restore)
TestMain->>TestMain: teardown & shutdown
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/test-edit-block-performance-integration.js`:
- Around line 595-603: The test calls callTool(client, 'get_config', {}) without
first asserting the MCP advertises the 'get_config' tool; add an assertion
similar to the existing loop that checks tools.tools.some(tool => tool.name ===
'get_config') (using the same tools variable from client.listTools) so the test
validates presence of 'get_config' before invoking callTool and fails with the
correct message if the tool is missing.
- Around line 591-623: The setup() routine can fail after creating TEST_DIR and
mutating server state (via callTool('set_config_value')), leaving the MCP
process and partial config changes running; modify main() to guard the setup
call by wrapping it in try/catch (or move setup into the existing try block) and
on any setup error run the same cleanup/teardown logic used in the finally block
(stop the MCP process, remove TEST_DIR, and revert any config changes if
possible), ensuring resources created during setup are always cleaned even if
setup throws; reference setup(), main(), TEST_DIR, callTool and the
'set_config_value' tool when applying the fix.
- Around line 613-620: The test sets 'fileWriteLineLimit' to 50 via
callTool(client, 'set_config_value', { key, value, origin: 'llm' }) which is too
small for the generated fixtures (thousands of lines) and can cause writes to
fail before exercising edit_block; increase the value for the
'fileWriteLineLimit' entry (in the array iterated by the for..of that calls
set_config_value and assertToolSuccess) to a number larger than the fixture size
(e.g., several thousand) so write_file calls won't be truncated or rejected
during the integration test.
- Around line 510-521: The responsiveness probe (started via
runResponsivenessProbe with stopProbe) is not stopped/drained if Promise.all
rejects; modify runParallelWorkflows to ensure the probe is always stopped and
awaited: wrap the Promise.all([...]) call in a try/finally (or
try/catch/finally), set stopProbe.value = true in the finally block, then await
responsivenessProbe in that same finally to drain it before rethrowing any
error; keep existing variables workflowResults and responsivenessProbe names so
the change is local to runParallelWorkflows and ensures the probe won’t continue
pinging during teardown.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: cac780dd-aca7-4fd3-bfc5-805c7c7fb7c7
📒 Files selected for processing (1)
test/test-edit-block-performance-integration.js
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/integration/run-all-integration-tests.js (1)
16-43: 🏗️ Heavy liftConsider adding a timeout mechanism for hung tests.
Currently, if a test hangs indefinitely (e.g., waiting for a resource that never arrives), the runner will also hang. While CI pipelines typically have external timeouts, adding a configurable per-test timeout would improve developer experience and provide clearer error messages when tests exceed expected durations.
💡 Example timeout implementation
-function runTestFile(testFile) { +function runTestFile(testFile, timeoutMs = 300000) { // default 5 min return new Promise((resolve) => { console.log(`\nRunning integration test: ${testFile}`); const startedAt = Date.now(); const proc = spawn('node', [testFile], { cwd: __dirname, stdio: 'inherit', shell: false, }); + const timer = setTimeout(() => { + proc.kill('SIGTERM'); + const duration = Date.now() - startedAt; + console.error(`FAIL ${testFile} (${duration}ms): timeout after ${timeoutMs}ms`); + resolve({ file: testFile, success: false, duration, error: 'timeout' }); + }, timeoutMs); + proc.on('close', (code) => { + clearTimeout(timer); const duration = Date.now() - startedAt; if (code === 0) { console.log(`PASS ${testFile} (${duration}ms)`); resolve({ file: testFile, success: true, duration }); } else { console.error(`FAIL ${testFile} (${duration}ms, exit code ${code})`); resolve({ file: testFile, success: false, duration, exitCode: code }); } }); proc.on('error', (error) => { + clearTimeout(timer); const duration = Date.now() - startedAt; console.error(`FAIL ${testFile} (${duration}ms): ${error.message}`); resolve({ file: testFile, success: false, duration, error: error.message }); }); }); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/integration/run-all-integration-tests.js` around lines 16 - 43, The runTestFile function lacks a per-test timeout and can hang indefinitely; add a configurable timeout (e.g., from an env var or argument like TEST_TIMEOUT_MS with a sane default) inside runTestFile that starts a timer after spawning the child, and on timeout logs a clear FAIL message, kills the child process (proc.kill()), and resolves the promise with success:false, duration and a timeout flag/message; ensure you clear the timeout when proc emits 'close' or 'error' to avoid leaks and race conditions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@test/integration/run-all-integration-tests.js`:
- Around line 16-43: The runTestFile function lacks a per-test timeout and can
hang indefinitely; add a configurable timeout (e.g., from an env var or argument
like TEST_TIMEOUT_MS with a sane default) inside runTestFile that starts a timer
after spawning the child, and on timeout logs a clear FAIL message, kills the
child process (proc.kill()), and resolves the promise with success:false,
duration and a timeout flag/message; ensure you clear the timeout when proc
emits 'close' or 'error' to avoid leaks and race conditions.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 601046cf-19b1-4482-86ff-828497e16c48
📒 Files selected for processing (2)
package.jsontest/integration/run-all-integration-tests.js
🚧 Files skipped from review as they are similar to previous changes (1)
- package.json
…itch The integration test set DESKTOP_COMMANDER_DISABLE_TELEMETRY on the spawned server, but nothing read it — telemetry was gated only on the persisted `telemetryEnabled` config, so test/CI runs fired real GA4 + BigQuery-proxy events. Add an env-based kill-switch that short-circuits both send paths, independent of config and without mutating the user's config.
Adds a DOCX same-file edit workflow (40 edits) to the parallel performance suite so the docx edit_block path (find/replace on pretty-printed document XML + zip repack) is exercised under the concurrent responsiveness probe. Targets <w:t xml:space="preserve">...</w:t> elements and verifies via the DOCX outline read.
Summary by CodeRabbit