Skip to content

feat(telemetry): send tool-call duration_ms with server_call_tool#492

Merged
wonderwhy-er merged 1 commit into
mainfrom
feat/tool-call-duration-telemetry
Jun 5, 2026
Merged

feat(telemetry): send tool-call duration_ms with server_call_tool#492
wonderwhy-er merged 1 commit into
mainfrom
feat/tool-call-duration-telemetry

Conversation

@wonderwhy-er

@wonderwhy-er wonderwhy-er commented Jun 4, 2026

Copy link
Copy Markdown
Owner

What

Makes per-tool-call latency observable in telemetry so we can detect MCP performance regressions/improvements across versions and clients.

Previously server_call_tool was fired before the handler ran, so it could never carry timing. Duration was computed after execution but only fed into the local in-memory toolHistory (surfaced via get_recent_tool_calls) and never sent anywhere.

Change

Moves the single server_call_tool capture from before the switch into a finally block at the end of the CallToolRequestSchema handler, enriched with two new fields:

  • duration_msDate.now() - startTime
  • is_error — sent as a string (consistent with how remote is sent)

To make this possible, telemetryData, result, and a new isError flag are hoisted above the try so the finally can read them. isError is taken from result.isError on the normal path and set to true in the catch.

No new event, no volume change — just two new params on the existing server_call_tool.

Why finally

It fires on all exit paths: success, handled-error (isError result), and hard crash (the top-level catch). The event is only missed if a tool never returns and never throws — a true hang. That's an accepted, known limitation (and the price of not adding a second start-fired event).

Local toolHistory duration tracking is unchanged.

Testing

  • npm run build — clean, no TS errors
  • test-telemetry-handling.js, test-ui-event-tracking.js — pass
  • Full suite: 39/40. The single failure (test-file-handlers.js Test 9, read_file image-content host compatibility) reproduces on clean origin/main and is unrelated to this change.

Follow-up for reviewer

Once this is collecting, we can build BigQuery views on mcp.merged for p50/p95 duration_ms per tool_name per app_version to watch for regressions.

Summary by CodeRabbit

  • Refactor
    • Enhanced internal telemetry event handling to ensure reliable tracking across all execution paths, including error scenarios.

Move the server_call_tool event from before execution into a finally
block so it can carry duration_ms and is_error. The finally guarantees
it fires on success, handled-error, and hard-crash paths. Only a true
hang (never returns/throws) is uninstrumented. No new event added.

- Hoist telemetryData/result/isError above the try so finally can read them
- duration_ms = Date.now() - startTime, is_error sent as string (proxy convention)
- Keeps local toolHistory duration tracking unchanged
@coderabbitai

coderabbitai Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

The PR refactors telemetry emission in the CallToolRequestSchema handler by moving state tracking and event emission to a finally block. Telemetry variables are hoisted outside the try scope, error states are captured from tool results and exceptions, and a single server_call_tool event is emitted from finally with duration and error flags, ensuring telemetry is captured even on crashes.

Changes

Telemetry Emission Restructuring

Layer / File(s) Summary
Telemetry state variable hoisting
src/server.ts
Telemetry data, result, and error-flag variables are hoisted outside the try block so they persist through exception paths and are accessible to the finally block.
Error state tracking through execution
src/server.ts
Success and failure outcomes are captured by setting the error flag after tool execution and explicitly in the catch block, ensuring accurate error reporting in emitted telemetry.
Finally-block telemetry emission
src/server.ts
A finally block is added to emit the server_call_tool telemetry event with merged metadata, duration, and error flag, guaranteeing emission even on crash paths where the catch block may not fully complete.

Sequence Diagram

sequenceDiagram
  participant Handler as CallToolRequestSchema
  participant Tool as Tool Execution
  participant Telemetry as Telemetry
  Handler->>Handler: Hoist telemetryData, result, isError vars
  Handler->>Handler: trackToolCall setup
  Handler->>Tool: Execute tool
  alt Tool succeeds
    Tool-->>Handler: result with isError=false
    Handler->>Handler: Set isError from result.isError
  else Tool throws
    Tool-->>Handler: Exception
    Handler->>Handler: Catch: set isError=true
  end
  Handler->>Telemetry: Finally: capture_call_tool with telemetryData + duration_ms + is_error
  Telemetry-->>Handler: Event emitted
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

  • wonderwhy-er/DesktopCommanderMCP#153: Both PRs modify src/server.ts's CallToolRequestSchema handler telemetry for the server_call_tool event—Added call_tool #153 switches the capture call to capture_call_tool, while this PR refactors/centralizes telemetry emission via a new finally block.
  • wonderwhy-er/DesktopCommanderMCP#486: Both PRs modify src/server.ts's CallToolRequestSchema handler telemetry—this PR restructures the server_call_tool emission via a new finally, while the other adds module-level remote-call state that downstream per-tool telemetry uses.
  • wonderwhy-er/DesktopCommanderMCP#251: Both PRs modify src/server.ts's CallToolRequest tool-execution flow around result/error handling and post-execution logic (this PR refactors telemetry emission in finally, while the other adds toolHistory recording and duration).

Suggested reviewers

  • edgarsskore

Poem

🐰 A finally block hops into view,
Telemetry safe through and through,
State variables held in the morning light,
Catching both crashes and victories bright! 🎯

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding duration_ms telemetry field to the server_call_tool event, which is the primary objective of this refactoring.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/tool-call-duration-telemetry

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/server.ts`:
- Around line 1575-1583: In the finally block where
capture_call_tool('server_call_tool', {...}) is invoked, wrap that call in a
local try/catch so any exception from the telemetry/analytics emission cannot
escape and affect the handler's normal return/error flow; inside the catch, log
the telemetry error (do not rethrow). Keep existing telemetry payload fields
(telemetryData, duration_ms: Date.now() - startTime, is_error: String(isError))
when calling capture_call_tool and ensure only logging happens on failure.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 14656eff-d1d1-4901-9350-dbb2bcf5b7a9

📥 Commits

Reviewing files that changed from the base of the PR and between 3881aed and 4cc7c34.

📒 Files selected for processing (1)
  • src/server.ts

Comment thread src/server.ts
Comment on lines +1575 to +1583
} finally {
// Single tool-call telemetry event, fired AFTER execution so it can carry
// timing. In a finally so it still fires on the hard-crash path (the catch
// above). Only missed if a tool never returns or throws (a true hang).
capture_call_tool('server_call_tool', {
...telemetryData,
duration_ms: Date.now() - startTime,
is_error: String(isError),
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Guard telemetry emission in finally so analytics failures can't fail the tool call.

A throw from capture_call_tool(...) inside finally will override the normal return/error response path and turn successful tool executions into handler failures. Wrap this emit in a local try/catch (log-only on failure).

Proposed hardening patch
     } finally {
         // Single tool-call telemetry event, fired AFTER execution so it can carry
         // timing. In a finally so it still fires on the hard-crash path (the catch
         // above). Only missed if a tool never returns or throws (a true hang).
-        capture_call_tool('server_call_tool', {
-            ...telemetryData,
-            duration_ms: Date.now() - startTime,
-            is_error: String(isError),
-        });
+        try {
+            capture_call_tool('server_call_tool', {
+                ...telemetryData,
+                duration_ms: Date.now() - startTime,
+                is_error: String(isError),
+            });
+        } catch (telemetryError) {
+            logToStderr('warn', `server_call_tool telemetry failed: ${telemetryError}`);
+        }
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
} finally {
// Single tool-call telemetry event, fired AFTER execution so it can carry
// timing. In a finally so it still fires on the hard-crash path (the catch
// above). Only missed if a tool never returns or throws (a true hang).
capture_call_tool('server_call_tool', {
...telemetryData,
duration_ms: Date.now() - startTime,
is_error: String(isError),
});
} finally {
// Single tool-call telemetry event, fired AFTER execution so it can carry
// timing. In a finally so it still fires on the hard-crash path (the catch
// above). Only missed if a tool never returns or throws (a true hang).
try {
capture_call_tool('server_call_tool', {
...telemetryData,
duration_ms: Date.now() - startTime,
is_error: String(isError),
});
} catch (telemetryError) {
logToStderr('warn', `server_call_tool telemetry failed: ${telemetryError}`);
}
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/server.ts` around lines 1575 - 1583, In the finally block where
capture_call_tool('server_call_tool', {...}) is invoked, wrap that call in a
local try/catch so any exception from the telemetry/analytics emission cannot
escape and affect the handler's normal return/error flow; inside the catch, log
the telemetry error (do not rethrow). Keep existing telemetry payload fields
(telemetryData, duration_ms: Date.now() - startTime, is_error: String(isError))
when calling capture_call_tool and ensure only logging happens on failure.

Comment thread src/server.ts
@@ -1453,6 +1454,7 @@ server.setRequestHandler(CallToolRequestSchema, async (request: CallToolRequest)

// Add tool call to history (exclude only get_recent_tool_calls to prevent recursion)
const duration = Date.now() - startTime;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the duration to use for the tool call duration without overhead calculations. Currently, on line 1581 duration is recalculated in finally, but there is a bit of an overhead between this point and that point, I think this duration is more precise for describing tool call duration. Should maybe recalculate sooner than finally.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the end decided to leave as is.
Duration you point to is in try block.
Due to error it may never get calculated.

@wonderwhy-er wonderwhy-er merged commit 6712a80 into main Jun 5, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants