fix: add Promise.race hard timeout to feature flags fetch (#465)#467
Conversation
On Windows + Node 24 / undici 7.x, AbortController.abort() fails to interrupt an in-progress TCP connect — the fetch hangs until the OS-level TCP timeout (~30s). This blocks the MCP initialize response for users on high-latency networks (e.g. Australia). Changes: - fetchFlags(): wrap fetch in Promise.race with hard 3s timeout alongside AbortController, so the JS-level timeout is enforced regardless of whether AbortController actually interrupts the underlying socket - waitForFreshFlags(): add 5s safety timeout so it can never hang indefinitely (protects the new-user onboarding path that awaits this inside the MCP initialize handler) - Lower fetch timeout from 5s to 3s — flags load from cache anyway, a fresh fetch is not worth perceived startup latency - Add test/test-feature-flags-timeout.js with 6 tests covering: AbortController behavior, Promise.race pattern, slow-response servers, new-user onboarding path, and simulated broken AbortController Fixes #465
|
CodeAnt AI is reviewing your PR. Thanks for using CodeAnt! 🎉We're free for open-source projects. if you're enjoying it, help us grow by sharing. Share on X · |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThis PR hardens feature flags fetch timeout behavior by adding a JS-level hard timeout via ChangesFeature Flags Fetch Timeout Hardening
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
test/test-feature-flags-timeout.js (2)
287-300: ⚡ Quick winThis test does not cover the new
waitForFreshFlags()safety timeout.Here the onboarding path awaits
freshFetchPromisedirectly, so it only proves the fetch path returns in ~3s. IffreshFetchPromisenever resolves, the real protection is now thePromise.raceinsidewaitForFreshFlags(), and this test would miss that regression.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/test-feature-flags-timeout.js` around lines 287 - 300, The test currently awaits freshFetchPromise directly and therefore doesn't exercise the waitForFreshFlags() timeout behavior; update the test to call the actual waitForFreshFlags() (or the public method that races freshFetchPromise with the timeout) instead of awaiting freshFetchPromise, e.g. trigger the fire-and-forget fetch via currentFetchPattern/invoke initialize() and then await waitForFreshFlags() so the Promise.race timeout path is exercised and the test will fail if the safety timeout doesn't resolve.
96-113: ⚡ Quick winThis copied helper has already drifted from production.
currentFetchPattern()clearsabortTimeouton both success and failure, whileFeatureFlagManager.fetchFlags()only clears it on success and never clears the hard-timeout timer. That means this suite can stay green while the shipped path still leaks timers. Please share the timeout wrapper or exercise the real manager instead of copying the logic here.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/test-feature-flags-timeout.js` around lines 96 - 113, The test helper currentFetchPattern() diverges from production behavior in FeatureFlagManager.fetchFlags() by clearing abortTimeout on both paths and not reusing the shared timeout wrapper, which masks a timer leak; to fix, stop duplicating logic and either import/use the shared timeout wrapper used by FeatureFlagManager or invoke FeatureFlagManager.fetchFlags() in the test instead of currentFetchPattern(), ensuring the test exercises the real behavior (i.e., match how abortTimeout and the hard-timeout are handled in production and do not silently clear the hard-timeout timer if production doesn't).src/utils/feature-flags.ts (1)
122-123: ⚡ Quick winCancel or
unref()thewaitForFreshFlags()safety timer.If
freshFetchPromiseresolves immediately, this 5s timer still stays active. On the initialize path that can keep the process open after the response is already sent.Suggested fix
async waitForFreshFlags(): Promise<void> { if (this.freshFetchPromise) { - const safetyTimeout = new Promise<void>((resolve) => setTimeout(resolve, 5000)); - await Promise.race([this.freshFetchPromise, safetyTimeout]); + let timeoutHandle: NodeJS.Timeout | undefined; + const safetyTimeout = new Promise<void>((resolve) => { + timeoutHandle = setTimeout(resolve, 5000); + timeoutHandle.unref(); + }); + try { + await Promise.race([this.freshFetchPromise, safetyTimeout]); + } finally { + if (timeoutHandle) { + clearTimeout(timeoutHandle); + } + } } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/utils/feature-flags.ts` around lines 122 - 123, The safety timer created as safetyTimeout (new Promise with setTimeout) keeps the Node timer active even when this.freshFetchPromise resolves; change the implementation to create the timeout with setTimeout (store the returned timer id), call timer.unref() if available to avoid keeping the event loop alive, and after awaiting Promise.race([this.freshFetchPromise, safetyTimeoutPromise]) clearTimeout(timer) (or otherwise cancel the timer) so the timer is not left running; update the code paths around waitForFreshFlags / the initialization that reference this.freshFetchPromise to use this cancelable/unrefable timer.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/utils/feature-flags.ts`:
- Around line 160-182: The fetch timeout logic leaks timers; store the hard
timeout handle and move cleanup into a finally block so both timers are always
cleared: capture the return value of setTimeout(...) for hardTimeout, wrap the
Promise.race([fetchPromise, hardTimeout]) await in try/catch/finally (or
try/finally) and in finally call clearTimeout(abortTimeout) and
clearTimeout(hardTimeout) (and if desired call controller.abort() there), and
apply the same pattern to waitForFreshFlags() to ensure no timer handles remain
active on error/timeout paths; refer to FETCH_TIMEOUT_MS, controller,
abortTimeout, hardTimeout, fetchPromise, Promise.race and waitForFreshFlags() to
locate the changes.
---
Nitpick comments:
In `@src/utils/feature-flags.ts`:
- Around line 122-123: The safety timer created as safetyTimeout (new Promise
with setTimeout) keeps the Node timer active even when this.freshFetchPromise
resolves; change the implementation to create the timeout with setTimeout (store
the returned timer id), call timer.unref() if available to avoid keeping the
event loop alive, and after awaiting Promise.race([this.freshFetchPromise,
safetyTimeoutPromise]) clearTimeout(timer) (or otherwise cancel the timer) so
the timer is not left running; update the code paths around waitForFreshFlags /
the initialization that reference this.freshFetchPromise to use this
cancelable/unrefable timer.
In `@test/test-feature-flags-timeout.js`:
- Around line 287-300: The test currently awaits freshFetchPromise directly and
therefore doesn't exercise the waitForFreshFlags() timeout behavior; update the
test to call the actual waitForFreshFlags() (or the public method that races
freshFetchPromise with the timeout) instead of awaiting freshFetchPromise, e.g.
trigger the fire-and-forget fetch via currentFetchPattern/invoke initialize()
and then await waitForFreshFlags() so the Promise.race timeout path is exercised
and the test will fail if the safety timeout doesn't resolve.
- Around line 96-113: The test helper currentFetchPattern() diverges from
production behavior in FeatureFlagManager.fetchFlags() by clearing abortTimeout
on both paths and not reusing the shared timeout wrapper, which masks a timer
leak; to fix, stop duplicating logic and either import/use the shared timeout
wrapper used by FeatureFlagManager or invoke FeatureFlagManager.fetchFlags() in
the test instead of currentFetchPattern(), ensuring the test exercises the real
behavior (i.e., match how abortTimeout and the hard-timeout are handled in
production and do not silently clear the hard-timeout timer if production
doesn't).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 11a6c6fc-d58c-4f7d-aa6e-5ffe16f39b73
📒 Files selected for processing (2)
src/utils/feature-flags.tstest/test-feature-flags-timeout.js
Tested locally — fix confirmed workingThanks for the rapid turnaround on this. Tested the
1.2 s is effectively at parity with the "WiFi off" baseline from the original issue (~1.6 s) — meaning the feature flags fetch no longer blocks the Three side observations from the test run1. The Worth flagging because it's in the issue body and anyone hitting the same problem will try it. Pointing This actually reinforces the case for the code-level fix: a userspace workaround that relies on socket-level fail-fast behavior is platform-dependent and unreliable. 2. The bug reproduces on Node 22.20 (system), not only Node 24.15 (DXT-bundled). My original report was based on the Node bundled in Claude Desktop's DXT runtime (Node 24.15 / undici 7.24.4). For this test, 3. Minor unrelated bug: After "args": [
"C:\\\\MyProjects\\\\DesktopCommanderMCP\\\\dist\\\\index.js"
]That's quadruple backslashes in JSON, which JSON-parses to Likely the setup script applies something like LogsHappy to attach the three logs as artifacts if useful — the relevant deltas are the timestamps between Thanks again for prioritising this so fast. |
|
@e-grn thank you for reporting, investigating and testing the solution! Merged, will go in to next release this week. |
User description
Problem
On Windows + Node 24 / undici 7.x,
AbortController.abort()fails to interrupt an in-progress TCP connect. The fetch hangs until the OS-level TCP timeout (~30s on Windows), blocking the MCP initialize response for users on high-latency networks.Reported by a user in Australia — every cold start of Desktop Commander blocks for ~30s. With WiFi disabled (DNS fails fast), startup completes in ~1.6s.
See #465 for full details and logs.
Root Cause
Two paths where the slow fetch can block startup:
fetchFlags()usesAbortControllerwith a 5s timeout, but on affected platforms the abort signal doesn't interrupt the TCP socket — it hangs for ~30swaitForFreshFlags()(called in the MCP initialize handler for new users without a cache) awaits the fetch promise with no upper boundFix
fetchFlags(): Wrap fetch inPromise.racewith a hard 3s timeout alongside AbortController. Even if AbortController fails to interrupt the socket,Promise.raceensures rejection at the JS level.waitForFreshFlags(): Add 5s safety timeout so it can never hang indefinitely.Tests
Added
test/test-feature-flags-timeout.jswith 6 tests:waitForFreshFlags()with hanging fetchNote: Tests pass on macOS where AbortController works correctly. The broken AbortController simulation (test 6) proves the fix works regardless of platform. Needs testing on Windows to confirm the real-world fix.
Fixes #465
CodeAnt-AI Description
Prevent feature flags fetch from blocking startup
What Changed
Impact
✅ Faster MCP startup on slow networks✅ Fewer startup hangs for new users✅ Clearer protection against stalled feature-flag fetches🔄 Retrigger CodeAnt AI Review
Details
💡 Usage Guide
Checking Your Pull Request
Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.
Talking to CodeAnt AI
Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:
This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.
Example
Preserve Org Learnings with CodeAnt
You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:
This helps CodeAnt AI learn and adapt to your team's coding style and standards.
Example
Retrigger review
Ask CodeAnt AI to review the PR again, by typing:
Check Your Repository Health
To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.
Summary by CodeRabbit
Bug Fixes
Tests