Refactor file search by wonderwhy-er · Pull Request #229 · wonderwhy-er/DesktopCommanderMCP

wonderwhy-er · 2025-08-25T13:16:41Z

Summary by CodeRabbit

New Features
- Session-based streaming search: start_search, get_more_search_results, stop_search, list_searches — progressive results, pagination and cancellation.
Refactor
- Replaces one-shot search with streaming sessions and clearer options (files vs content, case/includeHidden/context defaults, pagination).
Chores
- File-search now prefers a session-backed engine with a Node.js fallback; usage tracking updated for new tools.
Tests
- Tests migrated to drive streaming sessions, poll for completion, and ensure session cleanup.

… negative reading

coderabbitai · 2025-08-25T13:16:49Z

Walkthrough

Adds a session-based streaming search subsystem: new search-manager and search-handlers, replaces one-shot search_files/search_code with start/get_more/stop/list search tools, updates schemas and server routing, adapts filesystem search to prefer ripgrep sessions with a NodeJS fallback, and migrates tests to the streaming API.

Changes

Cohort / File(s)	Summary
Handlers — search refactor `src/handlers/edit-search-handlers.ts`, `src/handlers/filesystem-handlers.ts`, `src/handlers/index.ts`, `src/handlers/search-handlers.ts`	Removes legacy `handleSearchCode` and `handleSearchFiles`; adds streaming handlers `handleStartSearch`, `handleGetMoreSearchResults`, `handleStopSearch`, `handleListSearches`; exposes `search-handlers` via barrel export; updates handler exports.
Search manager `src/search-manager.ts`	New session-based ripgrep orchestrator (`SearchManager` singleton) with start/read/terminate/list APIs, pagination/tail reads, process wiring/parse logic, auto-cleanup, timeouts, and telemetry.
Server wiring `src/server.ts`	Replaces tools `search_files`/`search_code` with `start_search`/`get_more_search_results`; adds `stop_search`/`list_searches`; swaps input schemas and routes calls to new handlers.
Schemas `src/tools/schemas.ts`	Removes `SearchFilesArgsSchema`/`SearchCodeArgsSchema`; adds `StartSearchArgsSchema`, `GetMoreSearchResultsArgsSchema`, `StopSearchArgsSchema`, `ListSearchesArgsSchema` with renamed/expanded fields (e.g., `searchType`, `timeout_ms`, defaults).
Tools / Filesystem `src/tools/filesystem.ts`, `src/tools/search.ts`	`searchFiles` now prefers `search-manager` sessions and falls back to a NodeJS recursive search; telemetry updated to indicate ripgrep vs fallback; `searchTextInFiles` adds ripgrep-fallback logging and adjusts exclude dirs.
Usage tracking `src/utils/usageTracker.ts`	Updates `TOOL_CATEGORIES` to include new search tool names (`start_search`, `get_more_search_results`, `stop_search`, `list_searches`) under searchOperations.
Tests (streaming migration) `test/test-search-code-edge-cases.js`, `test/test-search-code.js`, `test/test_improved_search_truncation.js`, `test/test_search_truncation.js`	Tests migrated from single-shot API to streaming API: use `handleStartSearch`, poll via `handleGetMoreSearchResults`, call `handleStopSearch`, add session cleanup and teardown, expanded scenarios, and updated assertions/imports to new handlers.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Client
  participant Server
  participant Handlers as search-handlers
  participant Manager as searchManager
  participant RG as ripgrep process

  Client->>Server: start_search(args)
  Server->>Handlers: handleStartSearch(args)
  Handlers->>Manager: startSearch(options)
  Manager->>RG: spawn ripgrep (stream JSON)
  RG-->>Manager: streaming lines
  Manager-->>Handlers: { sessionId, initial results, status }
  Handlers-->>Server: ServerResult (text summary)
  Server-->>Client: sessionId + initial summary

  loop polling
    Client->>Server: get_more_search_results({ sessionId, offset, length })
    Server->>Handlers: handleGetMoreSearchResults(...)
    Handlers->>Manager: readSearchResults(sessionId, ...)
    Manager-->>Handlers: { results, hasMore, isComplete }
    Handlers-->>Server: ServerResult (paged summary)
    Server-->>Client: incremental results/status
  end

  alt stop requested
    Client->>Server: stop_search({ sessionId })
    Server->>Handlers: handleStopSearch(...)
    Handlers->>Manager: terminateSearch(sessionId)
    Manager-->>Handlers: terminated status
    Handlers-->>Server: confirmation
    Server-->>Client: stopped message
  end

sequenceDiagram
  autonumber
  actor Client
  participant Server
  participant Handlers as search-handlers
  participant Manager as searchManager

  Client->>Server: list_searches()
  Server->>Handlers: handleListSearches()
  Handlers->>Manager: listSearchSessions()
  Manager-->>Handlers: sessions summary
  Handlers-->>Server: ServerResult (list)
  Server-->>Client: sessions list

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Improve prompts for cross os work #170 — Modifies/search-related handlers; likely overlaps with removed handleSearchCode and new search subsystem.
Allow reading urls and some cleanup #63 — Touches the same search handlers and utilities; likely conflicts with the streaming-search refactor.

Suggested labels

size:L

Suggested reviewers

serg33v
dmitry-ottic-ai

Poem

I thump my paw and start the quest,
Sessions hum and ripgrep does its best.
I fetch some bits, then peek some more,
Stop, list, and tidy up the store.
Hooray — the burrow’s search is blessed! 🥕🐇

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch refactor-file-search

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbit in a new review comment at the desired location with your query.
PR comments: Tag @coderabbit in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbit gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbit read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbit help to get the list of available commands.

Other keywords and placeholders

Add @coderabbit ignore or @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbit summary or @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbit or @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 10

🧹 Nitpick comments (25)

src/tools/search.ts (2)
270-281: Telemetry on ripgrep fallback: OK — ensure messages are sanitized.

The new capture('searchTextInFiles_ripgrep_fallback', { error }) is helpful. Keep error payloads strictly to sanitized strings (no file paths, patterns, or user content) to avoid accidental PII/log leakage. Given other capture calls, this is a minor consistency reminder.

277-281: Behavior drift: fallback excludes "dist" but primary ripgrep path does not.

This will produce different results depending on the path taken. Either exclude dist consistently in ripgrep or stop excluding it in the fallback. Easiest immediate fix: remove "dist" from the fallback exclude list.

Apply this minimal diff:
-    return searchCodeFallback({
-      ...options,
-      excludeDirs: ['node_modules', '.git', 'dist']
-    });
+    return searchCodeFallback({
+      ...options,
+      excludeDirs: ['node_modules', '.git']
+    });
Alternative (longer-term): add a consistent exclude to ripgrep with a glob like -g '!dist/**', or thread an excludeDirs option through searchCode so both code paths honor the same exclusions.
src/handlers/edit-search-handlers.ts (1)

13-13: Export-only module for edit handler — LGTM.

This simplifies responsibility after moving search handling to search-handlers.

Consider renaming this file to edit-handlers.ts to better reflect its current scope and reduce confusion with the new search-handlers.ts.
test/test_search_truncation.js (3)
10-16: Make sessionId extraction regex tighter and more robust.

Current pattern is greedy and may capture unintended trailing text if the header line format changes.

Use a stricter capture:
-  const sessionIdMatch = result.content[0].text.match(/Started .+ session: (.+)/);
+  const sessionIdMatch = result.content[0].text.match(/Started .* session:\s*([a-zA-Z0-9_-]+)/);
69-77: Use a single constant for the API limit to avoid drift.

You’re hardcoding 1,000,000 here while other tests use 1,048,576. Centralize to one value for consistency and future tweaks.

Suggested change within this test:
-        if (combinedLength > 1000000) {
+        const apiLimit = 1048576; // 1 MiB
+        if (combinedLength > apiLimit) {
             console.log('⚠️  Combined results quite large - may need truncation handling');
-        } else if (finalResult.content[0].text.includes('Results truncated')) {
+        } else if (finalResult.content[0].text.includes('Results truncated')) {
7-8: Reduce flakiness by relaxing the timeout for large repos.

A 10s overall timeout with 100ms polling might be tight on CI or larger trees. Consider 20–30s.

Example:
-async function searchAndWaitForCompletion(searchArgs, timeout = 10000) {
+async function searchAndWaitForCompletion(searchArgs, timeout = 30000) {
Also applies to: 20-33
test/test_improved_search_truncation.js (3)
11-16: Tighten the sessionId regex.

Avoid greedy capture to keep just the session ID token.
-  const sessionIdMatch = result.content[0].text.match(/Started .+ session: (.+)/);
+  const sessionIdMatch = result.content[0].text.match(/Started .* session:\s*([a-zA-Z0-9_-]+)/);
70-79: Good diagnostics and safety analysis. Minor polish suggestion.

You already use apiLimit = 1048576. Consider reusing it for the earlier thresholds to keep a single source of truth (e.g., warn if totalLength > apiLimit, caution if > 0.8 * apiLimit).

Example:
-        if (totalLength > 1000000) {
+        if (totalLength > apiLimit) {
             console.log('❌ Results still quite large - over 1MB combined');
-        } else if (totalLength > 800000) {
+        } else if (totalLength > Math.floor(0.8 * apiLimit)) {
             console.log('⚠️  Results approaching limits but acceptable');
         } else {
             console.log('✅ Results well within safe limits');
         }
Also applies to: 93-102

7-8: Bump timeout to reduce CI flakiness.

Same rationale as the sibling test — consider increasing to 30s.
-async function searchAndWaitForCompletion(searchArgs, timeout = 10000) {
+async function searchAndWaitForCompletion(searchArgs, timeout = 30000) {
Also applies to: 20-33
src/search-manager.ts (4)
368-387: Context lines are counted as matches, inflating totals.

processBufferedOutput increments totalMatches for both match and context records. That makes “Total results” ambiguous and higher than actual match count.

Recommendation: track two counters (e.g., matchCount and displayCount), and ensure handlers label them clearly (e.g., “Total matches” vs “Lines output (with context)”). Keep displayCount for pagination math if needed.

Would you like a small patch to separate counters and update the handlers’ summary strings accordingly?

4-4: Cyclic dependency risk with validatePath import.

This module imports validatePath from ./tools/filesystem.js, while src/tools/filesystem.ts dynamically imports ../search-manager.js. Although the dynamic import mitigates hard cycles at load time, it’s fragile and easy to regress.

Suggestion: extract validatePath (and path-allowlisting helpers) into a standalone path-utils.ts used by both modules to break the cycle cleanly.

98-105: Timeout cleanup: clear the termination timer on process exit.

You arm a setTimeout to kill ripgrep, but never clear it. If the process exits early, the timer still fires later (harmless but noisy).

Add a handle to the timer and clear it in both close and error handlers.
@@
-    if (options.timeout) {
-      setTimeout(() => {
+    let killTimer: NodeJS.Timeout | null = null;
+    if (options.timeout) {
+      killTimer = setTimeout(() => {
         if (!session.isComplete && !session.process.killed) {
           session.process.kill('SIGTERM');
         }
       }, options.timeout);
     }
@@
-    process.on('close', (code: number) => {
+    process.on('close', (code: number) => {
+      if (killTimer) { clearTimeout(killTimer); killTimer = null; }
@@
-    process.on('error', (error: Error) => {
+    process.on('error', (error: Error) => {
+      if (killTimer) { clearTimeout(killTimer); killTimer = null; }
Also applies to: 322-349

389-427: Lossy multi-match handling per line.

For parsed.type === 'match', only the first submatch is returned (submatches[0]). Lines with multiple occurrences are collapsed to one result.

If downstream consumers care about every occurrence, emit one result per submatch (or include submatches in the payload). Otherwise, document that matches are at most one per line.
src/tools/filesystem.ts (1)
971-1026: Fallback Node.js search semantics differ (no glob support, heavy per-file validatePath).

The fallback uses name.includes(pattern) (substring) so *.js won’t work. Consider supporting simple globs and | splitting for parity with ripgrep.

Calling validatePath(fullPath) for every entry is expensive. Given you validated rootPath and you remain inside it, you can:

avoid per-file validation, or

at least skip following symlinks (use lstat + isSymbolicLink) to prevent escaping the allowed tree.

Optional parity improvement:
-                if (entry.name.toLowerCase().includes(pattern.toLowerCase())) {
+                const patterns = pattern.split('|').map(p => p.trim()).filter(Boolean);
+                const matches = patterns.length
+                  ? patterns.some(glob => minimatch(entry.name, glob, { nocase: true }))
+                  : entry.name.toLowerCase().includes(pattern.toLowerCase());
+                if (matches) {
                     results.push(fullPath);
                 }
You’ll need minimatch (or implement a tiny glob matcher). If you want, I can wire this with zero new deps.
test/test-search-code.js (1)
526-543: Tests reach into private internals (searchManager.sessions?.clear?.()), increasing coupling.

Prefer a public shutdown/reset API to keep tests decoupled from private state. For example, expose:
// in src/search-manager.ts
export function shutdownSearchManager() {
  const sessions = searchManager.listSearchSessions();
  for (const s of sessions) searchManager.terminateSearch(s.id);
  stopSearchManagerCleanup();
  // Optional: expose a safe clear method on SearchManager guarded for tests
  (searchManager as any).clearAllSessions?.();
}
Or add a clearAllSessions() method to SearchManager under a __TEST__ guard.
test/test-search-code-edge-cases.js (3)

261-275: Concurrent searches test doesn’t assert completion.

You start multiple searches concurrently but only check the initial responses. Consider driving them to completion (via searchAndWaitForCompletion) to validate concurrency in the streaming pipeline.

285-300: Very short timeout test: broaden assertions to accept graceful termination text.

Depending on the exact ripgrep exit code and handler logic, the initial start text may still be “running”. It’s safer to assert for “Status: RUNNING” or a termination hint (e.g., “terminated”/“completed”) in addition to current checks.

328-340: Zero max results semantics are unclear.

Passing maxResults: 0 currently results in no -m flag (effectively “unlimited”). If “0 means unrestricted” is the intended contract, add an assertion here that ensures results can still be returned; otherwise, tighten the schema and handler to reject zero.

src/handlers/index.ts (1)

6-6: Barrel export looks good. Verify deprecation of legacy search handlers.

You still export ./edit-search-handlers.js. If the legacy handlers are deprecated, consider removing that export to avoid accidental use; keep it if you need backward compatibility.
src/tools/schemas.ts (1)
113-117: Constrain pagination types and semantics

Make intent explicit and prevent accidental zero/negative lengths. Also clarify that offset supports negative values.
 export const GetMoreSearchResultsArgsSchema = z.object({
   sessionId: z.string(),
-  offset: z.number().optional().default(0),    // Same as file reading
-  length: z.number().optional().default(100),  // Same as file reading (but smaller default)
+  offset: z.number().int().default(0),         // Negative = tail behavior
+  length: z.number().int().positive().default(100),
 });
src/handlers/search-handlers.ts (3)
95-98: Avoid redundant || fallbacks; rely on Zod defaults

safeParse with .default() already provides values. || masks intentional zeroes and complicates negatives.
-    const results = searchManager.readSearchResults(
-      parsed.data.sessionId,
-      parsed.data.offset || 0,
-      parsed.data.length || 100
-    );
+    const results = searchManager.readSearchResults(
+      parsed.data.sessionId,
+      parsed.data.offset,
+      parsed.data.length
+    );
@@
-    const offset = parsed.data.offset || 0;
+    const offset = parsed.data.offset;
Also applies to: 116-117

14-14: Prefer unknown over any for handler inputs

You’re validating with Zod; accept unknown and keep any out of the type surface.
-export async function handleStartSearch(args: any): Promise<ServerResult> {
+export async function handleStartSearch(args: unknown): Promise<ServerResult> {
@@
-export async function handleGetMoreSearchResults(args: any): Promise<ServerResult> {
+export async function handleGetMoreSearchResults(args: unknown): Promise<ServerResult> {
@@
-export async function handleStopSearch(args: any): Promise<ServerResult> {
+export async function handleStopSearch(args: unknown): Promise<ServerResult> {
Optionally, define precise types via z.infer<typeof Schema> after Zod parse.

Also applies to: 84-84, 172-172

2-7: Unused import

ListSearchesArgsSchema isn’t used in this module.

Remove it from the import list to keep the file tidy.
-  StopSearchArgsSchema,
-  ListSearchesArgsSchema
+  StopSearchArgsSchema
src/server.ts (2)
304-331: Tool doc: mention that initial results are included on start

The handler prints initial results (up to 10) on start. The description says it “returns immediately with a session ID”, which might imply no results. Clarify that it may include initial results and status.
-                        immediately with a session ID. Use get_more_search_results to get results as they
+                        immediately with a session ID and may include initial results. Use get_more_search_results to get results as they
808-822: Routing: new search tools wired correctly

Switch cases call the new handlers. Consider temporary aliases for removed tools to ease migration (return a helpful error pointing to the new API).

Example:
             default:
                 capture('server_unknown_tool', {name});
                 result = {
                     content: [{type: "text", text: `Error: Unknown tool: ${name}`}],
                     isError: true,
                 };
could be extended as:
+            case "search_files":
+            case "search_code":
+                result = {
+                    content: [{
+                        type: "text",
+                        text: `The tools "${name}" have been replaced by start_search/get_more_search_results/stop_search/list_searches. Please migrate.`
+                    }],
+                    isError: true,
+                };
+                break;

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a7e10a7 and 02ed3f9.

📒 Files selected for processing (14)

src/handlers/edit-search-handlers.ts (1 hunks)
src/handlers/filesystem-handlers.ts (0 hunks)
src/handlers/index.ts (1 hunks)
src/handlers/search-handlers.ts (1 hunks)
src/search-manager.ts (1 hunks)
src/server.ts (3 hunks)
src/tools/filesystem.ts (2 hunks)
src/tools/schemas.ts (1 hunks)
src/tools/search.ts (1 hunks)
src/utils/usageTracker.ts (1 hunks)
test/test-search-code-edge-cases.js (19 hunks)
test/test-search-code.js (17 hunks)
test/test_improved_search_truncation.js (1 hunks)
test/test_search_truncation.js (1 hunks)

💤 Files with no reviewable changes (1)

src/handlers/filesystem-handlers.ts

🧰 Additional context used

🧬 Code graph analysis (11)

src/tools/search.ts (1)

src/utils/capture.ts (1)

capture (277-284)

src/tools/filesystem.ts (3)

src/search-manager.ts (1)

searchManager (431-431)

src/utils/capture.ts (1)

capture (277-284)

src/handlers/filesystem-handlers.ts (1)

handleSearchFiles (244-281)

src/handlers/search-handlers.ts (4)

src/types.ts (1)

ServerResult (51-55)

src/tools/schemas.ts (3)

StartSearchArgsSchema (101-111)

GetMoreSearchResultsArgsSchema (113-117)

StopSearchArgsSchema (119-121)

src/search-manager.ts (1)

searchManager (431-431)

src/utils/capture.ts (1)

capture (277-284)

test/test_search_truncation.js (1)

src/handlers/search-handlers.ts (3)

handleStartSearch (14-79)

handleGetMoreSearchResults (84-167)

handleStopSearch (172-207)

src/search-manager.ts (3)

src/tools/search.ts (1)

SearchResult (9-13)

src/tools/filesystem.ts (1)

validatePath (226-281)

src/utils/capture.ts (1)

capture (277-284)

src/handlers/edit-search-handlers.ts (2)

src/tools/edit.ts (3)

handleEditBlock (345-354)

performSearchReplace (96-300)

SearchReplace (12-15)

test/test-edit-block-occurrences.js (1)

testEmptySearchString (282-306)

src/tools/schemas.ts (1)

src/tools/edit.ts (2)

performSearchReplace (96-300)

SearchReplace (12-15)

test/test_improved_search_truncation.js (1)

src/handlers/search-handlers.ts (3)

handleStartSearch (14-79)

handleGetMoreSearchResults (84-167)

handleStopSearch (172-207)

src/server.ts (1)

src/tools/schemas.ts (4)

StartSearchArgsSchema (101-111)

GetMoreSearchResultsArgsSchema (113-117)

StopSearchArgsSchema (119-121)

ListSearchesArgsSchema (123-123)

test/test-search-code-edge-cases.js (3)

test/test-search-code.js (33)

searchAndWaitForCompletion (203-207)

searchAndWaitForCompletion (227-232)

searchAndWaitForCompletion (247-252)

searchAndWaitForCompletion (269-274)

searchAndWaitForCompletion (291-296)

searchAndWaitForCompletion (319-324)

searchAndWaitForCompletion (344-349)

searchAndWaitForCompletion (369-374)

searchAndWaitForCompletion (393-397)

searchAndWaitForCompletion (475-479)

result (34-34)

result (415-419)

result (442-445)

result (455-458)

sessionIdMatch (37-37)

sessionId (41-41)

sessionId (165-165)

startTime (45-45)

moreResults (47-47)

text (212-212)

text (234-234)

text (254-254)

text (276-276)

text (301-301)

text (326-326)

text (351-351)

text (379-379)

text (402-402)

text (423-423)

text (446-446)

text (459-459)

text (481-481)

isValidResponse (424-424)

src/handlers/search-handlers.ts (4)

handleStartSearch (14-79)

handleGetMoreSearchResults (84-167)

handleStopSearch (172-207)

handleListSearches (212-248)

src/search-manager.ts (2)

searchManager (431-431)

stopSearchManagerCleanup (459-464)

test/test-search-code.js (3)

test/test-search-code-edge-cases.js (22)

result (30-30)

result (285-290)

result (309-314)

result (328-333)

result (348-353)

result (370-374)

sessionIdMatch (33-33)

sessionId (37-37)

sessionId (125-125)

startTime (41-41)

startTime (234-234)

text (164-164)

text (184-184)

text (202-202)

text (220-220)

text (246-246)

text (293-293)

text (335-335)

text (356-356)

text (378-378)

text (419-419)

text (451-451)

src/handlers/search-handlers.ts (4)

handleStartSearch (14-79)

handleGetMoreSearchResults (84-167)

handleStopSearch (172-207)

handleListSearches (212-248)

src/search-manager.ts (2)

searchManager (431-431)

stopSearchManagerCleanup (459-464)

🔇 Additional comments (7)

test/test-search-code.js (1)
470-485: Strengthen file-search test to assert exclusions

File: test/test-search-code.js, lines 470–485

Add negative assertions to ensure only .js files are returned and catch any regression in the glob‐filtering logic:
   const text = finalResult.content[0].text;
   assert(text.includes('test1.js'), 'Should find JavaScript files');
+  assert(!text.includes('test2.ts'),  'Should not list non-matching TypeScript files');
+  assert(!text.includes('nested.py'), 'Should not list non-matching Python files');
Please verify that using pattern: '*.js' is correctly applied in buildRipgrepArgs and that finalResult.content[0].text reflects the complete list of matched files so these negative checks will fail if non‐matching files slip through.
src/tools/schemas.ts (2)

98-99: Empty feedback schema: OK to leave empty

This matches the tool description that the form is user-filled. No issues.

119-123: Stop/List search schemas look good

Schemas are minimal and consistent with tool wiring.
src/handlers/search-handlers.ts (1)
23-34: No changes needed: startSearch arguments conform to SearchSessionOptions

I’ve verified that SearchSessionOptions is defined as:
export interface SearchSessionOptions {
  rootPath: string;
  pattern: string;
  searchType: 'files' | 'content';
  filePattern?: string;
  ignoreCase?: boolean;
  maxResults?: number;
  includeHidden?: boolean;
  contextLines?: number;
  timeout?: number;
}
And the handler passes exactly those keys—rootPath, pattern, searchType, filePattern, ignoreCase, maxResults, includeHidden, contextLines, and timeout—to startSearch. No renaming or adjustments are required.
src/server.ts (3)

42-46: Schema imports: OK

New search schemas are correctly imported and used below.

333-357: Get-more tool doc aligns with handler behavior

Offset/length semantics and tail behavior match the implementation.

374-383: List tool doc consistent

Docs reflect the output built in handleListSearches.

coderabbitai · 2025-08-25T13:29:36Z

+      for (const searchResult of result.results.slice(0, 10)) {
+        if (searchResult.type === 'content') {
+          output += `📄 ${searchResult.file}:${searchResult.line} - ${searchResult.match?.substring(0, 100)}${searchResult.match && searchResult.match.length > 100 ? '...' : ''}\n`;
+        } else {
+          output += `📁 ${searchResult.file}\n`;
+        }
+      }


🛠️ Refactor suggestion

Guard against undefined match text to avoid “undefined” in output

When match is missing, template interpolation prints “undefined”. Use a safe default and compute ellipsis robustly.

- for (const searchResult of result.results.slice(0, 10)) { - if (searchResult.type === 'content') { - output += `📄 ${searchResult.file}:${searchResult.line} - ${searchResult.match?.substring(0, 100)}${searchResult.match && searchResult.match.length > 100 ? '...' : ''}\n`; - } else { - output += `📁 ${searchResult.file}\n`; - } - } + for (const searchResult of result.results.slice(0, 10)) { + if (searchResult.type === 'content') { + const raw = searchResult.match ?? ''; + const snippet = raw.slice(0, 100); + const ellipsis = raw.length > 100 ? '...' : ''; + output += `📄 ${searchResult.file}:${searchResult.line} - ${snippet}${ellipsis}\n`; + } else { + output += `📁 ${searchResult.file}\n`; + } + }

- for (const result of results.results) { - if (result.type === 'content') { - output += `📄 ${result.file}:${result.line} - ${result.match?.substring(0, 100)}${result.match && result.match.length > 100 ? '...' : ''}\n`; - } else { - output += `📁 ${result.file}\n`; - } - } + for (const result of results.results) { + if (result.type === 'content') { + const raw = result.match ?? ''; + const snippet = raw.slice(0, 100); + const ellipsis = raw.length > 100 ? '...' : ''; + output += `📄 ${result.file}:${result.line} - ${snippet}${ellipsis}\n`; + } else { + output += `📁 ${result.file}\n`; + } + }

Also applies to: 137-143

🤖 Prompt for AI Agents

In src/handlers/search-handlers.ts around lines 48-54 (and similarly at 137-143), the template interpolation can print "undefined" when searchResult.match is missing; change the logic to first coerce match to a safe string (e.g. const matchText = (searchResult.match ?? '') as string), compute a boolean for truncation using matchText.length > 100, and then use matchText.substring(0, 100) plus '...' only when truncated; update both locations so they never interpolate undefined and the ellipsis is added only when the actual string is longer than the limit.

coderabbitai · 2025-08-25T13:29:36Z

+    const errorMessage = error instanceof Error ? error.message : String(error);
+    capture('search_session_start_error', { error: errorMessage });
+


⚠️ Potential issue

Telemetry may leak PII/file paths via raw error messages

capture('search_session_start_error', { error: errorMessage }) can include absolute paths, patterns, or environment details. The capture pipeline sends to Google Analytics (see src/utils/capture.ts lines 276-283), which risks unintended data exfiltration.

Harden by redacting the message and only sending error class + coarse flags:

} catch (error) { const errorMessage = error instanceof Error ? error.message : String(error); - capture('search_session_start_error', { error: errorMessage }); + const errorName = error instanceof Error ? error.name : 'UnknownError'; + const containsPathLikeData = /[\\/][^ \n\r\t]{1,}([\\/][^ \n\r\t]{1,})+/.test(errorMessage); + capture('search_session_start_error', { error_name: errorName, contains_path_like_data: containsPathLikeData });

Optionally log the full error locally (not to telemetry) if needed. Do not send raw strings off-box.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const errorMessage = error instanceof Error ? error.message : String(error);

capture('search_session_start_error', { error: errorMessage });

} catch (error) {

const errorMessage = error instanceof Error ? error.message : String(error);

const errorName = error instanceof Error ? error.name : 'UnknownError';

const containsPathLikeData = /[\\/][^ \n\r\t]{1,}([\\/][^ \n\r\t]{1,})+/.test(errorMessage);

capture('search_session_start_error', { error_name: errorName, contains_path_like_data: containsPathLikeData });

// …any additional error handling…

}

🤖 Prompt for AI Agents

In src/handlers/search-handlers.ts around lines 71-73, sending the raw error message to telemetry can leak PII or file paths; instead redact the message before sending by replacing it with the error class/name and coarse flags (e.g., { type: error.constructor?.name || 'Error', isTimeout: boolean, isNetwork: boolean }) and remove or truncate any string details; optionally write the full error (stack/message) to local logs only (not via capture) for debugging; ensure capture('search_session_start_error', ...) no longer includes raw errorMessage but only the sanitized fields.

coderabbitai · 2025-08-25T13:29:36Z

+                    name: "stop_search", 
+                    description: `
+                        Stop an active search.
+
+                        Stops the background search process gracefully. Use this when you've found
+                        what you need or if a search is taking too long. Similar to force_terminate
+                        for terminal processes.
+
+                        The search will still be available for reading final results until it's
+                        automatically cleaned up after 5 minutes.
+
+                        ${CMD_PREFIX_DESCRIPTION}`,
+                    inputSchema: zodToJsonSchema(StopSearchArgsSchema),
+                },


💡 Verification agent

🧩 Analysis chain

Stop tool doc: verify auto-cleanup timing

Doc promises cleanup after 5 minutes. Ensure searchManager actually performs TTL cleanup.

If not implemented, either implement cleanup or adjust the description.

🏁 Script executed:

#!/bin/bash rg -n "cleanup|TTL|setTimeout|setInterval|expire|gc|Garbage|5 ?min" src -C2

Length of output: 11491

🏁 Script executed:

#!/bin/bash sed -n '430,480p' src/search-manager.ts

Length of output: 1122

Stop tool doc: clarify cleanup timing

The implementation does include a periodic TTL cleanup that runs every 5 minutes (via startCleanupIfNeeded → setInterval) and deletes sessions older than 5 minutes (cleanupSessions default maxAge = 5 * 60 * 1000). However, completed (and errored) searches are also removed after just 2 minutes by a separate setTimeout in SearchManager.

Please update the stop_search description in src/server.ts (around lines 366–372) to accurately reflect the two-phase cleanup:

Completed/error sessions auto-clean after ~2 minutes.

A full sweep removes any remaining stale sessions every 5 minutes.

This will keep the documentation in sync with the actual behavior.

🤖 Prompt for AI Agents

In src/server.ts around lines 359 to 372, the stop_search description text is inaccurate about cleanup timing; update the multi-line description so it states that completed or errored searches are removed after approximately 2 minutes (via SearchManager setTimeout), and that a periodic full sweep runs every 5 minutes to clean up any remaining stale sessions (via the TTL cleanup); keep the rest of the wording (including ${CMD_PREFIX_DESCRIPTION}) intact and ensure the new text clearly mentions the two-phase cleanup durations (~2 minutes for completed/errored sessions, 5 minutes for full sweeps).

coderabbitai · 2025-08-25T13:29:37Z

 export async function searchFiles(rootPath: string, pattern: string): Promise<string[]> {
+    // Use the new search manager for better performance
+    // This provides a temporary compatibility layer until we fully migrate to search sessions
+    const { searchManager } = await import('../search-manager.js');
+
+    try {
+        const result = await searchManager.startSearch({
+            rootPath,
+            pattern,
+            searchType: 'files',
+            ignoreCase: true,
+            maxResults: 5000, // Higher limit for compatibility
+        });
+
+        const sessionId = result.sessionId;
+
+        // Poll for results until complete
+        let allResults: string[] = [];
+        let isComplete = result.isComplete;
+        let startTime = Date.now();
+
+        // Add initial results
+        for (const searchResult of result.results) {
+            if (searchResult.type === 'file') {
+                allResults.push(searchResult.file);
+            }
+        }
+
+        while (!isComplete) {
+            await new Promise(resolve => setTimeout(resolve, 100)); // Wait 100ms
+
+            const results = searchManager.readSearchResults(sessionId);
+            isComplete = results.isComplete;
+
+            // Add new file paths to results
+            for (const searchResult of results.results) {
+                if (searchResult.file !== '__LAST_READ_MARKER__' && searchResult.type === 'file') {
+                    allResults.push(searchResult.file);
+                }
+            }
+
+            // Safety check to prevent infinite loops (30 second timeout)
+            if (Date.now() - startTime > 30000) {
+                searchManager.terminateSearch(sessionId);
+                break;
+            }
+        }
+
+        // Log only the count of found files, not their paths
+        capture('server_search_files_complete', {
+            resultsCount: allResults.length,
+            patternLength: pattern.length,
+            usedRipgrep: true
+        });
+
+        return allResults;
+    } catch (error) {
+        // Fallback to original Node.js implementation if ripgrep fails
+        capture('server_search_files_ripgrep_fallback', {
+            error: error instanceof Error ? error.message : 'Unknown error'
+        });
+
+        return await searchFilesNodeJS(rootPath, pattern);
+    }
+}


🛠️ Refactor suggestion

⚠️ Potential issue

Duplicate and truncated results when polling search sessions (offset is never advanced).

searchFiles repeatedly calls searchManager.readSearchResults(sessionId) without offset, which always returns the first page. You then keep appending, causing duplicates and possibly never enumerating beyond the first 100 entries.

Advance the offset and de-duplicate:

export async function searchFiles(rootPath: string, pattern: string): Promise<string[]> { @@ - // Poll for results until complete - let allResults: string[] = []; + // Poll for results until complete + const allResults: string[] = []; + const seen = new Set<string>(); let isComplete = result.isComplete; let startTime = Date.now(); + let offset = 0; + const PAGE = 1000; // Add initial results - for (const searchResult of result.results) { + for (const searchResult of result.results) { if (searchResult.type === 'file') { - allResults.push(searchResult.file); + if (!seen.has(searchResult.file)) { + seen.add(searchResult.file); + allResults.push(searchResult.file); + } } } + offset += result.results.length; while (!isComplete) { await new Promise(resolve => setTimeout(resolve, 100)); // Wait 100ms - - const results = searchManager.readSearchResults(sessionId); - isComplete = results.isComplete; - - // Add new file paths to results - for (const searchResult of results.results) { - if (searchResult.file !== '__LAST_READ_MARKER__' && searchResult.type === 'file') { - allResults.push(searchResult.file); - } - } + const page = searchManager.readSearchResults(sessionId, offset, PAGE); + isComplete = page.isComplete; + // Add new file paths to results + for (const searchResult of page.results) { + if (searchResult.type === 'file' && !seen.has(searchResult.file)) { + seen.add(searchResult.file); + allResults.push(searchResult.file); + } + } + offset += page.returnedCount; @@ if (Date.now() - startTime > 30000) { searchManager.terminateSearch(sessionId); break; } }

Also, __LAST_READ_MARKER__ isn’t produced anywhere; remove related filtering.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

export async function searchFiles(rootPath: string, pattern: string): Promise<string[]> {

// Use the new search manager for better performance

// This provides a temporary compatibility layer until we fully migrate to search sessions

const { searchManager } = await import('../search-manager.js');

try {

const result = await searchManager.startSearch({

rootPath,

pattern,

searchType: 'files',

ignoreCase: true,

maxResults: 5000, // Higher limit for compatibility

});

const sessionId = result.sessionId;

// Poll for results until complete

let allResults: string[] = [];

let isComplete = result.isComplete;

let startTime = Date.now();

// Add initial results

for (const searchResult of result.results) {

if (searchResult.type === 'file') {

allResults.push(searchResult.file);

}

}

while (!isComplete) {

await new Promise(resolve => setTimeout(resolve, 100)); // Wait 100ms

const results = searchManager.readSearchResults(sessionId);

isComplete = results.isComplete;

// Add new file paths to results

for (const searchResult of results.results) {

if (searchResult.file !== '__LAST_READ_MARKER__' && searchResult.type === 'file') {

allResults.push(searchResult.file);

}

}

// Safety check to prevent infinite loops (30 second timeout)

if (Date.now() - startTime > 30000) {

searchManager.terminateSearch(sessionId);

break;

}

}

// Log only the count of found files, not their paths

capture('server_search_files_complete', {

resultsCount: allResults.length,

patternLength: pattern.length,

usedRipgrep: true

});

return allResults;

} catch (error) {

// Fallback to original Node.js implementation if ripgrep fails

capture('server_search_files_ripgrep_fallback', {

error: error instanceof Error ? error.message : 'Unknown error'

});

return await searchFilesNodeJS(rootPath, pattern);

}

}

export async function searchFiles(rootPath: string, pattern: string): Promise<string[]> {

// Use the new search manager for better performance

// This provides a temporary compatibility layer until we fully migrate to search sessions

const { searchManager } = await import('../search-manager.js');

try {

const result = await searchManager.startSearch({

rootPath,

pattern,

searchType: 'files',

ignoreCase: true,

maxResults: 5000, // Higher limit for compatibility

});

const sessionId = result.sessionId;

// Poll for results until complete

const allResults: string[] = [];

const seen = new Set<string>();

let isComplete = result.isComplete;

let startTime = Date.now();

let offset = 0;

const PAGE = 1000;

// Add initial results

for (const searchResult of result.results) {

if (searchResult.type === 'file') {

if (!seen.has(searchResult.file)) {

seen.add(searchResult.file);

allResults.push(searchResult.file);

}

}

}

offset += result.results.length;

while (!isComplete) {

await new Promise(resolve => setTimeout(resolve, 100)); // Wait 100ms

const page = searchManager.readSearchResults(sessionId, offset, PAGE);

isComplete = page.isComplete;

// Add new file paths to results

for (const searchResult of page.results) {

if (searchResult.type === 'file' && !seen.has(searchResult.file)) {

seen.add(searchResult.file);

allResults.push(searchResult.file);

}

}

offset += page.returnedCount;

// Safety check to prevent infinite loops (30 second timeout)

if (Date.now() - startTime > 30000) {

searchManager.terminateSearch(sessionId);

break;

}

}

// Log only the count of found files, not their paths

capture('server_search_files_complete', {

resultsCount: allResults.length,

patternLength: pattern.length,

usedRipgrep: true

});

return allResults;

} catch (error) {

// Fallback to original Node.js implementation if ripgrep fails

capture('server_search_files_ripgrep_fallback', {

error: error instanceof Error ? error.message : 'Unknown error'

});

return await searchFilesNodeJS(rootPath, pattern);

}

}

🤖 Prompt for AI Agents

In src/tools/filesystem.ts around lines 905 to 969, the polling loop calls searchManager.readSearchResults(sessionId) without passing an offset so it always returns the first page, causing duplicates and truncation; remove the bogus '__LAST_READ_MARKER__' check, track and advance an offset on each read (initialize offset to the number of items already consumed from the initial result), pass that offset into readSearchResults(sessionId, offset), and append only new file entries while deduplicating (e.g., maintain a Set of seen file paths or check against allResults) so you never re-add the same path; also increment the offset by the number of results returned each poll and keep the existing timeout/termination logic.

coderabbitai · 2025-08-25T13:29:37Z

+export const StartSearchArgsSchema = z.object({
+  path: z.string(),
+  pattern: z.string(),
+  searchType: z.enum(['files', 'content']).default('files'),
+  filePattern: z.string().optional(),
+  ignoreCase: z.boolean().optional().default(true),
+  maxResults: z.number().optional(),
+  includeHidden: z.boolean().optional().default(false),
+  contextLines: z.number().optional().default(5),
+  timeout_ms: z.number().optional(), // Match process naming convention
+});


🛠️ Refactor suggestion

Tighten validation and simplify defaults in StartSearchArgsSchema

Prevent empty strings and invalid ranges.

Remove redundant .optional() before .default(); default() already handles undefined.

Add integer/positive guards to avoid resource spikes and accidental negatives.

Apply:

export const StartSearchArgsSchema = z.object({ - path: z.string(), - pattern: z.string(), + path: z.string().min(1, 'path cannot be empty'), + pattern: z.string().min(1, 'pattern cannot be empty'), searchType: z.enum(['files', 'content']).default('files'), filePattern: z.string().optional(), - ignoreCase: z.boolean().optional().default(true), - maxResults: z.number().optional(), - includeHidden: z.boolean().optional().default(false), - contextLines: z.number().optional().default(5), - timeout_ms: z.number().optional(), // Match process naming convention + ignoreCase: z.boolean().default(true), + maxResults: z.number().int().positive().max(50000).optional(), + includeHidden: z.boolean().default(false), + contextLines: z.number().int().min(0).default(5), + timeout_ms: z.number().int().positive().max(300000).optional(), // cap to 5 minutes });

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

export const StartSearchArgsSchema = z.object({

path: z.string(),

pattern: z.string(),

searchType: z.enum(['files', 'content']).default('files'),

filePattern: z.string().optional(),

ignoreCase: z.boolean().optional().default(true),

maxResults: z.number().optional(),

includeHidden: z.boolean().optional().default(false),

contextLines: z.number().optional().default(5),

timeout_ms: z.number().optional(), // Match process naming convention

});

export const StartSearchArgsSchema = z.object({

path: z.string().min(1, 'path cannot be empty'),

pattern: z.string().min(1, 'pattern cannot be empty'),

searchType: z.enum(['files', 'content']).default('files'),

filePattern: z.string().optional(),

ignoreCase: z.boolean().default(true),

maxResults: z.number().int().positive().max(50000).optional(),

includeHidden: z.boolean().default(false),

contextLines: z.number().int().min(0).default(5),

timeout_ms: z.number().int().positive().max(300000).optional(), // cap to 5 minutes

});

🤖 Prompt for AI Agents

In src/tools/schemas.ts around lines 101 to 111, tighten validation and simplify defaults for StartSearchArgsSchema: make path and pattern required non-empty strings (e.g., z.string().min(1)), make filePattern optional but if present require non-empty string, remove unnecessary .optional() before .default() on boolean/defaulted fields (use z.boolean().default(...)), enforce integer/positive guards on numeric fields (e.g., maxResults: z.number().int().positive().max(10000).optional(), contextLines: z.number().int().min(0).max(100).default(5), timeout_ms: z.number().int().positive().max(1200000).optional()), and keep searchType enum/default as is; adjust ranges/caps as shown to prevent negatives and resource spikes.

coderabbitai · 2025-08-25T13:29:37Z

+  search: ['start_search', 'get_more_search_results', 'stop_search', 'list_searches'],
  config: ['get_config', 'set_config_value'],
  process: ['list_processes', 'kill_process']
 };


💡 Verification agent

🧩 Analysis chain

New streaming search tools are correctly mapped to searchOperations.

Looks good and aligns with the new session-based API. Recommend scanning the repo for stale references to the old tools to keep analytics clean.

Run this quick check to ensure no lingering old tool names remain:

🏁 Script executed:

#!/bin/bash # Find references to deprecated tool names rg -nP --ignore-case '(search_files|search_code)\b' -g '!dist/**' -S

Length of output: 354

Documentation Cleanup: Remove deprecated search_files and search_code references

We’ve identified two occurrences of the old tool names that should be updated:

FAQ.md (line 271): “Both search_files and search_code have a default 30-second timeout”

README.md (lines 404–405): table entries for search_files and search_code

Please update or remove these references to keep the docs in sync with the new session-based API.

🤖 Prompt for AI Agents

In src/utils/usageTracker.ts around lines 43 to 46, the documentation references to deprecated tools `search_files` and `search_code` are out of sync with the new session-based API; update the docs by removing or replacing those tool names: remove occurrences in FAQ.md (line ~271) and README.md (lines ~404-405) or replace them with the current session-based equivalents and any relevant timeout text, ensuring table entries and explanatory sentences reflect the new API naming and behavior.

coderabbitai · 2025-08-25T13:29:37Z

-// Test script to verify search result truncation
-import { handleSearchCode } from '../dist/handlers/edit-search-handlers.js';
+// Test script to verify search result behavior using new streaming API
+import { handleStartSearch, handleGetMoreSearchResults } from '../dist/handlers/search-handlers.js';


⚠️ Potential issue

ReferenceError: handleStopSearch is used but not imported.

handleStopSearch is called in finally but isn't imported, which will break the test at runtime.

Apply this diff:

-import { handleStartSearch, handleGetMoreSearchResults } from '../dist/handlers/search-handlers.js'; +import { handleStartSearch, handleGetMoreSearchResults, handleStopSearch } from '../dist/handlers/search-handlers.js';

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

import { handleStartSearch, handleGetMoreSearchResults } from '../dist/handlers/search-handlers.js';

import { handleStartSearch, handleGetMoreSearchResults, handleStopSearch } from '../dist/handlers/search-handlers.js';

🤖 Prompt for AI Agents

In test/test_search_truncation.js around line 2, the test calls handleStopSearch in a finally block but the import statement only imports handleStartSearch and handleGetMoreSearchResults; add handleStopSearch to the named import from '../dist/handlers/search-handlers.js' so the function is available at runtime and the test no longer throws a ReferenceError.

coderabbitai

Actionable comments posted: 4

♻️ Duplicate comments (3)

src/search-manager.ts (3)

293-301: Scope -i to content searches only.

-i has no effect for rg --files and can confuse behavior. Limit it to content mode.

-    // Common options
-    if (options.ignoreCase !== false) {
-      args.push('-i');
-    }
+    // Case-insensitive: only meaningful for content searches
+    if (options.searchType === 'content' && options.ignoreCase !== false) {
+      args.push('-i');
+    }

306-316: filePattern flag: use '-g' for content filters, '--glob' for file mode.

Keeps behavior consistent and avoids option collisions with --files. Also split multiple patterns by '|'.

-    if (options.filePattern) {
+    if (options.filePattern) {
       const patterns = options.filePattern
         .split('|')
         .map(p => p.trim())
         .filter(Boolean);
-      
-      patterns.forEach(pattern => {
-        args.push('-g', pattern);
-      });
+      for (const p of patterns) {
+        if (options.searchType === 'content') {
+          args.push('-g', p);
+        } else {
+          args.push('--glob', p);
+        }
+      }
     }

318-335: Content mode missing '--' separator; may treat pattern starting with '-' as a flag.

Insert '--' before PATTERN and pass PATH after it. This prevents accidental option parsing.

-    if (options.searchType === 'files') {
+    if (options.searchType === 'files') {
@@
-    } else {
-      // Content search: pattern is the search term
-      args.push(options.pattern);
-    }
-    
-    // Add the root path
-    args.push(options.rootPath);
+    } else {
+      // Content search: terminate options before the pattern
+      args.push('--', options.pattern, options.rootPath);
+      return args; // rootPath already added
+    }
+    // Add the root path (files mode)
+    args.push(options.rootPath);

Also applies to: 336-338

🧹 Nitpick comments (5)

src/search-manager.ts (5)
97-107: Timeout telemetry and default may mislead; record actual timeout used.

You set a default 1000ms for exact filenames but log hasTimeout based only on options.timeout. Capture the resolved timeout and consider a more generous default.
-    const timeoutMs = options.timeout || (this.isExactFilename(options.pattern) ? 1000 : undefined);
+    const timeoutMs = options.timeout ?? (this.isExactFilename(options.pattern) ? 1500 : undefined);
@@
-    capture('search_session_started', {
+    capture('search_session_started', {
       sessionId,
       searchType: options.searchType,
-      hasTimeout: !!options.timeout
+      hasTimeout: !!timeoutMs,
+      timeoutMs
     });
Also applies to: 109-114

115-119: Avoid arbitrary sleep before returning initial state.

Fixed waits (50/100ms) add latency and are brittle. Prefer returning immediately and let the client poll, or await the first 'data' event (with a short cap).
-    const waitTime = this.isExactFilename(options.pattern) ? 50 : 100;
-    await new Promise(resolve => setTimeout(resolve, waitTime));
+    const firstChunk = new Promise<void>(resolve => {
+      const onData = () => { session.process.stdout?.off('data', onData); resolve(); };
+      session.process.stdout?.once('data', onData);
+      setTimeout(resolve, 40); // cap
+    });
+    await firstChunk;
359-385: Double cleanup logic; unify retention policy.

You both auto-delete after 2 minutes in handlers and have cleanupSessions(maxAge=5m). Pick one source of truth to avoid surprise eviction.
-      // Auto-cleanup completed sessions after 2 minutes
-      setTimeout(() => {
-        this.sessions.delete(session.id);
-        capture('search_session_auto_cleaned', { sessionId: session.id });
-      }, 2 * 60 * 1000);
+      // Rely on cleanupSessions(maxAge) only; no per-session timer
And same for the error handler block.

Also applies to: 397-402

426-464: Content JSON parsing: handle 'summary' and missing submatches defensively.

Avoid accessing submatches[0] when empty; optionally use summary to confirm totals.
-        if (parsed.type === 'match') {
-          // Return first submatch (ripgrep can have multiple matches per line)
-          const submatch = parsed.data.submatches[0];
+        if (parsed.type === 'match') {
+          const submatch = parsed.data?.submatches?.[0];
           return {
             file: parsed.data.path.text,
             line: parsed.data.line_number,
             match: submatch?.match?.text || parsed.data.lines.text,
             type: 'content'
           };
         }
+        if (parsed.type === 'summary') {
+          // Optional: could reconcile totalMatches with parsed.data.stats?.matchedLines
+          return null;
+        }
471-493: Cleanup interval start condition may stop too early.

You start interval only when there are active (not complete) sessions. If all sessions become complete quickly, the immediate 1s check may stop the interval before cleanupSessions can retire old sessions. Consider running cleanup on a fixed schedule or track completed-but-not-cleaned sessions count.
-  if (!cleanupInterval && searchManager.getActiveSessionCount() > 0) {
+  if (!cleanupInterval) {

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 02ed3f9 and 428a606.

📒 Files selected for processing (1)

src/search-manager.ts (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

src/search-manager.ts (3)

src/tools/search.ts (4)

SearchResult (9-13)

searchCode (16-146)

code (98-144)

resolve (78-145)

src/tools/filesystem.ts (1)

validatePath (226-281)

src/utils/capture.ts (1)

capture (277-284)

🔇 Additional comments (1)

src/search-manager.ts (1)

7-12: Remove misleading SearchResult divergence warning
No file imports both SearchResult definitions—each lives in its own module without collision risk. Interfaces can remain as-is.

Likely an incorrect or invalid review comment.

coderabbitai · 2025-08-28T14:39:04Z

+  readSearchResults(
+    sessionId: string, 
+    offset: number = 0, 
+    length: number = 100
+  ): {
+    results: SearchResult[];
+    returnedCount: number;        // Renamed from newResultsCount
+    totalResults: number;
+    isComplete: boolean;
+    isError: boolean;
+    error?: string;
+    hasMoreResults: boolean;      // New field
+    runtime: number;
+  } {
+    const session = this.sessions.get(sessionId);
+
+    if (!session) {
+      throw new Error(`Search session ${sessionId} not found`);
+    }
+
+    // Get all results (excluding internal markers)
+    const allResults = session.results.filter(r => r.file !== '__LAST_READ_MARKER__');
+
+    // Handle negative offsets (tail behavior) - like file reading
+    if (offset < 0) {
+      const tailCount = Math.abs(offset);
+      const tailResults = allResults.slice(-tailCount);
+      return {
+        results: tailResults,
+        returnedCount: tailResults.length,
+        totalResults: session.totalMatches,
+        isComplete: session.isComplete,
+        isError: session.isError,
+        error: session.error,
+        hasMoreResults: false, // Tail always returns what's available
+        runtime: Date.now() - session.startTime
+      };
+    }


⚠️ Potential issue

Tail reads don't refresh lastReadTime, risking premature cleanup.

Update lastReadTime for all reads, including tails, to honor activity.

- // Handle negative offsets (tail behavior) - like file reading + // Refresh last read time for any access + session.lastReadTime = Date.now(); + // Handle negative offsets (tail behavior) - like file reading if (offset < 0) { @@ - session.lastReadTime = Date.now(); + // lastReadTime already updated above

Also applies to: 177-179

🤖 Prompt for AI Agents

In src/search-manager.ts around lines 134-171 (and also apply the same change at 177-179), the tail-read branch returns results without updating the session's lastReadTime which can cause the session to be cleaned up prematurely; update session.lastReadTime = Date.now() for all read paths (including the negative offset/tail branch) before any return, and likewise ensure the normal (non-tail) return path and the subsequent returns at lines ~177-179 also set session.lastReadTime so every read counts as activity.

coderabbitai · 2025-08-28T14:39:04Z

+    if (options.maxResults && options.maxResults > 0) {
+      args.push('-m', options.maxResults.toString());
+    }
+


🛠️ Refactor suggestion

⚠️ Potential issue

Apply -m only to content mode; enforce maxResults in file mode in-process.

rg -m limits matches, not files from --files. Gate it to content; for file mode, kill the process once enough paths are collected.

- if (options.maxResults && options.maxResults > 0) { - args.push('-m', options.maxResults.toString()); - } + if (options.searchType === 'content' && options.maxResults && options.maxResults > 0) { + args.push('-m', options.maxResults.toString()); + }

And enforce for file mode in the collector:

@@ const result = this.parseLine(line, session.options.searchType); if (result) { session.results.push(result); session.totalMatches++; + // Enforce file-mode maxResults proactively + if ( + session.options.searchType === 'files' && + session.options.maxResults && + session.options.maxResults > 0 && + session.results.filter(r => r.type === 'file').length >= session.options.maxResults && + !session.process.killed + ) { + session.process.kill('SIGTERM'); + } }

🤖 Prompt for AI Agents

In src/search-manager.ts around lines 302-305, the code currently unconditionally appends the rg "-m" arg from options.maxResults; change this so "-m" is only added when options.mode === 'content'; when options.mode === 'file' do not pass "-m" to rg, and instead enforce the limit in-process by counting unique file paths in the collector and terminating the ripgrep child process once the count reaches options.maxResults (close/kill the child and stop collecting). Also ensure the collector returns at most maxResults items (deduplicate file paths if needed) and that any cleanup/stream listeners are handled when you terminate the process.

High Priority Fixes: - Fixed ripgrep command construction: case-insensitive flag only for content searches - Fixed missing '--' separator for content searches to prevent pattern interpretation as flags - Fixed different glob flag usage: use '--glob' for file mode, '-g' for content mode - Fixed search result counting: separate matches from context lines to avoid inflated totals - Fixed fallback search behavior: consistent exclusions between ripgrep and NodeJS fallback Medium Priority Fixes: - Improved timeout management: added timer cleanup and better defaults - Better type safety: use 'unknown' instead of 'any' for handler inputs - Improved test reliability: increased timeouts from 10s to 30s, tighter regex for session ID extraction - Fixed API limit constants: use single source of truth (1048576) across tests - Enhanced search result reporting: show separate match counts vs total results

coderabbitai

Actionable comments posted: 5

♻️ Duplicate comments (4)

test/test_improved_search_truncation.js (1)

2-2: Import handleStopSearch to prevent ReferenceError.

handleStopSearch is used later (Line 39) but not imported.

Apply:
-import { handleStartSearch, handleGetMoreSearchResults } from '../dist/handlers/search-handlers.js';
+import { handleStartSearch, handleGetMoreSearchResults, handleStopSearch } from '../dist/handlers/search-handlers.js';

src/search-manager.ts (3)

329-331: Apply -m only in content mode and enforce maxResults in file mode by terminating rg

-m has no effect with --files. Gate it to content searches and proactively stop the process once enough file paths are collected.

@@
-    if (options.maxResults && options.maxResults > 0) {
-      args.push('-m', options.maxResults.toString());
-    }
+    if (options.searchType === 'content' && options.maxResults && options.maxResults > 0) {
+      args.push('-m', options.maxResults.toString());
+    }

@@
       if (result) {
         session.results.push(result);
         // Separate counting of matches vs context lines
         if (result.type === 'content' && line.includes('"type":"context"')) {
           session.totalContextLines++;
         } else {
           session.totalMatches++;
         }
+        // Enforce file-mode limits locally (rg -m doesn't limit --files)
+        if (
+          session.options.searchType === 'files' &&
+          session.options.maxResults &&
+          session.options.maxResults > 0 &&
+          session.totalMatches >= session.options.maxResults &&
+          !session.process.killed
+        ) {
+          session.process.kill('SIGTERM');
+        }
       }

Also applies to: 439-451

349-364: Support multi-glob patterns in file mode (a|b) via repeated --glob and reduce fuzzy matches

Split options.pattern on | and push one --glob per token; only wrap non-glob tokens with wildcards.

-    if (options.searchType === 'files') {
-      // For file search: determine how to treat the pattern
-      if (this.isExactFilename(options.pattern)) {
-        // Exact filename: use --glob with the exact pattern
-        args.push('--glob', options.pattern);
-      } else if (this.isGlobPattern(options.pattern)) {
-        // Already a glob pattern: use --glob as-is
-        args.push('--glob', options.pattern);
-      } else {
-        // Substring/fuzzy search: wrap with wildcards
-        args.push('--glob', `*${options.pattern}*`);
-      }
-      // Add the root path for file mode
-      args.push(options.rootPath);
-    } else {
+    if (options.searchType === 'files') {
+      const tokens = (options.pattern || '')
+        .split('|')
+        .map(t => t.trim())
+        .filter(Boolean);
+      for (const token of tokens.length ? tokens : [options.pattern]) {
+        if (this.isExactFilename(token) || this.isGlobPattern(token)) {
+          args.push('--glob', token);
+        } else {
+          args.push('--glob', `*${token}*`);
+        }
+      }
+      // Add the root path for file mode
+      args.push(options.rootPath);
+    } else {

173-197: Refresh lastReadTime for all reads; tail should still reflect “more to come” while session is open

Tail reads don’t update activity and set hasMoreResults=false even if the process is still running.

@@
-    const session = this.sessions.get(sessionId);
+    const session = this.sessions.get(sessionId);
@@
-    // Handle negative offsets (tail behavior) - like file reading
+    // Refresh last read time for any access
+    session.lastReadTime = Date.now();
+    // Handle negative offsets (tail behavior) - like file reading
     if (offset < 0) {
@@
-      return {
+      return {
         results: tailResults,
         returnedCount: tailResults.length,
         totalResults: session.totalMatches + session.totalContextLines,
         totalMatches: session.totalMatches, // Actual matches only
         isComplete: session.isComplete,
         isError: session.isError,
         error: session.error,
-        hasMoreResults: false, // Tail always returns what's available
+        hasMoreResults: !session.isComplete,
         runtime: Date.now() - session.startTime
       };
     }

🧹 Nitpick comments (5)

test/test_improved_search_truncation.js (3)
51-57: Reduce flakiness: bound search scope and set backend timeout.

Searching . with pattern . and maxResults: 50000 can overwhelm CI and cause timeouts. Narrow the scope and set timeout_ms.

Apply:
         const searchArgs = {
             path: '.',
             pattern: '.',  // Match almost every line - this should be a lot of results
             searchType: 'content',
             maxResults: 50000,  // Very high limit to get lots of results, but may be capped
-            ignoreCase: true
+            ignoreCase: true,
+            filePattern: '**/*.{ts,tsx,js,jsx,md,json}',
+            includeHidden: false,
+            timeout_ms: 20000
         };
11-15: Session ID regex: align with other tests or relax.

Other tests use /Started .+ session: (.+)/. Your stricter class may break if session IDs include characters beyond [A-Za-z0-9_-].

Apply:
-  const sessionIdMatch = result.content[0].text.match(/Started .* session:\s*([a-zA-Z0-9_-]+)/);
+  const sessionIdMatch = result.content[0].text.match(/Started .+ session:\s*(\S+)/);
82-88: Don’t rely on a “Results truncated” sentinel not emitted by handlers.

search-handlers don’t guarantee that phrase. Prefer checking for pagination hints or completion status.

Apply:
-        if (finalResult.content[0].text.includes('Results truncated')) {
+        if (finalResult.content[0].text.includes('Results truncated')) {
             console.log('✅ Results properly truncated with warning message');
             const truncationIndex = finalResult.content[0].text.indexOf('Results truncated');
             console.log('Truncation message:', finalResult.content[0].text.substring(truncationIndex, truncationIndex + 150));
-        } else {
-            console.log('ℹ️  Results complete, no truncation needed with new streaming API');
+        } else if (finalResult.content[0].text.includes('More results available')) {
+            console.log('ℹ️  Many results; pagination hint present.');
+        } else {
+            console.log('ℹ️  Results complete, no truncation or pagination hint.');
         }
src/search-manager.ts (2)
3-3: Remove unused import

path is unused.
-import path from 'path';
7-12: Rename SearchResult in src/search-manager.ts to avoid collision
Two distinct SearchResult interfaces exist (src/tools/search.ts vs src/search-manager.ts); rename this one to SearchStreamResult (re-export an alias for compatibility).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 428a606 and 79f9114.

📒 Files selected for processing (5)

src/handlers/search-handlers.ts (1 hunks)
src/search-manager.ts (1 hunks)
src/tools/search.ts (1 hunks)
test/test_improved_search_truncation.js (1 hunks)
test/test_search_truncation.js (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

src/tools/search.ts
test/test_search_truncation.js
src/handlers/search-handlers.ts

🧰 Additional context used

🧬 Code graph analysis (2)

src/search-manager.ts (3)

src/tools/search.ts (1)

SearchResult (9-13)

src/tools/filesystem.ts (1)

validatePath (226-281)

src/utils/capture.ts (1)

capture (277-284)

test/test_improved_search_truncation.js (3)

test/test-search-code.js (7)

result (34-34)

result (415-419)

result (442-445)

result (455-458)

sessionIdMatch (37-37)

sessionId (41-41)

sessionId (165-165)

test/test-search-code-edge-cases.js (9)

result (30-30)

result (285-290)

result (309-314)

result (328-333)

result (348-353)

result (370-374)

sessionIdMatch (33-33)

sessionId (37-37)

sessionId (125-125)

src/handlers/search-handlers.ts (3)

handleStartSearch (13-78)

handleGetMoreSearchResults (83-166)

handleStopSearch (171-206)

🪛 Biome (2.1.2)

test/test_improved_search_truncation.js

[error] 74-74: This variable is used before its declaration.

The variable is declared here:

(lint/correctness/noInvalidUseBeforeDeclaration)

[error] 76-77: This variable is used before its declaration.

The variable is declared here:

(lint/correctness/noInvalidUseBeforeDeclaration)

[error] 94-94: Shouldn't redeclare 'apiLimit'. Consider to delete it or rename it.

'apiLimit' is defined here:

(lint/suspicious/noRedeclare)

🔇 Additional comments (3)

test/test_improved_search_truncation.js (1)

109-115: process.exit in tests: confirm runner expectations.

Hard exits can bypass cleanup/reporting in some runners. If this test is executed standalone, fine; otherwise consider returning a rejected promise instead of process.exit.

src/search-manager.ts (2)

320-323: Good: -i scoped to content and -- used before the pattern

These address flag misparsing and unnecessary case folding.

Also applies to: 365-368

445-450: Good: context lines tracked separately from matches

This aligns counters with rg’s match semantics.

coderabbitai · 2025-08-28T14:56:59Z

+    return {
+      sessionId,
+      isComplete: session.isComplete,
+      isError: session.isError,
+      results: [...session.results],
+      totalResults: session.totalMatches,
+      runtime: Date.now() - session.startTime
+    };


🛠️ Refactor suggestion

Align totalResults semantics across startSearch/read/list to avoid confusion

startSearch returns matches-only, while readSearchResults and listSearchSessions include context lines. Return the same aggregate everywhere.

return { sessionId, isComplete: session.isComplete, isError: session.isError, results: [...session.results], - totalResults: session.totalMatches, + totalResults: session.totalMatches + session.totalContextLines, runtime: Date.now() - session.startTime };

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In src/search-manager.ts around lines 144 to 151, startSearch currently returns totalResults as session.totalMatches (matches-only) while readSearchResults and listSearchSessions return an aggregate that includes context lines; change startSearch to return the same aggregate by using the aggregate field (e.g. session.totalResults) instead of session.totalMatches, and ensure all three functions consistently use the same field name and value (or compute the aggregate the same way) so totalResults semantics match across startSearch/read/list.

coderabbitai · 2025-08-28T14:56:59Z

+    process.stderr?.on('data', (data: Buffer) => {
+      const errorText = data.toString();
+      session.error = (session.error || '') + errorText;
+      capture('search_session_error', { 
+        sessionId: session.id,
+        error: errorText.substring(0, 200) // Limit error length for telemetry
+      });
+    });


⚠️ Potential issue

Sanitize error telemetry (paths/patterns can leak PII to GA)

Avoid sending raw stderr (absolute paths, query patterns). Sanitize before storing/sending.

- process.stderr?.on('data', (data: Buffer) => { - const errorText = data.toString(); - session.error = (session.error || '') + errorText; - capture('search_session_error', { - sessionId: session.id, - error: errorText.substring(0, 200) // Limit error length for telemetry - }); - }); + process.stderr?.on('data', (data: Buffer) => { + const errorText = data.toString(); + const sanitized = this.sanitizeError(errorText, session.options.rootPath, session.options.pattern); + session.error = (session.error || '') + sanitized; + capture('search_session_error', { + sessionId: session.id, + error: sanitized.substring(0, 200) + }); + });

Add this helper inside the class:

private sanitizeError(text: string, rootPath?: string, pattern?: string): string { let t = text.replaceAll('\u0000', ''); if (rootPath) { // Collapse absolute paths under the validated root const normRoot = rootPath.replace(/[-/\\^$*+?.()|[\]{}]/g, '\\$&'); t = t.replace(new RegExp(normRoot + '[^\\s"]*', 'g'), '<path>'); } if (pattern) { const escaped = pattern.replace(/[-/\\^$*+?.()|[\]{}]/g, '\\$&'); t = t.replace(new RegExp(escaped, 'g'), '<pattern>'); } // Drop overly verbose permission lines return t.slice(0, 2000); }

🤖 Prompt for AI Agents

In src/search-manager.ts around lines 381 to 388, raw stderr is being appended to session.error and sent to telemetry which may leak absolute paths or query patterns; add a private sanitizeError(text: string, rootPath?: string, pattern?: string): string helper inside the class that removes null chars, collapses any occurrences of a validated rootPath to '<path>' (escaping regex metacharacters), replaces occurrences of a provided pattern with '<pattern>' (escaped), and trims the result to a safe maximum length (e.g. 2000 chars); then call this sanitizer before appending to session.error and before capture(), and still send only the first 200 chars to telemetry (capture uses sanitizedText.substring(0,200)).

coderabbitai · 2025-08-28T14:56:59Z

+  const result = await handleStartSearch(searchArgs);
+
+  // Extract session ID from result with tighter regex
+  const sessionIdMatch = result.content[0].text.match(/Started .* session:\s*([a-zA-Z0-9_-]+)/);
+  if (!sessionIdMatch) {
+    throw new Error('Could not extract session ID from search result');
+  }
+  const sessionId = sessionIdMatch[1];
+


🛠️ Refactor suggestion

Guard against start_search failures.

If handleStartSearch returns isError, the test keeps going and fails later with a vague error.

Apply:

const result = await handleStartSearch(searchArgs); + if (result.isError) { + throw new Error(`start_search failed: ${result.content?.[0]?.text ?? 'Unknown error'}`); + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const result = await handleStartSearch(searchArgs);

// Extract session ID from result with tighter regex

const sessionIdMatch = result.content[0].text.match(/Started .* session:\s*([a-zA-Z0-9_-]+)/);

if (!sessionIdMatch) {

throw new Error('Could not extract session ID from search result');

}

const sessionId = sessionIdMatch[1];

const result = await handleStartSearch(searchArgs);

if (result.isError) {

throw new Error(

`start_search failed: ${result.content?.[0]?.text ?? 'Unknown error'}`

);

}

// Extract session ID from result with tighter regex

const sessionIdMatch = result.content[0].text.match(

/Started .* session:\s*([a-zA-Z0-9_-]+)/

);

if (!sessionIdMatch) {

throw new Error('Could not extract session ID from search result');

}

const sessionId = sessionIdMatch[1];

🤖 Prompt for AI Agents

In test/test_improved_search_truncation.js around lines 8 to 16, the test assumes handleStartSearch succeeded; add an early guard that checks if result.isError (or equivalent failure flag) and immediately throw or fail the test with the error message/details from result to stop execution and produce a clear failure; then proceed to extract sessionId as before, but also validate result.content and result.content[0].text exist before running the regex and throw an explicit error if they are missing.

coderabbitai · 2025-08-28T14:56:59Z

+    const startTime = Date.now();
+    while (Date.now() - startTime < timeout) {
+      const moreResults = await handleGetMoreSearchResults({ sessionId });
+
+      if (moreResults.content[0].text.includes('✅ Search completed')) {
+        return { initialResult: result, finalResult: moreResults, sessionId };
+      }
+
+      if (moreResults.content[0].text.includes('❌ ERROR')) {
+        throw new Error(`Search failed: ${moreResults.content[0].text}`);
+      }
+
+      // Wait a bit before polling again
+      await new Promise(resolve => setTimeout(resolve, 100));
+    }


🛠️ Refactor suggestion

Poll with explicit tail pagination and check isError.

get_more schema typically expects offset and length. Relying on implicit defaults is brittle; also checking for '❌ ERROR' text is unreliable—use isError.

Apply:

- // Wait for completion by polling - const startTime = Date.now(); + // Wait for completion by polling (tail last N results) + const startTime = Date.now(); + const length = 200; + const offset = -length; while (Date.now() - startTime < timeout) { - const moreResults = await handleGetMoreSearchResults({ sessionId }); + const moreResults = await handleGetMoreSearchResults({ sessionId, offset, length }); - if (moreResults.content[0].text.includes('✅ Search completed')) { + if (moreResults.content[0].text.includes('✅ Search completed')) { return { initialResult: result, finalResult: moreResults, sessionId }; } - if (moreResults.content[0].text.includes('❌ ERROR')) { - throw new Error(`Search failed: ${moreResults.content[0].text}`); - } + if (moreResults.isError) { + throw new Error(`get_more_search_results failed: ${moreResults.content?.[0]?.text ?? 'Unknown error'}`); + } // Wait a bit before polling again - await new Promise(resolve => setTimeout(resolve, 100)); + await new Promise(resolve => setTimeout(resolve, 200)); }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const startTime = Date.now();

while (Date.now() - startTime < timeout) {

const moreResults = await handleGetMoreSearchResults({ sessionId });

if (moreResults.content[0].text.includes('✅ Search completed')) {

return { initialResult: result, finalResult: moreResults, sessionId };

}

if (moreResults.content[0].text.includes('❌ ERROR')) {

throw new Error(`Search failed: ${moreResults.content[0].text}`);

}

// Wait a bit before polling again

await new Promise(resolve => setTimeout(resolve, 100));

}

// Wait for completion by polling (tail last N results)

const startTime = Date.now();

const length = 200;

const offset = -length;

while (Date.now() - startTime < timeout) {

const moreResults = await handleGetMoreSearchResults({ sessionId, offset, length });

if (moreResults.content[0].text.includes('✅ Search completed')) {

return { initialResult: result, finalResult: moreResults, sessionId };

}

if (moreResults.isError) {

throw new Error(`get_more_search_results failed: ${moreResults.content?.[0]?.text ?? 'Unknown error'}`);

}

// Wait a bit before polling again

await new Promise(resolve => setTimeout(resolve, 200));

}

🤖 Prompt for AI Agents

In test/test_improved_search_truncation.js around lines 19 to 33, the polling loop calls handleGetMoreSearchResults without explicit tail pagination and checks for errors by inspecting result text; update the call to pass explicit pagination parameters (e.g., include offset and length or a tail object as the get_more schema requires) so we don't rely on implicit defaults, and replace the text-based error detection with checking the returned isError flag (throw if isError is true); keep the existing completion check but reference the correct returned fields after adding pagination and isError checks.

coderabbitai · 2025-08-28T14:56:59Z

+        const apiLimit = 1048576; // 1 MiB - use consistent constant
+
+        // Check if we're within the safe limits using single source of truth
+        if (totalLength > apiLimit) {
+            console.log('❌ Results still quite large - over 1MB combined');
+        } else if (totalLength > Math.floor(0.8 * apiLimit)) {
+            console.log('⚠️  Results approaching limits but acceptable');


⚠️ Potential issue

Fix duplicate/redeclared apiLimit and “use before declaration” lints.

apiLimit is declared twice (Lines 71, 94), triggering Biome errors. Define once and reuse.

Apply:

- const apiLimit = 1048576; // 1 MiB - use consistent constant + const API_LIMIT_BYTES = 1024 * 1024; // 1 MiB - if (totalLength > apiLimit) { + if (totalLength > API_LIMIT_BYTES) { - } else if (totalLength > Math.floor(0.8 * apiLimit)) { + } else if (totalLength > Math.floor(0.8 * API_LIMIT_BYTES)) {

- const apiLimit = 1048576; // 1MB API limit - const safetyMargin = apiLimit - totalLength; + // reuse API_LIMIT_BYTES defined above + const safetyMargin = API_LIMIT_BYTES - totalLength;

- console.log(` API limit: ${apiLimit.toLocaleString()} characters`); + console.log(` API limit: ${API_LIMIT_BYTES.toLocaleString()} characters`); console.log(` Safety margin: ${safetyMargin.toLocaleString()} characters`); - console.log(` Utilization: ${((totalLength / apiLimit) * 100).toFixed(1)}%`); + console.log(` Utilization: ${((totalLength / API_LIMIT_BYTES) * 100).toFixed(1)}%`);

Also applies to: 94-103

🧰 Tools

🪛 Biome (2.1.2)

[error] 74-74: This variable is used before its declaration.

The variable is declared here:

(lint/correctness/noInvalidUseBeforeDeclaration)

[error] 76-77: This variable is used before its declaration.

The variable is declared here:

(lint/correctness/noInvalidUseBeforeDeclaration)

🤖 Prompt for AI Agents

In test/test_improved_search_truncation.js around lines 71-77 and 94-103, apiLimit is declared twice causing duplicate/redeclare and use-before-declaration lint errors; remove the second declaration and define apiLimit once (e.g., at the top of the test or before its first use), then update all references to reuse that single constant so both checks use the same apiLimit and no variable is redeclared or referenced before initialization.

wonderwhy-er added 5 commits August 23, 2025 00:04

Experiment with better search

b0a871b

Improve prompt

841f7a6

Fix tests

a4dc727

Improve search API and bring it in line with file read with paged and…

16e15f6

… negative reading

Refactor tests

02ed3f9

coderabbitai Bot reviewed Aug 25, 2025

View reviewed changes

Improve exact file search

428a606

coderabbitai Bot reviewed Aug 28, 2025

View reviewed changes

wonderwhy-er merged commit 227050b into main Aug 28, 2025
2 checks passed

This was referenced Aug 29, 2025

Fix search case sensitivity in file search #231

Merged

Few more fixes for search in files #232

Merged

wonderwhy-er deleted the refactor-file-search branch September 10, 2025 08:49

coderabbitai Bot mentioned this pull request Sep 12, 2025

Search improvements #240

Merged

coderabbitai Bot mentioned this pull request Sep 26, 2025

File improve search error reporting #249

Merged

coderabbitai Bot mentioned this pull request Nov 28, 2025

Feature/sheets handling #282

Merged

coderabbitai Bot mentioned this pull request Dec 19, 2025

Feature/welcome page ab test #298

Merged

coderabbitai Bot mentioned this pull request Feb 19, 2026

feat: DOCX support via existing read_file, write_file, edit_block, and start_search tools #341

Merged

coderabbitai Bot mentioned this pull request Mar 19, 2026

feat: add replace_lines tool - token-efficient line-based editing #385

Closed

		const errorMessage = error instanceof Error ? error.message : String(error);
		capture('search_session_start_error', { error: errorMessage });

	import { handleStartSearch, handleGetMoreSearchResults } from '../dist/handlers/search-handlers.js';
	import { handleStartSearch, handleGetMoreSearchResults, handleStopSearch } from '../dist/handlers/search-handlers.js';

Uh oh!

Conversation

wonderwhy-er commented Aug 25, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Aug 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wonderwhy-er commented Aug 25, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Aug 25, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)