Skip to content

List dir optimisation#252

Closed
wonderwhy-er wants to merge 5 commits into
mainfrom
list-dir-optimisation
Closed

List dir optimisation#252
wonderwhy-er wants to merge 5 commits into
mainfrom
list-dir-optimisation

Conversation

@wonderwhy-er

@wonderwhy-er wonderwhy-er commented Oct 1, 2025

Copy link
Copy Markdown
Owner

Summary by CodeRabbit

  • New Features

    • Directory listing now respects .gitignore by default, with an option to include hidden files.
    • Produces a clearer hierarchical view with sorted items and sensible truncation for large folders.
    • Improved robustness with better handling of inaccessible paths and timeouts.
  • Documentation

    • Updated tool description to explain new options (respectGitignore, includeHidden) and revised behavior.

…restore lost chats or pass context between chats about work that was done recently
- Add writeInterval property to track setInterval reference
- Use unref() to allow graceful process shutdown
- Add cleanup() method to properly dispose resources
- Fixes test hang issue where Node process never exited
@coderabbitai

coderabbitai Bot commented Oct 1, 2025

Copy link
Copy Markdown
Contributor

Walkthrough

Adds gitignore and hidden-file options to directory listing. Refactors listDirectory to use ripgrep for discovery and builds a hierarchical listing with depth limiting and truncation. Updates handler and schema to pass new options; adjusts tool description. Minor whitespace change in utils.

Changes

Cohort / File(s) Summary
Filesystem Tool Refactor
src/tools/filesystem.ts
Replaces recursive FS traversal with ripgrep-based discovery. Expands listDirectory(dirPath, depth) to listDirectory(dirPath, depth, respectGitignore, includeHidden) with defaults. Builds and renders hierarchical path tree, supports gitignore/hidden flags, sorting, truncation, and error/timeout handling.
Handler Update
src/handlers/filesystem-handlers.ts
Updates handleListDirectory to call listDirectory with two new boolean options: respectGitignore, includeHidden.
Tool Schema Update
src/tools/schemas.ts
Adds optional booleans to ListDirectoryArgsSchema: respectGitignore (default true), includeHidden (default false).
Server Tool Description
src/server.ts
Updates ListDirectory tool description to document gitignore and hidden-file support; removes prior access-denied note.
Non-functional Whitespace
src/utils/toolHistory.ts
Adds trailing blank line; no logic changes.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant C as Client
    participant S as Server
    participant H as FilesystemHandler
    participant T as listDirectory (Tool)
    participant RG as ripgrep (child process)

    C->>S: Request ListDirectory(dir, depth, respectGitignore?, includeHidden?)
    S->>H: handleListDirectory(args)
    H->>T: listDirectory(dir, depth, respectGitignore, includeHidden)
    T->>RG: spawn rg with flags (--hidden?, --no-ignore?)
    RG-->>T: file path matches (up to depth+1)
    T->>T: Build path tree, group/sort, truncate per dir
    T-->>H: Rendered hierarchical listing
    H-->>S: Result
    S-->>C: Response

    note over T,RG: New flow uses ripgrep for discovery and post-processing for hierarchy.
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • dmitry-ottic-ai

Poem

I burrow through dirs with whiskered delight,
Ripgrep sniffs paths in the soft moonlight.
Hidden or heeded by .gitignore’s law,
I sort tiny twigs with a tidy paw.
Depths respected, branches neat—
A rabbit’s tree is now complete. 🐇🌲

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title “List dir optimisation” succinctly reflects the key change in this pull request by indicating that the directory listing functionality has been optimized, aligning with the refactoring to use a ripgrep-based approach and new filtering options.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch list-dir-optimisation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (5)
src/utils/toolHistory.ts (1)

85-85: Nit: stray blank line

The extra blank line after the log makes for noisy diffs; consider removing to keep file tidy.

src/handlers/filesystem-handlers.ts (1)

214-219: Pass-through args look correct; add timeout and use measured duration

  • Good: forwarding path, depth, respectGitignore, includeHidden matches the new API.
  • Suggest: wrap the call in withTimeout (like read_file) to bound ripgrep stalls; also use the computed duration for telemetry or remove it to avoid unused var.

Apply this diff to add a handler-level timeout and avoid an unused variable:

 export async function handleListDirectory(args: unknown): Promise<ServerResult> {
   try {
-    const startTime = Date.now();
     const parsed = ListDirectoryArgsSchema.parse(args);
-    const entries = await listDirectory(
-      parsed.path, 
-      parsed.depth, 
-      parsed.respectGitignore,
-      parsed.includeHidden
-    );
-    const duration = Date.now() - startTime;
+    const HANDLER_TIMEOUT = 30000; // 30s
+    const entries = await withTimeout(
+      listDirectory(
+        parsed.path,
+        parsed.depth,
+        parsed.respectGitignore,
+        parsed.includeHidden
+      ),
+      HANDLER_TIMEOUT,
+      'List directory handler',
+      null
+    );
 
     const resultText = entries.join('\n');
src/tools/filesystem.ts (3)

895-905: Validate and clamp depth to sane bounds

Depth can be zero/negative from user input; scanDepth = depth + 1 would become ≤1. Clamp to ≥1 and optionally to a reasonable upper bound.

 export async function listDirectory(
     dirPath: string, 
-    depth: number = 2,
+    depth: number = 2,
     respectGitignore: boolean = true,
     includeHidden: boolean = false
 ): Promise<string[]> {
   const validPath = await validatePath(dirPath);
-  
+  // Normalize depth
+  if (!Number.isFinite(depth) || depth < 1) depth = 1;
+  if (depth > 10) depth = 10; // guardrail; consider making this configurable

1011-1013: Optional: show directories first

Currently children are sorted lexicographically, intermixing files and directories. For readability, list dirs first, then files.

-    for (const children of pathsByParent.values()) {
-        children.sort((a, b) => a.localeCompare(b));
-    }
+    for (const [parent, children] of pathsByParent) {
+        children.sort((a, b) => {
+            const ta = allPaths.get(a) === 'dir' ? 0 : 1;
+            const tb = allPaths.get(b) === 'dir' ? 0 : 1;
+            return ta - tb || a.localeCompare(b);
+        });
+        pathsByParent.set(parent, children);
+    }

8-10: Dependency note

Using @vscode/ripgrep’s bundled rg is fine; ensure it’s declared in deps and packaged for all targets. If rgPath resolution fails, catch ENOENT and fall back to Node traversal (like searchFilesNodeJS does).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4fb8d28 and a1bd840.

📒 Files selected for processing (5)
  • src/handlers/filesystem-handlers.ts (1 hunks)
  • src/server.ts (1 hunks)
  • src/tools/filesystem.ts (2 hunks)
  • src/tools/schemas.ts (1 hunks)
  • src/utils/toolHistory.ts (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/handlers/filesystem-handlers.ts (2)
test/test-home-directory.js (1)
  • entries (224-224)
src/tools/filesystem.ts (1)
  • listDirectory (895-1051)
🔇 Additional comments (2)
src/server.ts (1)

335-345: Docs update LGTM

The ListDirectory description cleanly documents respectGitignore (default true) and includeHidden (default false) and aligns with the schema and handler changes.

src/tools/schemas.ts (1)

67-69: Schema additions LGTM

Defaults and types for respectGitignore and includeHidden align with server docs and handler usage.

Comment thread src/tools/filesystem.ts
Comment on lines +927 to +957
const files = await new Promise<string[]>((resolve, reject) => {
const rgProcess = spawn(rgPath, rgArgs);
const output: string[] = [];
let errorOutput = '';

rgProcess.stdout?.on('data', (data: Buffer) => {
output.push(data.toString());
});

rgProcess.stderr?.on('data', (data: Buffer) => {
errorOutput += data.toString();
});

rgProcess.on('close', (code: number) => {
// Code 0 = success, 1 = no matches, 2 = some files couldn't be searched
if (code === 0 || code === 1 || code === 2) {
const allOutput = output.join('');
const fileList = allOutput
.split('\n')
.map(line => line.trim())
.filter(Boolean);
resolve(fileList);
} else {
reject(new Error(`ripgrep failed with code ${code}: ${errorOutput}`));
}
});

rgProcess.on('error', (error: Error) => {
reject(error);
});
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Child process can hang: add timeout and kill ripgrep

Spawning rg without a timeout is an external-call hazard; a slow filesystem or huge tree can block the request thread indefinitely. Add a guard that kills the process after a bounded interval and surfaces a clear error.

Apply this diff to add a timeout:

-    const files = await new Promise<string[]>((resolve, reject) => {
-        const rgProcess = spawn(rgPath, rgArgs);
+    const files = await new Promise<string[]>((resolve, reject) => {
+        const RG_TIMEOUT_MS = 30000;
+        const rgProcess = spawn(rgPath, rgArgs);
         const output: string[] = [];
         let errorOutput = '';
+        let timedOut = false;
+        const timer = setTimeout(() => {
+            timedOut = true;
+            try { rgProcess.kill('SIGKILL'); } catch {}
+        }, RG_TIMEOUT_MS);
         
         rgProcess.stdout?.on('data', (data: Buffer) => {
             output.push(data.toString());
         });
         
         rgProcess.stderr?.on('data', (data: Buffer) => {
             errorOutput += data.toString();
         });
         
         rgProcess.on('close', (code: number) => {
+            clearTimeout(timer);
+            if (timedOut) {
+                return reject(new Error(`ripgrep timed out after ${RG_TIMEOUT_MS}ms`));
+            }
             // Code 0 = success, 1 = no matches, 2 = some files couldn't be searched
             if (code === 0 || code === 1 || code === 2) {
                 const allOutput = output.join('');
                 const fileList = allOutput
                     .split('\n')
                     .map(line => line.trim())
                     .filter(Boolean);
                 resolve(fileList);
             } else {
                 reject(new Error(`ripgrep failed with code ${code}: ${errorOutput}`));
             }
         });
         
         rgProcess.on('error', (error: Error) => {
-            reject(error);
+            clearTimeout(timer);
+            reject(error);
         });
     });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const files = await new Promise<string[]>((resolve, reject) => {
const rgProcess = spawn(rgPath, rgArgs);
const output: string[] = [];
let errorOutput = '';
rgProcess.stdout?.on('data', (data: Buffer) => {
output.push(data.toString());
});
rgProcess.stderr?.on('data', (data: Buffer) => {
errorOutput += data.toString();
});
rgProcess.on('close', (code: number) => {
// Code 0 = success, 1 = no matches, 2 = some files couldn't be searched
if (code === 0 || code === 1 || code === 2) {
const allOutput = output.join('');
const fileList = allOutput
.split('\n')
.map(line => line.trim())
.filter(Boolean);
resolve(fileList);
} else {
reject(new Error(`ripgrep failed with code ${code}: ${errorOutput}`));
}
});
rgProcess.on('error', (error: Error) => {
reject(error);
});
});
const files = await new Promise<string[]>((resolve, reject) => {
const RG_TIMEOUT_MS = 30000;
const rgProcess = spawn(rgPath, rgArgs);
const output: string[] = [];
let errorOutput = '';
let timedOut = false;
const timer = setTimeout(() => {
timedOut = true;
try { rgProcess.kill('SIGKILL'); } catch {}
}, RG_TIMEOUT_MS);
rgProcess.stdout?.on('data', (data: Buffer) => {
output.push(data.toString());
});
rgProcess.stderr?.on('data', (data: Buffer) => {
errorOutput += data.toString();
});
rgProcess.on('close', (code: number) => {
clearTimeout(timer);
if (timedOut) {
return reject(new Error(`ripgrep timed out after ${RG_TIMEOUT_MS}ms`));
}
// Code 0 = success, 1 = no matches, 2 = some files couldn't be searched
if (code === 0 || code === 1 || code === 2) {
const allOutput = output.join('');
const fileList = allOutput
.split('\n')
.map(line => line.trim())
.filter(Boolean);
resolve(fileList);
} else {
reject(new Error(`ripgrep failed with code ${code}: ${errorOutput}`));
}
});
rgProcess.on('error', (error: Error) => {
clearTimeout(timer);
reject(error);
});
});
🤖 Prompt for AI Agents
In src/tools/filesystem.ts around lines 927 to 957, the ripgrep child process is
spawned without a timeout which can hang on slow or huge filesystems; add a
watchdog that sets a bounded timeout (e.g., configurable constant) which, when
reached, kills the rg process (call kill on the child and remove listeners),
rejects the promise with a clear timeout error message including rg args/path,
and clears the timer on normal close or error to avoid leaks; ensure the timeout
is cleared in the 'close' and 'error' handlers and handle the case where kill
itself errors so the promise always resolves/rejects exactly once.

Comment thread src/tools/filesystem.ts
Comment on lines +963 to +980
for (const filePath of files) {
// Get relative path from the root
const relativePath = path.relative(validPath, filePath);
if (!relativePath || relativePath.startsWith('..')) continue;

const parts = relativePath.split(path.sep);

// Add all parent directories up to the requested depth
for (let i = 1; i < parts.length && i <= depth; i++) {
const dirPath = parts.slice(0, i).join(path.sep);
directories.add(dirPath);
}

// Add the file itself only if it's within the requested depth
if (parts.length <= depth) {
fileSet.add(relativePath);
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Empty directories are omitted (ripgrep --files lists files only)

Deriving directories solely from files misses directories that contain no files (or only subdirs with no files within the scanned depth). list_directory should show those too; current output will be blank in such cases.

Minimal approach: supplement the directories set with a shallow Node.js scan up to depth for dir entries not implied by files.

     for (const filePath of files) {
       ...
     }
 
-    // Build the result list
+    // Supplement: include directories that contain no files within the scanned depth
+    await supplementMissingDirectories(validPath, depth, directories, includeHidden, respectGitignore);
+
+    // Build the result list

Add this helper in the same module:

async function supplementMissingDirectories(
  root: string,
  maxDepth: number,
  directories: Set<string>,
  includeHidden: boolean,
  respectGitignore: boolean
) {
  const queue: Array<{abs: string; rel: string; d: number}> = [{abs: root, rel: '', d: 0}];
  const gitignoreFilters = new Set<string>(); // Optional: populate from ripgrep/globs if needed

  while (queue.length) {
    const {abs, rel, d} = queue.shift()!;
    if (d >= maxDepth) continue;

    let entries: import('fs').Dirent[];
    try {
      entries = await (await import('fs/promises')).readdir(abs, { withFileTypes: true });
    } catch {
      continue;
    }

    for (const e of entries) {
      const isHidden = e.name.startsWith('.');
      if (!includeHidden && isHidden) continue;

      const childRel = rel ? path.join(rel, e.name) : e.name;
      const childAbs = path.join(abs, e.name);

      if (e.isDirectory()) {
        // Optionally skip gitignored directories when respectGitignore is true (left as enhancement)
        directories.add(childRel);
        queue.push({abs: childAbs, rel: childRel, d: d + 1});
      }
    }
  }
}

Note: If strict gitignore parity is required for these empty dirs, wire in ignore parsing; otherwise this at least prevents vanishing empty trees.

Comment thread src/tools/filesystem.ts
Comment on lines +1016 to 1041
const outputPath = (pathStr: string, depth: number = 0) => {
const type = allPaths.get(pathStr);
if (!type) return;

const prefix = type === 'dir' ? '[DIR]' : '[FILE]';
const indent = ' '.repeat(depth);
results.push(`${indent}${prefix} ${pathStr}`);

// Get children of this directory
const children = pathsByParent.get(pathStr) || [];

// Check if this directory should be trimmed
if (children.length > MAX_ITEMS_PER_DIR) {
const hiddenCount = children.length - MAX_ITEMS_PER_DIR;
results.push(`${indent} ⚠️ ${pathStr}/ contains ${children.length} items (${hiddenCount} hidden). List this directory directly to see all.`);

// Output only first MAX_ITEMS_PER_DIR children
for (let i = 0; i < MAX_ITEMS_PER_DIR; i++) {
outputPath(children[i], depth);
}
} else {
// Output all children
for (const child of children) {
outputPath(child, depth);
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Hierarchy rendering bug: indentation never increases (param shadowing and wrong recursion)

outputPath uses a parameter named depth that shadows the outer depth, and recursion calls pass the same value, so all lines render at the same indent. This breaks the “hierarchical listing” claim.

Apply this diff to fix indentation and avoid shadowing:

-    const outputPath = (pathStr: string, depth: number = 0) => {
-        const type = allPaths.get(pathStr);
+    const outputPath = (pathStr: string, level: number = 0) => {
+        const type = allPaths.get(pathStr);
         if (!type) return;
-        
-        const prefix = type === 'dir' ? '[DIR]' : '[FILE]';
-        const indent = '  '.repeat(depth);
+        const prefix = type === 'dir' ? '[DIR]' : '[FILE]';
+        const indent = '  '.repeat(level);
         results.push(`${indent}${prefix} ${pathStr}`);
         
         // Get children of this directory
         const children = pathsByParent.get(pathStr) || [];
         
         // Check if this directory should be trimmed
         if (children.length > MAX_ITEMS_PER_DIR) {
             const hiddenCount = children.length - MAX_ITEMS_PER_DIR;
             results.push(`${indent}    ⚠️ ${pathStr}/ contains ${children.length} items (${hiddenCount} hidden). List this directory directly to see all.`);
             
             // Output only first MAX_ITEMS_PER_DIR children
             for (let i = 0; i < MAX_ITEMS_PER_DIR; i++) {
-                outputPath(children[i], depth);
+                outputPath(children[i], level + 1);
             }
         } else {
             // Output all children
             for (const child of children) {
-                outputPath(child, depth);
+                outputPath(child, level + 1);
             }
         }
     };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const outputPath = (pathStr: string, depth: number = 0) => {
const type = allPaths.get(pathStr);
if (!type) return;
const prefix = type === 'dir' ? '[DIR]' : '[FILE]';
const indent = ' '.repeat(depth);
results.push(`${indent}${prefix} ${pathStr}`);
// Get children of this directory
const children = pathsByParent.get(pathStr) || [];
// Check if this directory should be trimmed
if (children.length > MAX_ITEMS_PER_DIR) {
const hiddenCount = children.length - MAX_ITEMS_PER_DIR;
results.push(`${indent} ⚠️ ${pathStr}/ contains ${children.length} items (${hiddenCount} hidden). List this directory directly to see all.`);
// Output only first MAX_ITEMS_PER_DIR children
for (let i = 0; i < MAX_ITEMS_PER_DIR; i++) {
outputPath(children[i], depth);
}
} else {
// Output all children
for (const child of children) {
outputPath(child, depth);
}
}
const outputPath = (pathStr: string, level: number = 0) => {
const type = allPaths.get(pathStr);
if (!type) return;
const prefix = type === 'dir' ? '[DIR]' : '[FILE]';
const indent = ' '.repeat(level);
results.push(`${indent}${prefix} ${pathStr}`);
// Get children of this directory
const children = pathsByParent.get(pathStr) || [];
// Check if this directory should be trimmed
if (children.length > MAX_ITEMS_PER_DIR) {
const hiddenCount = children.length - MAX_ITEMS_PER_DIR;
results.push(`${indent} ⚠️ ${pathStr}/ contains ${children.length} items (${hiddenCount} hidden). List this directory directly to see all.`);
// Output only first MAX_ITEMS_PER_DIR children
for (let i = 0; i < MAX_ITEMS_PER_DIR; i++) {
outputPath(children[i], level + 1);
}
} else {
// Output all children
for (const child of children) {
outputPath(child, level + 1);
}
}
};
🤖 Prompt for AI Agents
In src/tools/filesystem.ts around lines 1016 to 1041 the depth parameter is
being shadowed/kept constant so indentation never increases; fix by renaming the
parameter (or an inner variable) to avoid shadowing and pass an incremented
depth when recursing into children (e.g., outputPath(child, depth + 1)) both in
the trimmed branch and the else branch so children render with increased
indentation; also ensure any indent computation uses the (renamed) current depth
variable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant