All notable changes to cpd (Rust) are documented here. Releases follow Semantic Versioning.
- Razor (.razor) support — new tokenizer for Razor files, parsing HTML content and Razor keyword blocks (thanks to @chrisc-onaorg in #829)
- Fix
cargo teston Windows by usingCARGO_BIN_EXE_cpdor.exesuffix to locate the test binary - Fix
cargo fmtformatting issues across the workspace
- Extract methods and optimize intervals in
cpd-core(bumped to 0.1.6)
cpd-coreupdated to 0.1.6,cpd-tokenizerupdated to 0.1.7 across all dependent crates
- Emit scan-root-relative paths in all reporters when
absolute: false(or the default). Previously,jscpd /abs/pathfrom a different CWD left absolute paths in SARIF/JSON/XML/HTML/CSV/Markdown/console output, and Windows/macOS path canonicalization could leave\\?\or./prefixes. Paths are now normalized against the canonicalized scan root (with CWD fallback) and stripped of any leading./or.\\component. Fixes #827 - Fix
--skip-localto match jscpd v4 TypeScript semantics: it now filters clones where both fragments are under the same scan root, instead of only skipping clones in the same parent directory
- DRY duplication in reporters: extract shared helpers (
print_clone_header,print_clone_locations,print_snippet,write_report_file, report statistics, test fixtures, etc.) intocpd-reporter/src/shared.rs. Console, console-full, CSV, JSON, HTML, Markdown, silent, XML, and SARIF reporters now reuse the same implementation, reducing the monorepo's reported duplication ratio from 5.0% to 0.56% and fixing a latent--absolutepath relativization bug in the same pass - Move blame enrichment from
gitoxidetogit blame --porcelain; capture elapsed time after blame so timing includes blame work - Resolve
needless_borrowclippy warnings in CSV and Markdown reporters
- Add Nix and Homebrew install instructions to Rust READMEs. #818
- Update project homepage URLs to
https://jscpd.devin allCargo.tomland npmpackage.jsonfiles, add curl install method to READMEs, clean up outdated badges - Remove defunct Universal Analytics tracking pixels from all READMEs
- GitHub Action for jscpd (Rust v5) —
jscpd-copy-paste-detectoraction for GitHub Actions Marketplace. Scan your repo for copy/paste in CI withuses: kucherenko/jscpd/.github/workflows/action.yml@v5
- Resolve platform binary resolution when
cpdis installed as a nested dependency (e.g. in a project'snode_modulesvia a parent package). The runner now correctly locates the platform-specific binary relative to the installed package rather than assuming a top-level install. Fixes #816
- Prevent mmap exhaustion crashes when scanning repositories with more files than
vm.max_map_count(default 131 072 on Linux). The walker previously held a liveMmapper discovered file; each rayon worker now opens and drops its mapping within the processing closure, capping concurrent mappings to the thread-pool size (typically 8–32). Fixes #813 - Fix
--patternnot matching relative paths when the scan root is absolute (e.g. CWD). Patterns likesrc/**/*.tsnow match correctly by comparing against both the relative path and the full absolute path, and bare patterns like*.tsgain a**/prefix to match at any depth. Fixes #811 - Fix trailing-newline off-by-one in line-count filter: files not ending with
\nnow count the final line correctly
- Prevent stack overflow when scanning directories containing deeply-nested JS/TS files (e.g. Bun's
test/bundlerwith 320K+ nested for-loops). OXC's recursive-descent parser allocates one stack frame per AST nesting level; pathological inputs now exceed the default 8 MiB thread stack. Fixed by building a local rayonThreadPoolwith 64 MiB stacks instead of using the global pool (which silently fails on re-init) - Default
--max-sizeto1mb— files exceeding the limit are skipped at walk time, consistent with jscpd v4'smaxSizebehavior. This prevents OXC from ever seeing megabyte-scale generated files that would overflow the stack --workers Nnow correctly takes effect on everyrun()call (previouslybuild_global()silently no-op'd after the first invocation)
- v4 config backward compatibility —
.jscpd.jsonfieldspath,pattern,ignore, andignorePatternare now read and applied, matching jscpd v4 behavior ignoreandignorePatternare now distinct:ignorematches file-level globs,ignorePatternmatches code-level regex patterns (previously conflated).jscpd.jsonpath config support — reads scan directories from thepathfield, resolving relative paths against the config file's directoryjscpdnpm wrapper package — publishes the same Rust binary under thejscpdname on npm with v5.x versioning--exit-codenow matches v4 behavior: accepts optional integer value (--exit-codeexits 1,--exit-code 2exits 2);--thresholdand--exit-codeare now independent- Performance improvements: memory-mapped file I/O (via
memmap2) eliminates heap copies of file contents; SIMD-accelerated line counting (viamemchr); parallel detection pipeline usesflat_mapto avoid intermediate allocations; JS tokenizer no longer clones source strings before parsing (thanks to @auterium, #808)
- Fixed
--exit-codeto match jscpd v4's--exitCodebehavior (was boolean, now optional integer) - Fixed unique temp dir generation in reporter tests (added PID to prevent race conditions under parallel test runners)
- CLI alignment with jscpd v4: new
--absolute,--ignore-case,--formats-exts,--formats-namesflags; fixed--threshold, improved--max-size - Detection and statistics aligned with jscpd for consistent output across Rust and TypeScript versions
- Side-by-side blame comparison in console-full reporter
- Clone list display in console reporter
- HTML reporter now outputs
jscpd-report.htmlat theoutput_dirroot - Resolved all clippy warnings across workspace
- Fixed unique temp dir generation in tests (use
as_nanos()instead ofsubsec_nanos())
- Rust-based cpd CLI with full feature parity to TypeScript jscpd
- Cross-platform binary distribution via npm platform packages (linux-x64-gnu, linux-arm64-gnu, linux-x64-musl, darwin-arm64, darwin-x64, windows-x64-msvc)
- 13 reporters: json, console, xml, csv, html, markdown, sarif, ai, badge, xcode, threshold, silent, console-full
- Time reporter for execution timing
- CLI short-form aliases matching TypeScript jscpd conventions
- ReportContext data structure for extensible reporter signatures
- Trusted Publishing support for crates.io via OIDC
- Fixed Vue SFC tokenization to dispatch each block to its own sub-format
- Fixed entire-file duplicates silently dropped by RabinKarp store flush logic
- Fixed ReDoS hang on Lisp/Elisp files
- Fixed crash on malformed package.json when reading config
- Initial Rust workspace with cpd-core, cpd-tokenizer, cpd-finder, cpd-reporter, and jscpd crates
- Cross-format detection for Vue SFC, Svelte, Astro, and Markdown files
- Shebang detection for extensionless scripts
- First stable Rust release — replaces the TypeScript-based CLI with a native binary
- Reporter trait signature changed to use ReportContext instead of Statistics directly