Skip to content

docs: Add doc for features: row based spill, adaptive hash partition, memory management pushdown#700

Open
wangxinshuo-bolt wants to merge 2 commits into
bytedance:mainfrom
wangxinshuo-bolt:add_doc
Open

docs: Add doc for features: row based spill, adaptive hash partition, memory management pushdown#700
wangxinshuo-bolt wants to merge 2 commits into
bytedance:mainfrom
wangxinshuo-bolt:add_doc

Conversation

@wangxinshuo-bolt

@wangxinshuo-bolt wangxinshuo-bolt commented Jul 3, 2026

Copy link
Copy Markdown
Collaborator

What problem does this PR solve?

Issue Number: close #695

Type of Change

  • 🐛 Bug fix (non-breaking change which fixes an issue)
  • ✨ New feature (non-breaking change which adds functionality)
  • 🚀 Performance improvement (optimization)
  • ⚠️ Breaking change (fix or feature that would cause existing functionality to change)
  • 🔨 Refactoring (no logic changes)
  • 🔧 Build/CI or Infrastructure changes
  • 📝 Documentation only

Description

  1. Adds a blog explaining why Bolt moved off-heap memory reservation, repayment, spill, and OOM decisions into the Bolt side. It also covers the follow-up RSS quota calibration mechanism, which uses process RSS feedback to reduce premature spill and OOM when logical quota usage is higher than actual resident memory usage.
  2. Adds a blog describing row-based spill for operators whose execution state is already row-oriented. The post explains how writing RowContainer-backed state directly to disk can reduce column-row conversion, serialization, restore, and merge overhead.
  3. Adds a blog explaining how HashJoin spill partition counts can be adjusted using runtime row-count information. The post covers first-spill capacity sampling, adaptive joinPartitionBits, adaptive joinRepartitionBits, compression selection, fallback behavior, and limitations such as row-width variance and skew.

Performance Impact

  • No Impact: This change does not affect the critical path (e.g., build system, doc, error handling).

  • Positive Impact: I have run benchmarks.

    Click to view Benchmark Results
    Paste your google-benchmark or TPC-H results here.
    Before: 10.5s
    After:   8.2s  (+20%)
    
  • Negative Impact: Explained below (e.g., trade-off for correctness).

Release Note

Please describe the changes in this PR

Release Note:

Release Note:
- Add docs for 3 feature: row based spill, adaptive hash partition, memory management pushdown

Checklist (For Author)

  • I have added/updated unit tests (ctest).
  • I have verified the code with local build (Release/Debug).
  • I have run clang-format / linters.
  • (Optional) I have run Sanitizers (ASAN/TSAN) locally for complex C++ changes.
  • No need to test or manual test.

Breaking Changes

  • No

  • Yes (Description: ...)

    Click to view Breaking Changes
    Breaking Changes:
    - Description of the breaking change.
    - Possible solutions or workarounds.
    - Any other relevant information.
    

@wangxinshuo-bolt wangxinshuo-bolt changed the title Add doc [Docs] Add doc for features: row based spill, adaptive hash partition, memory management pushdown Jul 3, 2026
@wangxinshuo-bolt wangxinshuo-bolt changed the title [Docs] Add doc for features: row based spill, adaptive hash partition, memory management pushdown docs: Add doc for features: row based spill, adaptive hash partition, memory management pushdown Jul 3, 2026
@wangxinshuo-bolt wangxinshuo-bolt marked this pull request as ready for review July 3, 2026 12:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Documentation] Add blog posts for Bolt features and performance improvements

1 participant