Skip to content

fix: add json_to_map control char parse config#686

Open
zhangxffff wants to merge 1 commit into
bytedance:mainfrom
zhangxffff:fix/json-to-map-escape-control-chars
Open

fix: add json_to_map control char parse config#686
zhangxffff wants to merge 1 commit into
bytedance:mainfrom
zhangxffff:fix/json-to-map-escape-control-chars

Conversation

@zhangxffff

@zhangxffff zhangxffff commented Jun 30, 2026

Copy link
Copy Markdown
Collaborator

What problem does this PR solve?

Issue Number: related to #672

Type of Change

  • 🐛 Bug fix (non-breaking change which fixes an issue)
  • ✨ New feature (non-breaking change which adds functionality)
  • 🚀 Performance improvement (optimization)
  • ⚠️ Breaking change (fix or feature that would cause existing functionality to change)
  • 🔨 Refactoring (no logic changes)
  • 🔧 Build/CI or Infrastructure changes
  • 📝 Documentation only

Description

in PR #674, we align the behavior of unescaped control chars like '\n' in json with hive udf implement, however, this need an extra parse to handle the control chars. so we add a new json_to_map_escape_control_chars config to control whether we handle control chars.

when json_to_map_escape_control_chars is true, json_to_map would handle control chars and return valid map.

when json_to_map_escape_control_chars is false, json_to_map would return null when json contains unescaped control chars

Performance Impact

  • No Impact: This change does not affect the critical path (e.g., build system, doc, error handling).

  • Positive Impact: I have run benchmarks.

    Click to view Benchmark Results
    Paste your google-benchmark or TPC-H results here.
    Before: 10.5s
    After:   8.2s  (+20%)
    
  • Negative Impact: Explained below (e.g., trade-off for correctness).

Release Note

Please describe the changes in this PR

Release Note:

Release Note:
- Fixed a crash in `substr` when input is null.
- optimized `group by` performance by 20%.

Checklist (For Author)

  • I have added/updated unit tests (ctest).
  • I have verified the code with local build (Release/Debug).
  • I have run clang-format / linters.
  • (Optional) I have run Sanitizers (ASAN/TSAN) locally for complex C++ changes.
  • No need to test or manual test.

Breaking Changes

  • No

  • Yes (Description: ...)

    Click to view Breaking Changes
    Breaking Changes:
    - Description of the breaking change.
    - Possible solutions or workarounds.
    - Any other relevant information.
    

@zhangxffff zhangxffff requested a review from guhaiyan0221 July 1, 2026 09:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant