Skip to content

feat(data-collection): create DataCollection option in client#6702

Open
ericapisani wants to merge 16 commits into
masterfrom
ep/db-spec-experiement-foundation-dict
Open

feat(data-collection): create DataCollection option in client#6702
ericapisani wants to merge 16 commits into
masterfrom
ep/db-spec-experiement-foundation-dict

narrow the type on a variable

0c3be5c
Select commit
Loading
Failed to load commit list.
@sentry/warden / warden: code-review completed Jul 3, 2026 in 0s

4 issues

code-review: Found 4 issues (1 medium, 3 low)

Medium

`should_send_default_pii()` ignores `data_collection["user_info"]`, causing silent divergence - `sentry_sdk/client.py:357`

After this change, EventScrubber is initialized from data_collection["user_info"], but should_send_default_pii() (line 753) still reads only options["send_default_pii"]. A user who migrates to data_collection={"user_info": True} without setting send_default_pii=True will find EventScrubber won't scrub PII fields, yet all integration code paths guarded by should_send_default_pii() — including scope user attachment (scope.py:1749), cookies, IP addresses, and dozens of integrations — will still suppress PII, leaving the configuration in a silently split state.

Low

Spotlight re-derivation treats `None` option values differently than `resolve_data_collection` - `sentry_sdk/client.py:640-643`

Using is not False identity checks for include_local_variables and include_source_context means an explicit None value is treated as True here, while resolve_data_collection passes None directly to _map_from_send_default_pii (where it's falsy). If a user passes either option as None, the spotlight override would produce a data_collection inconsistent with the initial resolution. Consider using the raw option values directly (matching resolve_data_collection's approach) or explicitly coercing with bool(...).

_http_headers_from_value silently mishandles non-dict values via string-contains fallback - `sentry_sdk/consts.py:1282`

_resolve_explicit passes d.get("http_headers", {}) straight into _http_headers_from_value without verifying the value is a dict. Inside, the guard "request" in val performs a substring check when val is a string. If a user passes a string like "off" (which contains neither "request" nor "response"), the code silently falls back to the deny_list default — meaning a user attempting to disable header collection actually still collects headers (a PII concern). If instead the string contains "request" or "response" as a substring (e.g. "request_off"), val["request"] performs string indexing and raises a cryptic TypeError/AttributeError in _kvcb_from_value rather than a clear validation error. The test test_http_headers_collection_defaults exercises "off" and confirms the silent-fallback path, but no validation rejects malformed values.

Also found at:

  • sentry_sdk/data_collection.py:152-158
DeprecationWarning stacklevel points to internal SDK code instead of user code - `sentry_sdk/data_collection.py:249-251`

With stacklevel=2, the warning points to _get_options() in client.py rather than the user's sentry_sdk.init() call, making it hard for users to locate the offending line in their own code. A stacklevel of ~5 is needed to reach user code via resolve_data_collection_get_options_Client.__init___init → user code.

Also found at:

  • sentry_sdk/client.py:28

⏱ 11m 19s · 4.6M in / 105.0k out · $4.37

Annotations

Check warning on line 357 in sentry_sdk/client.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: code-review

`should_send_default_pii()` ignores `data_collection["user_info"]`, causing silent divergence

After this change, `EventScrubber` is initialized from `data_collection["user_info"]`, but `should_send_default_pii()` (line 753) still reads only `options["send_default_pii"]`. A user who migrates to `data_collection={"user_info": True}` without setting `send_default_pii=True` will find EventScrubber won't scrub PII fields, yet all integration code paths guarded by `should_send_default_pii()` — including scope user attachment (`scope.py:1749`), cookies, IP addresses, and dozens of integrations — will still suppress PII, leaving the configuration in a silently split state.