macos: fix AXVisibleCharacterRange and UTF-16 indexing#12881
Conversation
| // pins.items is monotonic in Pin order (the formatter emits in | ||
| // document order). Pin.before is O(pages) for cross-node | ||
| // comparisons, so we use binary search rather than a linear scan. | ||
| const start = std.sort.partitionPoint( | ||
| Pin, | ||
| pins.items, | ||
| self.pages.getTopLeft(.viewport), | ||
| struct { | ||
| fn pred(ctx: Pin, pin: Pin) bool { | ||
| return pin.before(ctx); | ||
| } | ||
| }.pred, | ||
| ); | ||
| const end = start + std.sort.partitionPoint( | ||
| Pin, | ||
| pins.items[start..], | ||
| self.pages.getBottomRight(.viewport).?, | ||
| struct { | ||
| fn pred(ctx: Pin, pin: Pin) bool { | ||
| return !ctx.before(pin); | ||
| } | ||
| }.pred, | ||
| ); |
There was a problem hiding this comment.
This approach could be a lot more efficient if we didn't need to allocate and search through the entire pin_map.
I separately implemented a new pin_offsets formatter capability (#12882) that would makes this operation much cheaper.
af68b58 to
7dda884
Compare
bo2themax
left a comment
There was a problem hiding this comment.
macOS part looks good to me 👍
Several accessibility methods on SurfaceView returned values in the wrong coordinate system, which made the visible-range and selection APIs unusable across screen readers, translation tools, and AI autocomplete apps: - accessibilityVisibleCharacterRange returned the full scrollback as the visible range. AX clients would fetch hundreds of KB to translate a single sentence on screen. - accessibilityNumberOfCharacters reported grapheme count rather than UTF-16 code units. Any range derived from it indexed into the wrong bytes when the buffer contained emoji. - accessibilityLine(for:) treated the parameter as a grapheme index, same problem. - accessibilitySelectedTextRange returned viewport-cell-linear offsets (y*cols+x), which are unrelated to UTF-16 NSRange semantics. The selection's range and text disagreed for non-ASCII content. The main idea is a self-consistent screen-text snapshot. The Zig core adds Screen.screenText which uses ScreenFormatter directly to emit the full screen along with a per-byte Pin array, then binary-searches the array for the viewport's start/end byte offsets. The C API exports this as ghostty_surface_read_screen / ghostty_screen_text_s. The macOS layer caches a Ghostty.SurfaceView.ScreenText snapshot that holds the text, the viewport NSRange in UTF-16, the total UTF-16 length, and a precomputed lineStarts table. The AX overrides read from this single cache so all reported lengths and ranges live in the same coordinate system. accessibilitySelectedTextRange searches the cached text for the selection's bytes and returns the unique match, or NSNotFound when the selection appears more than once. A wrong-but-valid range would mislead AX clients more than NSNotFound. The preexisting cachedScreenContents and cachedVisibleContents stay in place because GetTerminalDetailsIntent still reads them. Folding those into the new cache can be a separate cleanup. The overall result was verified using some small AX scripts to confirm correct AXVisibleCharacterRange, AXNumberOfCharacters, AXStringForRange, and AXSelectedTextRange behavior. Note that AXRangeForPosition and AXBoundsForRange are still not yet implemented. Those need a little more foundational work and can come in follow-up changes.
This allows it to be shared with a future iOS implementation.
Extend ScreenText to include the active selection's UTF-8 byte offsets alongside the text and viewport. The selection range is computed as part of the same formatter pass as the text, so they're guaranteed to be in sync. accessibilitySelectedTextRange now reads the cached selection range directly, and we now return NSNotFound when there's no selection per the NSAccessibility convention for read-only content.
e778459 to
359d3d3
Compare
mitchellh
left a comment
There was a problem hiding this comment.
Awesome, thanks for looking at this. I think the only sus part is the Screen.zig changes. I get what you're going for and why formatters directly don't provide this,but I want to spend more time thinking about if there is a better way to expose this.
Assume you mean this bit, or something else? while (r.end > r.start and text[r.end - 1] == '\n') r.end -= 1; |
Several accessibility methods on
SurfaceViewreturned values in the wrong coordinate system, which made the visible-range and selection APIs unusable across screen readers, translation tools, and AI autocomplete apps:accessibilityVisibleCharacterRangereturned the full scrollback as the visible range. AX clients would fetch hundreds of KB to translate a single sentence on screen.accessibilityNumberOfCharactersreported grapheme count rather than UTF-16 code units. Any range derived from it indexed into the wrong bytes when the buffer contained emoji.accessibilityLine(for:)treated the parameter as a grapheme index, same problem.accessibilitySelectedTextRangereturned viewport-cell-linear offsets (y*cols+x), which are unrelated to UTF-16NSRangesemantics. The selection's range and text disagreed for non-ASCII content.The main idea is a self-consistent screen-text snapshot. The Zig core adds
Screen.screenTextwhich usesScreenFormatterdirectly to emit the full screen along with a per-bytePinarray, then binary-searches the array for the viewport's start/end byte offsets. The C API exports this asghostty_surface_read_screen/ghostty_screen_text_s.The macOS layer caches a
Ghostty.SurfaceView.ScreenTextsnapshot that holds the text, the viewportNSRangein UTF-16, the total UTF-16 length, and a precomputedlineStartstable. The AX overrides read from this single cache so all reported lengths and ranges live in the same coordinate system.accessibilitySelectedTextRangesearches the cached text for the selection's bytes and returns the unique match, orNSNotFoundwhen the selection appears more than once. A wrong-but-valid range would mislead AX clients more thanNSNotFound.The preexisting
cachedScreenContentsandcachedVisibleContentsstay in place becauseGetTerminalDetailsIntentstill reads them. Folding those into the new cache can be a separate cleanup.The overall result was verified using some small AX scripts to confirm correct
AXVisibleCharacterRange,AXNumberOfCharacters,AXStringForRange, andAXSelectedTextRangebehavior.Note that
AXRangeForPositionandAXBoundsForRangeare still not yet implemented. Those need a little more foundational work and can come in follow-up changes.See: #9932
AI Disclosure: I did the initial investigation using Claude Opus 4.7, which was helpful in generating the test cases and planning the overall phased approach.