Track service-oracle execution time only when the service runs, not on replay by ndr-ds · Pull Request #6517 · linera-io/linera-protocol

ndr-ds · 2026-06-19T22:13:39Z

Draft — flagged for execution-team review. This changes consensus-relevant
execution (service-oracle execution-time accounting). Please sanity-check the reasoning
before this is considered for merge/backport.

Motivation

When a validator re-executes a certified block it replays the recorded oracle responses, so
the computed outcome must be a deterministic function of the block and those responses — the
chain worker compares the re-computed (messages, state_hash) against the certified one and
raises CorruptedChainState on any mismatch.

run_service_oracle_query (linera-execution/src/runtime.rs) breaks this for blocks that
call a service as an oracle (query_service, used by pm-oracle):

The service response is correctly memoized and replayed — execution_state_actor.rs
runs the service inside TransactionTracker::oracle(), whose closure is skipped on replay.
But the runtime times the whole request with Instant::now()…elapsed() and feeds it to
ResourceController::track_service_oracle_execution, which accumulates it and hard-aborts
with MaximumServiceOracleExecutionTimeExceeded once the running total crosses
policy.maximum_service_oracle_execution_ms.
That timing runs on every execution, including replay. Two honest validators
re-executing the same block measure different wall-clock times (host load, scheduling), so
one can cross the limit and abort while the other completes → different (messages, state_hash) → CorruptedChainState.

Wall-clock leaking into the block outcome is a determinism violation, and a candidate root
cause for the recent per-validator-disjoint "Corrupted chain state" surge on testnet_conway
(the affected chains are PM chains, and pm-oracle calls query_service).

Proposal

Measure the service-oracle execution time where the service actually runs — inside the
actor's oracle() closure, which is skipped during replay — and report it back to the runtime
to track. On replay the closure does not run, so the reported time is Duration::ZERO and
nothing accumulates. The per-call deadline (used only by the proposer during validation) is
unchanged. Wall-clock time is now only ever consumed by the proposer at validation time, never
compared across replaying validators.

Test Plan

New regression test test_query_service_does_not_track_execution_time_on_replay
(contract_runtime_apis.rs): replays a query_service call and asserts
controller.tracker.service_oracle_execution == Duration::ZERO. It fails on the old code
(which tracks the round-trip elapsed(), always > 0) and passes with the fix.
cargo clippy -p linera-execution clean.

Release Plan

These changes should be backported to the latest testnet branch, then
- be released in a validator hotfix.
(Consensus determinism — affects certified-block re-execution.)

Links

Same incident as Fix JournalingError::must_reload_view to delegate to the inner store error #6507 / Fix JournalingError::must_reload_view to delegate to the inner store error #6508 (JournalingError::must_reload_view), different mechanism.

Open questions for reviewers

Is there any case where a replaying validator should still enforce a service-oracle time
limit? (I believe not — the response is certified and no real service work runs.)
Alternative considered: record the measured execution time inside the OracleResponse so it
is certified and replayed deterministically, rather than discarded on replay. This PR takes
the smaller change (discard on replay); happy to switch if you prefer the recorded-time
approach.
reviewer checklist

…n replay

Track service-oracle execution time only when the service runs, not o…

d07eecd

…n replay

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Track service-oracle execution time only when the service runs, not on replay#6517

Track service-oracle execution time only when the service runs, not on replay#6517
ndr-ds wants to merge 1 commit into
mainfrom
ndr-ds/fix-service-oracle-replay-determinism

ndr-ds commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

ndr-ds commented Jun 19, 2026

Motivation

Proposal

Test Plan

Release Plan

Links

Open questions for reviewers

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant