lazily

Lazy reactive primitives for Rust — Context, Slots, Cells with automatic dependency tracking and cache invalidation.

Overview

lazily provides five core primitives for reactive computation:

Context — owns all reactive state and manages the dependency graph
Slot — a lazily-computed cached value that automatically tracks dependencies
Cell — a mutable value that invalidates dependent Slots when changed
Signal — an eager derived value that recomputes the instant a dependency invalidates, with no intermediate unset value
Effect — a side-effect callback that automatically reruns after tracked dependencies invalidate

Values are lazy by default: dependents are marked dirty on invalidation but only validated or recomputed when accessed. When you need eager push-style semantics — recompute immediately, observe v1 -> v2 with no unset window — reach for Signal, which layers a puller effect over a memoized slot. The Slot -> Cell -> Signal progression lets you choose lazy or eager per derived value within one graph. ctx.memo() Slots use a memo guard: if recomputation produces the same value, downstream dirty caches and effects are left alone. Multiple updates can be grouped with ctx.batch(...) so invalidation and effect reruns happen once after the outermost batch exits.

Development

Minimum supported Rust version (MSRV): 1.88 — declared via rust-version in Cargo.toml. The crate uses let_chains (stabilized in 1.88) pervasively.

Run the local CI-equivalent suite with:

make check

The Makefile also exposes focused targets such as make test-tokio, make test-loom, make benchmark-check, and make benchmark-update.

Usage

#![allow(unused)]
fn main() {
use lazily::Context;

let ctx = Context::new();

// Create a mutable cell
let counter = ctx.cell(0i32);

// Create a derived value (automatically tracks dependencies)
let doubled = ctx.computed(|ctx| {
    let val = counter.get(ctx);
    val * 2
});

assert_eq!(doubled.get(&ctx), 0);

// Mutate the cell — dependents are marked dirty (not recomputed yet)
counter.set(&ctx, 5);

// Slot recomputes lazily on next access
assert_eq!(doubled.get(&ctx), 10);

// Effects run immediately and then after tracked dependencies change
let effect = ctx.effect(move |ctx| {
    println!("counter = {}", counter.get(ctx));
});

counter.set(&ctx, 6); // schedules and runs the effect once
effect.dispose(&ctx); // unsubscribes and prevents future reruns

// Batch writes coalesce invalidation and effect reruns.
ctx.batch(|ctx| {
    counter.set(ctx, 7);
    counter.set(ctx, 8);
});
}

Why Lazy?

	Lazy (Slots)	Eager (Signals)
When does recomputation happen?	On access (`get`)	Immediately on change
Wasted work	Zero — only compute what’s read	Can compute values nobody uses
Glitch-free	By construction	Requires topological sorting
Ordering	Irrelevant — pull-based	Critical — push-based DAG walk
Use case	Request handling, data pipelines	UI rendering, real-time updates

In a web server handling requests, you might have 50 computed values available but any given request only uses 5. With eager reactivity, all 50 recompute on every change. With lazy, only the 5 actually accessed compute.

lazily defaults to lazy but does not force the choice on you: derive with ctx.computed()/ctx.memo() for pull-based laziness, or ctx.signal() for the eager column above (UI rendering, real-time mirrors, always-materialized values). Both share the same context, dependency graph, glitch-freedom, and memo guard — pick per value.

Core Concepts

Context

Context owns all Slots and Cells. It manages the dependency graph and provides the API for creating, reading, and mutating reactive values. Think of it as the “world” for your reactive computations — in web frameworks, this maps to a request context, application scope, or component tree.

The current Context is intentionally single-threaded. It uses RefCell and non-Send callback storage to keep the fast path allocation-only and mutex-free. Create independent contexts per OS thread for local graphs, or use ThreadSafeContext when one reactive graph must be shared across threads.

Slot

A SlotHandle<T> wraps a compute function Fn(&Context) -> T. The result is cached after first access. Dependencies are discovered automatically via a thread-local tracking stack — any Slot or Cell accessed during computation becomes a dependency. ctx.computed() is the ergonomic name for a derived value; ctx.slot() is the same primitive. Use ctx.memo() when T: PartialEq and equal recomputations should suppress downstream work.

When a dependency is invalidated, the Slot marks its cached value dirty. It does not validate or recompute until ctx.get() is called again. For ctx.memo() slots, if recomputation returns a value equal to the previous cache, downstream dirty Slots become fresh without recomputing, and scheduled effects that only depended on unchanged Slots skip cleanup/rerun.

Dependencies are dynamic. Every time a Slot recomputes, it re-discovers its dependencies from scratch. If your compute function has conditional branches that access different Cells depending on state, the dependency graph updates automatically. No stale subscriptions, no manual cleanup.

Cell

A CellHandle<T> holds a mutable value. cell.set(&ctx, value) and ctx.set_cell() compare old and new values via PartialEq — if unchanged, no invalidation occurs. If changed, all dependent Slots are recursively marked dirty.

Signal

A SignalHandle<T> is an eager derived value — the eager counterpart to a lazy Slot, one step further along the Slot -> Cell -> Signal progression. Where a Slot only marks itself dirty on invalidation and recomputes on the next read, a Signal recomputes the instant a dependency is invalidated, before the invalidating set/set_cell/batch call returns. The value is always materialized, so observers never see an intermediate unset value — a dependency change drives the value directly from v1 to v2.

#![allow(unused)]
fn main() {
let n = ctx.cell(1);
let doubled = ctx.signal(|ctx| n.get(ctx) * 2); // materialized now: 2
n.set(&ctx, 5);                                  // doubled is already 10 — eager
assert_eq!(doubled.get(&ctx), 10);
}

A Signal is composed from existing primitives, not a parallel engine: a memoized Slot (ctx.memo) supplies glitch-free, pull-based, memo-guarded recomputation, and a small puller Effect re-materializes that slot after every invalidation to supply the eagerness. Consequently a Signal inherits the memo guard (an equal recompute suppresses downstream work) and diamond glitch-freedom (D = f(A, g(A)) never surfaces a mixed new-A/old-g(A) intermediate), and batched writes settle to one consistent recomputation at batch exit.

ctx.signal() requires T: PartialEq + 'static (the memo guard); get_signal additionally requires T: Clone. signal.dispose(&ctx) removes the eager puller — the value stays readable but reverts to lazy (recompute-on-read) behavior. The same primitive is available on ThreadSafeContext (signal, returning a Send + Sync ThreadSafeSignalHandle<T>) and AsyncContext (signal_async, with a non-blocking get_signal snapshot and an awaiting get_signal_async); see SPEC.md for the per-context type bounds and the async eagerness caveat.

Batch Updates

ctx.batch(|ctx| { ... }) groups multiple cell updates and explicit slot/cell clears into one invalidation pass. Nested batches flush only when the outermost batch exits. Direct ctx.get_cell() reads inside the callback see the latest cell value immediately; changed-cell dependents are marked dirty after the batch, so Slot reads during the callback return their pre-batch cached value until the batch completes.

Effect

An EffectHandle represents a side-effect callback registered with ctx.effect(). Effects run immediately, track any Slots or Cells read during that run, and rerun after those dependencies invalidate. Scheduled effect reruns are flushed after the invalidation pass, so diamond dependency paths coalesce to one rerun. Effects scheduled only by dirty Slot dependencies first validate those Slots and skip cleanup/rerun when values are unchanged.

Effects can return a cleanup closure. Cleanup runs before the next rerun and when the handle is disposed:

#![allow(unused)]
fn main() {
let effect = ctx.effect(move |ctx| {
    let value = counter.get(ctx);
    move || println!("cleanup for {value}")
});

effect.dispose(&ctx);
}

API

Method	Purpose
`Context::new()`	Create a new context
`ctx.computed(\|ctx\| T)`	Create a derived lazily-computed value
`ctx.slot(\|ctx\| T)`	Create a lazily-computed slot; synonym of `ctx.computed()`
`ctx.memo(\|ctx\| T)`	Create a lazily-computed slot with a `PartialEq` memoization guard
`slot.get(&ctx)`	Get value (computes if unset)
`ctx.get(&slot)`	Context method alias for `slot.get(&ctx)`
`ctx.cell(value)`	Create a mutable cell
`cell.get(&ctx)`	Get cell value
`ctx.get_cell(&cell)`	Context method alias for `cell.get(&ctx)`
`ctx.set_cell(&cell, value)`	Update cell (marks dependents dirty if changed)
`cell.set(&ctx, value)`	Handle method alias for `ctx.set_cell(&cell, value)`
`ctx.signal(\|ctx\| T)`	Create an eager derived value (recomputes on invalidation, no unset window); `T: PartialEq + 'static`
`signal.get(&ctx)`	Get the signal’s value (`T: Clone`); also `ctx.get_signal(&signal)`
`signal.dispose(&ctx)`	Remove the eager puller; value reverts to lazy recompute-on-read
`signal.is_active(&ctx)`	Check whether the eager puller is still registered
`ctx.batch(\|ctx\| { ... })`	Defer changed-cell dirty marking and explicit clears until the outermost batch exits
`ctx.effect(\|ctx\| { ... })`	Run an effect immediately and rerun it after tracked dependencies invalidate
`ctx.is_set(&slot)`	Check if slot has a cached, fresh value
`slot.clear(&ctx)`	Clear cached value and cascade to dependents
`cell.clear_dependents(&ctx)`	Clear downstream slots without changing cell value
`effect.dispose(&ctx)`	Dispose an effect and unsubscribe dependencies
`effect.is_active(&ctx)`	Check whether an effect is still registered

ThreadSafeContext

ThreadSafeContext is the mutex-backed counterpart for sharing one reactive graph across OS threads. It mirrors the core Context methods while requiring Send + Sync + 'static values and compute/effect callbacks. The graph lock is released before user compute callbacks, effect callbacks, or cleanup closures run, so callbacks can re-enter the same context without deadlocking. If a slot is invalidated while its callback is running, the stale result is discarded and the getter retries before returning a fresh value.

Design

Lazy by default, eager on demand: Slots mark dirty on invalidation and validate/recompute on access; ctx.signal() opts a value into eager recomputation (a memo-slot + puller-effect composition) with no intermediate unset state
Ergonomic aliases: ctx.computed() names derived values while preserving ctx.slot() for low-level terminology
PartialEq guard: CellHandle::set() only invalidates when value actually changes
Memo guard: Dirty ctx.memo() Slots compare recomputed values and suppress downstream recomputation/effect reruns when values are equal
Dynamic dependencies: Edges re-discovered on each recomputation (no stale subscriptions)
Batching: Multiple writes share one invalidation/effect flush boundary
Effect scheduling: Effects rerun after dependency invalidation and coalesce duplicate schedules
Slot-id-indexed contiguous node storage for the single-threaded fast path
Interior mutability via RefCell (single-threaded)
Thread-local tracking stack for automatic dependency discovery
Zero mandatory runtime dependencies in the default library surface
Optional instrumentation feature for benchmark counters, lock timing, and thread-safe lock attribution

Threading Roadmap

lazily-rs guarantees local, single-threaded Context graphs plus an explicit ThreadSafeContext for shared graphs. SlotHandle<T> and CellHandle<T> are Send + Sync when T is Send + Sync, and EffectHandle is also Send + Sync, but handles must be used with their owning context.

Enable the optional loom feature to run the thread-safe synchronization model:

cargo test --features loom --test thread_safe_loom

Enable the optional tokio feature for sync-on-Tokio integration tests and the tokio_sync example:

cargo test --features tokio
cargo run --example tokio_sync --features tokio

The feature proves ThreadSafeContext can be shared through tokio::spawn and tokio::task::spawn_blocking. It does not add async computations or effects; those need the separate AsyncContext design captured in SPEC.md, including in-flight future deduplication, stale completion handling, cleanup ordering, and separate Send versus LocalSet surfaces.

ThreadSafeContext intentionally keeps one mutex-backed graph lock while fresh cached slot reads use a per-slot read-mostly cached-value sidecar. Dependency edges, dirty/revision state, cached-value publication, batching, and effect queues still mutate under the graph mutex. In-flight recompute waiters use per-slot generation Condvar sidecars so they can park while the compute owner runs user code, and a completion only wakes waiters for that finished slot. Per-slot dependency summaries let cell-only dirty refreshes claim the SlotId-partitioned recompute sidecar and skip the graph-locked get_refresh dependency scan before the final publish mutation. Changed-cell and slot-value invalidation build an explicit frontier plan, then apply dirty flags, revisions, and effect scheduling in one graph-mutex mutation boundary; slot-only changed-cell frontiers may publish dirty state through per-node sidecars with cache-revision dirty epochs instead. The thread_safe_graph_propagation benchmarks compare fan-out eager validation, fan-out/fan-in lazy dirty epoch publication, and fan-in batched flush behavior with lock attribution, effect queue pushes, dependency-edge counters, sidecar dirty marks, sidecar fallbacks, and dirty epoch advances. Sharded-lock or CAS variants should wait for lock wait/hold benchmark evidence and a Loom or Shuttle safety model for stale in-flight completion, invalidation during compute, dynamic dependency cleanup/disposal, effect scheduling/disposal, and re-entrant callbacks. A lock-free versioned optimistic read path is deferred until cached values can be retained independently of graph-protected erased-value storage.

Benchmarks

See BENCHMARKS.md for full benchmark results, regression budgets, lock attribution, and instrumentation profiles.

Multi-Language

lazily is implemented across three languages with shared semantics:

	lazily-rs	lazily-zig	lazily-py
Context	Owned `Context` struct	Explicit allocator	Plain `dict`
Slot creation	`Box<dyn Fn>` closures	`comptime` function pointers	Lambdas
Cell equality	`PartialEq` trait	`std.meta.eql`	`!=` operator
Thread safety	Single-threaded `Context`; explicit `ThreadSafeContext`	Mutex by default	GIL
Storage	Unified generics	`.direct` / `.indirect`	Object identity

Cross-Channel Compatibility

The cross-language family should use one graph-state protocol across channels: IpcMessage::Snapshot and IpcMessage::Delta. Rust FFI is viable as a narrow C ABI adapter with opaque handles and owned byte buffers, not by sharing live Rust contexts, closures, typed handles, or references across the ABI.

IPC, WebSocket frames, WebRTC data channels, and FFI byte buffers can then carry the same permission-filtered snapshots and deltas. Transport code owns framing, memory ownership, reliability, and back-pressure; lazily semantics stay in the shared message schema.

Enable the ffi feature for the C ABI adapter. It exposes an opaque LazilyFfiChannel, JSON IpcMessage validation/classification helpers, and Rust-owned LazilyFfiBytes buffers with an explicit free function. The adapter re-encodes every accepted frame as canonical IpcMessage JSON, so FFI callers share the same state plane as other channels.

lazily-zig — Zig implementation with FFI support
lazily-py — Python implementation with context-as-dict
Blog post: Lazily — Reactive Primitives Done Right

License

MIT

lazily-rs Specification

Rust library for lazy evaluation with context-aware dependency tracking and cache invalidation. Counterpart to lazily-zig and lazily-py.

Core Concepts

Context

Container for all slots and cells. Owns all allocations via interior mutability using a single RefCell<ContextInner>.

#![allow(unused)]
fn main() {
struct ContextInner {
    nodes: Vec<Option<Node>>,
    next_id: u64,
    free_ids: Vec<u64>,
    pending_effects: VecDeque<SlotId>,
    scheduled_effects: HashSet<SlotId>,
    flushing_effects: bool,
    batch_depth: usize,
    batched_cells: HashSet<SlotId>,
    batched_cell_clears: HashSet<SlotId>,
    batched_slots: HashSet<SlotId>,
}

pub struct Context {
    inner: RefCell<ContextInner>,
}
}

API:

Method	Purpose
`Context::new()`	Create a new context
`ctx.computed(\|ctx\| T)`	Create a derived lazily-computed value
`ctx.slot(\|ctx\| T)`	Create a lazily-computed slot; synonym of `ctx.computed()`
`ctx.memo(\|ctx\| T)`	Create a lazily-computed slot with a `PartialEq` memoization guard
`slot.get(&ctx)`	Get value (computes if unset)
`ctx.get(&slot)`	Context method alias for `slot.get(&ctx)`
`ctx.get_rc(&slot)`	Get slot value as `Rc<T>`, avoiding deep clone
`ctx.cell(value)`	Create a mutable cell
`cell.get(&ctx)`	Get cell value
`ctx.get_cell(&cell)`	Context method alias for `cell.get(&ctx)`
`ctx.get_cell_rc(&cell)`	Get cell value as `Rc<T>`, avoiding deep clone
`ctx.set_cell(&cell, value)`	Update cell (marks dependents dirty if changed)
`cell.set(&ctx, value)`	Handle method alias for `ctx.set_cell(&cell, value)`
`ctx.batch(\|ctx\| { ... })`	Defer changed-cell dirty marking and explicit clears until the outermost batch exits
`ctx.effect(\|ctx\| { ... })`	Run an effect immediately and rerun it after tracked dependencies invalidate
`ctx.is_set(&slot)`	Check if slot has a cached, fresh value
`slot.clear(&ctx)`	Clear cached value and cascade to dependents
`cell.clear_dependents(&ctx)`	Clear downstream slots without changing cell value
`effect.dispose(&ctx)`	Dispose an effect, unsubscribe dependencies, and run cleanup
`effect.is_active(&ctx)`	Check whether an effect is still registered

Context stores nodes in a slot-id-indexed Vec<Option<Node>> rather than a hash map. SlotId values are allocated sequentially; effect disposal returns the ID to a free list for reuse, preventing unbounded Vec growth from transient effects while keeping lookups contiguous and hash-free.

Dependency and dependent edges use SmallVec<[SlotId; 4]> rather than HashSet<SlotId>. For the typical 1-3 dependency fan-out, SmallVec stores edges inline without heap allocation, avoids hash computation overhead, and eliminates the temporary Vec<SlotId> allocations that were previously required in every refresh/clear/dirty hot path (replaced with SmallVec::clone() or std::mem::take). Sets that require true dedup semantics (scheduled effects, batch queues, tracking frames) remain as HashSet<SlotId>.

The single-threaded Context consolidates all mutable state behind one RefCell<ContextInner> instead of ten separate RefCell fields. This reduces borrow-check overhead from 4-6 flag modifications per get() to 1 and eliminates the risk of re-entrant borrows across fields. Slot and effect compute closures are stored as Rc<dyn Fn> so they can be cloned cheaply without unsafe pointer copies.

ThreadSafeContext

Mutex-backed counterpart to Context for sharing one reactive graph across OS threads. It mirrors the core local-context API and requires thread-safe values and callbacks.

#![allow(unused)]
fn main() {
pub struct ThreadSafeContext {
    inner: Arc<ThreadSafeInner>,
}
}

API:

Method	Purpose
`ThreadSafeContext::new()`	Create a new thread-safe context
`ctx.computed(\|ctx\| T)`	Create a `Send + Sync` derived lazily-computed value
`ctx.slot(\|ctx\| T)`	Create a `Send + Sync` lazily-computed slot
`ctx.memo(\|ctx\| T)`	Create a `Send + Sync` lazily-computed slot with a `PartialEq` memoization guard
`slot.get(&ctx)`	Get value from any thread (computes if unset)
`ctx.get(&slot)`	Context method alias for `slot.get(&ctx)`
`ctx.cell(value)`	Create a mutable `Send + Sync` cell
`cell.get(&ctx)`	Get cell value from any thread
`ctx.get_cell(&cell)`	Context method alias for `cell.get(&ctx)`
`ctx.set_cell(&cell, value)`	Update cell and invalidate dependents across threads
`ctx.batch(\|ctx\| { ... })`	Defer invalidation until the outermost shared batch exits
`ctx.effect(\|ctx\| { ... })`	Run a `Send + Sync` effect immediately and rerun it after tracked dependencies invalidate
`ctx.clear(&slot)`	Clear cached value and cascade to dependents
`ctx.clear_cell_dependents(&cell)`	Clear downstream slots without changing cell value
`ctx.dispose_effect(&effect)`	Dispose an effect, unsubscribe dependencies, and run cleanup
`ctx.is_effect_active(&effect)`	Check whether an effect is still registered

Slot

Lazily-computed cached value with dependency tracking. A Slot is fresh, dirty, or unset; dirty slots may retain a previous cached value for memo validation. ctx.memo() creates a Slot whose values implement PartialEq so dirty caches can be compared against recomputed values.

#![allow(unused)]
fn main() {
type ComputeFn = dyn Fn(&Context) -> Rc<dyn Any>;
type EqualsFn = dyn Fn(&dyn Any, &dyn Any) -> bool;

struct SlotNode {
    value: Option<Rc<dyn Any>>,
    type_id: TypeId,
    compute: Rc<ComputeFn>,
    equals: Option<Box<EqualsFn>>,
    dependencies: SmallVec<[SlotId; 4]>,
    dependents: SmallVec<[SlotId; 4]>,
    dirty: bool,
    force_recompute: bool,
}
}

Semantics:

Activation: First ctx.get() calls the compute function, caches the result
Computed alias: ctx.computed() creates the same Slot as ctx.slot() for derived-value ergonomics
Invalidation: Marks the cached value dirty and marks downstream slots dirty without discarding their cached values
Clearing: Explicit slot.clear(&ctx) removes the cached value and clears all dependent slots recursively
Memo guard: Dirty ctx.memo() slots compare recomputed values with the previous cache via PartialEq; equal values make downstream dirty slots fresh without recomputing them
Dependencies: If Slot B accesses Slot A during computation, B depends on A. If A clears, B clears automatically; if A’s value changes after dirty validation, B is forced stale
Immutable by default: Once set, a Slot’s value doesn’t change — only clear + recompute
Dynamic: Dependencies re-discovered on each recomputation (no stale subscriptions)

Cell

Mutable value container. Changing a Cell’s value marks dependent Slots dirty.

#![allow(unused)]
fn main() {
struct CellNode {
    value: Rc<dyn Any>,
    type_id: TypeId,
    dependents: SmallVec<[SlotId; 4]>,
}
}

Semantics:

ctx.set_cell() and cell.set(&ctx, value) compare old and new via PartialEq
If unchanged, no invalidation occurs (no-op)
If changed, dependent Slots are marked dirty while cached values are preserved for memo validation

Effect

Side-effect callback that automatically tracks dependencies. Effects run immediately on creation, then rerun after any Cell or Slot read during the last run is invalidated.

#![allow(unused)]
fn main() {
type EffectFn = dyn Fn(&Context) -> Option<Box<dyn FnOnce()>>;

struct EffectNode {
    run: Rc<EffectFn>,
    dependencies: SmallVec<[SlotId; 4]>,
    cleanup: Option<Box<dyn FnOnce()>>,
    force_run: bool,
}
}

Semantics:

Immediate activation: ctx.effect() runs the callback once during creation
Auto-tracking: Any Slot or Cell accessed during the callback becomes a dependency
Scheduling: Dependency invalidation schedules the effect, then the context flushes scheduled effects after the invalidation pass
Coalescing: An effect scheduled through multiple dependency paths in the same invalidation pass runs once
Memo guard: Effects scheduled by dirty slot dependencies first validate those slots and skip cleanup/rerun when values are unchanged
Cleanup: Returning a cleanup closure runs it before the next rerun and on disposal
Disposal: effect.dispose(&ctx) unsubscribes from dependencies, removes pending scheduled work, and prevents future reruns

Signal

Eager derived value. A Signal sits one step beyond a Slot on the Slot -> Cell -> Signal progression: where a Slot is lazy (invalidation only marks it dirty; the value is recomputed on the next read), a Signal is eager — it recomputes the instant any dependency is invalidated. The value is always materialized, so observers never see an intermediate unset value: a dependency change drives the value directly from v1 to v2.

A Signal is composed from existing primitives: a memoized Slot (ctx.memo) plus a small puller Effect that re-materializes the slot after every invalidation. This composition is intentional — it inherits the Slot’s glitch-free, pull-based recomputation and the memo guard, while the Effect supplies eagerness.

#![allow(unused)]
fn main() {
let n = ctx.cell(1);
let doubled = ctx.signal(|ctx| n.get(ctx) * 2); // materialized now: 2
n.set(&ctx, 5);                                  // doubled is already 10
assert_eq!(doubled.get(&ctx), 10);
}

Semantics:

Eager activation: ctx.signal() computes the value once at creation; the value is set from the start
Eager recomputation: Dependency invalidation recomputes the value during the invalidation flush, before the invalidating set_cell/set/batch call returns — no read is required to drive it
No unset state: The backing slot is invalidated via dirty-marking (not hard-cleared), so the value transitions v1 -> v2 and is never observed as unset
Memo guard: Backed by ctx.memo, a recomputation that yields an equal value (via PartialEq) does not invalidate downstream dependents
Glitch-free: Recomputation is pull-based; a Signal that reads other Signals/Slots always observes values consistent with the current inputs (e.g. a diamond D = f(A, g(A)) never surfaces a mixed new-A/old-g(A) intermediate)
Batch coalescing: Writes inside ctx.batch() settle to a single consistent recomputation at batch exit
Type bounds: signal<T> requires T: PartialEq + 'static (for the memo guard); get_signal additionally requires T: Clone
Disposal: signal.dispose(&ctx) removes the eager puller; the value remains readable and reverts to lazy (recomputed on next read) behavior

Signal across context types

The eager-Signal primitive is exposed on all three context types with the same memo-slot + puller-effect composition, so shared-graph and async consumers get the same always-set, glitch-free v1 -> v2 derived values that the single-threaded Context provides (#lzsignalparity).

ThreadSafeContext::signal — shared-graph counterpart. Returns a ThreadSafeSignalHandle<T> with .get/.dispose/.is_active(&ctx) helpers and matching ctx.get_signal/dispose_signal/is_signal_active. Recomputation is eager (driven during the invalidation flush before the set_cell/batch call returns), glitch-free, memo-guarded, and batch-coalesced — identical to the single-threaded semantics above. Type bounds add Send + Sync: signal<T> requires T: PartialEq + Send + Sync + 'static; get_signal additionally requires T: Clone. The handle is Copy + Send + Sync and may be read from any thread sharing the context.
AsyncContext::signal_async — async counterpart. Returns an AsyncSignalHandle<T> backed by memo_async plus an effect_async puller that awaits the slot after every invalidation. Reads: ctx.get_signal (or handle.get) returns Option<T> as a non-blocking snapshot; ctx.get_signal_async (or handle.get_async) awaits the up-to-date value. Inside a slot/effect callback, AsyncComputeContext::get_signal_async reads a signal and registers its backing slot as a dependency, enabling chained async signals and downstream observers. Type bounds: T: PartialEq + Clone + Send + Sync + 'static.
- Eagerness is runtime-driven: because resolution is asynchronous, the puller drives the recompute to completion on the runtime shortly after the invalidating write rather than synchronously within it.
- Propagation is not suppressed on equal recompute: the async memo guard keeps the value correct on an equal recompute, but — unlike the single-threaded/thread-safe graph — does not suppress downstream propagation (async invalidation force-reruns effect dependents on every upstream change). No inconsistent (glitch) value is ever observed. This matches the documented async memo does not suppress downstream propagation behavior of memo_async.

Batch

Write-coalescing boundary for multiple cell updates or explicit slot clears.

#![allow(unused)]
fn main() {
ctx.batch(|ctx| {
    ctx.set_cell(&a, 1);
    ctx.set_cell(&b, 2);
});
}

Semantics:

Outermost boundary: Nested batches flush only when the outermost batch exits
Changed cells: ctx.set_cell() still updates the cell value immediately, but dependent dirty marking is queued until batch exit
Explicit clears: slot.clear(&ctx) and cell.clear_dependents(&ctx) are queued until batch exit
Coalescing: Repeated updates to the same cell or clears of the same slot queue one invalidation root
Thread-safe local batching: same-thread ThreadSafeContext batch writes buffer changed cells and clears in a thread-local batch frame, then merge that frame into the graph-owned batch queue at batch exit; cross-thread writes during another active batch still fall back to the graph-owned queue
Effect flushing: Effects scheduled by batched invalidation rerun after the batch invalidation pass and coalesce duplicate schedules
Reads during a batch: Direct ctx.get_cell() reads see the latest cell value immediately; dependent slot reads keep their pre-batch cached value until dirty marking flushes at batch exit

State Machine

A finite state machine built on top of CellHandle and the reactive graph. StateMachine<S, E> wraps a CellHandle<S> as the current state and a pure transition function Fn(&S, &E) -> Option<S>.

#![allow(unused)]
fn main() {
use lazily::{Context, StateMachine};

let ctx = Context::new();
let m = StateMachine::new(&ctx, Door::Closed, |s, e| match (s, e) {
    (Door::Closed, DoorEvent::Button) => Some(Door::Opening),
    (Door::Opening, DoorEvent::Open) => Some(Door::Open),
    _ => None,
});

m.send(&ctx, DoorEvent::Button);
assert_eq!(m.state(&ctx), Door::Opening);
}

API:

Method	Description
`StateMachine::new(ctx, initial, transition_fn)`	Create with initial state + pure transition function
`send(ctx, event) -> bool`	Evaluate transition; `true` if accepted, `false` if rejected (`None`)
`state(ctx) -> S`	Read the current state
`state_handle() -> CellHandle<S>`	Underlying cell for reactive dependencies
`on_transition(ctx, \|old, new\| ...) -> EffectHandle`	Observer that fires on each state change with `(old, new)`
`state_is(ctx, target) -> SignalHandle<bool>`	Eager signal: `true` when in `target` state

Semantics:

PartialEq guard: A transition to an equal state is accepted (true) but does not invalidate dependents (the CellHandle equality guard suppresses no-op updates). To force re-entry, call cell.clear_dependents(ctx) before send.
Reactive integration: Any ctx.computed, ctx.memo, ctx.signal, or ctx.effect that reads state_handle() automatically recomputes/reruns on transition.
On-enter / on-exit: Use ctx.effect with cleanup — the effect body is on-enter, the returned cleanup closure is on-exit (runs before the next rerun). Alternatively, use on_transition for a single (old, new) observer.
Batch atomicity: ctx.batch() coalesces multiple send() calls — effects fire once after the batch settles.
Single-threaded: StateMachine is backed by Context (single-threaded RefCell).

`ThreadSafeStateMachine` — cross-thread

ThreadSafeStateMachine<S, E> is the lock-backed counterpart to StateMachine, mirroring the same API over ThreadSafeContext. The transition function and state must be Send + Sync + 'static, so the machine and its owning context can be shared across OS threads. The machine is Clone — cloning yields another handle to the same state cell and transition function.

#![allow(unused)]
fn main() {
use lazily::{ThreadSafeContext, ThreadSafeStateMachine};
use std::sync::Arc;

let ctx = Arc::new(ThreadSafeContext::new());
let m = ThreadSafeStateMachine::new(&ctx, Door::Closed, |s, e| match (s, e) {
    (Door::Closed, DoorEvent::Button) => Some(Door::Opening),
    _ => None,
});

m.send(&ctx, DoorEvent::Button);
assert_eq!(m.state(&ctx), Door::Opening);
}

on_transition returns an EffectHandle (dispose via ctx.dispose_effect); state_is returns a ThreadSafeSignalHandle<bool>. Observers fire synchronously within the invalidating send/batch call, preserving the glitch-free pull-based ordering of ThreadSafeContext.

`AsyncStateMachine` — Tokio (`async` feature)

AsyncStateMachine<S, E> is the async counterpart, backed by AsyncContext. The state lives in an AsyncCellHandle<S>; because cells are the synchronous input layer of AsyncContext, send and state are synchronous. Reactive observers use the async effect/signal APIs: on_transition returns an AsyncEffectHandle and state_is returns an AsyncSignalHandle<bool>. Because resolution is asynchronous, eager recomputation settles on the runtime rather than synchronously within send.

#![allow(unused)]
fn main() {
use lazily::{AsyncContext, AsyncStateMachine};

let ctx = AsyncContext::new();
let m = AsyncStateMachine::new(&ctx, Door::Closed, |s, e| match (s, e) {
    (Door::Closed, DoorEvent::Button) => Some(Door::Opening),
    _ => None,
});

m.send(&ctx, DoorEvent::Button);
assert_eq!(m.state(&ctx), Door::Opening);
}

Regression property harness

The default Rust test suite includes tests/property_graph.rs, a proptest harness that drives a fixed reactive graph with generated programs of cell sets, equal-value sets, cell.clear_dependents, slot.clear, memo-guarded dependencies, batch boundaries, reads, and effect disposal/recreation.

Each generated program is checked against a pure model:

Cell and slot reads must match the model after every operation.
Equal-value cell sets must not schedule effect cleanup/rerun work.
Same-parity cell changes through a memo slot must preserve downstream effect run counts when the observable output is unchanged.
Explicit slot and cell-dependent clears must hard-clear cached slots when no effect is active, and must rerun active effects to re-prime their dependencies.
Operations inside ctx.batch must not clear cached dependent slots or run effect cleanups/reruns until the outermost batch exits; dependent reads during the batch must continue to see the pre-batch cached value.

SlotId

Unique identifier for reactive nodes. Lightweight Copy type wrapping a u64.

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
pub struct SlotId(u64);
}

Both SlotHandle<T> and CellHandle<T> wrap a SlotId with PhantomData<T> for type safety.

Dependency Tracking

Uses a thread-local tracking stack (mirroring lazily-zig’s TrackingFrame approach).

When a Slot computes, it pushes a frame onto the tracking stack
When an Effect runs, it also pushes a frame onto the tracking stack
Any nested slot/cell access sees the parent frame
The child registers the parent as a dependent
When a dependency clears, slot dependents hard-clear recursively; when a Cell changes, slot dependents are marked dirty and effect dependents are scheduled

Threading and Concurrency Contract

Current `Context`

Context is intentionally local to one OS thread. It owns RefCell graph state, cached values as Box<dyn Any>, compute callbacks as Box<dyn Fn(&Context)>, effect callbacks as Box<dyn Fn(&Context)>, and cleanups as Box<dyn FnOnce()>. Those storage choices avoid synchronization overhead for the common single-threaded path and make Context neither Send nor Sync.

Current guarantees:

Independent Context instances may be used on different OS threads
A single Context must not be moved into, shared with, or accessed from another thread
SlotHandle<T> and CellHandle<T> are lightweight ids and are Send + Sync when T is Send + Sync, but they are only meaningful with their owning context
EffectHandle is a lightweight id; effect execution and cleanup remain tied to the owning context thread
Dependency tracking is thread-local; a compute/effect callback cannot split work onto another thread and expect nested reads there to attach to the original tracking frame

`ThreadSafeContext`

Thread-safe support is explicit rather than a silent change to Context. ThreadSafeContext mirrors the existing Context methods while preserving the single-threaded fast path.

API bounds:

Method family	Additional bounds
`cell`, `get_cell`, `set_cell`	`T: PartialEq + Clone + Send + Sync + 'static`
`slot`, `computed`	`T: Clone + Send + Sync + 'static`; compute closure `Fn(&ThreadSafeContext) -> T + Send + Sync + 'static`
`memo`	`T: PartialEq + Clone + Send + Sync + 'static`; compute closure `Send + Sync + 'static`
`effect`	effect callback `Fn(&ThreadSafeContext) -> R + Send + Sync + 'static`; cleanup `FnOnce() + Send + 'static`
handles	remain id-only and copyable; usable from any thread only with the owning `ThreadSafeContext`

Locking model:

Uses one context-level Mutex synchronization primitive for graph state before introducing finer-grained graph locks
ThreadSafeState stores nodes in a slot-id-indexed Vec<Option<ThreadSafeNode>> matching the single-threaded Context, eliminating hash-map lookup overhead on every node access. Slot IDs are reused via a free list on effect disposal.
Fresh cached slot reads use a per-slot read-mostly cached-value sidecar; dependency-edge changes, invalidation frontier application, batch queues, effect queues, and disposal remain graph mutex mutations
Read-mostly cached slot access is versioned optimistically: the getter loads a per-slot atomic cache revision before cloning the retained cached Arc, then validates the revision and dirty/force flags again after the clone. Any concurrent invalidation, clear, or value publish changes the revision and forces the getter onto the graph-validated refresh path.
Each thread-safe slot also owns a per-slot recompute/value-publish sidecar for the cached-value visibility flags, in-flight bit, waiter Condvar, and revision used to reject stale callback results. Graph-state dirty/revision fields remain mirrored under the context mutex for dependency-frontier traversal and tests.
Each thread-safe slot sidecar mirrors a per-slot dependency summary: the current dependency ids plus a slot-dependency count. A cell-only dirty refresh may claim the SlotId-partitioned recompute sidecar, snapshot old dependencies, and skip the graph-locked get_refresh dependency scan. The owner still takes the final publish graph mutation to diff dynamic dependencies, publish the value, notify dependents, and reject stale in-flight revisions.
Cells and slots mirror their dependent frontiers into per-node sidecars keyed by SlotId. Changed-cell invalidation may use these sidecars without taking the context graph mutex only when no callback is actively discovering dependencies, the context is not inside a batch, and the discovered frontier contains slots only. The per-slot cache revision acts as the dirty epoch for sidecar publication, and instrumentation records each epoch advance so parallel writers can publish version state without the graph mutex while cached reads still reject mid-read invalidation races. Effect scheduling, batching, dynamic dependency discovery, disposal, and any frontier that reaches an effect fall back to the graph mutex.
Do not hold the graph lock while running user compute callbacks, effect callbacks, or cleanup closures
Re-acquire the lock only to publish computed values, dependency edges, invalidation state, and pending effect work
Slot refresh must avoid helper-level lock churn: a fresh cached get should clone the value through the per-slot fast path without taking a get_refresh graph lock or recursively validating unchanged dependencies, dependency refresh should not take a separate node-kind probe lock before recursively validating a dependency, cell dependencies should not be probed as refreshable slots, clean dirty flags should be folded into the refresh decision lock, and recompute must diff old/new dependency sets at publish so unchanged edges stay subscribed while only stale edges are removed
Recompute dependency tracking must skip graph-lock edge registration for dependencies already present in the slot’s previous dependency set, while still eagerly registering newly discovered dependencies during the callback so concurrent invalidation can mark the in-flight result stale
Effect rerun dependency tracking must also preserve unchanged edges: dependencies already present on the effect remain subscribed through the rerun, newly discovered dependencies are registered during tracking, and stale dependencies are removed in one post-callback graph mutation before the next cleanup is stored
Re-entrant user code must be able to call back into the same context without deadlocking
Concurrent first access and dirty same-slot contention share one in-flight computation for the current slot revision; waiters check the per-slot recompute sidecar before the get_refresh/publish graph-lock path, park on that slot’s notification primitive, then return the published cache or retry if an invalidation makes the in-flight result stale
Recompute waiters observe the per-slot in-flight/revision state while holding the sidecar mutex, then park on the same sidecar Condvar. Finishers publish value and dirty-state sidecar updates before clearing the in-flight bit and notifying, so a stale in-flight completion cannot be missed.
Recompute notifications are scoped to the slot that finished. A completion for one in-flight slot must not wake waiters parked behind another in-flight slot.
Per-slot recompute wakeups use a waiter-counted handoff instead of notify_all: the finisher calls notify_one when waiters exist, and each awakened waiter notifies the next parked waiter after observing the completed sidecar state. This drains all waiters without a completion-wide wakeup stampede.
Optimistic cached reads fall back whenever the context is dirty, forced to recompute, or racing with a sidecar revision change. They do not publish dependency edges, flush effects, observe batch-local unflushed invalidations as fresh, or replace graph-locked refresh for ambiguous callback/dependency states.
If an upstream invalidation happens while a slot callback is running, the in-flight stale result is not published as fresh; the getter retries until it can return a value that matches the latest dependency state
Batch exit, effect scheduling, disposal, and explicit clears must each have a single atomic graph mutation boundary and one coalesced effect flush per outermost invalidation pass
The outermost thread-safe batch exit must collect dependents for all changed cells and apply one coalesced frontier invalidation, so a shared dependent reached through many changed cells is marked dirty and advances revision once per batch flush
Thread-safe invalidation uses an explicit InvalidationPlan computed from a frontier work queue under the graph mutex instead of recursive dependent walks. Changed-cell and slot-value-change roots snapshot dependent frontiers, coalesce duplicate slot ids in one invalidation pass, preserve direct changed-value force_recompute upgrades when a slot is reached through both direct and downstream paths, snapshot hard-clear frontiers for explicit slot/cell clears, then apply dirty, clear, revision, and effect-scheduling mutations at the same graph mutation boundary. The plan shape is partitionable for future bounded worker traversal, but this prototype keeps snapshot and application under the context mutex until benchmark and model-checking evidence proves a parallel apply path safe.
Thread-safe stress coverage must run the same contention script under both LowConcurrency and HighConcurrency, mixing batched cell writes, explicit slot and cell-dependent clears, effect cleanup/rerun, effect disposal racing with writers, and concurrent cached reads. The harness lives in tests/thread_safe_stress.rs and is part of make check.

Lock strategy evaluation:

Keep one context-level graph synchronization primitive until benchmark instrumentation shows a finer-grained design improves the relevant workload without trading off other contention cases
ThreadSafeContext uses read-mostly per-slot cached-value sidecars for fresh cached reads, a per-slot recompute/value-publish sidecar for in-flight same-slot waiters, a per-slot dependency summary for cell-only dirty refresh routing, and per-node dependent frontier sidecars for slot-only changed-cell invalidation; same-thread batches use thread-local batch frames to coalesce changed-cell queueing before the graph-owned batch flush; dependency graph mutations still require the context mutex
The read-mostly cached-value sidecar is an optimistic validation path, not a lock-free graph replacement. Its atomic cache revision rejects mid-read invalidation and mid-read publish races, then falls back to the existing graph refresh path.
ThreadSafeContext uses per-slot sidecar recompute Condvars for in-flight waiters. Those Condvars guard only per-slot in-flight/revision/cache-visibility state, use waiter-counted notify_one handoff wakeups to avoid broad notify_all contention, and must not mutate dependency graph state independently of the context mutex
The read-mostly prototype is benchmark-gated by the 1/2/4/8/16-worker same_slot_write_read, independent_slots, read_mostly_waiters, and batched_write_bursts matrix after the #lazybatch1 and #lazybatch2 invalidation/read-churn fixes
The dependent-frontier sidecar prototype is benchmark-gated by thread_safe_contention / independent_slots and set_cell_invalidation / independent_slot_contention at 8 and 16 workers. It should reduce set_cell_invalidation graph-lock acquisitions for independent slot-only roots without changing effect, batch, or dynamic-dependency semantics.
A sidecar frontier invalidation that reaches a slot with recompute in flight falls back to the graph-locked invalidation path, so stale publishes cannot clear newer dirty markers.
High-parallel graph propagation profiles gate the lazy-invalidation path with thread_safe_graph_propagation at 8 and 16 workers. The matrix compares fan-out eager validation, fan-out lazy dirty epoch publication, fan-in lazy dirty epoch publication, and fan-in batched flush behavior using throughput, p50/p95 latency, lock attribution, effect queue pushes, dependency-edge counters, sidecar dirty marks, sidecar fallbacks, and dirty epoch advances.
The local batch-frame prototype is benchmark-gated by set_cell_invalidation / batched_write_bursts and thread_safe_contention / batched_write_bursts at 8 and 16 workers. It should reduce per-write graph queueing during same-thread batches while preserving one coalesced dirty/effect flush at the outermost batch exit.
Effect-heavy contention profiles gate any queue or batch synchronization change. thread_safe_effect_contention isolates effect queue coalescing, cleanup execution, and nested batch flush behavior at 8 and 16 workers with deterministic lock-site budgets before a sharded graph-lock design can be considered.
Synchronization strategy comparison is release-gated by one fixed evidence table. The current std::sync mutex/Condvar path is the baseline; narrower Condvar wakeups are adopted only for per-slot recompute waiters; parking_lot style parking and targeted CAS remain candidates. A candidate must report throughput plus p50/p95 latency for the required 8/16-worker contention and effect-heavy cases, stay within lock-site budgets, and carry Loom/Shuttle proof for stale completion, effect scheduling/disposal, batch flush, and re-entrant callbacks before release.
Benchmark watch items from generated README deltas must be confirmed with a controlled A/B rerun before tuning. The rerun should use the same benchmark filter on the same host/toolchain when possible, isolate baseline and current code in clean worktrees or Criterion baselines, and record whether the signal reproduces. If confidence intervals overlap or Criterion reports no statistically significant change, document the watch item and avoid speculative synchronization changes.
Edge storage A/B benchmarking procedure: the vec_edges feature flag switches EdgeVec from SmallVec<[SlotId; 4]> (default) to Vec<SlotId>. To compare:
1. cargo bench --bench context -- dependency_fan_out,set_cell_invalidation/high_fan_out --save-baseline smallvec
2. cargo bench --bench context --features vec_edges -- dependency_fan_out,set_cell_invalidation/high_fan_out --baseline smallvec
3. Compare Criterion output — if confidence intervals overlap, keep SmallVec; if Vec is faster at the tested fan-out widths, reconsider the default. The comparison must run on the same host/toolchain with no other workload.

Cached-read strategy is runtime-selectable (#vd5v / #rdstrat1). Both read paths are compiled in and chosen at context construction via ThreadSafeContext::with_read_strategy(ReadStrategy), defaulting to LowConcurrency; the slot sidecar ThreadSafeSlotFastPath.value is a CachedReadStorage enum:

LowConcurrency → parking_lot::RwLock<Option<Arc<dyn Any + Send + Sync>>> read — optimal uncontended / low core counts (the default).
HighConcurrency → arc_swap::ArcSwapOption<Arc<dyn Any + Send + Sync>> wait-free load — no read lock; optimal at 8+ cores. (arc-swap’s RefCnt is Sized-only, so it stores Arc<Arc<dyn Any>>; the extra outer Arc is allocated only on the cold publish path, never on the read.)

Both reconstruct &T via the inline type_id without vtable indirection, and both carry the same atomic cache_revision + dirty/force_recompute validation envelope (loaded before, re-checked after the clone), so a get starting after a completed cross-thread invalidation cannot return the pre-invalidation value regardless of strategy. The runtime selection costs one per-read enum branch (the price of compiling both paths). 0.9.0 shipped arc-swap as the default; 0.10.0 (#rdstrat1) flips the default to LowConcurrency with explicit opt-in to HighConcurrency. Verified by the full default suite (both strategies) and the thread_safe_loom model (the validation algorithm is identical across variants). The inline small-Copy seqlock fast path that subsumes this tradeoff for small values is #rdstrat2 (implemented; opt-in via slot_copy/computed_copy/memo_copy — see Inline small-Copy seqlock below).

Contention tradeoff (rigorous isolated-worktree A/B, #xtwf). This is a deliberate low-contention-for-high-contention trade, not a free win. Comparing 463ca71 (arc-swap) against its parent 06fd3c2 (the parking_lot::RwLock read) on a shared Criterion target dir, same host/toolchain:

benchmark	arc-swap vs RwLock
`cached_reads/thread_safe_context` (1 thread)	+3.1% (slower)
`read_mostly_waiters/4`	+11.9% (slower)
`read_mostly_waiters/8`	−6.5% (faster)
`read_mostly_waiters/16`	−28.6% (faster)

arc-swap’s wait-free read wins at high core counts where RwLock read-lock cache-line traffic dominates, but its debt-tracking load plus the Arc<Arc<…>> double-indirection costs a few percent when uncontended. lazily-rs ThreadSafeContext targets highly-parallel reactive graphs, so the 8/16-worker win is the operative case and the ≤~3% uncontended/low-contention regression is accepted (see the revised benchmark gate below). The contended numbers carry wide confidence intervals; treat the crossover (~4→8 workers) as approximate.

Any future sharding or CAS path must include a Loom or Shuttle safety model covering concurrent first get, stale in-flight completion, invalidation during compute, effect scheduling/disposal, and re-entrant callbacks before it can replace the single-graph-lock design
The current sidecar Mutex/Condvar waiter path, optimistic cached-read fallback, and explicit invalidation-plan safety envelope are covered by cargo test --features loom --test thread_safe_loom, which models concurrent first get, scoped slot notification, waiter-counted handoff wakeup draining, stale in-flight completion and retry, read-mostly waiter handoff, mid-read optimistic validation fallback, invalidation during compute, fast-frontier fallback while dependency discovery is active, dynamic dependency switch/disposal cleanup, effect scheduling/disposal races, re-entrant callback graph access, duplicate diamond paths marking each frontier slot once, effect enqueue coalescing, nested batch invalidation flushing only at the outermost boundary, and the inline small-Copy seqlock (#rdstrat2) refusing torn reads and stale post-invalidation reads under concurrent single-writer publish
A lock-strategy change must preserve the rule that user compute/effect/cleanup callbacks never run while holding graph-state locks

Sharded/versioned storage evaluation:

After the frontier invalidation and read-churn prototypes, the thread_safe_contention benchmark and instrumentation rows are the gate for storage changes. The read-mostly cached-value sidecar is limited to fresh cached slot reads; write-side graph mutation remains serialized so the same benchmark matrix can show whether read-side wins trade off same-slot writes, independent slots, or batched aggregate writes.
Do not replace the single graph lock with sharded storage in the current implementation. Sharding may help independent roots and slots, but it does not remove serialization for same-root writes, shared aggregate slots, effect queues, batch-depth accounting, disposal, or dynamic dependency edge changes. A shard design must first define shard ownership for dependency edges that cross shards, a deterministic merge/apply order for dirty/revision/effect mutations, and a Loom or Shuttle model for deadlock-free multi-shard invalidation and disposal.
Do not treat versioned optimistic reads as an invalidation optimization. Versioned reads target fresh cached get latency, while the isolated set_cell_invalidation profiles attribute invalidation pressure to write-side graph mutation. The current prototype keeps independently retained sidecar Arc snapshots plus atomic dirty/revision validation so a get starting after a cross-thread invalidation cannot return the pre-invalidation cached value.
The next storage experiment, if pursued, should be a benchmark-gated prototype rather than a replacement: shard independent ThreadSafeState mutation by stable node id, keep effect queue and batch flush as one deterministic merge boundary, and require the isolated thread_safe_effect_contention profiles, set_cell_invalidation matrix, and Loom/Shuttle coverage to improve before adopting it.

Typed cache fast-path

Context and ThreadSafeContext store cached values as type-erased trait objects (Rc<dyn Any> and Arc<dyn Any + Send + Sync>). Every cached read calls downcast_ref::<T>(), which invokes a virtual type_id() through the trait object’s vtable, compares the returned TypeId, and then performs a pointer cast. The vtable call adds indirect branching overhead to the hot cached-read path.

The typed cache fast-path stores a TypeId inline in each node at creation time. Cached reads compare the stored TypeId directly (a single inline u64 equality check) and, on match, use an unchecked pointer cast to recover the typed reference. This eliminates the vtable indirection on every cached slot and cell read for both Context and ThreadSafeContext.

Storage layout:

SlotNode.type_id: TypeId — set once at slot() / computed() / memo() creation from TypeId::of::<T>()
CellNode.type_id: TypeId — set once at cell() creation
ThreadSafeSlotFastPath.type_id: TypeId — set once at slot creation
ThreadSafeCellFastPath.type_id: TypeId — set once at cell creation

Read-path fast-path:

Load the node’s type_id field (inline u64 load, no vtable)
Compare with TypeId::of::<T>()
On match, cast the stored value pointer to &T without going through dyn Any::downcast_ref
On mismatch, panic with the same “type mismatch” message as before

get_rc() and get_cell_rc() follow the same pattern but clone the reference-counted pointer instead of cloning the inner value.

The compute-function signature (dyn Fn(&Context) -> Rc<dyn Any> for Context, dyn Fn(&ThreadSafeContext) -> Box<ThreadSafeAny> for ThreadSafeContext) remains unchanged; the typed cache is a storage-side optimization that does not affect the closure API.

Benchmark gate:

cached_reads/context must show a measurable improvement over the downcast_ref baseline. The current baseline is approximately 8 ns per cached read; the typed cache target is to reduce this by the vtable call overhead (typically 0.5–1.5 ns on x86-64).
cached_reads/thread_safe_context (single-thread, uncontended) is a bounded-regression gate, not a no-regression gate, as of the #vd5v lock-free read. The arc-swap read sidecar may regress the uncontended cached read by up to ~3% in exchange for the high-contention win (read_mostly_waiters −6%/−29% at 8/16 workers; see Lock strategy evaluation → contention tradeoff). A regression beyond ~3% uncontended, or any regression in the 8/16-worker read-contention matrix, fails the gate and must be investigated. The earlier inline-TypeId vtable-elimination win for cached_reads/context (single-threaded Context) is unaffected and must still hold.
The improvement must reproduce under controlled Criterion A/B comparison (same host, same toolchain, non-overlapping confidence intervals).

Future implications:

Fully lock-free cached reads require typed storage so that a reader can reconstruct a typed reference from an atomically published pointer without vtable indirection. The inline TypeId is a prerequisite: it proves the type at compile time and makes the unchecked cast sound. Implemented (#vd5v): ThreadSafeSlotFastPath.value is now an arc_swap::ArcSwapOption, so read_fresh loads the published snapshot wait-free and recovers &T via the inline type_id — see Lock strategy evaluation above.
A future ErasedValue storage type could replace Rc<dyn Any> / Arc<dyn Any> entirely, storing the value inline for small types and avoiding heap allocation on compute. The current inline-TypeId step preserves the Rc<dyn Any> / Arc<dyn Any> layout while unlocking the fast-path read optimization. Partially implemented for ThreadSafeContext cached reads (#rdstrat2): the slot cached-read sidecar (CachedReadStorage::Inline) stores small Copy values inline behind a wait-free seqlock — see Inline small-Copy seqlock below. The node still retains its Arc<dyn Any> value (the inline buffer is a read-acceleration duplicate); a full ErasedValue that removes the Arc for non-Copy types remains future work.

Inline small-`Copy` seqlock (#rdstrat2)

For small Copy values, the ThreadSafeContext slot cached-read sidecar (ThreadSafeSlotFastPath.value) selects a third CachedReadStorage variant, Inline, instead of the strategy-selected Locked/LockFree path. The value’s bytes are stored inline in [AtomicU8; INLINE_CAP] (INLINE_CAP = 24, alignment bound 16) behind a single-writer / multi-reader seqlock, with no heap Arc, no refcount traffic on either read or publish. The inline path is optimal under both ReadStrategy modes; the runtime mode only governs the large / non-Copy fallback.

Soundness conditions. Inline is chosen only when T: Copy and size_of::<T>() <= INLINE_CAP and align_of::<T>() <= 16. Copy removes any Drop / ownership hazard from a discarded torn read; the size/alignment bound keeps the byte copy in-bounds. The bytes are read/written with relaxed atomic per-byte operations (not a plain memcpy), so a reader racing the single writer is well-defined under the Rust memory model — unlike a classic non-atomic seqlock, which has a formally-UB benign data race. The inline type_id proves T so the validated byte snapshot can be reconstructed into T without a vtable.
Single-writer invariant. Every write (value publish in recompute_slot_now, clear in apply_locked) runs while holding the graph state write lock, so writes are serialized; only reads are lock-free. The seq counter is even when stable, odd while a write is in progress; a reader observing an odd or changed seq discards its snapshot and retries. The closing Release store of the even seq and the reader’s bracketing Acquire loads (with the canonical Acquire fence) make an accepted snapshot a consistent image of exactly one publish.
Same validation envelope. The inline path carries the identical atomic cache_revision + dirty / force_recompute envelope as the other two strategies, so a read racing a publish/invalidation is rejected identically.
Opt-in constructors. Inline selection is not automatic on the generic slot / computed / memo constructors: stable Rust cannot branch on T: Copy inside a generic fn that lacks the bound (method resolution is pre-monomorphization, so a Copy-gated impl is never applicable where the bound is unprovable; automatic detection would require nightly specialization). The inline path is therefore opt-in through the Copy-bounded slot_copy / computed_copy / memo_copy constructors, which fall back transparently to the strategy path when the value exceeds the inline size/alignment bound.
Loom gate. The seqlock orderings (single-writer publish vs. concurrent lock-free readers, plus the cache_revision/dirty envelope) are modeled by cargo test --features loom --test thread_safe_loom (inline_seqlock_reader_never_observes_torn_value, inline_seqlock_envelope_rejects_torn_and_stale_under_concurrent_publish, inline_seqlock_read_after_completed_invalidation_is_rejected). The two single-property models run unbounded (exhaustive); the combined envelope model uses a preemption-bounded Builder (bound 4, 60 s duration cap) because the full 6-atomic envelope across two threads plus the driving body makes the unbounded permutation space non-terminating — the bound is validated by confirming the model still flags an injected torn-read regression, satisfying the lock-strategy Loom gate before landing.

Tokio integration is scoped in two stages:

Synchronous thread-safe sharing first: ThreadSafeContext should work inside tokio::spawn and tokio::task::spawn_blocking when all captured values and callbacks satisfy the Send + Sync bounds above. This is exposed behind the optional tokio feature with async tests and the tokio_sync example; it must not introduce async compute/effect semantics.
True async computations/effects are separate future work. They need explicit semantics for in-flight future deduplication, cancellation, dependency tracking across .await, stale future completion, cleanup ordering, and Send versus LocalSet futures.

AsyncContext

True async support is a new explicit async context surface, not an overload of Context or ThreadSafeContext. The AsyncContext API lives behind a separate async feature flag so downstream users do not accidentally accept the larger semantic surface. The async feature depends on tokio for runtime primitives (spawn, JoinHandle, notification).

AsyncContext type definitions

#![allow(unused)]
fn main() {
pub struct AsyncContext {
    inner: Arc<AsyncContextInner>,
}

pub struct AsyncSlotHandle<T> {
    id: SlotId,
    _marker: PhantomData<T>,
}

pub struct AsyncCellHandle<T> {
    id: SlotId,
    _marker: PhantomData<T>,
}

pub struct AsyncEffectHandle {
    id: SlotId,
}

pub struct AsyncComputeContext<'a> {
    context_id: AsyncContextId,
    node_id: SlotId,
    inner: &'a AsyncContextInner,
}
}

AsyncContext API surface

Method	Signature	Purpose
`new`	`fn new() -> Self`	Create a new async context
`cell`	`fn cell<T>(&self, value: T) -> AsyncCellHandle<T>`	Create a mutable cell (`T: PartialEq + Clone + Send + Sync + 'static`)
`get_cell`	`fn get_cell<T>(&self, handle: &AsyncCellHandle<T>) -> T`	Get cell value (synchronous)
`set_cell`	`fn set_cell<T>(&self, handle: &AsyncCellHandle<T>, value: T)`	Update cell and invalidate dependents
`computed_async`	`fn computed_async<T, F, Fut>(&self, compute: F) -> AsyncSlotHandle<T>`	Create an async computed slot
`get`	`fn get<T>(&self, handle: &AsyncSlotHandle<T>) -> Option<T>`	Synchronous cached read; returns `Some(T)` if resolved, `None` otherwise. Avoids async overhead on warm paths
`get_async`	`async fn get_async<T>(&self, handle: &AsyncSlotHandle<T>) -> T`	Await slot value; uses `get()` fast-path for resolved slots, otherwise spawns async compute
`memo_async`	`fn memo_async<T, F, Fut>(&self, compute: F) -> AsyncSlotHandle<T>`	Like `computed_async` with `PartialEq` memo guard
`effect_async`	`fn effect_async<F, Fut, C, CleanupFut>(&self, effect: F) -> AsyncEffectHandle`	Create an async effect
`dispose_async_effect`	`fn dispose_async_effect(&self, handle: &AsyncEffectHandle)`	Dispose async effect and await cleanup
`batch`	`fn batch<F, R>(&self, run: F) -> R`	Synchronous batch boundary; schedules async reruns at batch exit

API bounds:

Method family	Additional bounds
`get`	`T: Clone + Send + Sync + 'static`
`cell`, `get_cell`, `set_cell`	`T: PartialEq + Clone + Send + Sync + 'static`
`computed_async`, `memo_async`	`T: PartialEq + Clone + Send + Sync + 'static`; compute `Fn(AsyncComputeContext) -> Fut + Send + Sync + 'static`; future `Future<Output = T> + Send + 'static`
`effect_async`	effect `Fn(AsyncComputeContext) -> Fut + Send + Sync + 'static`; future `Future<Output = Option<C>> + Send + 'static`; cleanup `FnOnce() -> CleanupFut + Send + 'static`; cleanup future `Future<Output = ()> + Send + 'static`
handles	remain id-only and copyable; usable from any task only with the owning `AsyncContext`

AsyncSlotNode state machine

Each async slot tracks its state through a finite state machine:

#![allow(unused)]
fn main() {
enum AsyncSlotState {
    Empty,
    Computing { revision: u64, handle: JoinHandle<()> },
    Resolved,
    Error,
}
}

States:

Empty: no cached value, no in-flight computation. Entered on creation and after hard clear.
Computing: a JoinHandle tracks the in-flight future for the current revision. Concurrent get_async callers attach waiters to the same in-flight result instead of spawning duplicate futures.
Resolved: the cached value is fresh. The value remains until dependency invalidation transitions back to Computing.
Error: the last computation failed. Callers receive the error or retry on the next get_async.

State transitions:

Empty → Computing: first get_async call or dependency invalidation when no cached value exists.
Computing → Resolved: future completes with Ok, and the recorded revision still matches the current slot revision. The value is cached.
Computing → Error: future completes with Err, and the recorded revision still matches.
Computing → Computing (stale): dependency invalidation advances the slot revision during an in-flight computation. The completing future finds its revision no longer matches and discards the result. A new future is spawned for the updated revision.
Resolved → Computing: dependency invalidation marks the cached value stale and spawns a new computation.
Error → Computing: get_async retry after an error.

Revision tracking ensures stale completions are discarded: an async computation records the slot revision at start; at publish time the graph accepts the value only if the revision is still current.

AsyncContext cancellation contract

Waiter cancellation is safe: dropping one get_async future does not cancel the shared in-flight computation while other waiters still need it. Each waiter holds a shared handle (e.g., oneshot receiver or Shared<...>); dropping the receiver does not abort the JoinHandle.
Stale completion handling: when dependency invalidation advances the slot revision during an in-flight computation, the completing future finds its recorded revision no longer matches and discards the result. Waiting callers are retried against the new revision or attached to the newly spawned future.
Explicit cancellation: slot.clear(), dependency invalidation, or context disposal may mark the in-flight revision as canceled. If the runtime provides an abort handle, the task is aborted. User futures must be cancellation-safe because aborting drops them at an .await boundary.
Context disposal: dropping the AsyncContext cancels all in-flight computations via their JoinHandle::abort() handles and awaits completion of all active cleanup futures before returning.
Effect cleanup futures must complete before the next effect body starts. Disposal removes pending reruns before awaiting cleanup.

`get_async` re-resolve contract (#k03k)

get_async must treat the slot state as authoritative and re-resolve rather than assert, because the slot can change between its lock acquisitions and a notifier can close under it. It runs an outer loop that, each pass, re-reads the slot via the get() fast path and then re-locks to attach to / spawn a computation. Two concurrency windows are load-bearing:

Resolved-since-get(): the slot can transition Computing → Resolved between the get() fast-path check (which releases the lock) and the re-lock. Observing Resolved at the re-lock is therefore expected and the cached value is read directly — it is not an unreachable state.
Notifier dropped: the per-computation watch senders can all drop without a final Resolved send when an in-flight compute is superseded by a newer revision (the stale Computing → Computing transition early-returns) or the slot is invalidated. A recv.changed() error means “the world changed”, not a fatal error: the awaiter restarts the outer loop and re-resolves from current slot state (returning the now-published value, attaching to the new in-flight compute, or respawning).

Neither window is a data inconsistency — the published value is always correct; the contract is that get_async never panics on these benign races. Covered by async_context_concurrent_set_and_get_async_never_panics_k03k, which fails deterministically against the prior assert-based implementation.

Deterministic window coverage. Unlike ThreadSafeContext, the async resolve loop cannot be modeled with Loom: AsyncContext runs on tokio’s async executor and tokio::sync::watch, while Loom only shims synchronous loom::sync primitives and has no async runtime. Each window is instead pinned by a targeted deterministic test in tests/async_resolve_loop.rs (async + instrumentation features):

Window 1 is forced via a one-shot instrumentation-gated seam (AsyncContext::__install_window1_hook) that resolves the slot inside the synchronous gap between the fast-path get() and the re-lock — the gap has no .await, so cooperative scheduling alone cannot reach it. The test asserts the reader returned through the Resolved-after-re-lock arm via __window1_resolved_hits. The seam compiles out of default/release builds.
Window 2 is forced by gating an in-flight compute and superseding it with a newer revision so the notifier drops without a final send, asserting the waiter re-resolves to the latest value rather than panicking (mirrors the broader async_stress.rs::get_async_waiter_cancellation_and_stale_completion_keep_latest).

Exhaustive interleaving exploration of the async path (beyond these two known windows) would require a Shuttle model, which in turn requires making async_context.rs generic over its concurrency primitives — a larger architectural change tracked separately, not a Loom drop-in.

Async race stress coverage must exercise get_async waiter cancellation, stale in-flight completion after dependency invalidation, dynamic dependency replacement across awaited slot reads, and async effect cleanup-before-rerun ordering. The harness lives in tests/async_stress.rs under the async feature so make test-async and make check run it with the rest of AsyncContext coverage.

AsyncContext dependency tracking

Async compute and effect callbacks do not use thread-local tracking stacks. Instead, each callback receives an AsyncComputeContext:

#![allow(unused)]
fn main() {
impl<'a> AsyncComputeContext<'a> {
    pub async fn get_async<T>(&self, handle: &AsyncSlotHandle<T>) -> T;
    pub fn get_cell<T>(&self, handle: &AsyncCellHandle<T>) -> T;
}
}

get_async on the compute context records the accessed slot as a dependency before awaiting its value.
get_cell on the compute context records the accessed cell as a dependency synchronously.
Async reads register the graph edge immediately, so source invalidation while the future is suspended can cancel or supersede the in-flight computation before it publishes stale data.
Dependencies are collected into a HashSet<SlotId> attached to the async node. On rerun, stale dependencies are removed and new dependencies are registered.
This design survives executor thread migration and suspension/resume across .await points because the dependency set is carried by the AsyncComputeContext, not a thread-local.

AsyncContext async effects

#![allow(unused)]
fn main() {
fn effect_async<F, Fut, C, CleanupFut>(&self, effect: F) -> AsyncEffectHandle
where
    F: Fn(AsyncComputeContext) -> Fut + Send + Sync + 'static,
    Fut: Future<Output = Option<C>> + Send + 'static,
    C: FnOnce() -> CleanupFut + Send + 'static,
    CleanupFut: Future<Output = ()> + Send + 'static;
}

Serialized reruns: async effect reruns are serialized per effect. A rerun does not start until the previous cleanup future completes.
Cleanup ordering: the cleanup future from the previous run completes before the next effect body starts. Disposal awaits the current cleanup before removing the effect node.
Auto-tracking: the effect body receives an AsyncComputeContext and tracks dependencies through get_async and get_cell calls.
Dependency invalidation schedules an async rerun after the current invalidation pass. The rerun is spawned on the runtime executor, not inline.
Effect disposal: removes pending scheduled reruns, awaits the current cleanup future, and unsubscribes dependency edges.

AsyncContext batch support

ctx.batch() is a synchronous boundary. Cell updates queue invalidation roots.
At batch exit, queued roots trigger invalidation propagation. Async slots and effects are scheduled for rerun but do not execute inside the batch callback.
Async reruns execute after the batch returns, on the runtime executor.
Batching semantics remain synchronous at the graph mutation boundary: invalidations schedule async reruns only after the outermost batch exits.

AsyncContext feature flag

[features]
async = ["dep:tokio"]

The async feature depends on Tokio for spawn, JoinHandle, and runtime primitives. It is separate from the tokio feature (which covers synchronous ThreadSafeContext sharing inside Tokio tasks). The async feature implies tokio.

Integration tests live in tests/async_integration.rs and are gated behind #![cfg(feature = "async")]. Run with make test-async (or cargo test --locked --features async). The make check target includes test-async alongside test, test-tokio, and test-loom.

AsyncContext implementation notes

Async graph locks must never be held while polling user futures or cleanup futures. Acquire the lock only to read/write graph state, then release before polling.
Nested async slot reads register dependencies on the awaiting parent before awaiting the child result.
The sync tokio feature is not enough to enable this API; true async support uses the separate async feature flag.
The Send async context requires Send + Sync + 'static values, callbacks, futures, and cleanup futures. A future LocalAsyncContext may support !Send futures on tokio::task::LocalSet, but handles must not be interchangeable with the Send async context.
In-flight future deduplication: each async slot has one published cache and at most one in-flight computation for the current slot revision. Concurrent get_async callers await the same in-flight result instead of spawning duplicate futures.
Synchronous cached-read fast-path: get() returns the cached value synchronously when the slot is Resolved, avoiding async overhead. get_async() calls get() first; only unresolved or dirty slots enter the async spawn path.

Invalidation Semantics

ctx.set_cell() → if value changed (PartialEq) → mark all dependent slots dirty
slot.clear(&ctx) → remove cached value → cascade clear to all dependents
cell.clear_dependents(&ctx) → clear all dependent slots without changing cell value
ctx.batch() → queue changed cells and explicit slot/cell clears, then flush queued roots when the outermost batch exits
Slot invalidation → preserve cached value as dirty → validate/recompute on next ctx.get() access
Thread-safe slot invalidation walks an explicit coalesced frontier, so diamond paths mark each reachable slot at most once per invalidation pass unless a later direct changed-value path upgrades that slot to forced recompute
If a dirty ctx.memo() slot recomputes to an equal value, downstream dirty slots become fresh without recomputing
Slot clearing → remove cached value → hard-clear dependents recursively
Effects rerun after the invalidation pass if any tracked dependency invalidated
Effects scheduled only by dirty slot dependencies skip rerun if those slots validate unchanged
Effect cleanup runs before rerun and on disposal

Design Goals

Lazy evaluation: Values computed only when first accessed or when dirty caches are validated
Ergonomic derived values: ctx.computed() is the preferred spelling for ordinary derived slots
Fine-grained reactivity: Only affected dependents recompute
Memoized invalidation: Equal intermediate ctx.memo() recomputation suppresses downstream recomputation/effect reruns
Effects: Side effects are scheduled from the same dependency graph as slots
Batching: Multiple writes can share one invalidation/effect flush boundary
Zero mandatory runtime dependencies: The default library surface uses only the Rust standard library and smallvec; Tokio is optional and Criterion is dev-only for benchmarks
Single-threaded fast path: Context uses a single RefCell<ContextInner> with no mutex overhead and no unsafe code
Contiguous local storage: Both Context and ThreadSafeContext index nodes directly by SlotId in a Vec<Option<Node>> to avoid hash-map lookup and churn; slot IDs are reused via a free list to prevent unbounded growth from transient effects
Explicit thread-safe path: ThreadSafeContext uses a context-level lock and Send + Sync bounds for shared reactive graphs
Performance tracking: Criterion benchmarks cover both Context and ThreadSafeContext for cached reads, cold first access, dependency fan-out, memo equality suppression, effect flushing, and batch storms; ThreadSafeContext also tracks a 1/2/4/8/16-worker contention matrix that separates hot shared-slot writes, independent per-worker slots, read-mostly waiters, and batched write bursts.
Benchmark instrumentation: The optional instrumentation feature exposes lightweight counters for recompute starts, duplicate speculative thread-safe computes, dependency edge churn, effect queue depth, reactive node allocations, aggregate ThreadSafeContext lock wait/hold timing, and per-operation thread-safe lock attribution.

Performance Benchmarks

The benchmark suite is a development-only surface under benches/context.rs. It must compile with cargo bench --no-run and should remain focused on public API behavior rather than private graph internals.

Required benchmark scenarios:

Cached reads after the slot has already been computed
Cold first get including graph construction and initial dependency capture
Dependency fan-out invalidation followed by dependent reads
Memo equality suppression where an equal intermediate value prevents downstream recomputation
Effect flushing after dependency mutation
Batch storms that coalesce many writes into one invalidation/effect flush boundary
ThreadSafeContext set_cell invalidation isolation, split into:
- high fan-out changed-cell invalidation without dependent reads
- same-root/same-slot write contention without dependent reads
- independent per-worker roots and slots without dependent reads
- batched write bursts over per-worker cell groups without dependent reads
ThreadSafeContext contention at 1, 2, 4, 8, and 16 workers, split into:
- same-root/same-slot write plus read contention
- independent per-worker roots and computed slots
- read-mostly waiters with one writer and many readers
- batched write bursts over per-worker cell groups
ThreadSafeContext effect-heavy contention at 8 and 16 workers, split into:
- effect queue coalescing across batched worker writes
- effect cleanup execution during concurrent cell updates
- nested batch flushes that schedule computed-effect dependencies
ThreadSafeContext synchronization model checking with the optional loom feature

The optional instrumentation feature adds instrumentation_snapshot() and reset_instrumentation() to both context types and exports InstrumentationSnapshot. The snapshot records:

Reactive node allocation events as a stable allocation proxy
Slot recompute callback starts
Duplicate speculative ThreadSafeContext recomputes that lose publication races; this should remain zero when in-flight deduplication is effective
Dependency edges added and removed
Effect queue pushes and maximum pending queue depth
ThreadSafeContext lock/coordination acquisitions plus total wait and hold nanoseconds

ThreadSafeContext::lock_profile_snapshot() returns per-operation lock and coordination counters for the thread-safe path. The buckets are intentionally high-level: unattributed/other work, get refresh, dependency edge add/remove, set_cell invalidation, recompute publication, and in-flight recompute waiting. For the per-slot sidecar recompute Condvars, the in-flight wait bucket records the parked wait and reacquire boundary. The bucket acquisition counts must sum to the aggregate lock_acquisitions counter so profile consumers can attribute contention without losing the stable summary fields.

The instrumentation profile bench lives in benches/profile.rs and is gated behind required-features = ["instrumentation"]; compile it with cargo bench --features instrumentation --no-run.

The benchmark report harness lives at scripts/update-benchmark-results.py. It runs cargo bench --features instrumentation, reads Criterion estimate files from target/criterion, captures examples/instrumentation_profile.rs counter snapshots in target/lazily-instrumentation-profile.csv, and rewrites the generated README section between  and . The generated section must include the current Cargo package version, the refresh command, the Criterion baseline comparison workflow, one timing row for each required benchmark scenario above, p50/p95 Criterion sample latency rows for the required 8/16-worker same-slot, independent-slot, read-mostly, batched-write, and effect-heavy cases, and instrumentation rows covering recomputes, duplicate speculative recomputes, dependency edge churn, effect queue depth, node allocations, lock wait/hold time, and per-operation ThreadSafeContext lock attribution for every 1/2/4/8/16-worker contention matrix profile and every set_cell invalidation isolation profile. The generated section must also publish regression budgets for the slow contention profiles and their lock-site acquisition totals. --check verifies that the README section is already current without rewriting it and fails when required p50/p95 latency rows are missing or any instrumentation profile exceeds its lock-acquisition budget; --no-run reuses existing Criterion estimate files for a report-only refresh after a manual baseline comparison run while refreshing the instrumentation CSV unless it is also running in check mode.

Serialization (`lazily-serde` feature gate)

The serde feature is declared in Cargo.toml but currently has no implementation. Its purpose is to produce a serializable snapshot of context state so the planned lazily-ipc (snapshot + incremental update protocol) and lazily-distributed (CRDT/Raft remote graphs) layers have a stable on-the-wire representation to build on. This section fixes the design of that feature gate before any code lands.

The type-erasure problem

Context and ThreadSafeContext store cached values as type-erased trait objects (Rc<dyn Any> and Arc<dyn Any + Send + Sync>; see Typed cache fast-path). serde::Serialize is not object-safe — it has a generic serialize<S: Serializer> method — so a dyn Any cannot be serialized directly, and the concrete type T is gone by the time a snapshot walks the node Vec. Any serialization design must recover the ability to call a monomorphized Serialize/Deserialize for each node’s erased T. Two approaches were considered.

Approach A — trait bounds (erased-serde style)

Replace dyn Any cache storage with a serialize-aware trait object — a sealed dyn ReactiveValue: Any whose vtable also carries an erased_serde-style erased_serialize(&self, &mut dyn Serializer). Every signal-creation API (slot, computed, memo, cell) gains a T: Serialize + DeserializeOwned bound under #[cfg(feature = "serde")].

Pros: serialization is total — every cached node is serializable with no per-node opt-in; one storage type, one source of truth.
Cons:
- The T: Serialize bound is viral: it propagates through the whole public API and leaks onto Context itself even for signals that are never serialized, forcing callers to satisfy it for purely local reactive state.
- Requires either the erased-serde crate or a hand-rolled vtable equivalent — acceptable under a feature gate but still added surface.
- Breaks the typed cache fast-path. That optimization recovers &T via an unchecked pointer cast keyed on an inline TypeId; swapping the trait object out from under it for a serialize-aware type would have to preserve the same unchecked-cast guarantee.
- Deserialize is the hard half: reconstruction needs the concrete type at the call site, so a type-tag → constructor registry is required anyway — Approach A does not avoid the registry, it only adds the viral bound on top of it.

Approach B — type-erased closures captured at creation (recommended)

Keep dyn Any storage untouched. At signal creation under the serde feature, capture a monomorphized serde vtable — a small &'static struct of function pointers — alongside the TypeId the typed cache fast-path already records:

#![allow(unused)]
fn main() {
#[cfg(feature = "serde")]
pub struct SerdeVTable {
    pub type_tag: &'static str,                                  // stable cross-process key
    pub serialize:   fn(&dyn Any, &mut dyn erased_serde::Serializer)
                       -> Result<(), erased_serde::Error>,
    pub deserialize: fn(&mut dyn erased_serde::Deserializer)
                       -> Result<Rc<dyn Any>, erased_serde::Error>,
}
}

Each SlotNode/CellNode (and the ThreadSafe*FastPath mirrors) gains a #[cfg(feature = "serde")] serde_vtable: Option<&'static SerdeVTable> field, set once at slot() / computed() / memo() / cell() creation. The thunks are monomorphized per T at the construction call site, so the concrete type is captured exactly where it is still known; invoking serialize downcasts the &dyn Any back to &T and calls the real T: Serialize.

Pros:
- Composes with the typed cache fast-path — both capture per-T metadata at creation; the read hot path is untouched and pays zero overhead (thunks run only at snapshot time).
- Opt-in per node. A signal whose T: !Serialize stores None and is emitted as an Opaque placeholder, matching how lazily-ipc will snapshot only an explicitly shared subgraph rather than the whole context. No viral bound on Context.
- The deserialize thunk doubles as the type-tag → constructor registry Approach A needs anyway; populating it at slot construction keeps tags and constructors in one place.
Cons:
- Per-node storage grows by one pointer (Option<&'static SerdeVTable>, 8 bytes) — eliminated entirely when the serde feature is off via #[cfg(feature = "serde")] on the field.
- Without specialization on stable Rust, “serialize if T: Serialize, else None” needs a sealed MaybeSerialize<T> autoref/marker helper or explicit *_serde constructor variants rather than a blanket impl.

Decision

Adopt Approach B. The lazily-ipc/lazily-distributed roadmap serializes an explicit allowlisted RemoteOp set (#39c5), not arbitrary nodes, so the totality Approach A buys is unneeded — and it costs a viral bound plus a rework of the typed cache fast-path to buy it. Approach B keeps the default build byte-for-byte identical (field compiled out), preserves the zero-overhead read path, and reuses the TypeId-at-creation pattern already proven by the typed cache.

Feature-gate shape

serde = ["dep:serde"] (declared) gains dep:erased-serde under the same gate; both stay optional, preserving the zero mandatory runtime dependency goal.
A sealed MaybeSerialize<T> helper resolves the vtable to Some when T: Serialize + DeserializeOwned and None otherwise, so existing constructor signatures are unchanged and non-serializable signals keep compiling.
Context::snapshot() / ThreadSafeContext::snapshot() walk live nodes, invoke each present vtable, and emit ContextSnapshot { nodes: Vec<NodeSnapshot { slot_id, type_tag, payload | Opaque }> }. Restoration reads type_tag, looks up the deserialize thunk, and rebuilds typed cache entries.
This snapshot type is the input to #ipc2 (snapshot + incremental update protocol) and the value substrate for #ipc3 (CRDT vs Raft).

IPC snapshot + incremental update protocol (`lazily-ipc`)

lazily-ipc transmits a reactive graph’s state to a remote observer and keeps it in sync as the graph mutates. It builds directly on the ContextSnapshot from the lazily-serde design and reuses the existing batch-flush and cache-revision machinery as its consistency boundary rather than inventing a new one.

Two message kinds

Snapshot — full graph state. Sent on connect and on resync.
Delta — incremental change set. Sent once per outermost batch-flush invalidation pass — the single atomic graph-mutation boundary the thread-safe path already guarantees (see Threading and Concurrency Contract: “Batch exit … a single atomic graph mutation boundary and one coalesced effect flush per outermost invalidation pass”). Coalescing is therefore free: one delta per flush, never one per set_cell.

Epoch / versioning

A context-level monotonic ipc_epoch: u64 advances once per outermost batch flush, not per write. It is independent of the per-slot cache_revision atomics (which remain the read-path dirty epoch) — ipc_epoch is the wire sequence number.

Snapshot carries epoch.
Each Delta carries { base_epoch, epoch } with epoch == base_epoch + 1. Deltas are strictly sequential, so a receiver detects any gap, reorder, or sender restart by checking base_epoch == last_epoch.

Payloads

Snapshot { epoch: u64, nodes: Vec<NodeSnapshot>,           // NodeSnapshot from lazily-serde
           edges: Vec<(SlotId /*dependent*/, SlotId /*dependency*/)>,
           roots: Vec<SlotId> }                            // cells + source slots

Delta { base_epoch: u64, epoch: u64, ops: Vec<DeltaOp> }

DeltaOp =
  | CellSet    { slot_id, payload }           // changed-value cell write (PartialEq-guarded)
  | SlotValue  { slot_id, payload }           // a recompute published a new value
  | Invalidate { slot_id }                    // dirtied, not yet recomputed (lazy)
  | NodeAdd    { slot_id, type_tag, payload | Opaque }
  | NodeRemove { slot_id }                    // freed slot id (free-list reuse → Remove then Add)
  | EdgeAdd    { dependent, dependency }
  | EdgeRemove { dependent, dependency }

Consistency invariants (inherited, not re-derived)

PartialEq cell guard: an equal set_cell invalidates nothing, so it emits no CellSet and no downstream ops — the wire is silent exactly when the graph is (SPEC Invalidation Semantics).
memo equality suppression: a dirty memo() that recomputes to an equal value emits no SlotValue and no downstream Invalidate, mirroring the local “downstream dirty slots become fresh without recomputing” rule.
Coalesced frontier: a dependent reached through many changed cells in one batch appears at most once per delta — the same once-per-pass guarantee the thread-safe frontier already enforces.

Lazy reconciliation

Because lazily-rs is lazy, a flush can invalidate a slot without producing a new value. Two receiver modes:

Value-mirror (default for IPC): at flush the sender resolves each invalidated allowlisted slot via ctx.get() so the delta carries concrete SlotValues. The receiver stays a pure data mirror holding no compute closures. Trades local laziness for a value-complete wire image.
Mirror-lazy: the sender emits bare Invalidate and the receiver keeps a stale marker, recomputing only on its own read. This requires the compute closures to be replicated too and is therefore deferred to lazily-distributed (#ipc3), not lazily-ipc.

Resync / gap handling

The receiver tracks last_epoch. On a Delta whose base_epoch != last_epoch (gap, reorder, or sender restart) it discards the delta and requests a Snapshot; the sender replies with a fresh Snapshot { epoch } and resumes deltas from there. Messages are length-prefixed and tagged Snapshot / Delta via serde/erased-serde; the protocol is transport-agnostic (unix socket, pipe, WebSocket — the last feeds the #yxjw signaling server).

Permission boundary (forward link to #39c5)

Only nodes on the per-peer allowlist (#39c5 RemoteOp) are serialized into a snapshot or delta; non-allowlisted nodes are omitted entirely — not even as Opaque — so a peer cannot infer their existence. The allowlist is applied at snapshot/delta construction, before serialization, so the filter is the same on the full and incremental paths.

Implemented (#39c5)

The permission policy layer ships behind the distributed feature in src/distributed.rs:

NodeId / PeerId — wire-stable identifiers (decoupled from the internal SlotId), serde-derived under the serde feature.
OpKind (Read / Write / TriggerEffect) and RemoteOp { kind, node } — the gated, serializable unit a peer requests; the three kinds are gated independently (a read grant never implies write or effect-trigger).
PeerPermissions — default-deny per-peer allowlist with allow, allow_many, revoke (prunes empty peer entries), revoke_peer, is_allowed, and a fail-closed check → Result<(), PermissionDenied>.
filter_readable(peer, nodes) enforces the omission invariant above: non-readable nodes are dropped from the result entirely, preserving input order, so it can be applied at snapshot/delta construction before serialization.

PeerPermissions is local server-side state and is intentionally not serializable; only the wire-facing RemoteOp family is. Higher layers (lazily-ipc snapshot/delta construction, the lazily-distributed CRDT cell plane, and the single-writer effect authority) gate every remote request through check and build observable subgraphs through filter_readable.

Feature gate

A new ipc = ["serde"] feature adds the pure-protocol Snapshot/Delta types plus a transport-agnostic IpcSink / IpcSource trait pair. No transport dependency enters the core crate. The Delta/ipc_epoch model is a single-writer linear log; whether multi-writer needs CRDT merge or Raft consensus on top of that log is exactly the #ipc3 question.

Implemented surface:

Snapshot { epoch, nodes, edges, roots }, NodeSnapshot, NodeState, and EdgeSnapshot define the full graph image.
Delta { base_epoch, epoch, ops } and DeltaOp define the one-flush incremental image. Delta::next(base_epoch, ops) enforces epoch == base_epoch + 1; Delta::apply_status(last_epoch) returns Apply or ResyncRequired.
Snapshot::filter_readable and Delta::filter_readable apply PeerPermissions before serialization. Non-readable nodes and operations are omitted entirely; edges are retained only when both endpoints are readable.
IpcMessage, IpcSink, and IpcSource keep Unix sockets, pipes, WebSockets, and shared-memory ring buffers outside the core crate.
ShmBlobArena, ShmBlobRef, and IpcValue::SharedBlob provide the shared memory payload path. The arena writes a fixed header before each payload with generation, epoch, length, and checksum metadata; readers validate that header before accepting a descriptor. IpcMessage control frames can carry a ShmBlobRef instead of embedding large bytes inline.

Formal companion: lazily-spec/formal/lean models the shared IPC Snapshot/Delta state machine in Lean 4 and proves the epoch sequencing, fail-closed resync, PartialEq/memo suppression, batch coalescing, and eager Signal slot_value invariants. It is intentionally a spec-layer oracle; Rust implementation behavior remains covered by the crate tests and conformance fixtures.

Shared-memory IPC is therefore a supported transport direction, not a separate reactive-graph mode: the shared memory segment carries large blob payloads, and the ordinary control transport carries framed IpcMessages with blob descriptors. Each process keeps its own local Context / ThreadSafeContext and reconciles via snapshots and deltas. A live Context is not shared across process address spaces.

Cross-language channel compatibility (FFI / IPC / WebSocket / WebRTC data)

Yes: lazily-rs has a viable FFI strategy, but the FFI layer should be an adapter around the same transport-agnostic state plane used by IPC and distributed peers. It should not expose the closure-based Rust Context, ThreadSafeContext, SlotHandle<T>, CellHandle<T>, or &T cached values directly across an ABI boundary.

Compatibility model

The cross-language lazily family has one canonical message plane:

IpcMessage::Snapshot and IpcMessage::Delta are the graph-state payloads.
NodeId, PeerId, RemoteOp, Snapshot, Delta, and DeltaOp are the wire-facing contract; internal SlotId values and typed handles remain local implementation details.
IpcPayload is opaque serialized value bytes. The producing language owns type-aware encoding through stable type_tags; the channel only moves bytes.
ShmBlobRef is a descriptor carried by a control frame. Shared memory stores large payload bytes, but reconciliation still happens through ordinary IpcMessages.

Every supported channel carries that same message plane:

Channel	Compatibility strategy
FFI	C ABI exposes opaque context/session handles plus owned byte buffers for `IpcMessage` encode/decode, snapshot export, delta apply, and memory release. No Rust references, trait objects, closures, or typed handles cross the boundary.
IPC	Unix sockets, pipes, local TCP, or process channels carry length-prefixed serialized `IpcMessage`s. Shared-memory IPC is an optimization for large `IpcValue::SharedBlob` payload bytes, not a separate graph-sharing mode.
WebSocket	One WebSocket frame carries one serialized `IpcMessage` or a negotiated fragment. The #yxjw signaling server may relay the frame as opaque payload and must not parse CRDT/IPC state.
WebRTC data	Reliable ordered data channels carry the same serialized `IpcMessage`s after #yxjw peer discovery. Unordered or unreliable channels are only acceptable for optional lossy telemetry; `Delta`s need ordered reliable delivery or receiver-side gap detection and snapshot resync.

FFI boundary shape

The Rust FFI surface is deliberately narrow:

#![allow(unused)]
fn main() {
#[repr(C)]
pub struct LazilyFfiBytes {
    pub ptr: *mut u8,
    pub len: usize,
}

#[repr(C)]
pub enum LazilyFfiStatus {
    Ok = 0,
    Empty = 1,
    NullPointer = 2,
    InvalidMessage = 3,
    EncodeFailed = 4,
    Panic = 5,
}

#[repr(C)]
pub enum LazilyFfiMessageKind {
    Unknown = 0,
    Snapshot = 1,
    Delta = 2,
}
}

The ffi feature exports extern "C" functions for creating/freeing an opaque LazilyFfiChannel, validating/classifying JSON-encoded IpcMessage frames, enqueueing accepted frames, receiving Rust-owned LazilyFfiBytes frames, and freeing buffers allocated by Rust. All allocation ownership is explicit: the caller owns input bytes, Rust owns output buffers until the paired free function is called. Errors return LazilyFfiStatus; panics must be caught before crossing the C ABI.

The implemented channel is a local ABI adapter. It decodes each accepted frame as IpcMessage, then re-encodes canonical JSON bytes before enqueueing or returning a cloned frame. That keeps FFI byte transport aligned with IPC, WebSocket, and WebRTC data transport while leaving snapshot export, delta application to a live graph, and richer typed convenience APIs as higher-level work on top of the same message plane.

The FFI layer may expose convenience helpers for local cells, but those helpers still encode/decode through the same type_tag + payload registry used by lazily-serde. A foreign runtime can therefore choose either direct local FFI calls or framed IPC/WebSocket/WebRTC transport without changing the graph-state protocol.

Shared library build

The crate produces both an rlib (for Rust consumers) and a cdylib (for FFI consumers) via crate-type = ["lib", "cdylib"] in Cargo.toml.

make build-ffi builds the shared library with the ffi feature enabled.
make ffi-headers generates a C header file (target/lazily.h) via cbindgen using cbindgen.toml.
Future options include safer-ffi (safer FFI wrappers with auto-generated headers) and diplomat (multi-language binding generation). The current cbindgen approach is sufficient for C ABI consumers.

Binary serialization (`ipc-binary` feature)

The ipc-binary feature adds optional binary serialization via postcard alongside the default JSON codec. Binary frames are significantly smaller and faster for performance-critical paths between same-language or binary-aware peers.

IpcMessage::encode_binary() / IpcMessage::decode_binary() — postcard encode/decode on the IpcMessage type.
IpcMessage::encode_json() / IpcMessage::decode_json() — JSON encode/decode (gated behind the ffi feature which pulls in serde_json).
FFI binary functions: lazily_ffi_channel_send_binary, lazily_ffi_channel_recv_binary, lazily_ffi_ipc_message_validate_binary, lazily_ffi_ipc_message_kind_binary, lazily_ffi_ipc_message_clone_binary — mirror the JSON FFI functions but use the postcard codec.
EncodeError / DecodeError — codec-agnostic error types with Json and Binary variants gated by their respective features.

The binary codec is not self-describing; peers must agree on the schema. For cross-language use, JSON remains the default; binary is for same-Rust or postcard-aware transports.

WebRTC data channel transport (`webrtc-data` feature)

WebRTC data channels carry IpcMessage frames peer-to-peer after the signaling-client (#yxjw) completes SDP/ICE negotiation. This is the internet-scale transport layer — no server relay needed for graph state after the initial signaling handshake.

Feature gate

webrtc-data = ["ipc", "dep:str0m", "dep:tokio"]

Uses str0m for the WebRTC stack (pure Rust, no C dependencies) and tokio for async runtime integration. Separate from signaling-client so consumers can use signaling without incurring the full WebRTC dependency.

Transport interface

#![allow(unused)]
fn main() {
#[cfg(feature = "webrtc-data")]
pub struct WebRtcDataChannel {
    // str0m session wrapping a single data channel
}

#[cfg(feature = "webrtc-data")]
impl IpcSink for WebRtcDataChannel {
    type Error = WebRtcDataError;
    fn send(&mut self, message: &IpcMessage) -> Result<(), Self::Error>;
}

#[cfg(feature = "webrtc-data")]
impl IpcSource for WebRtcDataChannel {
    type Error = WebRtcDataError;
    fn recv(&mut self) -> Result<Option<IpcMessage>, Self::Error>;
}
}

Channel contract

Ordered + reliable: data channels must be created with ordered: true, maxRetransmits: None so Delta delivery matches the single-writer epoch contract. Unordered/unreliable channels are only acceptable for optional lossy telemetry, never for graph state.
Framing: each IpcMessage is length-prefixed (4-byte LE length + payload). Binary or JSON codec negotiated during capability handshake.
Back-pressure: send blocks or yields when the SCTP congestion window is full; the caller must not flood faster than the channel drains.
Reconnect: on channel close or SCTP failure, the transport signals Err so the caller can re-signaling and re-establish a fresh channel. The Delta resync mechanism handles any gap.

Lifecycle

SignalingClient exchanges SDP offer/answer with peer via #yxjw
ICE candidates trickle through the signaling channel
On ICE completion, WebRtcDataChannel::from_sdp(local_sdp, remote_sdp) creates the str0m session and opens the data channel
Capability handshake on the data channel (protocol id, codec, features)
IpcMessage frames flow bidirectionally
On disconnect, re-signaling via SignalingClient and repeat

Integration test surface

tests/webrtc_data.rs — gated behind #[cfg(feature = "webrtc-data")]
Loopback test: create two str0m sessions back-to-back, send IpcMessage::Snapshot and IpcMessage::Delta, verify round-trip
Codec negotiation: JSON and binary on the same channel
Ordered delivery: send 100 deltas, verify epoch order on recv
make test-webrtc-data target in Makefile

str0m backends: loopback vs networked (`webrtc-str0m` feature)

The webrtc-str0m feature ships two concrete DataChannel backends over the same sans-IO str0m pump loop, differing only in transport and clock:

Str0mLoopback (src/str0m_backend.rs) — two Rtc instances in one thread, connected by an in-memory packet route advanced on a synthetic clock. No sockets, no threads, no wall-clock dependency, so the full ICE/DTLS/SCTP handshake is deterministically testable in-process. This is the unit/CI substrate for the WebRtcSink/WebRtcSource bridge.
Str0mNet (src/str0m_net.rs) — one Rtc driven over a real UDP socket by a background driver thread, with the SDP offer/answer and trickled ICE candidates exchanged by the caller (typically over SignalingClient, #yxjw). This is the real “beyond signaling” peer-to-peer path that can reach a peer on another host.

Str0mNet lifecycle:

Str0mNet::offer(bind) → binds the UDP socket, opens the lazily-ipc channel, returns the SDP offer string; Str0mNet::answer(bind, offer) returns the SDP answer string. The offerer applies the peer’s answer with accept_answer.
Each peer exposes its host candidate via local_candidate(); the caller trickles it to the remote, which feeds it to add_remote_candidate().
The driver thread pumps poll_output → UDP send_to, and UDP recv_from → handle_input, advancing real timers, until the SCTP data channel opens (wait_open). Inbound frames queue for try_recv_frame; outbound frames requested before open are buffered and flushed on open.
On ChannelClose, socket failure, or a dead Rtc, the channel reports closed so the sync sink/source surface Err and the caller re-signals.

Because Str0mNet needs live two-peer connectivity it cannot use the synthetic clock. tests/str0m_net.rs exercises a real two-socket round trip over 127.0.0.1 (real UDP/DTLS/SCTP/timers); a cross-host round trip through the live signaling Worker is operator-gated.

`Str0mNet` outbound backpressure contract (#lzstr0mframe)

The Str0mNet driver’s outbound frame queue is bounded at MAX_PENDING_FRAMES (1024). When a caller’s send_frame rate exceeds the SCTP drain rate, the driver applies backpressure on two layers:

Channel::write backpressure (Ok(false)) — str0m returns Ok(false) when the SCTP send buffer is full. The driver’s flush loop re-queues the frame and yields (rather than popping it), so the next poll_output / recv_from cycle can drain the SCTP window and accept the frame on the following iteration. Pre-#lzstr0mframe this branch was a bare if ch.write(...).is_err() { break; } that ignored the bool, silently dropping every frame that hit the Ok(false) path — violating the ordered/reliable DataChannel invariant WebRtcSink/WebRtcSource rely on.
Queue cap (Str0mNetError::Backpressure) — once the driver’s out_pending VecDeque reaches MAX_PENDING_FRAMES, send_frame itself returns Err(Str0mNetError::Backpressure) so the caller applies flow control (sleep / await / shed load) instead of growing memory without bound. The counter is decremented when Channel::write accepts the frame (Ok(true)) and reset to zero on driver exit (the queue will never drain after close).

The regression test burst_of_frames_arrives_in_order_under_backpressure exercises this contract by bursting 100 × 8 KiB frames through a single Str0mNetChannel and asserting all 100 arrive in order at the remote peer, retrying on Backpressure as needed.

`Str0mNet` driver I/O error handling (#lzstr0mpolldrive)

The driver’s UDP I/O surfaces failures instead of silently dropping packets:

socket.send_to — pre-fix this was let _ = socket.send_to(...), which discarded every error: ENOBUFS (send-buffer pressure), ECONNREFUSED (ICMP port-unreachable, peer down), ENETUNREACH/EHOSTUNREACH (route flap), EBADF (socket closed). The corresponding ICE/DTLS/SCTP packet was silently lost and the handshake stalled without diagnostics. Post-fix, WouldBlock/Interrupted retryable errors continue the drain loop (str0m re-emits the Transmit on a later poll_output); any other error breaks the driver ('outer), surfacing Closed so the caller re-signals.
Read-timeout cap as command-poll interval — recv_from waits at most COMMAND_POLL_INTERVAL (15 ms) so control commands (Send / AcceptAnswer / AddRemoteCandidate / Shutdown) read from cmd_rx at the top of each outer iteration stay bounded-latency. This is not a str0m timing parameter: str0m is fed an accurate time advance via Input::Timeout(now) whenever the socket times out without data, and an “early” Input::Timeout (every 15 ms during idle) is harmless — str0m just re-emits its pending deadline if it isn’t time yet.

Signaling glue: `Str0mNet` over `SignalingClient` (#lzwebrtcwire)

Str0mNet exchanges its SDP offer/answer and trickled ICE candidates out of band; SignalingClient (#yxjw, below) is the out-of-band channel. The webrtc_signaling module (enabled when both signaling-client and webrtc-str0m are on) is the wire between them — two async driver functions that own the full handshake:

offer_to_peer(client, peer, bind, timeout) — binds the socket via Str0mNet::offer, sends the SDP offer and the local local_candidate() to peer over the signaling client, then pumps incoming ServerMessages (answer → accept_answer, ice → add_remote_candidate) until the data channel opens, returning the connected Str0mNet.
answer_next_offer(client, bind, timeout) — waits for the next offer frame, produces the SDP answer via Str0mNet::answer, returns the answer + local candidate over signaling, applies any ICE candidate that raced ahead of the offer, then pumps until open. Returns the offering PeerId and the connected Str0mNet.

Both pump loops re-check Str0mNet::is_open() on a short poll tick as well as on each signaling frame, because the channel opens on the backend’s driver thread, off the signaling path. The caller is responsible for learning the target peer is present (from the welcome roster or a peer-joined frame) before offering; an offer to an absent peer is dropped by the relay and surfaces only as a timeout.

tests/webrtc_signaling.rs drives this end to end over a loopback signaling relay: an in-process tokio-tungstenite server implementing the #yxjw roster + from-stamped routing on 127.0.0.1, two real SignalingClient WebSocket connections, and the real Str0mNet UDP/DTLS/SCTP transport — proving a permission-filtered Snapshot crosses a data channel negotiated entirely through SignalingClient. The only remaining slice is the live two-host / NAT run through the deployed #yxjw Worker, which is operator-gated (#h6qb).

Capability negotiation

Each non-local session starts with a small compatibility handshake before graph state flows:

protocol id: lazily-ipc
protocol major version
codec (json today; binary codecs can be transport crates as long as they encode the same IpcMessage schema)
maximum frame size and fragmentation support
ordered/reliable delivery guarantee
PeerId and session/graph id
supported features such as shared-blob, crdt-cell-plane, and signaling-relay

If peers disagree on protocol major version, codec, ordering guarantees, or required feature flags, they fail closed before applying any Snapshot or Delta.

Cross-language family rules

The shared semantics are lazy slots, mutable cells, dynamic dependency tracking, PartialEq/equality-guarded invalidation, memo equality suppression, batching, and permission-filtered snapshots/deltas.
Compute closures are language-local. Cross-language sync shares the cell state plane by default; derived slots converge remotely only when peers use a shared compiled graph or an explicit compute-descriptor system.
JavaScript/TypeScript peers must keep PeerId values at or below Number.MAX_SAFE_INTEGER, matching the #s0fc signaling protocol.
Permission filtering happens before serialization on every channel. A WebSocket relay, WebRTC data channel, or FFI caller must not receive nodes or operations that PeerPermissions would omit.
Channel code must preserve back-pressure and resync behavior: if frame delivery gaps, reorders, truncates, or exceeds negotiated size, the receiver requests a fresh Snapshot instead of applying a partial delta.

This keeps FFI viable without making it a special semantic path. FFI, IPC, WebSocket, and WebRTC data differ only in framing, ownership, and reliability; the lazily family stays compatible because all channels carry the same permission-filtered IpcMessage state plane.

Multi-writer coordination: CRDT vs Raft (`lazily-distributed`)

lazily-ipc (above) is a single-writer linear log: one authority mutates the graph and streams Deltas stamped by a monotonic ipc_epoch. lazily-distributed asks the harder question — when multiple peers may write the same shared reactive graph, what coordination model orders those writes: a CRDT (conflict-free replicated data types, eventual consistency) or Raft (leader-ordered consensus, strong consistency)?

The reactive structure collapses most of the question

The key observation is that not all graph state is writable. lazily-rs has exactly two node kinds with respect to authorship:

Cells / source slots — externally writable. These are the only state a peer can directly set.
Derived slots (computed / memo) — pure deterministic functions of their dependencies. They are never written; they recompute. Their values, and the dynamic dependency topology discovered during recompute, are a deterministic view of the cell state — provided every peer runs identical compute closures (the closure-replication prerequisite already flagged as lazily-ipc’s mirror-lazy mode).

So coordination is only needed on the small cell plane. The entire derived graph — typically the large majority of nodes, plus all edges and the effect schedule — converges automatically once the cells converge and recompute runs. This is the same property the local engine already relies on: derived state is a function, not a source of truth.

The two models on the cell plane

Aspect	CRDT (cell-plane registers)	Raft (leader-ordered log)
Consistency	Eventual; peers converge after delivery	Strong; one total order, every peer identical
Availability	Local-first — peers read/write while partitioned	Minority partition cannot write; needs quorum
Write latency	Local (no round-trip)	Quorum round-trip to leader per write
Offline peers	Native (merge on reconnect)	Not supported (writes need quorum)
P2P / WAN fit	Direct (no leader); fits #yxjw signaling	Awkward — leader election over WAN, quorum cost
Conflict model	Per-cell merge (LWW / MV register)	None — serialized, last in order wins by fiat
Extends #ipc2	Per-peer `Delta`s + causal stamps, merged	One global Raft-replicated `Delta` log
Cost	Every writable cell must be a CRDT	Election, log replication, quorum machinery

Recommendation — CRDT cell plane (HLC-stamped registers), not Raft

Adopt a CRDT layer on the cell plane only, with derived slots recomputing deterministically on each peer. Concretely:

Each replicated cell is a register CRDT keyed by a hybrid logical clock (HLC) — wall-clock for human-meaningful ordering, logical counter for causal tiebreak. Two flavors, chosen per cell via a trait:
- LWW-register (last-write-wins) — default; “current value” semantics that most reactive cells want. Silently drops the losing concurrent write.
- MV-register (multi-value) — surfaces concurrent writes as a set for the compute layer (or app) to resolve, when dropping a write is unacceptable.
- Additive cells can opt into a PN-counter instead of a register.
The local PartialEq invalidation guard still applies — after merge: a merge that yields an equal value invalidates nothing, exactly as a local equal set_cell does. memo equality suppression likewise holds post-merge, so convergent peers do the same downstream work.
lazily-ipc’s Delta generalizes from one monotonic ipc_epoch to per-peer causal stamps: each peer keeps its own sequence; cross-peer order comes from the HLC/dot metadata carried on each CellSet. Delivery can be out-of-order; merge is commutative/associative/idempotent so gaps self-heal without the snapshot-resync that the single-writer log needed.

Raft is the wrong default because the lazily-distributed roadmap is explicitly availability-first — P2P signaling (#yxjw) and offline peers — and Raft trades exactly that away for a global total order that the reactive model does not need: derived state is already deterministic, and the writable surface is small.

The narrow exception — irreversible effects need an authority

CRDT convergence is correct for state. It is not sufficient for effects that perform irreversible external actions (send an email, charge a card, fire a webhook): convergence may run the same effect on every peer, or run it twice as merges arrive. Pure state can converge; an external side effect cannot be merged.

For that narrow class, gate the effect behind a single-writer effect authority — a designated peer (or a small Raft group owning only the effect-intent log, not the whole graph) decides when an irreversible effect fires, at-most-once. This is a hybrid: CRDT for the state plane, a single-writer/Raft authority for the irreversible-effect plane, with the #39c5 RemoteOp allowlist already gating which remote writes and effects a peer may trigger at all. The large reactive core stays leaderless and local-first; only the small irreversible-effect tail pays for consensus.

Open prototype gates (deferred to implementation)

HLC skew bounds and the LWW-vs-MV default per cell category.
Whether closure replication (required for peers to recompute derived slots) is shipped as serialized compute descriptors or restricted to a shared compiled graph — this gates how much of the derived plane can live remotely at all.
Delta-state vs op-based CRDT encoding on the wire, reusing lazily-serde.

Internet-scale peer discovery: signaling server (`#yxjw`)

The CRDT cell plane (above) is leaderless and local-first, but peers still have to find each other and open transport before any Delta can flow. On a LAN that is mDNS or a known address; across the internet it needs a rendezvous point. #yxjw is that rendezvous: a small Cloudflare Worker signaling server that brokers peer discovery and relays the WebRTC SDP/ICE handshake so peers can establish direct P2P data channels, falling back to server relay of opaque payloads when a direct channel cannot be formed. It is strictly a discovery + relay layer — it never parses or merges CRDT state, so it stays trivially horizontally scalable and never becomes the consistency authority the CRDT design deliberately avoids.

Why a Cloudflare Worker + Durable Objects

One Durable Object per session id. A WebSocket upgrade to GET /session/:id routes to a SignalingRoom DO keyed by idFromName(sessionId). Cloudflare guarantees a single global instance per id, so each session gets a lock-free single-threaded coordination point for its roster with no external store. Scale is achieved by sharding sessions across DO instances, not by growing one server — which matches the availability-first, P2P posture of the CRDT recommendation (it “fits #yxjw signaling”).
Edge-local. Workers run close to peers worldwide, minimizing handshake latency; the DO migrates to wherever its session is most active.

Roles and protocol

The server tracks a per-session roster of connected peers and forwards three classes of frame. PeerId is the same u64 as Rust PeerId (serialized by serde as a bare JSON number; ids must stay ≤ Number.MAX_SAFE_INTEGER).

Client → server: join { peer, capabilities? }, offer { to, sdp }, answer { to, sdp }, ice { to, candidate }, relay { to, payload }, leave.
Server → client: welcome { peer, peers } (roster on join), peer-joined/peer-left, forwarded offer/answer/ice/relay stamped with the real from, and error { code, message }.

Anti-spoofing: the from on every forwarded frame is the sender connection’s registered peer id, never a client-supplied field, so a peer cannot impersonate another.

Permission boundary (reuses #39c5)

Admission and relay are gated by SignalingPermissions, the discovery-layer mirror of lazily::distributed::PeerPermissions:

open mode — any peer may join and signal any other joined peer (trusted / LAN / common discovery case).
allowlist mode — default-deny: a peer may join only when explicitly granted, and may send directed frames only to explicitly allowed targets, exactly as #39c5 gates RemoteOp. This is the discovery-layer half of the same boundary; the Rust data plane still re-checks every RemoteOp locally.

Reconnect / resync

The roster lives in the DO; it is authoritative. A peer that drops simply re-joins and receives a fresh welcome roster — no snapshot epoch is needed at this layer because signaling carries no CRDT state (the data plane’s HLC/dot-stamped Deltas self-heal independently per the CRDT design).

Implemented (#yxjw)

Ships as a standalone TypeScript Worker under signaling/ (its own Node toolchain; not part of the Rust crate build):

src/protocol.ts — wire types + untrusted-frame validation/codec.
src/permissions.ts — SignalingPermissions (open / default-deny allowlist).
src/room-core.ts — transport-agnostic RoomCore: roster, routing, anti-spoofing, permission gating.
src/room.ts — SignalingRoom Durable Object (thin WebSocket adapter).
src/index.ts — Worker entry: /health + /session/:id routing.
test/ — 24 vitest tests (protocol/permissions/room-core units plus an end-to-end Worker + DO + WebSocket test in the workerd runtime).

Open gates (deferred)

TURN/relay fallback policy when both peers are behind symmetric NAT (today the server can relay payloads, but a dedicated TURN allocation is out of scope).
Authenticated admission tokens feeding the allowlist grants from an external identity source rather than static configuration.
Capacity/back-pressure limits per session DO.

Consumable clients (#s0fc)

So a project can depend on the signaling endpoint for distributed peer discovery (the plan for agent-doc), the endpoint ships two clients that speak one shared wire protocol — this section is the normative source of truth both conform to.

Wire protocol (normative). All frames are JSON with a type tag. PeerId is a u64 serialized as a bare JSON number (Rust PeerId(u64) ⇄ TS number; keep ids ≤ 2^53).

Client → server: join {peer, capabilities?}, offer {to, sdp}, answer {to, sdp}, ice {to, candidate}, relay {to, payload}, leave.
Server → client: welcome {peer, peers}, peer-joined {peer}, peer-left {peer}, offer/answer/ice/relay (each stamped from), error {code, message}.

Rust client (signaling-client feature, src/signaling_client.rs): lazily::SignalingClient::connect(base_url, session, peer) opens a tokio-tungstenite WebSocket to {base_url}/session/{session}, joins, and exposes offer/answer/ice/relay/leave plus recv() for ServerMessages. ClientMessage/ServerMessage are serde-tagged (rename_all = "kebab-case") and reuse the #39c5 PeerId. Conformance tests assert the exact JSON shapes above. The feature pulls tokio-tungstenite (rustls) only when enabled; the default build is unaffected.

TypeScript client (@lazily/signaling package, signaling/src/client.ts): SignalingClient.connect(baseUrl, session, peer) (or attach(socket, peer) for a pre-opened socket) works against any WebSocket-like transport (browser, Node ≥ 22, or injected), with onMessage + the same send helpers. The package exports ./client and ./protocol. Unit tests plus an end-to-end test drive the real Worker + Durable Object in workerd.

Both clients are covered in CI (cargo test --features signaling-client; the Worker job’s npm run check). The Rust conformance tests and the TS protocol share the byte-for-byte frame shapes defined above, so the two implementations stay wire-compatible.

Differences from lazily-zig

Aspect	lazily-zig	lazily-rs
Context	Explicit allocator	Owned allocations
Slot creation	`comptime` function pointers	Closures (`Box<dyn Fn>`)
Storage modes	`.direct` / `.indirect`	Unified via generics
FFI	Built-in `StringView`	Via `#[no_mangle]` + `extern "C"`
Thread safety	Mutex by default; `-Dthread_safe=false` removes locking	`Context` is single-threaded (`RefCell`); `ThreadSafeContext` uses a context-level lock

Differences from lazily-py

Aspect	lazily-py	lazily-rs
Context	Plain `dict`	Typed `Context` struct
Slot keys	Object identity	`SlotId` (u64)
Cell equality	`!=` operator	`PartialEq` trait
Context resolvers	`resolve_ctx` functions	Direct context passing
Dependencies	Zero mandatory runtime crates by default; optional Tokio support and dev-only Criterion benchmarks	Zero (pure Rust)

lazily Wire Protocol

Language-agnostic protocol reference for the lazily reactive-graph family (lazily-rs, lazily-py, lazily-zig). This document describes the wire format, message schemas, and transport contracts. Language-specific APIs live in each binding’s own documentation.

Message Plane

All channels (FFI, IPC, WebSocket, WebRTC data) carry the same two message kinds:

Snapshot — full graph image, sent on connect and on resync
Delta — incremental change set, sent once per outermost batch flush

These are tagged as IpcMessage:

{ "Snapshot": { ... } }
{ "Delta": { ... } }

Wire Types

NodeId

Stable wire identifier for a reactive node (cell or slot). Decoupled from language-internal allocation IDs.

{ "node": 1 }

Wire format: u64 wrapped in a "node" field.

PeerId

Identifies a remote peer.

{ "peer": 42 }

Wire format: u64. JavaScript peers must keep this at or below Number.MAX_SAFE_INTEGER.

OpKind

Access category for a remote operation.

Value	Meaning
`"Read"`	Read node value into snapshot/delta
`"Write"`	Write new value to source cell
`"TriggerEffect"`	Trigger effect on irreversible-effect plane

RemoteOp

A single operation a remote peer may request.

{ "kind": "Read", "node": 1 }

IpcPayload

Opaque serialized value bytes. The producing language owns type-aware encoding through type_tag; the channel only moves bytes.

Wire format: array of u8 (JSON array of integers).

NodeState

Serialization state for a node.

{ "Payload": [1, 2, 3, 4] }
"Opaque"
{ "SharedBlob": { "offset": 0, "len": 16, "generation": 1, "epoch": 9, "checksum": 123456789 } }

Variant	Meaning
`{ "Payload": [...] }`	Inline serialized value bytes
`"Opaque"`	Known node whose value cannot be serialized
`{ "SharedBlob": { ... } }`	Descriptor for bytes in shared memory

ShmBlobRef

Descriptor for a payload stored in a shared-memory blob arena.

Field	Type	Meaning
`offset`	`u64`	Byte offset from arena start
`len`	`u64`	Payload length in bytes
`generation`	`u64`	Per-write generation (stale rejection)
`epoch`	`u64`	IPC epoch of the publishing message
`checksum`	`u64`	FNV-1a payload checksum

IpcValue

Value stored inline or by shared-memory blob reference.

{ "Inline": [10, 20, 30] }
{ "SharedBlob": { "offset": 40, "len": 17, "generation": 2, "epoch": 9, "checksum": 987654321 } }

Snapshot Message

Full graph image sent on connect or resync.

Schema

Snapshot {
  epoch: u64,
  nodes: Vec<NodeSnapshot>,
  edges: Vec<EdgeSnapshot>,
  roots: Vec<NodeId>
}

NodeSnapshot

NodeSnapshot {
  node: NodeId,
  type_tag: string,
  state: NodeState
}

EdgeSnapshot

EdgeSnapshot {
  dependent: NodeId,
  dependency: NodeId
}

Example: Minimal snapshot

{
  "Snapshot": {
    "epoch": 1,
    "nodes": [
      {
        "node": 1,
        "type_tag": "i32",
        "state": { "Payload": [1, 2, 3, 4] }
      }
    ],
    "edges": [],
    "roots": [1]
  }
}

Example: Multi-node snapshot with opaque node

{
  "Snapshot": {
    "epoch": 7,
    "nodes": [
      { "node": 1, "type_tag": "i32", "state": { "Payload": [1, 2, 3] } },
      { "node": 2, "type_tag": "f64", "state": { "Payload": [0, 0, 0, 0, 0, 0, 240, 63] } },
      { "node": 3, "type_tag": "opaque-type", "state": "Opaque" }
    ],
    "edges": [
      { "dependent": 2, "dependency": 1 },
      { "dependent": 3, "dependency": 1 }
    ],
    "roots": [1, 2]
  }
}

Example: Snapshot with shared-blob node

{
  "Snapshot": {
    "epoch": 9,
    "nodes": [
      {
        "node": 7,
        "type_tag": "text/plain",
        "state": {
          "SharedBlob": {
            "offset": 0,
            "len": 16,
            "generation": 1,
            "epoch": 9,
            "checksum": 123456789
          }
        }
      }
    ],
    "edges": [],
    "roots": [7]
  }
}

Delta Message

Incremental change set emitted after one outermost batch flush.

Schema

Delta {
  base_epoch: u64,
  epoch: u64,
  ops: Vec<DeltaOp>
}

Sequential deltas satisfy epoch == base_epoch + 1. A receiver detects gaps, reorders, or sender restarts by checking base_epoch == last_epoch.

DeltaOp Variants

Variant	Fields	Meaning
`CellSet`	`node`, `payload` (IpcValue)	Source cell changed to new value
`SlotValue`	`node`, `payload` (IpcValue)	Lazily recomputed slot published a value
`Invalidate`	`node`	Node dirtied without a concrete value
`NodeAdd`	`node`, `type_tag`, `state` (NodeState)	New node became visible
`NodeRemove`	`node`	Node was removed
`EdgeAdd`	`dependent`, `dependency`	Dependency edge added
`EdgeRemove`	`dependent`, `dependency`	Dependency edge removed

Example: Sequential delta with all op variants

{
  "Delta": {
    "base_epoch": 40,
    "epoch": 41,
    "ops": [
      { "CellSet": { "node": 1, "payload": { "Inline": [10] } } },
      { "SlotValue": { "node": 2, "payload": { "Inline": [20] } } },
      { "Invalidate": { "node": 3 } },
      { "NodeAdd": { "node": 4, "type_tag": "u64", "state": { "Payload": [64] } } },
      { "NodeRemove": { "node": 5 } },
      { "EdgeAdd": { "dependent": 2, "dependency": 1 } },
      { "EdgeRemove": { "dependent": 3, "dependency": 1 } }
    ]
  }
}

Example: Non-sequential delta (gap)

{
  "Delta": {
    "base_epoch": 12,
    "epoch": 13,
    "ops": []
  }
}

When the receiver’s last_epoch is 10, this delta has a gap (expected 10→11, got 12→13). The receiver must discard it and request a fresh Snapshot.

Example: Delta with shared-blob payload

{
  "Delta": {
    "base_epoch": 8,
    "epoch": 9,
    "ops": [
      {
        "SlotValue": {
          "node": 7,
          "payload": {
            "SharedBlob": {
              "offset": 40,
              "len": 17,
              "generation": 2,
              "epoch": 9,
              "checksum": 987654321
            }
          }
        }
      }
    ]
  }
}

Epoch Contract

ipc_epoch is a monotonic u64 that advances once per outermost batch flush.
Snapshot carries epoch.
Delta carries { base_epoch, epoch } with epoch == base_epoch + 1.
On Delta where base_epoch != last_epoch: discard the delta, request a fresh Snapshot, resume from the snapshot’s epoch.

Consistency Invariants

PartialEq cell guard: equal CellSet produces no wire ops.
Memo equality suppression: a dirty memo slot that recomputes to an equal value emits no SlotValue or downstream Invalidate.
Coalesced frontier: a dependent reached through many changed cells in one batch appears at most once per delta.
Eager Signal nodes always carry a value: an eager Signal (see below) is recomputed during the invalidation flush, so when it changes it appears in the delta as a concrete SlotValue (never a bare Invalidate). A purely lazy slot that was not read before the flush may instead appear as Invalidate with no value. Both are valid wire states for the same SlotValue/Invalidate op set; the distinction is computation timing, not message format.

Eager Signal Nodes

A Signal is the eager derived value in the Slot -> Cell -> Signal family: it recomputes the instant a dependency invalidates rather than on next read. It is not a new wire type. A Signal is composed from a memoized backing slot plus a local puller effect, and only the backing slot is graph state, so on the wire a Signal node is an ordinary slot node:

Snapshot: the backing slot appears as a NodeSnapshot with its materialized value in NodeState (Payload/SharedBlob), like any other readable slot.
Delta: a value change appears as SlotValue for the backing slot’s NodeId. Because the value is eagerly materialized at flush time it is always concrete; eager nodes do not emit bare Invalidate.
Memo guard still applies: an eager recompute that yields an equal value (PartialEq) suppresses the SlotValue and any downstream Invalidate, exactly as for ctx.memo slots.
The puller effect is local: it drives eager recomputation but is not serialized as a node and produces no TriggerEffect op. Eagerness is a producer-side scheduling property; remote peers receive the same permission-filtered Snapshot/Delta state plane regardless of whether a node is lazy or eager.

Peers therefore need no protocol change to consume signals from an eager producer — a Signal is observed as a slot that is reliably present in every delta that changes it.

Permission Boundary

Only nodes on the per-peer allowlist are serialized. Non-allowlisted nodes are omitted entirely (not even as Opaque) so a peer cannot infer their existence. Edges are retained only when both endpoints are readable.

This filter is applied at snapshot/delta construction time, before serialization, on all channels without exception.

Serialization

JSON (default)

serde_json with derived Serialize/Deserialize. All examples above use JSON. This is the cross-language default.

Binary (optional)

postcard compact binary encoding via the ipc-binary feature. Smaller and faster than JSON, but not self-describing — peers must agree on the schema. For same-language or postcard-aware transports only.

Binary frames decode through IpcMessage::decode_binary(bytes) and encode through IpcMessage::encode_binary().

Transport Contracts

FFI (C ABI)

Opaque channel handle + owned byte buffers
Functions: channel_new, channel_free, channel_send, channel_recv, ipc_message_validate, ipc_message_kind, ipc_message_clone, bytes_free
Binary variants: same functions with _binary suffix
Ownership: caller owns input bytes; Rust owns output buffers until the paired free function is called
Errors return LazilyFfiStatus enum; panics are caught before the C ABI

IPC (Unix socket / pipe / local TCP)

Length-prefixed serialized IpcMessage frames
Shared-memory optional for large IpcValue::SharedBlob payloads
IpcSink / IpcSource trait interface

WebSocket

One WebSocket text/binary frame carries one serialized IpcMessage
Signaling server (#yxjw) relays frames as opaque payload
Server must not parse CRDT/IPC state

WebRTC Data Channel

Reliable ordered data channels only (for graph state)
Length-prefixed framing: 4-byte LE length + payload
JSON or binary codec negotiated during capability handshake
On channel failure: re-signaling via SignalingClient, delta resync covers gaps
Unordered/unreliable channels only for optional lossy telemetry

Capability Negotiation

Each non-local session starts with a handshake:

Field	Description
Protocol id	`"lazily-ipc"`
Protocol major version	`1`
Codec	`"json"` or `"binary"`
Maximum frame size	Negotiated maximum
Ordered/reliable	Required for graph state
PeerId	Session participant
Supported features	`shared-blob`, `crdt-cell-plane`, etc.

If peers disagree on protocol major version, codec, or ordering guarantees, they fail closed before applying any Snapshot or Delta.

Cross-Language Family Rules

Compute closures are language-local. Cross-language sync shares the cell state plane; derived slots converge remotely only when peers use a shared compiled graph or explicit compute descriptors.
Permission filtering happens before serialization on every channel.
Channel code must preserve back-pressure and resync behavior.
All channels carry the same permission-filtered IpcMessage state plane.

Conformance Test Vectors

Canonical JSON fixtures in tests/conformance/ validate wire-format agreement across all language bindings:

Fixture	Coverage
`snapshot_minimal.json`	Single payload node, no edges
`snapshot_multi_node.json`	Multiple nodes, opaque state, edges
`snapshot_shared_blob.json`	Shared-memory blob reference
`delta_sequential.json`	All 7 DeltaOp variants
`delta_non_sequential.json`	Gap requiring resync
`delta_shared_blob.json`	Delta with shared-blob payload

Each fixture contains:

{
  "description": "...",
  "protocol_version": 1,
  "kind": "Snapshot" | "Delta",
  "assertions": { ... },
  "wire": { <IpcMessage> }
}

Language bindings should:

Parse wire into native types
Validate assertions (field values, counts, state kinds)
Re-serialize and verify byte-exact match

lazily Benchmark Results

Generated benchmark data for the lazily reactive primitives library.

Benchmark Results

Generated for package lazily version 0.12.2.

Environment: rustc 1.96.0 (ac68faa20 2026-05-25) on x86_64-unknown-linux-gnu.

Refresh command:

python3 scripts/update-benchmark-results.py

Regression workflow:

cargo bench --features instrumentation -- --save-baseline before
# apply the performance patch
cargo bench --features instrumentation -- --baseline before
python3 scripts/update-benchmark-results.py --no-run

Regression budgets enforced by python3 scripts/update-benchmark-results.py --check:

Profile	Max lock acquisitions	Site lock budgets
thread_safe_set_cell_invalidation_independent_slot_contention_16	192	set_cell_invalidation<=0, dependency_edge<=16, get_refresh<=32, publish<=32
thread_safe_set_cell_invalidation_batched_write_bursts_16	900	other<=800, set_cell_invalidation<=16, dependency_edge<=64, get_refresh<=2, publish<=2
thread_safe_contention_same_slot_write_read_16	1000	get_refresh<=160, publish<=256, in_flight_wait<=700, set_cell_invalidation<=180
thread_safe_contention_independent_slots_16	1100	other<=450, get_refresh<=64, publish<=320, dependency_edge<=16, set_cell_invalidation<=300
thread_safe_contention_read_mostly_waiters_16	256	get_refresh<=128, publish<=64, in_flight_wait<=96
thread_safe_contention_batched_write_bursts_16	950	other<=800, get_refresh<=128, dependency_edge<=64, set_cell_invalidation<=16, publish<=64, in_flight_wait<=64
thread_safe_effect_contention_queue_coalescing_16	2600	other<=900, dependency_edge<=1600, set_cell_invalidation<=16, get_refresh<=64, publish<=0
thread_safe_effect_contention_cleanup_execution_16	1300	other<=450, dependency_edge<=700, set_cell_invalidation<=256, get_refresh<=0, publish<=0
thread_safe_effect_contention_batch_flush_16	1500	other<=1300, get_refresh<=32, dependency_edge<=96, set_cell_invalidation<=16, publish<=32

Budgets use deterministic lock acquisition counts instead of elapsed wait/hold time.

Synchronization strategy adoption gate:

Strategy	Status	Required throughput evidence	Required p50/p95 latency evidence	Lock-site and safety gate
current_std_mutex_condvar	baseline	thread_safe_contention and thread_safe_effect_contention at 8/16 workers	p50/p95 latency for same-slot, read-mostly, batch, and effect-heavy cases	must stay within current lock-site budgets and Loom safety coverage
narrower_condvar_wakeups	adopted for per-slot recompute waiters	same-slot write/read and read-mostly waiter throughput at 8/16 workers	p50/p95 latency for waiter wakeup handoff and stale-completion retry	must not regress effect queue, cleanup, or batch flush budgets
parking_lot_style_parking	candidate only	same contention matrix measured against current_std_mutex_condvar	p50/p95 latency for parking/unparking under 8/16 workers	requires no worse lock-site budgets plus a deadlock/starvation model
targeted_cas	candidate only	fresh cached reads and independent-slot throughput at 8/16 workers	p50/p95 latency for revision validation fallback and publish races	requires unchanged effect/batch/disposal budgets plus Loom/Shuttle proof

Candidates do not replace the current strategy before the same run reports throughput, p50/p95 latency, and lock-site budgets for the required 8/16-worker cases.

Required latency evidence uses Criterion sample per-iteration timing.

Watch-item A/B follow-up:

Watch item	Baseline/current refs	Focused command	Controlled rerun result	Decision
cached ThreadSafeContext read latency	a8b6fc3 vs c917401	`cargo bench --features instrumentation --bench context -- cached_reads/thread_safe_context`	73.48 ns baseline vs 73.20 ns current on warm-cache repeat	no tuning; the archived 56.5 ns row did not reproduce under controlled A/B
effect cleanup contention at 16 workers	a8b6fc3 vs c917401	`cargo bench --features instrumentation --bench context -- thread_safe_effect_contention/cleanup_execution/16`	2.31 ms baseline vs 2.43 ms current on warm-cache repeat with overlapping CIs	keep watching; Criterion reported no statistically significant change

Group	Case	p50	p95	Samples
thread_safe_contention	same_slot_write_read / 8	2.578 ms	2.746 ms	10
thread_safe_contention	same_slot_write_read / 16	7.185 ms	8.642 ms	10
thread_safe_contention	independent_slots / 8	1.133 ms	1.220 ms	10
thread_safe_contention	independent_slots / 16	2.885 ms	3.035 ms	10
thread_safe_contention	read_mostly_waiters / 8	533.466 us	578.630 us	10
thread_safe_contention	read_mostly_waiters / 16	1.105 ms	1.186 ms	10
thread_safe_contention	batched_write_bursts / 8	2.734 ms	3.132 ms	10
thread_safe_contention	batched_write_bursts / 16	5.097 ms	5.433 ms	10
thread_safe_effect_contention	queue_coalescing / 8	1.464 ms	1.739 ms	10
thread_safe_effect_contention	queue_coalescing / 16	4.039 ms	4.208 ms	10
thread_safe_effect_contention	cleanup_execution / 8	1.457 ms	1.719 ms	10
thread_safe_effect_contention	cleanup_execution / 16	3.948 ms	4.323 ms	10
thread_safe_effect_contention	batch_flush / 8	2.646 ms	2.929 ms	10
thread_safe_effect_contention	batch_flush / 16	7.896 ms	8.064 ms	10
thread_safe_graph_propagation	fan_out_eager_validation / 8	3.538 ms	3.629 ms	10
thread_safe_graph_propagation	fan_out_eager_validation / 16	6.130 ms	6.357 ms	10
thread_safe_graph_propagation	fan_out_lazy_dirty_epochs / 8	2.551 ms	2.671 ms	10
thread_safe_graph_propagation	fan_out_lazy_dirty_epochs / 16	4.442 ms	4.535 ms	10
thread_safe_graph_propagation	fan_in_lazy_dirty_epochs / 8	574.260 us	585.218 us	10
thread_safe_graph_propagation	fan_in_lazy_dirty_epochs / 16	1.126 ms	1.138 ms	10
thread_safe_graph_propagation	fan_in_batched_flush / 8	1.167 ms	1.287 ms	10
thread_safe_graph_propagation	fan_in_batched_flush / 16	2.271 ms	2.452 ms	10

Criterion estimates are local mean wall-clock time per iteration.

Group	Case	Mean	95% CI
cached_reads	context	8.027 ns	7.987 ns - 8.066 ns
cached_reads	thread_safe_context	65.223 ns	64.895 ns - 65.584 ns
cold_first_get	context	80.054 ns	77.778 ns - 82.208 ns
cold_first_get	thread_safe_context	990.971 ns	959.974 ns - 1.021 us
dependency_fan_out	context / 32	3.107 us	2.962 us - 3.248 us
dependency_fan_out	context / 256	45.763 us	43.643 us - 48.510 us
dependency_fan_out	thread_safe_context / 32	20.950 us	20.551 us - 21.347 us
dependency_fan_out	thread_safe_context / 256	160.588 us	159.203 us - 162.024 us
set_cell_invalidation	high_fan_out / 512	92.241 us	89.998 us - 94.223 us
set_cell_invalidation	same_slot_contention / 1	38.338 us	37.310 us - 39.419 us
set_cell_invalidation	same_slot_contention / 2	80.472 us	78.918 us - 81.907 us
set_cell_invalidation	same_slot_contention / 4	186.897 us	180.693 us - 192.872 us
set_cell_invalidation	same_slot_contention / 8	541.804 us	517.595 us - 568.932 us
set_cell_invalidation	same_slot_contention / 16	1.759 ms	1.718 ms - 1.794 ms
set_cell_invalidation	independent_slot_contention / 1	39.527 us	38.588 us - 40.454 us
set_cell_invalidation	independent_slot_contention / 2	73.353 us	71.989 us - 74.760 us
set_cell_invalidation	independent_slot_contention / 4	128.280 us	126.976 us - 129.514 us
set_cell_invalidation	independent_slot_contention / 8	281.716 us	269.854 us - 293.557 us
set_cell_invalidation	independent_slot_contention / 16	612.042 us	607.064 us - 617.024 us
set_cell_invalidation	batched_write_bursts / 1	128.103 us	126.066 us - 130.111 us
set_cell_invalidation	batched_write_bursts / 2	226.325 us	218.580 us - 232.553 us
set_cell_invalidation	batched_write_bursts / 4	480.877 us	463.297 us - 496.610 us
set_cell_invalidation	batched_write_bursts / 8	1.401 ms	1.353 ms - 1.449 ms
set_cell_invalidation	batched_write_bursts / 16	4.002 ms	3.922 ms - 4.075 ms
memo_equality_suppression	context	1.881 us	1.758 us - 2.005 us
memo_equality_suppression	thread_safe_context	33.002 us	32.645 us - 33.355 us
effect_flushing	context	47.561 ns	47.315 ns - 47.835 ns
effect_flushing	thread_safe_context	919.973 ns	914.308 ns - 926.589 ns
batch_storms	context / 64	2.764 us	2.744 us - 2.786 us
batch_storms	thread_safe_context / 64	7.082 us	6.899 us - 7.306 us
thread_safe_contention	same_slot_write_read / 1	98.686 us	96.377 us - 100.859 us
thread_safe_contention	same_slot_write_read / 2	266.923 us	261.168 us - 272.455 us
thread_safe_contention	same_slot_write_read / 4	871.034 us	824.425 us - 916.400 us
thread_safe_contention	same_slot_write_read / 8	2.558 ms	2.488 ms - 2.625 ms
thread_safe_contention	same_slot_write_read / 16	7.411 ms	7.144 ms - 7.749 ms
thread_safe_contention	independent_slots / 1	101.937 us	100.782 us - 102.937 us
thread_safe_contention	independent_slots / 2	212.770 us	206.378 us - 217.797 us
thread_safe_contention	independent_slots / 4	463.904 us	453.067 us - 474.423 us
thread_safe_contention	independent_slots / 8	1.133 ms	1.109 ms - 1.159 ms
thread_safe_contention	independent_slots / 16	2.850 ms	2.739 ms - 2.949 ms
thread_safe_contention	read_mostly_waiters / 1	103.037 us	101.347 us - 104.750 us
thread_safe_contention	read_mostly_waiters / 2	138.765 us	134.847 us - 142.827 us
thread_safe_contention	read_mostly_waiters / 4	277.379 us	271.246 us - 282.068 us
thread_safe_contention	read_mostly_waiters / 8	522.464 us	489.359 us - 552.208 us
thread_safe_contention	read_mostly_waiters / 16	1.116 ms	1.097 ms - 1.136 ms
thread_safe_contention	batched_write_bursts / 1	208.558 us	206.766 us - 210.767 us
thread_safe_contention	batched_write_bursts / 2	547.941 us	521.314 us - 575.389 us
thread_safe_contention	batched_write_bursts / 4	1.467 ms	1.454 ms - 1.480 ms
thread_safe_contention	batched_write_bursts / 8	2.796 ms	2.716 ms - 2.890 ms
thread_safe_contention	batched_write_bursts / 16	4.970 ms	4.733 ms - 5.184 ms
thread_safe_effect_contention	queue_coalescing / 8	1.491 ms	1.437 ms - 1.558 ms
thread_safe_effect_contention	queue_coalescing / 16	4.055 ms	4.013 ms - 4.102 ms
thread_safe_effect_contention	cleanup_execution / 8	1.498 ms	1.445 ms - 1.561 ms
thread_safe_effect_contention	cleanup_execution / 16	4.042 ms	3.893 ms - 4.185 ms
thread_safe_effect_contention	batch_flush / 8	2.689 ms	2.607 ms - 2.771 ms
thread_safe_effect_contention	batch_flush / 16	7.856 ms	7.715 ms - 7.961 ms
thread_safe_graph_propagation	fan_out_eager_validation / 8	3.558 ms	3.531 ms - 3.587 ms
thread_safe_graph_propagation	fan_out_eager_validation / 16	6.154 ms	6.104 ms - 6.211 ms
thread_safe_graph_propagation	fan_out_lazy_dirty_epochs / 8	2.544 ms	2.503 ms - 2.584 ms
thread_safe_graph_propagation	fan_out_lazy_dirty_epochs / 16	4.451 ms	4.427 ms - 4.477 ms
thread_safe_graph_propagation	fan_in_lazy_dirty_epochs / 8	572.971 us	567.279 us - 578.239 us
thread_safe_graph_propagation	fan_in_lazy_dirty_epochs / 16	1.109 ms	1.074 ms - 1.130 ms
thread_safe_graph_propagation	fan_in_batched_flush / 8	1.169 ms	1.144 ms - 1.201 ms
thread_safe_graph_propagation	fan_in_batched_flush / 16	2.296 ms	2.257 ms - 2.341 ms
profile_instrumentation	context_snapshot	267.600 ns	265.902 ns - 269.343 ns
profile_instrumentation	thread_safe_snapshot	295.465 us	294.338 us - 296.488 us
async_cached_resolve	async_context	4.005 us	3.829 us - 4.197 us
async_cached_resolve	sync_context_baseline	73.063 ns	69.741 ns - 77.406 ns
async_cached_resolve	sync_get	11.537 ns	11.488 ns - 11.585 ns
async_cached_resolve	thread_safe_context_baseline	1.394 us	1.388 us - 1.399 us
async_cold_resolve	async_context	4.261 us	4.114 us - 4.397 us
async_cold_resolve	sync_context_baseline	83.178 ns	79.646 ns - 86.312 ns
async_cold_resolve	thread_safe_context_baseline	1.010 us	983.439 ns - 1.036 us
async_invalidation_throughput	async_context	245.023 us	232.026 us - 258.919 us
async_invalidation_throughput	sync_context_baseline	2.785 us	2.763 us - 2.808 us
async_invalidation_throughput	thread_safe_context_baseline	39.314 us	39.079 us - 39.593 us
async_cancellation_throughput	async_invalidate_in_flight	77.176 us	64.769 us - 87.137 us
async_concurrent_contention	async_context / 1	71.400 us	70.290 us - 72.484 us
async_concurrent_contention	async_context / 4	391.647 us	375.605 us - 406.235 us
async_concurrent_contention	async_context / 16	1.939 ms	1.834 ms - 2.040 ms
async_concurrent_contention	thread_safe_context_baseline / 1	60.390 us	58.969 us - 61.565 us
async_concurrent_contention	thread_safe_context_baseline / 4	509.481 us	496.452 us - 522.465 us
async_concurrent_contention	thread_safe_context_baseline / 16	3.494 ms	3.408 ms - 3.553 ms
async_effect_throughput	async_context	188.121 ms	187.959 ms - 188.267 ms
async_batch_throughput	async_context	76.304 us	73.323 us - 79.503 us
async_batch_throughput	sync_context_baseline	10.755 us	10.676 us - 10.843 us
tokio_sync_cached_read	single_task	1.488 us	1.476 us - 1.501 us
tokio_sync_cached_read	spawn_read	5.051 us	4.882 us - 5.218 us
tokio_sync_cold_first_get	single_task	1.425 us	1.414 us - 1.435 us
tokio_sync_cold_first_get	spawn_compute	5.226 us	4.983 us - 5.462 us
tokio_sync_invalidation	single_task	39.148 us	38.805 us - 39.520 us
tokio_sync_concurrent_contention	same_slot_write_read / 1	43.792 us	43.051 us - 44.677 us
tokio_sync_concurrent_contention	same_slot_write_read / 4	402.701 us	389.818 us - 416.472 us
tokio_sync_concurrent_contention	same_slot_write_read / 16	3.734 ms	3.570 ms - 3.882 ms
tokio_sync_concurrent_contention	independent_slots / 1	44.150 us	43.337 us - 44.963 us
tokio_sync_concurrent_contention	independent_slots / 4	254.247 us	241.953 us - 268.158 us
tokio_sync_concurrent_contention	independent_slots / 16	1.460 ms	1.401 ms - 1.518 ms
tokio_sync_batch	spawn_batch	43.228 us	42.846 us - 43.603 us
tokio_sync_effect	single_task	10.087 ms	10.083 ms - 10.093 ms
typed_cache_reads	context_cell	2.513 ns	2.497 ns - 2.530 ns
typed_cache_reads	context_rc_cell	2.771 ns	2.754 ns - 2.788 ns
typed_cache_reads	context_rc_slot	8.245 ns	8.197 ns - 8.297 ns
typed_cache_reads	context_slot	7.961 ns	7.925 ns - 7.998 ns
typed_cache_reads	thread_safe_cell	25.089 ns	24.867 ns - 25.314 ns
typed_cache_reads	thread_safe_slot	64.338 ns	64.048 ns - 64.615 ns

Instrumentation snapshots are single local profile runs captured by examples/instrumentation_profile.rs.

Profile	Alloc	Recomputes	Edges +	Edges -	Effect pushes	Max queue	Lock acquisitions	Lock wait	Lock hold	Sidecar frontiers	Sidecar dirty marks	Sidecar fallbacks	Dirty epochs
context_memo_effect	4	3	4	1	2	1	0	0.000 ns	0.000 ns	0	0	0	0
context_fan_out_32	33	64	64	32	0	0	0	0.000 ns	0.000 ns	0	0	0	0
context_batch_storm_64	65	0	128	64	2	1	0	0.000 ns	0.000 ns	0	0	0	0
thread_safe_first_get_2	2	1	1	0	0	0	11	19.240 us	18.020 us	0	0	0	0
thread_safe_set_cell_invalidation_high_fan_out_512	0	0	0	0	0	0	0	0.000 ns	0.000 ns	1	512	0	512
thread_safe_set_cell_invalidation_same_slot_contention_1	2	1	1	0	0	0	8	290.000 ns	2.880 us	16	16	0	16
thread_safe_set_cell_invalidation_same_slot_contention_2	2	1	1	0	0	0	8	280.000 ns	1.170 us	32	32	0	32
thread_safe_set_cell_invalidation_same_slot_contention_4	2	1	1	0	0	0	8	280.000 ns	1.060 us	64	64	0	64
thread_safe_set_cell_invalidation_same_slot_contention_8	2	1	1	0	0	0	8	280.000 ns	1.110 us	128	128	0	128
thread_safe_set_cell_invalidation_same_slot_contention_16	2	1	1	0	0	0	8	380.000 ns	5.080 us	256	256	0	256
thread_safe_set_cell_invalidation_independent_slot_contention_1	2	1	1	0	0	0	8	380.000 ns	7.940 us	15	15	0	15
thread_safe_set_cell_invalidation_independent_slot_contention_2	4	2	2	0	0	0	16	610.000 ns	2.410 us	31	31	0	31
thread_safe_set_cell_invalidation_independent_slot_contention_4	8	4	4	0	0	0	32	1.140 us	4.920 us	63	63	0	63
thread_safe_set_cell_invalidation_independent_slot_contention_8	16	8	8	0	0	0	64	2.270 us	9.930 us	127	127	0	127
thread_safe_set_cell_invalidation_independent_slot_contention_16	32	16	16	0	0	0	128	4.570 us	18.810 us	255	255	0	255
thread_safe_set_cell_invalidation_batched_write_bursts_1	5	1	4	0	0	0	97	3.650 us	60.860 us	0	0	0	15
thread_safe_set_cell_invalidation_batched_write_bursts_2	9	1	8	0	0	0	159	193.830 us	106.741 us	0	0	0	21
thread_safe_set_cell_invalidation_batched_write_bursts_4	17	1	16	0	0	0	199	439.375 us	150.901 us	0	0	0	6
thread_safe_set_cell_invalidation_batched_write_bursts_8	33	1	32	0	0	0	369	2.189 ms	231.381 us	0	0	0	4
thread_safe_set_cell_invalidation_batched_write_bursts_16	65	1	64	0	0	0	715	9.403 ms	432.702 us	0	0	0	2
thread_safe_contention_same_slot_write_read_1	2	17	1	0	0	0	24	1.080 us	17.510 us	16	16	0	16
thread_safe_contention_same_slot_write_read_2	2	28	1	0	0	0	77	16.610 us	46.410 us	23	23	9	32
thread_safe_contention_same_slot_write_read_4	2	52	1	0	0	0	226	127.742 us	107.120 us	33	33	31	64
thread_safe_contention_same_slot_write_read_8	2	111	1	0	0	0	449	207.906 us	200.263 us	70	70	58	128
thread_safe_contention_same_slot_write_read_16	2	211	1	0	0	0	862	1.132 ms	429.864 us	138	138	118	256
thread_safe_contention_independent_slots_1	2	16	1	0	0	0	23	980.000 ns	15.180 us	15	15	0	15
thread_safe_contention_independent_slots_2	4	33	2	0	0	0	89	19.151 us	39.491 us	17	17	14	31
thread_safe_contention_independent_slots_4	8	67	4	0	0	0	199	416.563 us	95.721 us	25	25	38	63
thread_safe_contention_independent_slots_8	16	135	8	0	0	0	465	4.256 ms	296.231 us	11	11	116	127
thread_safe_contention_independent_slots_16	32	271	16	0	0	0	939	19.906 ms	630.554 us	9	9	246	255
thread_safe_contention_read_mostly_waiters_1	2	17	1	0	0	0	24	1.110 us	17.340 us	16	16	0	16
thread_safe_contention_read_mostly_waiters_2	2	17	1	0	0	0	24	1.030 us	14.530 us	16	16	0	16
thread_safe_contention_read_mostly_waiters_4	2	17	1	0	0	0	41	2.500 us	26.930 us	16	16	0	16
thread_safe_contention_read_mostly_waiters_8	2	17	1	0	0	0	59	22.950 us	31.790 us	16	16	0	16
thread_safe_contention_read_mostly_waiters_16	2	17	1	0	0	0	83	217.170 us	114.471 us	16	16	0	16
thread_safe_contention_batched_write_bursts_1	5	16	4	0	0	0	112	4.490 us	64.750 us	0	0	0	15
thread_safe_contention_batched_write_bursts_2	9	21	8	0	0	0	188	66.861 us	109.832 us	0	0	0	20
thread_safe_contention_batched_write_bursts_4	17	37	16	0	0	0	389	432.502 us	282.893 us	0	0	0	38
thread_safe_contention_batched_write_bursts_8	33	10	32	0	0	0	400	2.433 ms	298.504 us	0	0	0	9
thread_safe_contention_batched_write_bursts_16	65	3	64	0	0	0	717	10.003 ms	462.403 us	0	0	0	2
thread_safe_effect_contention_queue_coalescing_8	33	0	32	0	8	1	406	2.018 ms	269.084 us	0	0	0	0
thread_safe_effect_contention_queue_coalescing_16	65	0	64	0	10	1	775	9.282 ms	531.114 us	0	0	0	0
thread_safe_effect_contention_cleanup_execution_8	9	0	8	8	32	1	408	1.845 ms	190.321 us	0	0	127	0
thread_safe_effect_contention_cleanup_execution_16	17	0	16	16	32	1	696	7.538 ms	305.824 us	0	0	255	0
thread_safe_effect_contention_batch_flush_8	34	3	33	0	3	1	634	2.638 ms	269.453 us	0	0	0	2
thread_safe_effect_contention_batch_flush_16	66	2	65	0	3	1	1239	21.843 ms	617.275 us	0	0	0	1
thread_safe_graph_propagation_fan_out_eager_validation_8	34	549	64	0	49	1	1132	24.084 ms	6.033 ms	10	320	118	4096
thread_safe_graph_propagation_fan_out_eager_validation_16	34	555	64	0	49	1	1380	117.009 ms	11.557 ms	17	544	239	8192
thread_safe_graph_propagation_fan_out_lazy_dirty_epochs_8	33	64	32	0	0	0	226	10.090 us	76.281 us	128	4096	0	4096
thread_safe_graph_propagation_fan_out_lazy_dirty_epochs_16	33	64	32	0	0	0	226	9.670 us	69.370 us	256	8192	0	8192
thread_safe_graph_propagation_fan_in_lazy_dirty_epochs_8	65	66	64	0	0	0	328	11.840 us	105.540 us	508	540	0	572
thread_safe_graph_propagation_fan_in_lazy_dirty_epochs_16	129	130	128	0	0	0	648	27.661 us	197.172 us	1020	1084	0	1148
thread_safe_graph_propagation_fan_in_batched_flush_8	66	66	65	0	3	1	605	2.600 ms	397.774 us	0	0	0	73
thread_safe_graph_propagation_fan_in_batched_flush_16	130	135	129	0	5	1	1319	7.690 ms	744.236 us	0	0	0	170

ThreadSafe lock attribution for contention profiles:

Profile	Site	Lock acquisitions	Lock wait	Lock hold
thread_safe_set_cell_invalidation_same_slot_contention_1	other	4	110.000 ns	1.830 us
thread_safe_set_cell_invalidation_same_slot_contention_1	get_refresh	2	90.000 ns	230.000 ns
thread_safe_set_cell_invalidation_same_slot_contention_1	dependency_edge	1	50.000 ns	480.000 ns
thread_safe_set_cell_invalidation_same_slot_contention_1	publish	1	40.000 ns	340.000 ns
thread_safe_set_cell_invalidation_same_slot_contention_2	other	4	110.000 ns	240.000 ns
thread_safe_set_cell_invalidation_same_slot_contention_2	get_refresh	2	100.000 ns	200.000 ns
thread_safe_set_cell_invalidation_same_slot_contention_2	dependency_edge	1	30.000 ns	400.000 ns
thread_safe_set_cell_invalidation_same_slot_contention_2	publish	1	40.000 ns	330.000 ns
thread_safe_set_cell_invalidation_same_slot_contention_4	other	4	120.000 ns	210.000 ns
thread_safe_set_cell_invalidation_same_slot_contention_4	get_refresh	2	90.000 ns	160.000 ns
thread_safe_set_cell_invalidation_same_slot_contention_4	dependency_edge	1	30.000 ns	370.000 ns
thread_safe_set_cell_invalidation_same_slot_contention_4	publish	1	40.000 ns	320.000 ns
thread_safe_set_cell_invalidation_same_slot_contention_8	other	4	120.000 ns	240.000 ns
thread_safe_set_cell_invalidation_same_slot_contention_8	get_refresh	2	90.000 ns	200.000 ns
thread_safe_set_cell_invalidation_same_slot_contention_8	dependency_edge	1	20.000 ns	340.000 ns
thread_safe_set_cell_invalidation_same_slot_contention_8	publish	1	50.000 ns	330.000 ns
thread_safe_set_cell_invalidation_same_slot_contention_16	other	4	180.000 ns	1.320 us
thread_safe_set_cell_invalidation_same_slot_contention_16	get_refresh	2	130.000 ns	1.360 us
thread_safe_set_cell_invalidation_same_slot_contention_16	dependency_edge	1	20.000 ns	1.190 us
thread_safe_set_cell_invalidation_same_slot_contention_16	publish	1	50.000 ns	1.210 us
thread_safe_set_cell_invalidation_independent_slot_contention_1	other	4	180.000 ns	2.550 us
thread_safe_set_cell_invalidation_independent_slot_contention_1	get_refresh	2	130.000 ns	2.340 us
thread_safe_set_cell_invalidation_independent_slot_contention_1	dependency_edge	1	20.000 ns	1.670 us
thread_safe_set_cell_invalidation_independent_slot_contention_1	publish	1	50.000 ns	1.380 us
thread_safe_set_cell_invalidation_independent_slot_contention_2	other	8	250.000 ns	590.000 ns
thread_safe_set_cell_invalidation_independent_slot_contention_2	get_refresh	4	220.000 ns	410.000 ns
thread_safe_set_cell_invalidation_independent_slot_contention_2	dependency_edge	2	50.000 ns	710.000 ns
thread_safe_set_cell_invalidation_independent_slot_contention_2	publish	2	90.000 ns	700.000 ns
thread_safe_set_cell_invalidation_independent_slot_contention_4	other	16	510.000 ns	1.500 us
thread_safe_set_cell_invalidation_independent_slot_contention_4	get_refresh	8	340.000 ns	680.000 ns
thread_safe_set_cell_invalidation_independent_slot_contention_4	dependency_edge	4	100.000 ns	1.330 us
thread_safe_set_cell_invalidation_independent_slot_contention_4	publish	4	190.000 ns	1.410 us
thread_safe_set_cell_invalidation_independent_slot_contention_8	other	32	950.000 ns	3.330 us
thread_safe_set_cell_invalidation_independent_slot_contention_8	get_refresh	16	680.000 ns	1.260 us
thread_safe_set_cell_invalidation_independent_slot_contention_8	dependency_edge	8	250.000 ns	2.600 us
thread_safe_set_cell_invalidation_independent_slot_contention_8	publish	8	390.000 ns	2.740 us
thread_safe_set_cell_invalidation_independent_slot_contention_16	other	64	1.990 us	5.380 us
thread_safe_set_cell_invalidation_independent_slot_contention_16	get_refresh	32	1.390 us	2.460 us
thread_safe_set_cell_invalidation_independent_slot_contention_16	dependency_edge	16	430.000 ns	5.200 us
thread_safe_set_cell_invalidation_independent_slot_contention_16	publish	16	760.000 ns	5.770 us
thread_safe_set_cell_invalidation_batched_write_bursts_1	other	74	2.770 us	15.940 us
thread_safe_set_cell_invalidation_batched_write_bursts_1	get_refresh	2	80.000 ns	290.000 ns
thread_safe_set_cell_invalidation_batched_write_bursts_1	dependency_edge	4	170.000 ns	1.760 us
thread_safe_set_cell_invalidation_batched_write_bursts_1	set_cell_invalidation	16	590.000 ns	42.450 us
thread_safe_set_cell_invalidation_batched_write_bursts_1	publish	1	40.000 ns	420.000 ns
thread_safe_set_cell_invalidation_batched_write_bursts_2	other	126	186.900 us	41.250 us
thread_safe_set_cell_invalidation_batched_write_bursts_2	get_refresh	2	80.000 ns	180.000 ns
thread_safe_set_cell_invalidation_batched_write_bursts_2	dependency_edge	8	340.000 ns	6.010 us
thread_safe_set_cell_invalidation_batched_write_bursts_2	set_cell_invalidation	22	6.460 us	58.951 us
thread_safe_set_cell_invalidation_batched_write_bursts_2	publish	1	50.000 ns	350.000 ns
thread_safe_set_cell_invalidation_batched_write_bursts_4	other	174	434.615 us	99.261 us
thread_safe_set_cell_invalidation_batched_write_bursts_4	get_refresh	2	80.000 ns	190.000 ns
thread_safe_set_cell_invalidation_batched_write_bursts_4	dependency_edge	16	700.000 ns	7.690 us
thread_safe_set_cell_invalidation_batched_write_bursts_4	set_cell_invalidation	6	3.930 us	43.400 us
thread_safe_set_cell_invalidation_batched_write_bursts_4	publish	1	50.000 ns	360.000 ns
thread_safe_set_cell_invalidation_batched_write_bursts_8	other	330	2.187 ms	199.591 us
thread_safe_set_cell_invalidation_batched_write_bursts_8	get_refresh	2	130.000 ns	1.410 us
thread_safe_set_cell_invalidation_batched_write_bursts_8	dependency_edge	32	1.600 us	17.200 us
thread_safe_set_cell_invalidation_batched_write_bursts_8	set_cell_invalidation	4	200.000 ns	11.920 us
thread_safe_set_cell_invalidation_batched_write_bursts_8	publish	1	50.000 ns	1.260 us
thread_safe_set_cell_invalidation_batched_write_bursts_16	other	646	9.400 ms	384.152 us
thread_safe_set_cell_invalidation_batched_write_bursts_16	get_refresh	2	90.000 ns	280.000 ns
thread_safe_set_cell_invalidation_batched_write_bursts_16	dependency_edge	64	3.180 us	34.280 us
thread_safe_set_cell_invalidation_batched_write_bursts_16	set_cell_invalidation	2	90.000 ns	13.590 us
thread_safe_set_cell_invalidation_batched_write_bursts_16	publish	1	50.000 ns	400.000 ns
thread_safe_contention_same_slot_write_read_1	other	4	130.000 ns	770.000 ns
thread_safe_contention_same_slot_write_read_1	get_refresh	2	100.000 ns	510.000 ns
thread_safe_contention_same_slot_write_read_1	dependency_edge	1	30.000 ns	570.000 ns
thread_safe_contention_same_slot_write_read_1	publish	17	820.000 ns	15.660 us
thread_safe_contention_same_slot_write_read_2	other	22	3.030 us	990.000 ns
thread_safe_contention_same_slot_write_read_2	get_refresh	2	90.000 ns	190.000 ns
thread_safe_contention_same_slot_write_read_2	dependency_edge	1	30.000 ns	360.000 ns
thread_safe_contention_same_slot_write_read_2	set_cell_invalidation	9	1.920 us	13.920 us
thread_safe_contention_same_slot_write_read_2	publish	28	11.540 us	30.950 us
thread_safe_contention_same_slot_write_read_2	in_flight_wait	15	0.000 ns	0.000 ns
thread_safe_contention_same_slot_write_read_4	other	63	45.511 us	2.670 us
thread_safe_contention_same_slot_write_read_4	get_refresh	2	80.000 ns	150.000 ns
thread_safe_contention_same_slot_write_read_4	dependency_edge	1	30.000 ns	330.000 ns
thread_safe_contention_same_slot_write_read_4	set_cell_invalidation	31	30.140 us	43.440 us
thread_safe_contention_same_slot_write_read_4	publish	52	51.981 us	60.530 us
thread_safe_contention_same_slot_write_read_4	in_flight_wait	77	0.000 ns	0.000 ns
thread_safe_contention_same_slot_write_read_8	other	117	62.331 us	3.711 us
thread_safe_contention_same_slot_write_read_8	get_refresh	12	12.790 us	6.830 us
thread_safe_contention_same_slot_write_read_8	dependency_edge	1	20.000 ns	350.000 ns
thread_safe_contention_same_slot_write_read_8	set_cell_invalidation	58	70.113 us	62.631 us
thread_safe_contention_same_slot_write_read_8	publish	111	62.652 us	126.741 us
thread_safe_contention_same_slot_write_read_8	in_flight_wait	150	0.000 ns	0.000 ns
thread_safe_contention_same_slot_write_read_16	other	229	407.693 us	8.420 us
thread_safe_contention_same_slot_write_read_16	get_refresh	16	12.330 us	5.900 us
thread_safe_contention_same_slot_write_read_16	dependency_edge	1	20.000 ns	1.130 us
thread_safe_contention_same_slot_write_read_16	set_cell_invalidation	118	558.902 us	150.022 us
thread_safe_contention_same_slot_write_read_16	publish	211	152.932 us	264.392 us
thread_safe_contention_same_slot_write_read_16	in_flight_wait	287	0.000 ns	0.000 ns
thread_safe_contention_independent_slots_1	other	4	140.000 ns	1.130 us
thread_safe_contention_independent_slots_1	get_refresh	2	90.000 ns	310.000 ns
thread_safe_contention_independent_slots_1	dependency_edge	1	20.000 ns	1.180 us
thread_safe_contention_independent_slots_1	publish	16	730.000 ns	12.560 us
thread_safe_contention_independent_slots_2	other	36	2.640 us	1.450 us
thread_safe_contention_independent_slots_2	get_refresh	4	200.000 ns	420.000 ns
thread_safe_contention_independent_slots_2	dependency_edge	2	50.000 ns	710.000 ns
thread_safe_contention_independent_slots_2	set_cell_invalidation	14	7.690 us	10.741 us
thread_safe_contention_independent_slots_2	publish	33	8.571 us	26.170 us
thread_safe_contention_independent_slots_4	other	82	103.730 us	3.910 us
thread_safe_contention_independent_slots_4	get_refresh	8	380.000 ns	650.000 ns
thread_safe_contention_independent_slots_4	dependency_edge	4	100.000 ns	1.320 us
thread_safe_contention_independent_slots_4	set_cell_invalidation	38	203.262 us	32.581 us
thread_safe_contention_independent_slots_4	publish	67	109.091 us	57.260 us
thread_safe_contention_independent_slots_8	other	190	1.107 ms	10.290 us
thread_safe_contention_independent_slots_8	get_refresh	16	740.000 ns	1.310 us
thread_safe_contention_independent_slots_8	dependency_edge	8	220.000 ns	2.810 us
thread_safe_contention_independent_slots_8	set_cell_invalidation	116	1.523 ms	130.711 us
thread_safe_contention_independent_slots_8	publish	135	1.625 ms	151.110 us
thread_safe_contention_independent_slots_16	other	374	5.514 ms	20.750 us
thread_safe_contention_independent_slots_16	get_refresh	32	1.400 us	2.970 us
thread_safe_contention_independent_slots_16	dependency_edge	16	430.000 ns	6.240 us
thread_safe_contention_independent_slots_16	set_cell_invalidation	246	6.024 ms	288.392 us
thread_safe_contention_independent_slots_16	publish	271	8.367 ms	312.202 us
thread_safe_contention_read_mostly_waiters_1	other	4	130.000 ns	1.430 us
thread_safe_contention_read_mostly_waiters_1	get_refresh	2	140.000 ns	880.000 ns
thread_safe_contention_read_mostly_waiters_1	dependency_edge	1	20.000 ns	1.180 us
thread_safe_contention_read_mostly_waiters_1	publish	17	820.000 ns	13.850 us
thread_safe_contention_read_mostly_waiters_2	other	4	130.000 ns	310.000 ns
thread_safe_contention_read_mostly_waiters_2	get_refresh	2	90.000 ns	180.000 ns
thread_safe_contention_read_mostly_waiters_2	dependency_edge	1	20.000 ns	380.000 ns
thread_safe_contention_read_mostly_waiters_2	publish	17	790.000 ns	13.660 us
thread_safe_contention_read_mostly_waiters_4	other	4	130.000 ns	250.000 ns
thread_safe_contention_read_mostly_waiters_4	get_refresh	8	1.560 us	2.200 us
thread_safe_contention_read_mostly_waiters_4	dependency_edge	1	20.000 ns	360.000 ns
thread_safe_contention_read_mostly_waiters_4	publish	17	790.000 ns	24.120 us
thread_safe_contention_read_mostly_waiters_4	in_flight_wait	11	0.000 ns	0.000 ns
thread_safe_contention_read_mostly_waiters_8	other	4	150.000 ns	390.000 ns
thread_safe_contention_read_mostly_waiters_8	get_refresh	16	14.830 us	5.350 us
thread_safe_contention_read_mostly_waiters_8	dependency_edge	1	30.000 ns	360.000 ns
thread_safe_contention_read_mostly_waiters_8	publish	17	7.940 us	25.690 us
thread_safe_contention_read_mostly_waiters_8	in_flight_wait	21	0.000 ns	0.000 ns
thread_safe_contention_read_mostly_waiters_16	other	4	180.000 ns	1.330 us
thread_safe_contention_read_mostly_waiters_16	get_refresh	22	211.850 us	16.180 us
thread_safe_contention_read_mostly_waiters_16	dependency_edge	1	30.000 ns	1.190 us
thread_safe_contention_read_mostly_waiters_16	publish	17	5.110 us	95.771 us
thread_safe_contention_read_mostly_waiters_16	in_flight_wait	39	0.000 ns	0.000 ns
thread_safe_contention_batched_write_bursts_1	other	74	2.790 us	14.810 us
thread_safe_contention_batched_write_bursts_1	get_refresh	2	110.000 ns	340.000 ns
thread_safe_contention_batched_write_bursts_1	dependency_edge	4	170.000 ns	1.790 us
thread_safe_contention_batched_write_bursts_1	set_cell_invalidation	16	670.000 ns	29.360 us
thread_safe_contention_batched_write_bursts_1	publish	16	750.000 ns	18.450 us
thread_safe_contention_batched_write_bursts_2	other	124	52.991 us	31.280 us
thread_safe_contention_batched_write_bursts_2	get_refresh	2	90.000 ns	180.000 ns
thread_safe_contention_batched_write_bursts_2	dependency_edge	8	320.000 ns	4.270 us
thread_safe_contention_batched_write_bursts_2	set_cell_invalidation	21	9.380 us	44.482 us
thread_safe_contention_batched_write_bursts_2	publish	21	4.080 us	29.620 us
thread_safe_contention_batched_write_bursts_2	in_flight_wait	12	0.000 ns	0.000 ns
thread_safe_contention_batched_write_bursts_4	other	237	388.651 us	83.570 us
thread_safe_contention_batched_write_bursts_4	get_refresh	10	32.921 us	4.740 us
thread_safe_contention_batched_write_bursts_4	dependency_edge	16	740.000 ns	7.550 us
thread_safe_contention_batched_write_bursts_4	set_cell_invalidation	38	3.930 us	94.652 us
thread_safe_contention_batched_write_bursts_4	publish	37	6.260 us	92.381 us
thread_safe_contention_batched_write_bursts_4	in_flight_wait	51	0.000 ns	0.000 ns
thread_safe_contention_batched_write_bursts_8	other	340	2.431 ms	199.842 us
thread_safe_contention_batched_write_bursts_8	get_refresh	2	90.000 ns	390.000 ns
thread_safe_contention_batched_write_bursts_8	dependency_edge	32	1.560 us	15.721 us
thread_safe_contention_batched_write_bursts_8	set_cell_invalidation	9	480.000 ns	49.211 us
thread_safe_contention_batched_write_bursts_8	publish	10	540.000 ns	33.340 us
thread_safe_contention_batched_write_bursts_8	in_flight_wait	7	0.000 ns	0.000 ns
thread_safe_contention_batched_write_bursts_16	other	646	9.999 ms	389.373 us
thread_safe_contention_batched_write_bursts_16	get_refresh	2	120.000 ns	1.270 us
thread_safe_contention_batched_write_bursts_16	dependency_edge	64	3.040 us	34.950 us
thread_safe_contention_batched_write_bursts_16	set_cell_invalidation	2	110.000 ns	17.680 us
thread_safe_contention_batched_write_bursts_16	publish	3	150.000 ns	19.130 us
thread_safe_effect_contention_queue_coalescing_8	other	365	2.016 ms	199.513 us
thread_safe_effect_contention_queue_coalescing_8	dependency_edge	32	1.420 us	11.631 us
thread_safe_effect_contention_queue_coalescing_8	set_cell_invalidation	9	520.000 ns	57.940 us
thread_safe_effect_contention_queue_coalescing_16	other	702	9.279 ms	470.073 us
thread_safe_effect_contention_queue_coalescing_16	dependency_edge	64	2.810 us	24.800 us
thread_safe_effect_contention_queue_coalescing_16	set_cell_invalidation	9	440.000 ns	36.241 us
thread_safe_effect_contention_cleanup_execution_8	other	265	599.415 us	62.861 us
thread_safe_effect_contention_cleanup_execution_8	dependency_edge	16	800.000 ns	11.120 us
thread_safe_effect_contention_cleanup_execution_8	set_cell_invalidation	127	1.245 ms	116.340 us
thread_safe_effect_contention_cleanup_execution_16	other	409	3.289 ms	93.061 us
thread_safe_effect_contention_cleanup_execution_16	dependency_edge	32	1.530 us	13.520 us
thread_safe_effect_contention_cleanup_execution_16	set_cell_invalidation	255	4.247 ms	199.243 us
thread_safe_effect_contention_batch_flush_8	other	594	2.636 ms	226.842 us
thread_safe_effect_contention_batch_flush_8	get_refresh	2	110.000 ns	840.000 ns
thread_safe_effect_contention_batch_flush_8	dependency_edge	33	1.500 us	17.470 us
thread_safe_effect_contention_batch_flush_8	set_cell_invalidation	2	150.000 ns	16.160 us
thread_safe_effect_contention_batch_flush_8	publish	3	170.000 ns	8.141 us
thread_safe_effect_contention_batch_flush_16	other	1169	21.839 ms	558.475 us
thread_safe_effect_contention_batch_flush_16	get_refresh	2	110.000 ns	270.000 ns
thread_safe_effect_contention_batch_flush_16	dependency_edge	65	3.080 us	35.400 us
thread_safe_effect_contention_batch_flush_16	set_cell_invalidation	1	60.000 ns	12.710 us
thread_safe_effect_contention_batch_flush_16	publish	2	100.000 ns	10.420 us
thread_safe_graph_propagation_fan_out_eager_validation_8	other	337	5.266 ms	166.412 us
thread_safe_graph_propagation_fan_out_eager_validation_8	get_refresh	64	3.000 us	5.140 us
thread_safe_graph_propagation_fan_out_eager_validation_8	dependency_edge	64	2.280 us	29.920 us
thread_safe_graph_propagation_fan_out_eager_validation_8	set_cell_invalidation	118	14.440 ms	5.280 ms
thread_safe_graph_propagation_fan_out_eager_validation_8	publish	549	4.373 ms	552.165 us
thread_safe_graph_propagation_fan_out_eager_validation_16	other	458	30.162 ms	173.621 us
thread_safe_graph_propagation_fan_out_eager_validation_16	get_refresh	64	3.040 us	5.550 us
thread_safe_graph_propagation_fan_out_eager_validation_16	dependency_edge	64	2.500 us	30.240 us
thread_safe_graph_propagation_fan_out_eager_validation_16	set_cell_invalidation	239	76.647 ms	10.794 ms
thread_safe_graph_propagation_fan_out_eager_validation_16	publish	555	10.195 ms	553.344 us
thread_safe_graph_propagation_fan_out_lazy_dirty_epochs_8	other	66	3.290 us	7.310 us
thread_safe_graph_propagation_fan_out_lazy_dirty_epochs_8	get_refresh	64	2.820 us	5.830 us
thread_safe_graph_propagation_fan_out_lazy_dirty_epochs_8	dependency_edge	32	900.000 ns	18.280 us
thread_safe_graph_propagation_fan_out_lazy_dirty_epochs_8	publish	64	3.080 us	44.861 us
thread_safe_graph_propagation_fan_out_lazy_dirty_epochs_16	other	66	2.760 us	5.380 us
thread_safe_graph_propagation_fan_out_lazy_dirty_epochs_16	get_refresh	64	2.860 us	5.660 us
thread_safe_graph_propagation_fan_out_lazy_dirty_epochs_16	dependency_edge	32	880.000 ns	18.490 us
thread_safe_graph_propagation_fan_out_lazy_dirty_epochs_16	publish	64	3.170 us	39.840 us
thread_safe_graph_propagation_fan_in_lazy_dirty_epochs_8	other	130	3.800 us	9.170 us
thread_safe_graph_propagation_fan_in_lazy_dirty_epochs_8	get_refresh	68	3.620 us	10.250 us
thread_safe_graph_propagation_fan_in_lazy_dirty_epochs_8	dependency_edge	64	2.440 us	28.580 us
thread_safe_graph_propagation_fan_in_lazy_dirty_epochs_8	publish	66	1.980 us	57.540 us
thread_safe_graph_propagation_fan_in_lazy_dirty_epochs_16	other	258	12.270 us	18.010 us
thread_safe_graph_propagation_fan_in_lazy_dirty_epochs_16	get_refresh	132	6.440 us	12.920 us
thread_safe_graph_propagation_fan_in_lazy_dirty_epochs_16	dependency_edge	128	5.030 us	60.300 us
thread_safe_graph_propagation_fan_in_lazy_dirty_epochs_16	publish	130	3.921 us	105.942 us
thread_safe_graph_propagation_fan_in_batched_flush_8	other	403	2.572 ms	216.262 us
thread_safe_graph_propagation_fan_in_batched_flush_8	get_refresh	68	3.290 us	8.300 us
thread_safe_graph_propagation_fan_in_batched_flush_8	dependency_edge	65	2.350 us	28.090 us
thread_safe_graph_propagation_fan_in_batched_flush_8	set_cell_invalidation	3	5.310 us	87.820 us
thread_safe_graph_propagation_fan_in_batched_flush_8	publish	66	16.150 us	57.302 us
thread_safe_graph_propagation_fan_in_batched_flush_16	other	795	7.602 ms	346.913 us
thread_safe_graph_propagation_fan_in_batched_flush_16	get_refresh	254	15.790 us	29.590 us
thread_safe_graph_propagation_fan_in_batched_flush_16	dependency_edge	129	4.860 us	58.880 us
thread_safe_graph_propagation_fan_in_batched_flush_16	set_cell_invalidation	6	2.760 us	186.002 us
thread_safe_graph_propagation_fan_in_batched_flush_16	publish	135	65.141 us	122.851 us

Multi-Language

lazily is implemented across three languages with shared semantics:

	lazily-rs	lazily-zig	lazily-py
Context	Owned `Context` struct	Explicit allocator	Plain `dict`
Slot creation	`Box<dyn Fn>` closures	`comptime` function pointers	Lambdas
Cell equality	`PartialEq` trait	`std.meta.eql`	`!=` operator
Thread safety	Single-threaded `Context`; explicit `ThreadSafeContext`	Mutex by default	GIL
Storage	Unified generics	`.direct` / `.indirect`	Object identity

lazily-zig — Zig implementation with FFI support
lazily-py — Python implementation with context-as-dict
Blog post: Lazily — Reactive Primitives Done Right

License

MIT

lazily v0.11.0

Prepared for the operator publish (#12b1). Last published: crates.io 0.10.3. Tag v0.11.0 is intended to point at this release commit on main.

Highlights

This minor lands eager Signal primitives across the single-threaded, thread-safe, and async contexts, plus a full WebRTC DataChannel transport stack on top of the v0.10.x reactive core: a sans-IO str0m backend, a real-socket networked backend, a WebSocket fallback, signaling, and the glue that drives a complete handshake end to end.

Features

#3dmm / #x7sp - add SignalHandle, ThreadSafeSignalHandle, and AsyncSignalHandle as eager derived values backed by memo slots plus puller effects. Signals provide always-materialized v1 -> v2 updates with no observable unset window, inherit memo equality suppression, and are documented in the README, SPEC, PROTOCOL, and mdBook docs.
#lzwebrtcwire — wire SignalingClient to Str0mNet. New webrtc_signaling module (offer_to_peer / answer_next_offer) owns the full SDP offer/answer + trickled-ICE handshake over SignalingClient, pumping frames into accept_answer / add_remote_candidate until the data channel opens. Integration test brokers two real SignalingClient WebSocket peers through an in-process #yxjw-protocol loopback relay and proves a permission-filtered Snapshot crosses the negotiated channel.
#lzwebrtcnet — networked str0m DataChannel backend (Str0mNet) over a real UDP socket with the str0m DTLS/SCTP/ICE driver.
#97xn — multi-channel reactive bridge hub.
#akp3 — WebSocket DataChannel backend (in-process loopback over a real WS handshake).
#webrtcbackend — concrete sans-IO str0m DataChannel backend.
#webrtc2 / #webrtc3 — WebRTC DataChannel IPC transport abstraction, loopback integration tests, and Criterion benchmarks.

CI / tests

#lzleanmodel - CI now builds the sibling Lean formal model so protocol invariants stay checked alongside the Rust suite.
#lzspecconf — IPC conformance run against the canonical lazily-spec fixtures.
#k03k / #lzasync — deterministic async resolve-loop window coverage.

Remaining (operator-gated)

Live two-host / NAT validation of Str0mNet through the deployed #yxjw Cloudflare Worker (#lzwebrtcnet-e2e, part of #h6qb) — cannot be done in CI.

Publish checklist (#12b1)

cargo publish (dry-run verified clean: 72 files, 233 KiB compressed).
gh release create v0.11.0 --notes-file RELEASE_NOTES_v0.11.0.md --title "lazily v0.11.0".
Rotate the crates.io token if expired before step 1.

Keyboard shortcuts

lazily-rs