lazily
Lazy reactive primitives for Rust — Context, Slots, Cells with automatic dependency tracking and cache invalidation.
Overview
lazily provides five core primitives for reactive computation:
- Context — owns all reactive state and manages the dependency graph
- Slot — a lazily-computed cached value that automatically tracks dependencies
- Cell — a mutable value that invalidates dependent Slots when changed
- Signal — an eager derived value that recomputes the instant a dependency invalidates, with no intermediate unset value
- Effect — a side-effect callback that automatically reruns after tracked dependencies invalidate
Values are lazy by default: dependents are marked dirty on invalidation but only validated or recomputed when accessed. When you need eager push-style semantics — recompute immediately, observe v1 -> v2 with no unset window — reach for Signal, which layers a puller effect over a memoized slot. The Slot -> Cell -> Signal progression lets you choose lazy or eager per derived value within one graph.
ctx.memo() Slots use a memo guard: if recomputation produces the same value, downstream dirty caches and effects are left alone.
Multiple updates can be grouped with ctx.batch(...) so invalidation and effect reruns happen once after the outermost batch exits.
Development
Minimum supported Rust version (MSRV): 1.88 — declared via rust-version in Cargo.toml. The crate uses let_chains (stabilized in 1.88) pervasively.
Run the local CI-equivalent suite with:
make check
The Makefile also exposes focused targets such as make test-tokio,
make test-loom, make benchmark-check, and make benchmark-update.
Usage
#![allow(unused)]
fn main() {
use lazily::Context;
let ctx = Context::new();
// Create a mutable cell
let counter = ctx.cell(0i32);
// Create a derived value (automatically tracks dependencies)
let doubled = ctx.computed(|ctx| {
let val = counter.get(ctx);
val * 2
});
assert_eq!(doubled.get(&ctx), 0);
// Mutate the cell — dependents are marked dirty (not recomputed yet)
counter.set(&ctx, 5);
// Slot recomputes lazily on next access
assert_eq!(doubled.get(&ctx), 10);
// Effects run immediately and then after tracked dependencies change
let effect = ctx.effect(move |ctx| {
println!("counter = {}", counter.get(ctx));
});
counter.set(&ctx, 6); // schedules and runs the effect once
effect.dispose(&ctx); // unsubscribes and prevents future reruns
// Batch writes coalesce invalidation and effect reruns.
ctx.batch(|ctx| {
counter.set(ctx, 7);
counter.set(ctx, 8);
});
}
Why Lazy?
| Lazy (Slots) | Eager (Signals) | |
|---|---|---|
| When does recomputation happen? | On access (get) | Immediately on change |
| Wasted work | Zero — only compute what’s read | Can compute values nobody uses |
| Glitch-free | By construction | Requires topological sorting |
| Ordering | Irrelevant — pull-based | Critical — push-based DAG walk |
| Use case | Request handling, data pipelines | UI rendering, real-time updates |
In a web server handling requests, you might have 50 computed values available but any given request only uses 5. With eager reactivity, all 50 recompute on every change. With lazy, only the 5 actually accessed compute.
lazily defaults to lazy but does not force the choice on you: derive with ctx.computed()/ctx.memo() for pull-based laziness, or ctx.signal() for the eager column above (UI rendering, real-time mirrors, always-materialized values). Both share the same context, dependency graph, glitch-freedom, and memo guard — pick per value.
Core Concepts
Context
Context owns all Slots and Cells. It manages the dependency graph and provides the API for creating, reading, and mutating reactive values. Think of it as the “world” for your reactive computations — in web frameworks, this maps to a request context, application scope, or component tree.
The current Context is intentionally single-threaded. It uses RefCell and
non-Send callback storage to keep the fast path allocation-only and mutex-free.
Create independent contexts per OS thread for local graphs, or use
ThreadSafeContext when one reactive graph must be shared across threads.
Slot
A SlotHandle<T> wraps a compute function Fn(&Context) -> T. The result is cached after first access. Dependencies are discovered automatically via a thread-local tracking stack — any Slot or Cell accessed during computation becomes a dependency. ctx.computed() is the ergonomic name for a derived value; ctx.slot() is the same primitive. Use ctx.memo() when T: PartialEq and equal recomputations should suppress downstream work.
When a dependency is invalidated, the Slot marks its cached value dirty. It does not validate or recompute until ctx.get() is called again.
For ctx.memo() slots, if recomputation returns a value equal to the previous cache, downstream dirty Slots become fresh without recomputing, and scheduled effects that only depended on unchanged Slots skip cleanup/rerun.
Dependencies are dynamic. Every time a Slot recomputes, it re-discovers its dependencies from scratch. If your compute function has conditional branches that access different Cells depending on state, the dependency graph updates automatically. No stale subscriptions, no manual cleanup.
Cell
A CellHandle<T> holds a mutable value. cell.set(&ctx, value) and ctx.set_cell() compare old and new values via PartialEq — if unchanged, no invalidation occurs. If changed, all dependent Slots are recursively marked dirty.
Signal
A SignalHandle<T> is an eager derived value — the eager counterpart to a lazy Slot, one step further along the Slot -> Cell -> Signal progression. Where a Slot only marks itself dirty on invalidation and recomputes on the next read, a Signal recomputes the instant a dependency is invalidated, before the invalidating set/set_cell/batch call returns. The value is always materialized, so observers never see an intermediate unset value — a dependency change drives the value directly from v1 to v2.
#![allow(unused)]
fn main() {
let n = ctx.cell(1);
let doubled = ctx.signal(|ctx| n.get(ctx) * 2); // materialized now: 2
n.set(&ctx, 5); // doubled is already 10 — eager
assert_eq!(doubled.get(&ctx), 10);
}
A Signal is composed from existing primitives, not a parallel engine: a memoized Slot (ctx.memo) supplies glitch-free, pull-based, memo-guarded recomputation, and a small puller Effect re-materializes that slot after every invalidation to supply the eagerness. Consequently a Signal inherits the memo guard (an equal recompute suppresses downstream work) and diamond glitch-freedom (D = f(A, g(A)) never surfaces a mixed new-A/old-g(A) intermediate), and batched writes settle to one consistent recomputation at batch exit.
ctx.signal() requires T: PartialEq + 'static (the memo guard); get_signal additionally requires T: Clone. signal.dispose(&ctx) removes the eager puller — the value stays readable but reverts to lazy (recompute-on-read) behavior. The same primitive is available on ThreadSafeContext (signal, returning a Send + Sync ThreadSafeSignalHandle<T>) and AsyncContext (signal_async, with a non-blocking get_signal snapshot and an awaiting get_signal_async); see SPEC.md for the per-context type bounds and the async eagerness caveat.
Batch Updates
ctx.batch(|ctx| { ... }) groups multiple cell updates and explicit slot/cell clears into one invalidation pass. Nested batches flush only when the outermost batch exits. Direct ctx.get_cell() reads inside the callback see the latest cell value immediately; changed-cell dependents are marked dirty after the batch, so Slot reads during the callback return their pre-batch cached value until the batch completes.
Effect
An EffectHandle represents a side-effect callback registered with ctx.effect(). Effects run immediately, track any Slots or Cells read during that run, and rerun after those dependencies invalidate. Scheduled effect reruns are flushed after the invalidation pass, so diamond dependency paths coalesce to one rerun. Effects scheduled only by dirty Slot dependencies first validate those Slots and skip cleanup/rerun when values are unchanged.
Effects can return a cleanup closure. Cleanup runs before the next rerun and when the handle is disposed:
#![allow(unused)]
fn main() {
let effect = ctx.effect(move |ctx| {
let value = counter.get(ctx);
move || println!("cleanup for {value}")
});
effect.dispose(&ctx);
}
API
| Method | Purpose |
|---|---|
Context::new() | Create a new context |
ctx.computed(|ctx| T) | Create a derived lazily-computed value |
ctx.slot(|ctx| T) | Create a lazily-computed slot; synonym of ctx.computed() |
ctx.memo(|ctx| T) | Create a lazily-computed slot with a PartialEq memoization guard |
slot.get(&ctx) | Get value (computes if unset) |
ctx.get(&slot) | Context method alias for slot.get(&ctx) |
ctx.cell(value) | Create a mutable cell |
cell.get(&ctx) | Get cell value |
ctx.get_cell(&cell) | Context method alias for cell.get(&ctx) |
ctx.set_cell(&cell, value) | Update cell (marks dependents dirty if changed) |
cell.set(&ctx, value) | Handle method alias for ctx.set_cell(&cell, value) |
ctx.signal(|ctx| T) | Create an eager derived value (recomputes on invalidation, no unset window); T: PartialEq + 'static |
signal.get(&ctx) | Get the signal’s value (T: Clone); also ctx.get_signal(&signal) |
signal.dispose(&ctx) | Remove the eager puller; value reverts to lazy recompute-on-read |
signal.is_active(&ctx) | Check whether the eager puller is still registered |
ctx.batch(|ctx| { ... }) | Defer changed-cell dirty marking and explicit clears until the outermost batch exits |
ctx.effect(|ctx| { ... }) | Run an effect immediately and rerun it after tracked dependencies invalidate |
ctx.is_set(&slot) | Check if slot has a cached, fresh value |
slot.clear(&ctx) | Clear cached value and cascade to dependents |
cell.clear_dependents(&ctx) | Clear downstream slots without changing cell value |
effect.dispose(&ctx) | Dispose an effect and unsubscribe dependencies |
effect.is_active(&ctx) | Check whether an effect is still registered |
ThreadSafeContext
ThreadSafeContext is the mutex-backed counterpart for sharing one reactive
graph across OS threads. It mirrors the core Context methods while requiring
Send + Sync + 'static values and compute/effect callbacks. The graph lock is
released before user compute callbacks, effect callbacks, or cleanup closures
run, so callbacks can re-enter the same context without deadlocking. If a slot
is invalidated while its callback is running, the stale result is discarded and
the getter retries before returning a fresh value.
Design
- Lazy by default, eager on demand: Slots mark dirty on invalidation and validate/recompute on access;
ctx.signal()opts a value into eager recomputation (a memo-slot + puller-effect composition) with no intermediate unset state - Ergonomic aliases:
ctx.computed()names derived values while preservingctx.slot()for low-level terminology - PartialEq guard:
CellHandle::set()only invalidates when value actually changes - Memo guard: Dirty
ctx.memo()Slots compare recomputed values and suppress downstream recomputation/effect reruns when values are equal - Dynamic dependencies: Edges re-discovered on each recomputation (no stale subscriptions)
- Batching: Multiple writes share one invalidation/effect flush boundary
- Effect scheduling: Effects rerun after dependency invalidation and coalesce duplicate schedules
- Slot-id-indexed contiguous node storage for the single-threaded fast path
- Interior mutability via
RefCell(single-threaded) - Thread-local tracking stack for automatic dependency discovery
- Zero mandatory runtime dependencies in the default library surface
- Optional
instrumentationfeature for benchmark counters, lock timing, and thread-safe lock attribution
Threading Roadmap
lazily-rs guarantees local, single-threaded Context graphs plus an explicit
ThreadSafeContext for shared graphs. SlotHandle<T> and CellHandle<T> are
Send + Sync when T is Send + Sync, and EffectHandle is also Send + Sync,
but handles must be used with their owning context.
Enable the optional loom feature to run the thread-safe synchronization model:
cargo test --features loom --test thread_safe_loom
Enable the optional tokio feature for sync-on-Tokio integration tests and the
tokio_sync example:
cargo test --features tokio
cargo run --example tokio_sync --features tokio
The feature proves ThreadSafeContext can be shared through tokio::spawn and
tokio::task::spawn_blocking. It does not add async computations or effects;
those need the separate AsyncContext design captured in SPEC.md, including
in-flight future deduplication, stale completion handling, cleanup ordering, and
separate Send versus LocalSet surfaces.
ThreadSafeContext intentionally keeps one mutex-backed graph lock while
fresh cached slot reads use a per-slot read-mostly cached-value sidecar.
Dependency edges, dirty/revision state, cached-value publication, batching, and
effect queues still mutate under the graph mutex. In-flight recompute waiters
use per-slot generation Condvar sidecars so they can park while the compute
owner runs user code, and a completion only wakes waiters for that finished
slot. Per-slot dependency summaries let cell-only dirty refreshes claim the
SlotId-partitioned recompute sidecar and skip the graph-locked get_refresh
dependency scan before the final publish mutation. Changed-cell and slot-value
invalidation build an explicit frontier plan, then apply dirty flags, revisions,
and effect scheduling in one graph-mutex mutation boundary; slot-only
changed-cell frontiers may publish dirty state through per-node sidecars with
cache-revision dirty epochs instead. The thread_safe_graph_propagation benchmarks
compare fan-out eager validation, fan-out/fan-in lazy dirty epoch publication,
and fan-in batched flush behavior with lock attribution, effect queue pushes,
dependency-edge counters, sidecar dirty marks, sidecar fallbacks, and dirty
epoch advances. Sharded-lock or CAS variants should wait for lock wait/hold
benchmark evidence and a Loom or Shuttle safety model for stale in-flight
completion, invalidation during compute, dynamic dependency cleanup/disposal,
effect scheduling/disposal, and re-entrant callbacks. A lock-free versioned
optimistic read path is deferred
until cached values can be retained independently of graph-protected
erased-value storage.
Benchmarks
See BENCHMARKS.md for full benchmark results, regression budgets, lock attribution, and instrumentation profiles.
Multi-Language
lazily is implemented across three languages with shared semantics:
| lazily-rs | lazily-zig | lazily-py | |
|---|---|---|---|
| Context | Owned Context struct | Explicit allocator | Plain dict |
| Slot creation | Box<dyn Fn> closures | comptime function pointers | Lambdas |
| Cell equality | PartialEq trait | std.meta.eql | != operator |
| Thread safety | Single-threaded Context; explicit ThreadSafeContext | Mutex by default | GIL |
| Storage | Unified generics | .direct / .indirect | Object identity |
Cross-Channel Compatibility
The cross-language family should use one graph-state protocol across channels:
IpcMessage::Snapshot and IpcMessage::Delta. Rust FFI is viable as a narrow
C ABI adapter with opaque handles and owned byte buffers, not by sharing live
Rust contexts, closures, typed handles, or references across the ABI.
IPC, WebSocket frames, WebRTC data channels, and FFI byte buffers can then carry the same permission-filtered snapshots and deltas. Transport code owns framing, memory ownership, reliability, and back-pressure; lazily semantics stay in the shared message schema.
Enable the ffi feature for the C ABI adapter. It exposes an opaque
LazilyFfiChannel, JSON IpcMessage validation/classification helpers, and
Rust-owned LazilyFfiBytes buffers with an explicit free function. The adapter
re-encodes every accepted frame as canonical IpcMessage JSON, so FFI callers
share the same state plane as other channels.
Related
- lazily-zig — Zig implementation with FFI support
- lazily-py — Python implementation with context-as-dict
- Blog post: Lazily — Reactive Primitives Done Right
License
MIT
lazily-rs Specification
Rust library for lazy evaluation with context-aware dependency tracking and cache invalidation. Counterpart to lazily-zig and lazily-py.
Core Concepts
Context
Container for all slots and cells. Owns all allocations via interior mutability
using a single RefCell<ContextInner>.
#![allow(unused)]
fn main() {
struct ContextInner {
nodes: Vec<Option<Node>>,
next_id: u64,
free_ids: Vec<u64>,
pending_effects: VecDeque<SlotId>,
scheduled_effects: HashSet<SlotId>,
flushing_effects: bool,
batch_depth: usize,
batched_cells: HashSet<SlotId>,
batched_cell_clears: HashSet<SlotId>,
batched_slots: HashSet<SlotId>,
}
pub struct Context {
inner: RefCell<ContextInner>,
}
}
API:
| Method | Purpose |
|---|---|
Context::new() | Create a new context |
ctx.computed(|ctx| T) | Create a derived lazily-computed value |
ctx.slot(|ctx| T) | Create a lazily-computed slot; synonym of ctx.computed() |
ctx.memo(|ctx| T) | Create a lazily-computed slot with a PartialEq memoization guard |
slot.get(&ctx) | Get value (computes if unset) |
ctx.get(&slot) | Context method alias for slot.get(&ctx) |
ctx.get_rc(&slot) | Get slot value as Rc<T>, avoiding deep clone |
ctx.cell(value) | Create a mutable cell |
cell.get(&ctx) | Get cell value |
ctx.get_cell(&cell) | Context method alias for cell.get(&ctx) |
ctx.get_cell_rc(&cell) | Get cell value as Rc<T>, avoiding deep clone |
ctx.set_cell(&cell, value) | Update cell (marks dependents dirty if changed) |
cell.set(&ctx, value) | Handle method alias for ctx.set_cell(&cell, value) |
ctx.batch(|ctx| { ... }) | Defer changed-cell dirty marking and explicit clears until the outermost batch exits |
ctx.effect(|ctx| { ... }) | Run an effect immediately and rerun it after tracked dependencies invalidate |
ctx.is_set(&slot) | Check if slot has a cached, fresh value |
slot.clear(&ctx) | Clear cached value and cascade to dependents |
cell.clear_dependents(&ctx) | Clear downstream slots without changing cell value |
effect.dispose(&ctx) | Dispose an effect, unsubscribe dependencies, and run cleanup |
effect.is_active(&ctx) | Check whether an effect is still registered |
Context stores nodes in a slot-id-indexed Vec<Option<Node>> rather than a
hash map. SlotId values are allocated sequentially; effect disposal returns the
ID to a free list for reuse, preventing unbounded Vec growth from transient
effects while keeping lookups contiguous and hash-free.
Dependency and dependent edges use SmallVec<[SlotId; 4]> rather than HashSet<SlotId>.
For the typical 1-3 dependency fan-out, SmallVec stores edges inline without heap
allocation, avoids hash computation overhead, and eliminates the temporary
Vec<SlotId> allocations that were previously required in every refresh/clear/dirty
hot path (replaced with SmallVec::clone() or std::mem::take). Sets that require
true dedup semantics (scheduled effects, batch queues, tracking frames) remain as
HashSet<SlotId>.
The single-threaded Context consolidates all mutable state behind one
RefCell<ContextInner> instead of ten separate RefCell fields. This reduces
borrow-check overhead from 4-6 flag modifications per get() to 1 and eliminates
the risk of re-entrant borrows across fields. Slot and effect compute closures are
stored as Rc<dyn Fn> so they can be cloned cheaply without unsafe pointer copies.
ThreadSafeContext
Mutex-backed counterpart to Context for sharing one reactive graph across OS
threads. It mirrors the core local-context API and requires thread-safe values
and callbacks.
#![allow(unused)]
fn main() {
pub struct ThreadSafeContext {
inner: Arc<ThreadSafeInner>,
}
}
API:
| Method | Purpose |
|---|---|
ThreadSafeContext::new() | Create a new thread-safe context |
ctx.computed(|ctx| T) | Create a Send + Sync derived lazily-computed value |
ctx.slot(|ctx| T) | Create a Send + Sync lazily-computed slot |
ctx.memo(|ctx| T) | Create a Send + Sync lazily-computed slot with a PartialEq memoization guard |
slot.get(&ctx) | Get value from any thread (computes if unset) |
ctx.get(&slot) | Context method alias for slot.get(&ctx) |
ctx.cell(value) | Create a mutable Send + Sync cell |
cell.get(&ctx) | Get cell value from any thread |
ctx.get_cell(&cell) | Context method alias for cell.get(&ctx) |
ctx.set_cell(&cell, value) | Update cell and invalidate dependents across threads |
ctx.batch(|ctx| { ... }) | Defer invalidation until the outermost shared batch exits |
ctx.effect(|ctx| { ... }) | Run a Send + Sync effect immediately and rerun it after tracked dependencies invalidate |
ctx.clear(&slot) | Clear cached value and cascade to dependents |
ctx.clear_cell_dependents(&cell) | Clear downstream slots without changing cell value |
ctx.dispose_effect(&effect) | Dispose an effect, unsubscribe dependencies, and run cleanup |
ctx.is_effect_active(&effect) | Check whether an effect is still registered |
Slot
Lazily-computed cached value with dependency tracking. A Slot is fresh, dirty, or unset; dirty slots may retain a previous cached value for memo validation.
ctx.memo() creates a Slot whose values implement PartialEq so dirty caches can be compared against recomputed values.
#![allow(unused)]
fn main() {
type ComputeFn = dyn Fn(&Context) -> Rc<dyn Any>;
type EqualsFn = dyn Fn(&dyn Any, &dyn Any) -> bool;
struct SlotNode {
value: Option<Rc<dyn Any>>,
type_id: TypeId,
compute: Rc<ComputeFn>,
equals: Option<Box<EqualsFn>>,
dependencies: SmallVec<[SlotId; 4]>,
dependents: SmallVec<[SlotId; 4]>,
dirty: bool,
force_recompute: bool,
}
}
Semantics:
- Activation: First
ctx.get()calls the compute function, caches the result - Computed alias:
ctx.computed()creates the same Slot asctx.slot()for derived-value ergonomics - Invalidation: Marks the cached value dirty and marks downstream slots dirty without discarding their cached values
- Clearing: Explicit
slot.clear(&ctx)removes the cached value and clears all dependent slots recursively - Memo guard: Dirty
ctx.memo()slots compare recomputed values with the previous cache viaPartialEq; equal values make downstream dirty slots fresh without recomputing them - Dependencies: If Slot B accesses Slot A during computation, B depends on A. If A clears, B clears automatically; if A’s value changes after dirty validation, B is forced stale
- Immutable by default: Once set, a Slot’s value doesn’t change — only clear + recompute
- Dynamic: Dependencies re-discovered on each recomputation (no stale subscriptions)
Cell
Mutable value container. Changing a Cell’s value marks dependent Slots dirty.
#![allow(unused)]
fn main() {
struct CellNode {
value: Rc<dyn Any>,
type_id: TypeId,
dependents: SmallVec<[SlotId; 4]>,
}
}
Semantics:
ctx.set_cell()andcell.set(&ctx, value)compare old and new viaPartialEq- If unchanged, no invalidation occurs (no-op)
- If changed, dependent Slots are marked dirty while cached values are preserved for memo validation
Effect
Side-effect callback that automatically tracks dependencies. Effects run immediately on creation, then rerun after any Cell or Slot read during the last run is invalidated.
#![allow(unused)]
fn main() {
type EffectFn = dyn Fn(&Context) -> Option<Box<dyn FnOnce()>>;
struct EffectNode {
run: Rc<EffectFn>,
dependencies: SmallVec<[SlotId; 4]>,
cleanup: Option<Box<dyn FnOnce()>>,
force_run: bool,
}
}
Semantics:
- Immediate activation:
ctx.effect()runs the callback once during creation - Auto-tracking: Any Slot or Cell accessed during the callback becomes a dependency
- Scheduling: Dependency invalidation schedules the effect, then the context flushes scheduled effects after the invalidation pass
- Coalescing: An effect scheduled through multiple dependency paths in the same invalidation pass runs once
- Memo guard: Effects scheduled by dirty slot dependencies first validate those slots and skip cleanup/rerun when values are unchanged
- Cleanup: Returning a cleanup closure runs it before the next rerun and on disposal
- Disposal:
effect.dispose(&ctx)unsubscribes from dependencies, removes pending scheduled work, and prevents future reruns
Signal
Eager derived value. A Signal sits one step beyond a Slot on the
Slot -> Cell -> Signal progression: where a Slot is lazy (invalidation
only marks it dirty; the value is recomputed on the next read), a Signal is
eager — it recomputes the instant any dependency is invalidated. The value
is always materialized, so observers never see an intermediate unset value: a
dependency change drives the value directly from v1 to v2.
A Signal is composed from existing primitives: a memoized Slot (ctx.memo)
plus a small puller Effect that re-materializes the slot after every
invalidation. This composition is intentional — it inherits the Slot’s
glitch-free, pull-based recomputation and the memo guard, while the Effect
supplies eagerness.
#![allow(unused)]
fn main() {
let n = ctx.cell(1);
let doubled = ctx.signal(|ctx| n.get(ctx) * 2); // materialized now: 2
n.set(&ctx, 5); // doubled is already 10
assert_eq!(doubled.get(&ctx), 10);
}
Semantics:
- Eager activation:
ctx.signal()computes the value once at creation; the value is set from the start - Eager recomputation: Dependency invalidation recomputes the value during the invalidation flush, before the invalidating
set_cell/set/batchcall returns — no read is required to drive it - No unset state: The backing slot is invalidated via dirty-marking (not hard-cleared), so the value transitions
v1 -> v2and is never observed as unset - Memo guard: Backed by
ctx.memo, a recomputation that yields an equal value (viaPartialEq) does not invalidate downstream dependents - Glitch-free: Recomputation is pull-based; a Signal that reads other Signals/Slots always observes values consistent with the current inputs (e.g. a diamond
D = f(A, g(A))never surfaces a mixed new-A/old-g(A)intermediate) - Batch coalescing: Writes inside
ctx.batch()settle to a single consistent recomputation at batch exit - Type bounds:
signal<T>requiresT: PartialEq + 'static(for the memo guard);get_signaladditionally requiresT: Clone - Disposal:
signal.dispose(&ctx)removes the eager puller; the value remains readable and reverts to lazy (recomputed on next read) behavior
Signal across context types
The eager-Signal primitive is exposed on all three context types with the same
memo-slot + puller-effect composition, so shared-graph and async consumers get
the same always-set, glitch-free v1 -> v2 derived values that the
single-threaded Context provides (#lzsignalparity).
ThreadSafeContext::signal— shared-graph counterpart. Returns aThreadSafeSignalHandle<T>with.get/.dispose/.is_active(&ctx)helpers and matchingctx.get_signal/dispose_signal/is_signal_active. Recomputation is eager (driven during the invalidation flush before theset_cell/batchcall returns), glitch-free, memo-guarded, and batch-coalesced — identical to the single-threaded semantics above. Type bounds addSend + Sync:signal<T>requiresT: PartialEq + Send + Sync + 'static;get_signaladditionally requiresT: Clone. The handle isCopy + Send + Syncand may be read from any thread sharing the context.AsyncContext::signal_async— async counterpart. Returns anAsyncSignalHandle<T>backed bymemo_asyncplus aneffect_asyncpuller that awaits the slot after every invalidation. Reads:ctx.get_signal(orhandle.get) returnsOption<T>as a non-blocking snapshot;ctx.get_signal_async(orhandle.get_async) awaits the up-to-date value. Inside a slot/effect callback,AsyncComputeContext::get_signal_asyncreads a signal and registers its backing slot as a dependency, enabling chained async signals and downstream observers. Type bounds:T: PartialEq + Clone + Send + Sync + 'static.- Eagerness is runtime-driven: because resolution is asynchronous, the puller drives the recompute to completion on the runtime shortly after the invalidating write rather than synchronously within it.
- Propagation is not suppressed on equal recompute: the async memo guard
keeps the value correct on an equal recompute, but — unlike the
single-threaded/thread-safe graph — does not suppress downstream
propagation (async invalidation force-reruns effect dependents on every
upstream change). No inconsistent (glitch) value is ever observed. This
matches the documented
async memo does not suppress downstream propagationbehavior ofmemo_async.
Batch
Write-coalescing boundary for multiple cell updates or explicit slot clears.
#![allow(unused)]
fn main() {
ctx.batch(|ctx| {
ctx.set_cell(&a, 1);
ctx.set_cell(&b, 2);
});
}
Semantics:
- Outermost boundary: Nested batches flush only when the outermost batch exits
- Changed cells:
ctx.set_cell()still updates the cell value immediately, but dependent dirty marking is queued until batch exit - Explicit clears:
slot.clear(&ctx)andcell.clear_dependents(&ctx)are queued until batch exit - Coalescing: Repeated updates to the same cell or clears of the same slot queue one invalidation root
- Thread-safe local batching: same-thread
ThreadSafeContextbatch writes buffer changed cells and clears in a thread-local batch frame, then merge that frame into the graph-owned batch queue at batch exit; cross-thread writes during another active batch still fall back to the graph-owned queue - Effect flushing: Effects scheduled by batched invalidation rerun after the batch invalidation pass and coalesce duplicate schedules
- Reads during a batch: Direct
ctx.get_cell()reads see the latest cell value immediately; dependent slot reads keep their pre-batch cached value until dirty marking flushes at batch exit
State Machine
A finite state machine built on top of CellHandle and the reactive graph.
StateMachine<S, E> wraps a CellHandle<S> as the current state and a pure
transition function Fn(&S, &E) -> Option<S>.
#![allow(unused)]
fn main() {
use lazily::{Context, StateMachine};
let ctx = Context::new();
let m = StateMachine::new(&ctx, Door::Closed, |s, e| match (s, e) {
(Door::Closed, DoorEvent::Button) => Some(Door::Opening),
(Door::Opening, DoorEvent::Open) => Some(Door::Open),
_ => None,
});
m.send(&ctx, DoorEvent::Button);
assert_eq!(m.state(&ctx), Door::Opening);
}
API:
| Method | Description |
|---|---|
StateMachine::new(ctx, initial, transition_fn) | Create with initial state + pure transition function |
send(ctx, event) -> bool | Evaluate transition; true if accepted, false if rejected (None) |
state(ctx) -> S | Read the current state |
state_handle() -> CellHandle<S> | Underlying cell for reactive dependencies |
on_transition(ctx, |old, new| ...) -> EffectHandle | Observer that fires on each state change with (old, new) |
state_is(ctx, target) -> SignalHandle<bool> | Eager signal: true when in target state |
Semantics:
- PartialEq guard: A transition to an equal state is accepted (
true) but does not invalidate dependents (theCellHandleequality guard suppresses no-op updates). To force re-entry, callcell.clear_dependents(ctx)beforesend. - Reactive integration: Any
ctx.computed,ctx.memo,ctx.signal, orctx.effectthat readsstate_handle()automatically recomputes/reruns on transition. - On-enter / on-exit: Use
ctx.effectwith cleanup — the effect body is on-enter, the returned cleanup closure is on-exit (runs before the next rerun). Alternatively, useon_transitionfor a single(old, new)observer. - Batch atomicity:
ctx.batch()coalesces multiplesend()calls — effects fire once after the batch settles. - Single-threaded:
StateMachineis backed byContext(single-threadedRefCell).
ThreadSafeStateMachine — cross-thread
ThreadSafeStateMachine<S, E> is the lock-backed counterpart to
StateMachine, mirroring the same API over ThreadSafeContext.
The transition function and state must be Send + Sync + 'static, so the
machine and its owning context can be shared across OS threads. The machine is
Clone — cloning yields another handle to the same state cell and transition
function.
#![allow(unused)]
fn main() {
use lazily::{ThreadSafeContext, ThreadSafeStateMachine};
use std::sync::Arc;
let ctx = Arc::new(ThreadSafeContext::new());
let m = ThreadSafeStateMachine::new(&ctx, Door::Closed, |s, e| match (s, e) {
(Door::Closed, DoorEvent::Button) => Some(Door::Opening),
_ => None,
});
m.send(&ctx, DoorEvent::Button);
assert_eq!(m.state(&ctx), Door::Opening);
}
on_transition returns an EffectHandle (dispose via
ctx.dispose_effect); state_is returns a ThreadSafeSignalHandle<bool>.
Observers fire synchronously within the invalidating send/batch call,
preserving the glitch-free pull-based ordering of ThreadSafeContext.
AsyncStateMachine — Tokio (async feature)
AsyncStateMachine<S, E> is the async counterpart, backed by
AsyncContext. The state lives in an AsyncCellHandle<S>;
because cells are the synchronous input layer of AsyncContext, send and
state are synchronous. Reactive observers use the async effect/signal APIs:
on_transition returns an AsyncEffectHandle and state_is returns an
AsyncSignalHandle<bool>. Because resolution is asynchronous, eager
recomputation settles on the runtime rather than synchronously within send.
#![allow(unused)]
fn main() {
use lazily::{AsyncContext, AsyncStateMachine};
let ctx = AsyncContext::new();
let m = AsyncStateMachine::new(&ctx, Door::Closed, |s, e| match (s, e) {
(Door::Closed, DoorEvent::Button) => Some(Door::Opening),
_ => None,
});
m.send(&ctx, DoorEvent::Button);
assert_eq!(m.state(&ctx), Door::Opening);
}
Regression property harness
The default Rust test suite includes tests/property_graph.rs, a proptest
harness that drives a fixed reactive graph with generated programs of cell
sets, equal-value sets, cell.clear_dependents, slot.clear, memo-guarded
dependencies, batch boundaries, reads, and effect disposal/recreation.
Each generated program is checked against a pure model:
- Cell and slot reads must match the model after every operation.
- Equal-value cell sets must not schedule effect cleanup/rerun work.
- Same-parity cell changes through a
memoslot must preserve downstream effect run counts when the observable output is unchanged. - Explicit slot and cell-dependent clears must hard-clear cached slots when no effect is active, and must rerun active effects to re-prime their dependencies.
- Operations inside
ctx.batchmust not clear cached dependent slots or run effect cleanups/reruns until the outermost batch exits; dependent reads during the batch must continue to see the pre-batch cached value.
SlotId
Unique identifier for reactive nodes. Lightweight Copy type wrapping a u64.
#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
pub struct SlotId(u64);
}
Both SlotHandle<T> and CellHandle<T> wrap a SlotId with PhantomData<T> for type safety.
Dependency Tracking
Uses a thread-local tracking stack (mirroring lazily-zig’s TrackingFrame approach).
- When a Slot computes, it pushes a frame onto the tracking stack
- When an Effect runs, it also pushes a frame onto the tracking stack
- Any nested slot/cell access sees the parent frame
- The child registers the parent as a dependent
- When a dependency clears, slot dependents hard-clear recursively; when a Cell changes, slot dependents are marked dirty and effect dependents are scheduled
Threading and Concurrency Contract
Current Context
Context is intentionally local to one OS thread. It owns RefCell graph state,
cached values as Box<dyn Any>, compute callbacks as Box<dyn Fn(&Context)>,
effect callbacks as Box<dyn Fn(&Context)>, and cleanups as Box<dyn FnOnce()>.
Those storage choices avoid synchronization overhead for the common single-threaded
path and make Context neither Send nor Sync.
Current guarantees:
- Independent
Contextinstances may be used on different OS threads - A single
Contextmust not be moved into, shared with, or accessed from another thread SlotHandle<T>andCellHandle<T>are lightweight ids and areSend + SyncwhenTisSend + Sync, but they are only meaningful with their owning contextEffectHandleis a lightweight id; effect execution and cleanup remain tied to the owning context thread- Dependency tracking is thread-local; a compute/effect callback cannot split work onto another thread and expect nested reads there to attach to the original tracking frame
ThreadSafeContext
Thread-safe support is explicit rather than a silent change to Context.
ThreadSafeContext mirrors the existing Context methods while preserving the
single-threaded fast path.
API bounds:
| Method family | Additional bounds |
|---|---|
cell, get_cell, set_cell | T: PartialEq + Clone + Send + Sync + 'static |
slot, computed | T: Clone + Send + Sync + 'static; compute closure Fn(&ThreadSafeContext) -> T + Send + Sync + 'static |
memo | T: PartialEq + Clone + Send + Sync + 'static; compute closure Send + Sync + 'static |
effect | effect callback Fn(&ThreadSafeContext) -> R + Send + Sync + 'static; cleanup FnOnce() + Send + 'static |
| handles | remain id-only and copyable; usable from any thread only with the owning ThreadSafeContext |
Locking model:
- Uses one context-level
Mutexsynchronization primitive for graph state before introducing finer-grained graph locks ThreadSafeStatestores nodes in a slot-id-indexedVec<Option<ThreadSafeNode>>matching the single-threadedContext, eliminating hash-map lookup overhead on every node access. Slot IDs are reused via a free list on effect disposal.- Fresh cached slot reads use a per-slot read-mostly cached-value sidecar; dependency-edge changes, invalidation frontier application, batch queues, effect queues, and disposal remain graph mutex mutations
- Read-mostly cached slot access is versioned optimistically: the getter loads a per-slot atomic cache revision before cloning the retained cached
Arc, then validates the revision and dirty/force flags again after the clone. Any concurrent invalidation, clear, or value publish changes the revision and forces the getter onto the graph-validated refresh path. - Each thread-safe slot also owns a per-slot recompute/value-publish sidecar for the cached-value visibility flags, in-flight bit, waiter
Condvar, and revision used to reject stale callback results. Graph-state dirty/revision fields remain mirrored under the context mutex for dependency-frontier traversal and tests. - Each thread-safe slot sidecar mirrors a per-slot dependency summary: the
current dependency ids plus a slot-dependency count. A cell-only dirty refresh
may claim the SlotId-partitioned recompute sidecar, snapshot old dependencies,
and skip the graph-locked
get_refreshdependency scan. The owner still takes the finalpublishgraph mutation to diff dynamic dependencies, publish the value, notify dependents, and reject stale in-flight revisions. - Cells and slots mirror their dependent frontiers into per-node sidecars keyed
by
SlotId. Changed-cell invalidation may use these sidecars without taking the context graph mutex only when no callback is actively discovering dependencies, the context is not inside a batch, and the discovered frontier contains slots only. The per-slot cache revision acts as the dirty epoch for sidecar publication, and instrumentation records each epoch advance so parallel writers can publish version state without the graph mutex while cached reads still reject mid-read invalidation races. Effect scheduling, batching, dynamic dependency discovery, disposal, and any frontier that reaches an effect fall back to the graph mutex. - Do not hold the graph lock while running user compute callbacks, effect callbacks, or cleanup closures
- Re-acquire the lock only to publish computed values, dependency edges, invalidation state, and pending effect work
- Slot refresh must avoid helper-level lock churn: a fresh cached get should clone the value through the per-slot fast path without taking a
get_refreshgraph lock or recursively validating unchanged dependencies, dependency refresh should not take a separate node-kind probe lock before recursively validating a dependency, cell dependencies should not be probed as refreshable slots, clean dirty flags should be folded into the refresh decision lock, and recompute must diff old/new dependency sets at publish so unchanged edges stay subscribed while only stale edges are removed - Recompute dependency tracking must skip graph-lock edge registration for dependencies already present in the slot’s previous dependency set, while still eagerly registering newly discovered dependencies during the callback so concurrent invalidation can mark the in-flight result stale
- Effect rerun dependency tracking must also preserve unchanged edges: dependencies already present on the effect remain subscribed through the rerun, newly discovered dependencies are registered during tracking, and stale dependencies are removed in one post-callback graph mutation before the next cleanup is stored
- Re-entrant user code must be able to call back into the same context without deadlocking
- Concurrent first access and dirty same-slot contention share one in-flight computation for the current slot revision; waiters check the per-slot recompute sidecar before the
get_refresh/publishgraph-lock path, park on that slot’s notification primitive, then return the published cache or retry if an invalidation makes the in-flight result stale - Recompute waiters observe the per-slot in-flight/revision state while holding the sidecar mutex, then park on the same sidecar
Condvar. Finishers publish value and dirty-state sidecar updates before clearing the in-flight bit and notifying, so a stale in-flight completion cannot be missed. - Recompute notifications are scoped to the slot that finished. A completion for one in-flight slot must not wake waiters parked behind another in-flight slot.
- Per-slot recompute wakeups use a waiter-counted handoff instead of
notify_all: the finisher callsnotify_onewhen waiters exist, and each awakened waiter notifies the next parked waiter after observing the completed sidecar state. This drains all waiters without a completion-wide wakeup stampede. - Optimistic cached reads fall back whenever the context is dirty, forced to recompute, or racing with a sidecar revision change. They do not publish dependency edges, flush effects, observe batch-local unflushed invalidations as fresh, or replace graph-locked refresh for ambiguous callback/dependency states.
- If an upstream invalidation happens while a slot callback is running, the in-flight stale result is not published as fresh; the getter retries until it can return a value that matches the latest dependency state
- Batch exit, effect scheduling, disposal, and explicit clears must each have a single atomic graph mutation boundary and one coalesced effect flush per outermost invalidation pass
- The outermost thread-safe batch exit must collect dependents for all changed cells and apply one coalesced frontier invalidation, so a shared dependent reached through many changed cells is marked dirty and advances revision once per batch flush
- Thread-safe invalidation uses an explicit
InvalidationPlancomputed from a frontier work queue under the graph mutex instead of recursive dependent walks. Changed-cell and slot-value-change roots snapshot dependent frontiers, coalesce duplicate slot ids in one invalidation pass, preserve direct changed-valueforce_recomputeupgrades when a slot is reached through both direct and downstream paths, snapshot hard-clear frontiers for explicit slot/cell clears, then apply dirty, clear, revision, and effect-scheduling mutations at the same graph mutation boundary. The plan shape is partitionable for future bounded worker traversal, but this prototype keeps snapshot and application under the context mutex until benchmark and model-checking evidence proves a parallel apply path safe. - Thread-safe stress coverage must run the same contention script under both
LowConcurrencyandHighConcurrency, mixing batched cell writes, explicit slot and cell-dependent clears, effect cleanup/rerun, effect disposal racing with writers, and concurrent cached reads. The harness lives intests/thread_safe_stress.rsand is part ofmake check.
Lock strategy evaluation:
-
Keep one context-level graph synchronization primitive until benchmark instrumentation shows a finer-grained design improves the relevant workload without trading off other contention cases
-
ThreadSafeContextuses read-mostly per-slot cached-value sidecars for fresh cached reads, a per-slot recompute/value-publish sidecar for in-flight same-slot waiters, a per-slot dependency summary for cell-only dirty refresh routing, and per-node dependent frontier sidecars for slot-only changed-cell invalidation; same-thread batches use thread-local batch frames to coalesce changed-cell queueing before the graph-owned batch flush; dependency graph mutations still require the context mutex -
The read-mostly cached-value sidecar is an optimistic validation path, not a lock-free graph replacement. Its atomic cache revision rejects mid-read invalidation and mid-read publish races, then falls back to the existing graph refresh path.
-
ThreadSafeContextuses per-slot sidecar recomputeCondvars for in-flight waiters. Those Condvars guard only per-slot in-flight/revision/cache-visibility state, use waiter-countednotify_onehandoff wakeups to avoid broadnotify_allcontention, and must not mutate dependency graph state independently of the context mutex -
The read-mostly prototype is benchmark-gated by the 1/2/4/8/16-worker
same_slot_write_read,independent_slots,read_mostly_waiters, andbatched_write_burstsmatrix after the#lazybatch1and#lazybatch2invalidation/read-churn fixes -
The dependent-frontier sidecar prototype is benchmark-gated by
thread_safe_contention / independent_slotsandset_cell_invalidation / independent_slot_contentionat 8 and 16 workers. It should reduceset_cell_invalidationgraph-lock acquisitions for independent slot-only roots without changing effect, batch, or dynamic-dependency semantics. -
A sidecar frontier invalidation that reaches a slot with recompute in flight falls back to the graph-locked invalidation path, so stale publishes cannot clear newer dirty markers.
-
High-parallel graph propagation profiles gate the lazy-invalidation path with
thread_safe_graph_propagationat 8 and 16 workers. The matrix compares fan-out eager validation, fan-out lazy dirty epoch publication, fan-in lazy dirty epoch publication, and fan-in batched flush behavior using throughput, p50/p95 latency, lock attribution, effect queue pushes, dependency-edge counters, sidecar dirty marks, sidecar fallbacks, and dirty epoch advances. -
The local batch-frame prototype is benchmark-gated by
set_cell_invalidation / batched_write_burstsandthread_safe_contention / batched_write_burstsat 8 and 16 workers. It should reduce per-write graph queueing during same-thread batches while preserving one coalesced dirty/effect flush at the outermost batch exit. -
Effect-heavy contention profiles gate any queue or batch synchronization change.
thread_safe_effect_contentionisolates effect queue coalescing, cleanup execution, and nested batch flush behavior at 8 and 16 workers with deterministic lock-site budgets before a sharded graph-lock design can be considered. -
Synchronization strategy comparison is release-gated by one fixed evidence table. The current
std::syncmutex/Condvar path is the baseline; narrower Condvar wakeups are adopted only for per-slot recompute waiters; parking_lot style parking and targeted CAS remain candidates. A candidate must report throughput plus p50/p95 latency for the required 8/16-worker contention and effect-heavy cases, stay within lock-site budgets, and carry Loom/Shuttle proof for stale completion, effect scheduling/disposal, batch flush, and re-entrant callbacks before release. -
Benchmark watch items from generated README deltas must be confirmed with a controlled A/B rerun before tuning. The rerun should use the same benchmark filter on the same host/toolchain when possible, isolate baseline and current code in clean worktrees or Criterion baselines, and record whether the signal reproduces. If confidence intervals overlap or Criterion reports no statistically significant change, document the watch item and avoid speculative synchronization changes.
-
Edge storage A/B benchmarking procedure: the
vec_edgesfeature flag switchesEdgeVecfromSmallVec<[SlotId; 4]>(default) toVec<SlotId>. To compare:cargo bench --bench context -- dependency_fan_out,set_cell_invalidation/high_fan_out --save-baseline smallveccargo bench --bench context --features vec_edges -- dependency_fan_out,set_cell_invalidation/high_fan_out --baseline smallvec- Compare Criterion output — if confidence intervals overlap, keep SmallVec; if Vec is faster at the tested fan-out widths, reconsider the default. The comparison must run on the same host/toolchain with no other workload.
-
Cached-read strategy is runtime-selectable (#vd5v / #rdstrat1). Both read paths are compiled in and chosen at context construction via
ThreadSafeContext::with_read_strategy(ReadStrategy), defaulting toLowConcurrency; the slot sidecarThreadSafeSlotFastPath.valueis aCachedReadStorageenum:LowConcurrency→parking_lot::RwLock<Option<Arc<dyn Any + Send + Sync>>>read — optimal uncontended / low core counts (the default).HighConcurrency→arc_swap::ArcSwapOption<Arc<dyn Any + Send + Sync>>wait-free load — no read lock; optimal at 8+ cores. (arc-swap’sRefCntisSized-only, so it storesArc<Arc<dyn Any>>; the extra outerArcis allocated only on the cold publish path, never on the read.)
Both reconstruct
&Tvia the inlinetype_idwithout vtable indirection, and both carry the same atomiccache_revision+dirty/force_recomputevalidation envelope (loaded before, re-checked after the clone), so agetstarting after a completed cross-thread invalidation cannot return the pre-invalidation value regardless of strategy. The runtime selection costs one per-read enum branch (the price of compiling both paths). 0.9.0 shipped arc-swap as the default; 0.10.0 (#rdstrat1) flips the default toLowConcurrencywith explicit opt-in toHighConcurrency. Verified by the full default suite (both strategies) and thethread_safe_loommodel (the validation algorithm is identical across variants). The inline small-Copyseqlock fast path that subsumes this tradeoff for small values is #rdstrat2 (implemented; opt-in viaslot_copy/computed_copy/memo_copy— see Inline small-Copyseqlock below).Contention tradeoff (rigorous isolated-worktree A/B, #xtwf). This is a deliberate low-contention-for-high-contention trade, not a free win. Comparing
463ca71(arc-swap) against its parent06fd3c2(theparking_lot::RwLockread) on a shared Criterion target dir, same host/toolchain:benchmark arc-swap vs RwLock cached_reads/thread_safe_context(1 thread)+3.1% (slower) read_mostly_waiters/4+11.9% (slower) read_mostly_waiters/8−6.5% (faster) read_mostly_waiters/16−28.6% (faster) arc-swap’s wait-free read wins at high core counts where
RwLockread-lock cache-line traffic dominates, but its debt-tracking load plus theArc<Arc<…>>double-indirection costs a few percent when uncontended. lazily-rsThreadSafeContexttargets highly-parallel reactive graphs, so the 8/16-worker win is the operative case and the ≤~3% uncontended/low-contention regression is accepted (see the revised benchmark gate below). The contended numbers carry wide confidence intervals; treat the crossover (~4→8 workers) as approximate. -
Any future sharding or CAS path must include a Loom or Shuttle safety model covering concurrent first get, stale in-flight completion, invalidation during compute, effect scheduling/disposal, and re-entrant callbacks before it can replace the single-graph-lock design
-
The current sidecar
Mutex/Condvarwaiter path, optimistic cached-read fallback, and explicit invalidation-plan safety envelope are covered bycargo test --features loom --test thread_safe_loom, which models concurrent first get, scoped slot notification, waiter-counted handoff wakeup draining, stale in-flight completion and retry, read-mostly waiter handoff, mid-read optimistic validation fallback, invalidation during compute, fast-frontier fallback while dependency discovery is active, dynamic dependency switch/disposal cleanup, effect scheduling/disposal races, re-entrant callback graph access, duplicate diamond paths marking each frontier slot once, effect enqueue coalescing, nested batch invalidation flushing only at the outermost boundary, and the inline small-Copyseqlock (#rdstrat2) refusing torn reads and stale post-invalidation reads under concurrent single-writer publish -
A lock-strategy change must preserve the rule that user compute/effect/cleanup callbacks never run while holding graph-state locks
Sharded/versioned storage evaluation:
- After the frontier invalidation and read-churn prototypes, the
thread_safe_contentionbenchmark and instrumentation rows are the gate for storage changes. The read-mostly cached-value sidecar is limited to fresh cached slot reads; write-side graph mutation remains serialized so the same benchmark matrix can show whether read-side wins trade off same-slot writes, independent slots, or batched aggregate writes. - Do not replace the single graph lock with sharded storage in the current implementation. Sharding may help independent roots and slots, but it does not remove serialization for same-root writes, shared aggregate slots, effect queues, batch-depth accounting, disposal, or dynamic dependency edge changes. A shard design must first define shard ownership for dependency edges that cross shards, a deterministic merge/apply order for dirty/revision/effect mutations, and a Loom or Shuttle model for deadlock-free multi-shard invalidation and disposal.
- Do not treat versioned optimistic reads as an invalidation optimization.
Versioned reads target fresh cached
getlatency, while the isolatedset_cell_invalidationprofiles attribute invalidation pressure to write-side graph mutation. The current prototype keeps independently retained sidecarArcsnapshots plus atomic dirty/revision validation so agetstarting after a cross-thread invalidation cannot return the pre-invalidation cached value. - The next storage experiment, if pursued, should be a benchmark-gated
prototype rather than a replacement: shard independent
ThreadSafeStatemutation by stable node id, keep effect queue and batch flush as one deterministic merge boundary, and require the isolatedthread_safe_effect_contentionprofiles,set_cell_invalidationmatrix, and Loom/Shuttle coverage to improve before adopting it.
Typed cache fast-path
Context and ThreadSafeContext store cached values as type-erased trait
objects (Rc<dyn Any> and Arc<dyn Any + Send + Sync>). Every cached read
calls downcast_ref::<T>(), which invokes a virtual type_id() through the
trait object’s vtable, compares the returned TypeId, and then performs a
pointer cast. The vtable call adds indirect branching overhead to the hot
cached-read path.
The typed cache fast-path stores a TypeId inline in each node at creation
time. Cached reads compare the stored TypeId directly (a single inline
u64 equality check) and, on match, use an unchecked pointer cast to recover
the typed reference. This eliminates the vtable indirection on every cached
slot and cell read for both Context and ThreadSafeContext.
Storage layout:
SlotNode.type_id: TypeId— set once atslot()/computed()/memo()creation fromTypeId::of::<T>()CellNode.type_id: TypeId— set once atcell()creationThreadSafeSlotFastPath.type_id: TypeId— set once at slot creationThreadSafeCellFastPath.type_id: TypeId— set once at cell creation
Read-path fast-path:
- Load the node’s
type_idfield (inlineu64load, no vtable) - Compare with
TypeId::of::<T>() - On match, cast the stored value pointer to
&Twithout going throughdyn Any::downcast_ref - On mismatch, panic with the same “type mismatch” message as before
get_rc() and get_cell_rc() follow the same pattern but clone the
reference-counted pointer instead of cloning the inner value.
The compute-function signature (dyn Fn(&Context) -> Rc<dyn Any> for
Context, dyn Fn(&ThreadSafeContext) -> Box<ThreadSafeAny> for
ThreadSafeContext) remains unchanged; the typed cache is a storage-side
optimization that does not affect the closure API.
Benchmark gate:
cached_reads/contextmust show a measurable improvement over thedowncast_refbaseline. The current baseline is approximately 8 ns per cached read; the typed cache target is to reduce this by the vtable call overhead (typically 0.5–1.5 ns on x86-64).cached_reads/thread_safe_context(single-thread, uncontended) is a bounded-regression gate, not a no-regression gate, as of the #vd5v lock-free read. Thearc-swapread sidecar may regress the uncontended cached read by up to ~3% in exchange for the high-contention win (read_mostly_waiters−6%/−29% at 8/16 workers; see Lock strategy evaluation → contention tradeoff). A regression beyond ~3% uncontended, or any regression in the 8/16-worker read-contention matrix, fails the gate and must be investigated. The earlier inline-TypeIdvtable-elimination win forcached_reads/context(single-threadedContext) is unaffected and must still hold.- The improvement must reproduce under controlled Criterion A/B comparison (same host, same toolchain, non-overlapping confidence intervals).
Future implications:
- Fully lock-free cached reads require typed storage so that a reader can
reconstruct a typed reference from an atomically published pointer without
vtable indirection. The inline
TypeIdis a prerequisite: it proves the type at compile time and makes the unchecked cast sound. Implemented (#vd5v):ThreadSafeSlotFastPath.valueis now anarc_swap::ArcSwapOption, soread_freshloads the published snapshot wait-free and recovers&Tvia the inlinetype_id— see Lock strategy evaluation above. - A future
ErasedValuestorage type could replaceRc<dyn Any>/Arc<dyn Any>entirely, storing the value inline for small types and avoiding heap allocation on compute. The current inline-TypeIdstep preserves theRc<dyn Any>/Arc<dyn Any>layout while unlocking the fast-path read optimization. Partially implemented forThreadSafeContextcached reads (#rdstrat2): the slot cached-read sidecar (CachedReadStorage::Inline) stores smallCopyvalues inline behind a wait-free seqlock — see Inline small-Copyseqlock below. The node still retains itsArc<dyn Any>value (the inline buffer is a read-acceleration duplicate); a fullErasedValuethat removes theArcfor non-Copytypes remains future work.
Inline small-Copy seqlock (#rdstrat2)
For small Copy values, the ThreadSafeContext slot cached-read sidecar
(ThreadSafeSlotFastPath.value) selects a third CachedReadStorage variant,
Inline, instead of the strategy-selected Locked/LockFree path. The value’s
bytes are stored inline in [AtomicU8; INLINE_CAP] (INLINE_CAP = 24,
alignment bound 16) behind a single-writer / multi-reader seqlock, with no
heap Arc, no refcount traffic on either read or publish. The inline path is
optimal under both ReadStrategy modes; the runtime mode only governs the
large / non-Copy fallback.
- Soundness conditions. Inline is chosen only when
T: Copyandsize_of::<T>() <= INLINE_CAPandalign_of::<T>() <= 16.Copyremoves anyDrop/ ownership hazard from a discarded torn read; the size/alignment bound keeps the byte copy in-bounds. The bytes are read/written with relaxed atomic per-byte operations (not a plainmemcpy), so a reader racing the single writer is well-defined under the Rust memory model — unlike a classic non-atomic seqlock, which has a formally-UB benign data race. The inlinetype_idprovesTso the validated byte snapshot can be reconstructed intoTwithout a vtable. - Single-writer invariant. Every
write(value publish inrecompute_slot_now, clear inapply_locked) runs while holding the graph state write lock, so writes are serialized; only reads are lock-free. Theseqcounter is even when stable, odd while a write is in progress; a reader observing an odd or changedseqdiscards its snapshot and retries. The closingReleasestore of the evenseqand the reader’s bracketing Acquire loads (with the canonical Acquire fence) make an accepted snapshot a consistent image of exactly one publish. - Same validation envelope. The inline path carries the identical atomic
cache_revision+dirty/force_recomputeenvelope as the other two strategies, so a read racing a publish/invalidation is rejected identically. - Opt-in constructors. Inline selection is not automatic on the generic
slot/computed/memoconstructors: stable Rust cannot branch onT: Copyinside a generic fn that lacks the bound (method resolution is pre-monomorphization, so aCopy-gated impl is never applicable where the bound is unprovable; automatic detection would require nightlyspecialization). The inline path is therefore opt-in through theCopy-boundedslot_copy/computed_copy/memo_copyconstructors, which fall back transparently to the strategy path when the value exceeds the inline size/alignment bound. - Loom gate. The seqlock orderings (single-writer publish vs. concurrent
lock-free readers, plus the
cache_revision/dirtyenvelope) are modeled bycargo test --features loom --test thread_safe_loom(inline_seqlock_reader_never_observes_torn_value,inline_seqlock_envelope_rejects_torn_and_stale_under_concurrent_publish,inline_seqlock_read_after_completed_invalidation_is_rejected). The two single-property models run unbounded (exhaustive); the combined envelope model uses a preemption-boundedBuilder(bound 4, 60 s duration cap) because the full 6-atomic envelope across two threads plus the driving body makes the unbounded permutation space non-terminating — the bound is validated by confirming the model still flags an injected torn-read regression, satisfying the lock-strategy Loom gate before landing.
Tokio integration is scoped in two stages:
- Synchronous thread-safe sharing first:
ThreadSafeContextshould work insidetokio::spawnandtokio::task::spawn_blockingwhen all captured values and callbacks satisfy theSend + Syncbounds above. This is exposed behind the optionaltokiofeature with async tests and thetokio_syncexample; it must not introduce async compute/effect semantics. - True async computations/effects are separate future work. They need explicit
semantics for in-flight future deduplication, cancellation, dependency tracking
across
.await, stale future completion, cleanup ordering, andSendversusLocalSetfutures.
AsyncContext
True async support is a new explicit async context surface, not an overload of
Context or ThreadSafeContext. The AsyncContext API lives behind a separate
async feature flag so downstream users do not accidentally accept the larger
semantic surface. The async feature depends on tokio for runtime primitives
(spawn, JoinHandle, notification).
AsyncContext type definitions
#![allow(unused)]
fn main() {
pub struct AsyncContext {
inner: Arc<AsyncContextInner>,
}
pub struct AsyncSlotHandle<T> {
id: SlotId,
_marker: PhantomData<T>,
}
pub struct AsyncCellHandle<T> {
id: SlotId,
_marker: PhantomData<T>,
}
pub struct AsyncEffectHandle {
id: SlotId,
}
pub struct AsyncComputeContext<'a> {
context_id: AsyncContextId,
node_id: SlotId,
inner: &'a AsyncContextInner,
}
}
AsyncContext API surface
| Method | Signature | Purpose |
|---|---|---|
new | fn new() -> Self | Create a new async context |
cell | fn cell<T>(&self, value: T) -> AsyncCellHandle<T> | Create a mutable cell (T: PartialEq + Clone + Send + Sync + 'static) |
get_cell | fn get_cell<T>(&self, handle: &AsyncCellHandle<T>) -> T | Get cell value (synchronous) |
set_cell | fn set_cell<T>(&self, handle: &AsyncCellHandle<T>, value: T) | Update cell and invalidate dependents |
computed_async | fn computed_async<T, F, Fut>(&self, compute: F) -> AsyncSlotHandle<T> | Create an async computed slot |
get | fn get<T>(&self, handle: &AsyncSlotHandle<T>) -> Option<T> | Synchronous cached read; returns Some(T) if resolved, None otherwise. Avoids async overhead on warm paths |
get_async | async fn get_async<T>(&self, handle: &AsyncSlotHandle<T>) -> T | Await slot value; uses get() fast-path for resolved slots, otherwise spawns async compute |
memo_async | fn memo_async<T, F, Fut>(&self, compute: F) -> AsyncSlotHandle<T> | Like computed_async with PartialEq memo guard |
effect_async | fn effect_async<F, Fut, C, CleanupFut>(&self, effect: F) -> AsyncEffectHandle | Create an async effect |
dispose_async_effect | fn dispose_async_effect(&self, handle: &AsyncEffectHandle) | Dispose async effect and await cleanup |
batch | fn batch<F, R>(&self, run: F) -> R | Synchronous batch boundary; schedules async reruns at batch exit |
API bounds:
| Method family | Additional bounds |
|---|---|
get | T: Clone + Send + Sync + 'static |
cell, get_cell, set_cell | T: PartialEq + Clone + Send + Sync + 'static |
computed_async, memo_async | T: PartialEq + Clone + Send + Sync + 'static; compute Fn(AsyncComputeContext) -> Fut + Send + Sync + 'static; future Future<Output = T> + Send + 'static |
effect_async | effect Fn(AsyncComputeContext) -> Fut + Send + Sync + 'static; future Future<Output = Option<C>> + Send + 'static; cleanup FnOnce() -> CleanupFut + Send + 'static; cleanup future Future<Output = ()> + Send + 'static |
| handles | remain id-only and copyable; usable from any task only with the owning AsyncContext |
AsyncSlotNode state machine
Each async slot tracks its state through a finite state machine:
#![allow(unused)]
fn main() {
enum AsyncSlotState {
Empty,
Computing { revision: u64, handle: JoinHandle<()> },
Resolved,
Error,
}
}
States:
- Empty: no cached value, no in-flight computation. Entered on creation and after hard clear.
- Computing: a
JoinHandletracks the in-flight future for the current revision. Concurrentget_asynccallers attach waiters to the same in-flight result instead of spawning duplicate futures. - Resolved: the cached value is fresh. The value remains until dependency invalidation transitions back to Computing.
- Error: the last computation failed. Callers receive the error or retry on
the next
get_async.
State transitions:
- Empty → Computing: first
get_asynccall or dependency invalidation when no cached value exists. - Computing → Resolved: future completes with
Ok, and the recorded revision still matches the current slot revision. The value is cached. - Computing → Error: future completes with
Err, and the recorded revision still matches. - Computing → Computing (stale): dependency invalidation advances the slot revision during an in-flight computation. The completing future finds its revision no longer matches and discards the result. A new future is spawned for the updated revision.
- Resolved → Computing: dependency invalidation marks the cached value stale and spawns a new computation.
- Error → Computing:
get_asyncretry after an error.
Revision tracking ensures stale completions are discarded: an async computation records the slot revision at start; at publish time the graph accepts the value only if the revision is still current.
AsyncContext cancellation contract
-
Waiter cancellation is safe: dropping one
get_asyncfuture does not cancel the shared in-flight computation while other waiters still need it. Each waiter holds a shared handle (e.g., oneshot receiver orShared<...>); dropping the receiver does not abort theJoinHandle. -
Stale completion handling: when dependency invalidation advances the slot revision during an in-flight computation, the completing future finds its recorded revision no longer matches and discards the result. Waiting callers are retried against the new revision or attached to the newly spawned future.
-
Explicit cancellation:
slot.clear(), dependency invalidation, or context disposal may mark the in-flight revision as canceled. If the runtime provides an abort handle, the task is aborted. User futures must be cancellation-safe because aborting drops them at an.awaitboundary. -
Context disposal: dropping the
AsyncContextcancels all in-flight computations via theirJoinHandle::abort()handles and awaits completion of all active cleanup futures before returning. -
Effect cleanup futures must complete before the next effect body starts. Disposal removes pending reruns before awaiting cleanup.
get_async re-resolve contract (#k03k)
get_async must treat the slot state as authoritative and re-resolve rather
than assert, because the slot can change between its lock acquisitions and a
notifier can close under it. It runs an outer loop that, each pass, re-reads the
slot via the get() fast path and then re-locks to attach to / spawn a
computation. Two concurrency windows are load-bearing:
- Resolved-since-
get(): the slot can transitionComputing → Resolvedbetween theget()fast-path check (which releases the lock) and the re-lock. ObservingResolvedat the re-lock is therefore expected and the cached value is read directly — it is not an unreachable state. - Notifier dropped: the per-computation
watchsenders can all drop without a finalResolvedsend when an in-flight compute is superseded by a newer revision (the staleComputing → Computingtransition early-returns) or the slot is invalidated. Arecv.changed()error means “the world changed”, not a fatal error: the awaiter restarts the outer loop and re-resolves from current slot state (returning the now-published value, attaching to the new in-flight compute, or respawning).
Neither window is a data inconsistency — the published value is always correct;
the contract is that get_async never panics on these benign races. Covered by
async_context_concurrent_set_and_get_async_never_panics_k03k, which fails
deterministically against the prior assert-based implementation.
Deterministic window coverage. Unlike ThreadSafeContext, the async resolve
loop cannot be modeled with Loom: AsyncContext runs on tokio’s async
executor and tokio::sync::watch, while Loom only shims synchronous loom::sync
primitives and has no async runtime. Each window is instead pinned by a targeted
deterministic test in tests/async_resolve_loop.rs (async + instrumentation
features):
- Window 1 is forced via a one-shot
instrumentation-gated seam (AsyncContext::__install_window1_hook) that resolves the slot inside the synchronous gap between the fast-pathget()and the re-lock — the gap has no.await, so cooperative scheduling alone cannot reach it. The test asserts the reader returned through theResolved-after-re-lock arm via__window1_resolved_hits. The seam compiles out of default/release builds. - Window 2 is forced by gating an in-flight compute and superseding it with a
newer revision so the notifier drops without a final send, asserting the waiter
re-resolves to the latest value rather than panicking (mirrors the broader
async_stress.rs::get_async_waiter_cancellation_and_stale_completion_keep_latest).
Exhaustive interleaving exploration of the async path (beyond these two known
windows) would require a Shuttle model, which in turn requires making
async_context.rs generic over its concurrency primitives — a larger
architectural change tracked separately, not a Loom drop-in.
Async race stress coverage must exercise get_async waiter cancellation, stale
in-flight completion after dependency invalidation, dynamic dependency
replacement across awaited slot reads, and async effect cleanup-before-rerun
ordering. The harness lives in tests/async_stress.rs under the async feature
so make test-async and make check run it with the rest of AsyncContext
coverage.
AsyncContext dependency tracking
Async compute and effect callbacks do not use thread-local tracking stacks.
Instead, each callback receives an AsyncComputeContext:
#![allow(unused)]
fn main() {
impl<'a> AsyncComputeContext<'a> {
pub async fn get_async<T>(&self, handle: &AsyncSlotHandle<T>) -> T;
pub fn get_cell<T>(&self, handle: &AsyncCellHandle<T>) -> T;
}
}
get_asyncon the compute context records the accessed slot as a dependency before awaiting its value.get_cellon the compute context records the accessed cell as a dependency synchronously.- Async reads register the graph edge immediately, so source invalidation while the future is suspended can cancel or supersede the in-flight computation before it publishes stale data.
- Dependencies are collected into a
HashSet<SlotId>attached to the async node. On rerun, stale dependencies are removed and new dependencies are registered. - This design survives executor thread migration and suspension/resume across
.awaitpoints because the dependency set is carried by theAsyncComputeContext, not a thread-local.
AsyncContext async effects
#![allow(unused)]
fn main() {
fn effect_async<F, Fut, C, CleanupFut>(&self, effect: F) -> AsyncEffectHandle
where
F: Fn(AsyncComputeContext) -> Fut + Send + Sync + 'static,
Fut: Future<Output = Option<C>> + Send + 'static,
C: FnOnce() -> CleanupFut + Send + 'static,
CleanupFut: Future<Output = ()> + Send + 'static;
}
- Serialized reruns: async effect reruns are serialized per effect. A rerun does not start until the previous cleanup future completes.
- Cleanup ordering: the cleanup future from the previous run completes before the next effect body starts. Disposal awaits the current cleanup before removing the effect node.
- Auto-tracking: the effect body receives an
AsyncComputeContextand tracks dependencies throughget_asyncandget_cellcalls. - Dependency invalidation schedules an async rerun after the current invalidation pass. The rerun is spawned on the runtime executor, not inline.
- Effect disposal: removes pending scheduled reruns, awaits the current cleanup future, and unsubscribes dependency edges.
AsyncContext batch support
ctx.batch()is a synchronous boundary. Cell updates queue invalidation roots.- At batch exit, queued roots trigger invalidation propagation. Async slots and effects are scheduled for rerun but do not execute inside the batch callback.
- Async reruns execute after the batch returns, on the runtime executor.
- Batching semantics remain synchronous at the graph mutation boundary: invalidations schedule async reruns only after the outermost batch exits.
AsyncContext feature flag
[features]
async = ["dep:tokio"]
The async feature depends on Tokio for spawn, JoinHandle, and runtime
primitives. It is separate from the tokio feature (which covers synchronous
ThreadSafeContext sharing inside Tokio tasks). The async feature implies
tokio.
Integration tests live in tests/async_integration.rs and are gated behind
#![cfg(feature = "async")]. Run with make test-async (or
cargo test --locked --features async). The make check target includes
test-async alongside test, test-tokio, and test-loom.
AsyncContext implementation notes
- Async graph locks must never be held while polling user futures or cleanup futures. Acquire the lock only to read/write graph state, then release before polling.
- Nested async slot reads register dependencies on the awaiting parent before awaiting the child result.
- The sync
tokiofeature is not enough to enable this API; true async support uses the separateasyncfeature flag. - The
Sendasync context requiresSend + Sync + 'staticvalues, callbacks, futures, and cleanup futures. A futureLocalAsyncContextmay support!Sendfutures ontokio::task::LocalSet, but handles must not be interchangeable with theSendasync context. - In-flight future deduplication: each async slot has one published cache and
at most one in-flight computation for the current slot revision. Concurrent
get_asynccallers await the same in-flight result instead of spawning duplicate futures. - Synchronous cached-read fast-path:
get()returns the cached value synchronously when the slot isResolved, avoiding async overhead.get_async()callsget()first; only unresolved or dirty slots enter the async spawn path.
Invalidation Semantics
ctx.set_cell()→ if value changed (PartialEq) → mark all dependent slots dirtyslot.clear(&ctx)→ remove cached value → cascade clear to all dependentscell.clear_dependents(&ctx)→ clear all dependent slots without changing cell valuectx.batch()→ queue changed cells and explicit slot/cell clears, then flush queued roots when the outermost batch exits- Slot invalidation → preserve cached value as dirty → validate/recompute on next
ctx.get()access - Thread-safe slot invalidation walks an explicit coalesced frontier, so diamond paths mark each reachable slot at most once per invalidation pass unless a later direct changed-value path upgrades that slot to forced recompute
- If a dirty
ctx.memo()slot recomputes to an equal value, downstream dirty slots become fresh without recomputing - Slot clearing → remove cached value → hard-clear dependents recursively
- Effects rerun after the invalidation pass if any tracked dependency invalidated
- Effects scheduled only by dirty slot dependencies skip rerun if those slots validate unchanged
- Effect cleanup runs before rerun and on disposal
Design Goals
- Lazy evaluation: Values computed only when first accessed or when dirty caches are validated
- Ergonomic derived values:
ctx.computed()is the preferred spelling for ordinary derived slots - Fine-grained reactivity: Only affected dependents recompute
- Memoized invalidation: Equal intermediate
ctx.memo()recomputation suppresses downstream recomputation/effect reruns - Effects: Side effects are scheduled from the same dependency graph as slots
- Batching: Multiple writes can share one invalidation/effect flush boundary
- Zero mandatory runtime dependencies: The default library surface uses only the Rust standard library and
smallvec; Tokio is optional and Criterion is dev-only for benchmarks - Single-threaded fast path:
Contextuses a singleRefCell<ContextInner>with no mutex overhead and nounsafecode - Contiguous local storage: Both
ContextandThreadSafeContextindex nodes directly bySlotIdin aVec<Option<Node>>to avoid hash-map lookup and churn; slot IDs are reused via a free list to prevent unbounded growth from transient effects - Explicit thread-safe path:
ThreadSafeContextuses a context-level lock andSend + Syncbounds for shared reactive graphs - Performance tracking: Criterion benchmarks cover both
ContextandThreadSafeContextfor cached reads, cold first access, dependency fan-out, memo equality suppression, effect flushing, and batch storms;ThreadSafeContextalso tracks a 1/2/4/8/16-worker contention matrix that separates hot shared-slot writes, independent per-worker slots, read-mostly waiters, and batched write bursts. - Benchmark instrumentation: The optional
instrumentationfeature exposes lightweight counters for recompute starts, duplicate speculative thread-safe computes, dependency edge churn, effect queue depth, reactive node allocations, aggregateThreadSafeContextlock wait/hold timing, and per-operation thread-safe lock attribution.
Performance Benchmarks
The benchmark suite is a development-only surface under benches/context.rs.
It must compile with cargo bench --no-run and should remain focused on public
API behavior rather than private graph internals.
Required benchmark scenarios:
- Cached reads after the slot has already been computed
- Cold first
getincluding graph construction and initial dependency capture - Dependency fan-out invalidation followed by dependent reads
- Memo equality suppression where an equal intermediate value prevents downstream recomputation
- Effect flushing after dependency mutation
- Batch storms that coalesce many writes into one invalidation/effect flush boundary
ThreadSafeContextset_cellinvalidation isolation, split into:- high fan-out changed-cell invalidation without dependent reads
- same-root/same-slot write contention without dependent reads
- independent per-worker roots and slots without dependent reads
- batched write bursts over per-worker cell groups without dependent reads
ThreadSafeContextcontention at 1, 2, 4, 8, and 16 workers, split into:- same-root/same-slot write plus read contention
- independent per-worker roots and computed slots
- read-mostly waiters with one writer and many readers
- batched write bursts over per-worker cell groups
ThreadSafeContexteffect-heavy contention at 8 and 16 workers, split into:- effect queue coalescing across batched worker writes
- effect cleanup execution during concurrent cell updates
- nested batch flushes that schedule computed-effect dependencies
ThreadSafeContextsynchronization model checking with the optionalloomfeature
The optional instrumentation feature adds instrumentation_snapshot() and
reset_instrumentation() to both context types and exports
InstrumentationSnapshot. The snapshot records:
- Reactive node allocation events as a stable allocation proxy
- Slot recompute callback starts
- Duplicate speculative
ThreadSafeContextrecomputes that lose publication races; this should remain zero when in-flight deduplication is effective - Dependency edges added and removed
- Effect queue pushes and maximum pending queue depth
ThreadSafeContextlock/coordination acquisitions plus total wait and hold nanoseconds
ThreadSafeContext::lock_profile_snapshot() returns per-operation lock and
coordination counters for the thread-safe path. The buckets are intentionally
high-level: unattributed/other work, get refresh, dependency edge add/remove,
set_cell invalidation, recompute publication, and in-flight recompute waiting.
For the per-slot sidecar recompute Condvars, the in-flight wait bucket records
the parked wait and reacquire boundary. The bucket acquisition counts must sum
to the aggregate lock_acquisitions counter so profile consumers can attribute
contention without losing the stable summary fields.
The instrumentation profile bench lives in benches/profile.rs and is gated
behind required-features = ["instrumentation"]; compile it with
cargo bench --features instrumentation --no-run.
The benchmark report harness lives at
scripts/update-benchmark-results.py. It runs
cargo bench --features instrumentation, reads Criterion estimate files from
target/criterion, captures examples/instrumentation_profile.rs counter
snapshots in target/lazily-instrumentation-profile.csv, and rewrites the
generated README section between
<!-- benchmark-results:start --> and <!-- benchmark-results:end -->.
The generated section must include the current Cargo package version, the
refresh command, the Criterion baseline comparison workflow, one timing row for
each required benchmark scenario above, p50/p95 Criterion sample latency rows
for the required 8/16-worker same-slot, independent-slot, read-mostly,
batched-write, and effect-heavy cases, and instrumentation rows covering
recomputes, duplicate speculative recomputes, dependency edge churn, effect
queue depth, node allocations, lock wait/hold time, and per-operation
ThreadSafeContext lock attribution for every 1/2/4/8/16-worker contention
matrix profile and every set_cell invalidation isolation profile. The generated
section must also publish regression budgets for the slow contention profiles and
their lock-site acquisition totals. --check verifies that the README section is
already current without rewriting it and fails when required p50/p95 latency
rows are missing or any instrumentation profile exceeds its lock-acquisition
budget; --no-run reuses existing Criterion estimate files for a report-only
refresh after a manual baseline comparison run while refreshing the
instrumentation CSV unless it is also running in check mode.
Serialization (lazily-serde feature gate)
The serde feature is declared in Cargo.toml but currently has no
implementation. Its purpose is to produce a serializable snapshot of context
state so the planned lazily-ipc (snapshot + incremental update protocol) and
lazily-distributed (CRDT/Raft remote graphs) layers have a stable on-the-wire
representation to build on. This section fixes the design of that feature gate
before any code lands.
The type-erasure problem
Context and ThreadSafeContext store cached values as type-erased trait
objects (Rc<dyn Any> and Arc<dyn Any + Send + Sync>; see Typed cache
fast-path). serde::Serialize is not object-safe — it has a generic
serialize<S: Serializer> method — so a dyn Any cannot be serialized
directly, and the concrete type T is gone by the time a snapshot walks the
node Vec. Any serialization design must recover the ability to call a
monomorphized Serialize/Deserialize for each node’s erased T. Two
approaches were considered.
Approach A — trait bounds (erased-serde style)
Replace dyn Any cache storage with a serialize-aware trait object — a sealed
dyn ReactiveValue: Any whose vtable also carries an erased_serde-style
erased_serialize(&self, &mut dyn Serializer). Every signal-creation API
(slot, computed, memo, cell) gains a T: Serialize + DeserializeOwned
bound under #[cfg(feature = "serde")].
- Pros: serialization is total — every cached node is serializable with no per-node opt-in; one storage type, one source of truth.
- Cons:
- The
T: Serializebound is viral: it propagates through the whole public API and leaks ontoContextitself even for signals that are never serialized, forcing callers to satisfy it for purely local reactive state. - Requires either the
erased-serdecrate or a hand-rolled vtable equivalent — acceptable under a feature gate but still added surface. - Breaks the typed cache fast-path. That optimization recovers
&Tvia an unchecked pointer cast keyed on an inlineTypeId; swapping the trait object out from under it for a serialize-aware type would have to preserve the same unchecked-cast guarantee. Deserializeis the hard half: reconstruction needs the concrete type at the call site, so a type-tag → constructor registry is required anyway — Approach A does not avoid the registry, it only adds the viral bound on top of it.
- The
Approach B — type-erased closures captured at creation (recommended)
Keep dyn Any storage untouched. At signal creation under the serde feature,
capture a monomorphized serde vtable — a small &'static struct of
function pointers — alongside the TypeId the typed cache fast-path already
records:
#![allow(unused)]
fn main() {
#[cfg(feature = "serde")]
pub struct SerdeVTable {
pub type_tag: &'static str, // stable cross-process key
pub serialize: fn(&dyn Any, &mut dyn erased_serde::Serializer)
-> Result<(), erased_serde::Error>,
pub deserialize: fn(&mut dyn erased_serde::Deserializer)
-> Result<Rc<dyn Any>, erased_serde::Error>,
}
}
Each SlotNode/CellNode (and the ThreadSafe*FastPath mirrors) gains a
#[cfg(feature = "serde")] serde_vtable: Option<&'static SerdeVTable> field,
set once at slot() / computed() / memo() / cell() creation. The thunks
are monomorphized per T at the construction call site, so the concrete type
is captured exactly where it is still known; invoking serialize downcasts the
&dyn Any back to &T and calls the real T: Serialize.
- Pros:
- Composes with the typed cache fast-path — both capture per-
Tmetadata at creation; the read hot path is untouched and pays zero overhead (thunks run only at snapshot time). - Opt-in per node. A signal whose
T: !SerializestoresNoneand is emitted as anOpaqueplaceholder, matching howlazily-ipcwill snapshot only an explicitly shared subgraph rather than the whole context. No viral bound onContext. - The
deserializethunk doubles as the type-tag → constructor registry Approach A needs anyway; populating it at slot construction keeps tags and constructors in one place.
- Composes with the typed cache fast-path — both capture per-
- Cons:
- Per-node storage grows by one pointer (
Option<&'static SerdeVTable>, 8 bytes) — eliminated entirely when theserdefeature is off via#[cfg(feature = "serde")]on the field. - Without specialization on stable Rust, “serialize if
T: Serialize, elseNone” needs a sealedMaybeSerialize<T>autoref/marker helper or explicit*_serdeconstructor variants rather than a blanket impl.
- Per-node storage grows by one pointer (
Decision
Adopt Approach B. The lazily-ipc/lazily-distributed roadmap serializes
an explicit allowlisted RemoteOp set (#39c5), not arbitrary nodes, so the
totality Approach A buys is unneeded — and it costs a viral bound plus a
rework of the typed cache fast-path to buy it. Approach B keeps the default
build byte-for-byte identical (field compiled out), preserves the zero-overhead
read path, and reuses the TypeId-at-creation pattern already proven by the
typed cache.
Feature-gate shape
serde = ["dep:serde"](declared) gainsdep:erased-serdeunder the same gate; both stay optional, preserving the zero mandatory runtime dependency goal.- A sealed
MaybeSerialize<T>helper resolves the vtable toSomewhenT: Serialize + DeserializeOwnedandNoneotherwise, so existing constructor signatures are unchanged and non-serializable signals keep compiling. Context::snapshot()/ThreadSafeContext::snapshot()walk live nodes, invoke each present vtable, and emitContextSnapshot { nodes: Vec<NodeSnapshot { slot_id, type_tag, payload | Opaque }> }. Restoration readstype_tag, looks up thedeserializethunk, and rebuilds typed cache entries.- This snapshot type is the input to #ipc2 (snapshot + incremental update protocol) and the value substrate for #ipc3 (CRDT vs Raft).
IPC snapshot + incremental update protocol (lazily-ipc)
lazily-ipc transmits a reactive graph’s state to a remote observer and keeps
it in sync as the graph mutates. It builds directly on the ContextSnapshot
from the lazily-serde design and reuses the existing batch-flush and
cache-revision machinery as its consistency boundary rather than inventing a
new one.
Two message kinds
Snapshot— full graph state. Sent on connect and on resync.Delta— incremental change set. Sent once per outermost batch-flush invalidation pass — the single atomic graph-mutation boundary the thread-safe path already guarantees (see Threading and Concurrency Contract: “Batch exit … a single atomic graph mutation boundary and one coalesced effect flush per outermost invalidation pass”). Coalescing is therefore free: one delta per flush, never one perset_cell.
Epoch / versioning
A context-level monotonic ipc_epoch: u64 advances once per outermost batch
flush, not per write. It is independent of the per-slot cache_revision
atomics (which remain the read-path dirty epoch) — ipc_epoch is the wire
sequence number.
Snapshotcarriesepoch.- Each
Deltacarries{ base_epoch, epoch }withepoch == base_epoch + 1. Deltas are strictly sequential, so a receiver detects any gap, reorder, or sender restart by checkingbase_epoch == last_epoch.
Payloads
Snapshot { epoch: u64, nodes: Vec<NodeSnapshot>, // NodeSnapshot from lazily-serde
edges: Vec<(SlotId /*dependent*/, SlotId /*dependency*/)>,
roots: Vec<SlotId> } // cells + source slots
Delta { base_epoch: u64, epoch: u64, ops: Vec<DeltaOp> }
DeltaOp =
| CellSet { slot_id, payload } // changed-value cell write (PartialEq-guarded)
| SlotValue { slot_id, payload } // a recompute published a new value
| Invalidate { slot_id } // dirtied, not yet recomputed (lazy)
| NodeAdd { slot_id, type_tag, payload | Opaque }
| NodeRemove { slot_id } // freed slot id (free-list reuse → Remove then Add)
| EdgeAdd { dependent, dependency }
| EdgeRemove { dependent, dependency }
Consistency invariants (inherited, not re-derived)
- PartialEq cell guard: an equal
set_cellinvalidates nothing, so it emits noCellSetand no downstream ops — the wire is silent exactly when the graph is (SPEC Invalidation Semantics). - memo equality suppression: a dirty
memo()that recomputes to an equal value emits noSlotValueand no downstreamInvalidate, mirroring the local “downstream dirty slots become fresh without recomputing” rule. - Coalesced frontier: a dependent reached through many changed cells in one batch appears at most once per delta — the same once-per-pass guarantee the thread-safe frontier already enforces.
Lazy reconciliation
Because lazily-rs is lazy, a flush can invalidate a slot without producing a new value. Two receiver modes:
- Value-mirror (default for IPC): at flush the sender resolves each
invalidated allowlisted slot via
ctx.get()so the delta carries concreteSlotValues. The receiver stays a pure data mirror holding no compute closures. Trades local laziness for a value-complete wire image. - Mirror-lazy: the sender emits bare
Invalidateand the receiver keeps a stale marker, recomputing only on its own read. This requires the compute closures to be replicated too and is therefore deferred tolazily-distributed(#ipc3), notlazily-ipc.
Resync / gap handling
The receiver tracks last_epoch. On a Delta whose base_epoch != last_epoch
(gap, reorder, or sender restart) it discards the delta and requests a
Snapshot; the sender replies with a fresh Snapshot { epoch } and resumes
deltas from there. Messages are length-prefixed and tagged Snapshot / Delta
via serde/erased-serde; the protocol is transport-agnostic (unix socket,
pipe, WebSocket — the last feeds the #yxjw signaling server).
Permission boundary (forward link to #39c5)
Only nodes on the per-peer allowlist (#39c5 RemoteOp) are serialized into a
snapshot or delta; non-allowlisted nodes are omitted entirely — not even as
Opaque — so a peer cannot infer their existence. The allowlist is applied at
snapshot/delta construction, before serialization, so the filter is the same
on the full and incremental paths.
Implemented (#39c5)
The permission policy layer ships behind the distributed feature in
src/distributed.rs:
NodeId/PeerId— wire-stable identifiers (decoupled from the internalSlotId),serde-derived under theserdefeature.OpKind(Read/Write/TriggerEffect) andRemoteOp { kind, node }— the gated, serializable unit a peer requests; the three kinds are gated independently (a read grant never implies write or effect-trigger).PeerPermissions— default-deny per-peer allowlist withallow,allow_many,revoke(prunes empty peer entries),revoke_peer,is_allowed, and a fail-closedcheck→Result<(), PermissionDenied>.filter_readable(peer, nodes)enforces the omission invariant above: non-readable nodes are dropped from the result entirely, preserving input order, so it can be applied at snapshot/delta construction before serialization.
PeerPermissions is local server-side state and is intentionally not
serializable; only the wire-facing RemoteOp family is. Higher layers
(lazily-ipc snapshot/delta construction, the lazily-distributed CRDT cell
plane, and the single-writer effect authority) gate every remote request
through check and build observable subgraphs through filter_readable.
Feature gate
A new ipc = ["serde"] feature adds the pure-protocol Snapshot/Delta types
plus a transport-agnostic IpcSink / IpcSource trait pair. No transport
dependency enters the core crate. The Delta/ipc_epoch model is a
single-writer linear log; whether multi-writer needs CRDT merge or Raft
consensus on top of that log is exactly the #ipc3 question.
Implemented surface:
Snapshot { epoch, nodes, edges, roots },NodeSnapshot,NodeState, andEdgeSnapshotdefine the full graph image.Delta { base_epoch, epoch, ops }andDeltaOpdefine the one-flush incremental image.Delta::next(base_epoch, ops)enforcesepoch == base_epoch + 1;Delta::apply_status(last_epoch)returnsApplyorResyncRequired.Snapshot::filter_readableandDelta::filter_readableapplyPeerPermissionsbefore serialization. Non-readable nodes and operations are omitted entirely; edges are retained only when both endpoints are readable.IpcMessage,IpcSink, andIpcSourcekeep Unix sockets, pipes, WebSockets, and shared-memory ring buffers outside the core crate.ShmBlobArena,ShmBlobRef, andIpcValue::SharedBlobprovide the shared memory payload path. The arena writes a fixed header before each payload with generation, epoch, length, and checksum metadata; readers validate that header before accepting a descriptor.IpcMessagecontrol frames can carry aShmBlobRefinstead of embedding large bytes inline.
Formal companion: lazily-spec/formal/lean models the shared IPC
Snapshot/Delta state machine in Lean 4 and proves the epoch sequencing,
fail-closed resync, PartialEq/memo suppression, batch coalescing, and eager
Signal slot_value invariants. It is intentionally a spec-layer oracle; Rust
implementation behavior remains covered by the crate tests and conformance
fixtures.
Shared-memory IPC is therefore a supported transport direction, not a separate
reactive-graph mode: the shared memory segment carries large blob payloads, and
the ordinary control transport carries framed IpcMessages with blob
descriptors. Each process keeps its own local Context / ThreadSafeContext
and reconciles via snapshots and deltas. A live Context is not shared across
process address spaces.
Cross-language channel compatibility (FFI / IPC / WebSocket / WebRTC data)
Yes: lazily-rs has a viable FFI strategy, but the FFI layer should be an
adapter around the same transport-agnostic state plane used by IPC and
distributed peers. It should not expose the closure-based Rust Context,
ThreadSafeContext, SlotHandle<T>, CellHandle<T>, or &T cached values
directly across an ABI boundary.
Compatibility model
The cross-language lazily family has one canonical message plane:
IpcMessage::SnapshotandIpcMessage::Deltaare the graph-state payloads.NodeId,PeerId,RemoteOp,Snapshot,Delta, andDeltaOpare the wire-facing contract; internalSlotIdvalues and typed handles remain local implementation details.IpcPayloadis opaque serialized value bytes. The producing language owns type-aware encoding through stabletype_tags; the channel only moves bytes.ShmBlobRefis a descriptor carried by a control frame. Shared memory stores large payload bytes, but reconciliation still happens through ordinaryIpcMessages.
Every supported channel carries that same message plane:
| Channel | Compatibility strategy |
|---|---|
| FFI | C ABI exposes opaque context/session handles plus owned byte buffers for IpcMessage encode/decode, snapshot export, delta apply, and memory release. No Rust references, trait objects, closures, or typed handles cross the boundary. |
| IPC | Unix sockets, pipes, local TCP, or process channels carry length-prefixed serialized IpcMessages. Shared-memory IPC is an optimization for large IpcValue::SharedBlob payload bytes, not a separate graph-sharing mode. |
| WebSocket | One WebSocket frame carries one serialized IpcMessage or a negotiated fragment. The #yxjw signaling server may relay the frame as opaque payload and must not parse CRDT/IPC state. |
| WebRTC data | Reliable ordered data channels carry the same serialized IpcMessages after #yxjw peer discovery. Unordered or unreliable channels are only acceptable for optional lossy telemetry; Deltas need ordered reliable delivery or receiver-side gap detection and snapshot resync. |
FFI boundary shape
The Rust FFI surface is deliberately narrow:
#![allow(unused)]
fn main() {
#[repr(C)]
pub struct LazilyFfiBytes {
pub ptr: *mut u8,
pub len: usize,
}
#[repr(C)]
pub enum LazilyFfiStatus {
Ok = 0,
Empty = 1,
NullPointer = 2,
InvalidMessage = 3,
EncodeFailed = 4,
Panic = 5,
}
#[repr(C)]
pub enum LazilyFfiMessageKind {
Unknown = 0,
Snapshot = 1,
Delta = 2,
}
}
The ffi feature exports extern "C" functions for creating/freeing an opaque
LazilyFfiChannel, validating/classifying JSON-encoded IpcMessage frames,
enqueueing accepted frames, receiving Rust-owned LazilyFfiBytes frames, and
freeing buffers allocated by Rust. All allocation ownership is explicit: the
caller owns input bytes, Rust owns output buffers until the paired free function
is called. Errors return LazilyFfiStatus; panics must be caught before crossing the C ABI.
The implemented channel is a local ABI adapter. It decodes each accepted frame
as IpcMessage, then re-encodes canonical JSON bytes before enqueueing or
returning a cloned frame. That keeps FFI byte transport aligned with IPC,
WebSocket, and WebRTC data transport while leaving snapshot export, delta
application to a live graph, and richer typed convenience APIs as higher-level
work on top of the same message plane.
The FFI layer may expose convenience helpers for local cells, but those helpers
still encode/decode through the same type_tag + payload registry used by
lazily-serde. A foreign runtime can therefore choose either direct local FFI
calls or framed IPC/WebSocket/WebRTC transport without changing the graph-state
protocol.
Shared library build
The crate produces both an rlib (for Rust consumers) and a cdylib (for
FFI consumers) via crate-type = ["lib", "cdylib"] in Cargo.toml.
make build-ffibuilds the shared library with theffifeature enabled.make ffi-headersgenerates a C header file (target/lazily.h) viacbindgenusingcbindgen.toml.- Future options include
safer-ffi(safer FFI wrappers with auto-generated headers) anddiplomat(multi-language binding generation). The currentcbindgenapproach is sufficient for C ABI consumers.
Binary serialization (ipc-binary feature)
The ipc-binary feature adds optional binary serialization via postcard
alongside the default JSON codec. Binary frames are significantly smaller and
faster for performance-critical paths between same-language or
binary-aware peers.
IpcMessage::encode_binary()/IpcMessage::decode_binary()— postcard encode/decode on theIpcMessagetype.IpcMessage::encode_json()/IpcMessage::decode_json()— JSON encode/decode (gated behind theffifeature which pulls inserde_json).- FFI binary functions:
lazily_ffi_channel_send_binary,lazily_ffi_channel_recv_binary,lazily_ffi_ipc_message_validate_binary,lazily_ffi_ipc_message_kind_binary,lazily_ffi_ipc_message_clone_binary— mirror the JSON FFI functions but use the postcard codec. EncodeError/DecodeError— codec-agnostic error types withJsonandBinaryvariants gated by their respective features.
The binary codec is not self-describing; peers must agree on the schema. For cross-language use, JSON remains the default; binary is for same-Rust or postcard-aware transports.
WebRTC data channel transport (webrtc-data feature)
WebRTC data channels carry IpcMessage frames peer-to-peer after the
signaling-client (#yxjw) completes SDP/ICE negotiation. This is the
internet-scale transport layer — no server relay needed for graph state
after the initial signaling handshake.
Feature gate
webrtc-data = ["ipc", "dep:str0m", "dep:tokio"]
Uses str0m for the WebRTC stack (pure Rust, no C dependencies) and
tokio for async runtime integration. Separate from signaling-client
so consumers can use signaling without incurring the full WebRTC
dependency.
Transport interface
#![allow(unused)]
fn main() {
#[cfg(feature = "webrtc-data")]
pub struct WebRtcDataChannel {
// str0m session wrapping a single data channel
}
#[cfg(feature = "webrtc-data")]
impl IpcSink for WebRtcDataChannel {
type Error = WebRtcDataError;
fn send(&mut self, message: &IpcMessage) -> Result<(), Self::Error>;
}
#[cfg(feature = "webrtc-data")]
impl IpcSource for WebRtcDataChannel {
type Error = WebRtcDataError;
fn recv(&mut self) -> Result<Option<IpcMessage>, Self::Error>;
}
}
Channel contract
- Ordered + reliable: data channels must be created with
ordered: true, maxRetransmits: NonesoDeltadelivery matches the single-writer epoch contract. Unordered/unreliable channels are only acceptable for optional lossy telemetry, never for graph state. - Framing: each
IpcMessageis length-prefixed (4-byte LE length + payload). Binary or JSON codec negotiated during capability handshake. - Back-pressure:
sendblocks or yields when the SCTP congestion window is full; the caller must not flood faster than the channel drains. - Reconnect: on channel close or SCTP failure, the transport signals
Errso the caller can re-signaling and re-establish a fresh channel. TheDeltaresync mechanism handles any gap.
Lifecycle
SignalingClientexchanges SDP offer/answer with peer via #yxjw- ICE candidates trickle through the signaling channel
- On ICE completion,
WebRtcDataChannel::from_sdp(local_sdp, remote_sdp)creates the str0m session and opens the data channel - Capability handshake on the data channel (protocol id, codec, features)
IpcMessageframes flow bidirectionally- On disconnect, re-signaling via
SignalingClientand repeat
Integration test surface
tests/webrtc_data.rs— gated behind#[cfg(feature = "webrtc-data")]- Loopback test: create two str0m sessions back-to-back, send
IpcMessage::SnapshotandIpcMessage::Delta, verify round-trip - Codec negotiation: JSON and binary on the same channel
- Ordered delivery: send 100 deltas, verify epoch order on recv
make test-webrtc-datatarget in Makefile
str0m backends: loopback vs networked (webrtc-str0m feature)
The webrtc-str0m feature ships two concrete DataChannel backends over the
same sans-IO str0m pump loop, differing only in transport and clock:
Str0mLoopback(src/str0m_backend.rs) — twoRtcinstances in one thread, connected by an in-memory packet route advanced on a synthetic clock. No sockets, no threads, no wall-clock dependency, so the full ICE/DTLS/SCTP handshake is deterministically testable in-process. This is the unit/CI substrate for theWebRtcSink/WebRtcSourcebridge.Str0mNet(src/str0m_net.rs) — oneRtcdriven over a real UDP socket by a background driver thread, with the SDP offer/answer and trickled ICE candidates exchanged by the caller (typically overSignalingClient, #yxjw). This is the real “beyond signaling” peer-to-peer path that can reach a peer on another host.
Str0mNet lifecycle:
Str0mNet::offer(bind)→ binds the UDP socket, opens thelazily-ipcchannel, returns the SDP offer string;Str0mNet::answer(bind, offer)returns the SDP answer string. The offerer applies the peer’s answer withaccept_answer.- Each peer exposes its host candidate via
local_candidate(); the caller trickles it to the remote, which feeds it toadd_remote_candidate(). - The driver thread pumps
poll_output→ UDPsend_to, and UDPrecv_from→handle_input, advancing real timers, until the SCTP data channel opens (wait_open). Inbound frames queue fortry_recv_frame; outbound frames requested before open are buffered and flushed on open. - On
ChannelClose, socket failure, or a deadRtc, the channel reports closed so the sync sink/source surfaceErrand the caller re-signals.
Because Str0mNet needs live two-peer connectivity it cannot use the synthetic
clock. tests/str0m_net.rs exercises a real two-socket round trip over
127.0.0.1 (real UDP/DTLS/SCTP/timers); a cross-host round trip through the live
signaling Worker is operator-gated.
Str0mNet outbound backpressure contract (#lzstr0mframe)
The Str0mNet driver’s outbound frame queue is bounded at
MAX_PENDING_FRAMES (1024). When a caller’s send_frame rate exceeds the SCTP
drain rate, the driver applies backpressure on two layers:
Channel::writebackpressure (Ok(false)) — str0m returnsOk(false)when the SCTP send buffer is full. The driver’s flush loop re-queues the frame and yields (rather than popping it), so the nextpoll_output/recv_fromcycle can drain the SCTP window and accept the frame on the following iteration. Pre-#lzstr0mframe this branch was a bareif ch.write(...).is_err() { break; }that ignored thebool, silently dropping every frame that hit theOk(false)path — violating the ordered/reliable DataChannel invariantWebRtcSink/WebRtcSourcerely on.- Queue cap (
Str0mNetError::Backpressure) — once the driver’sout_pendingVecDeque reachesMAX_PENDING_FRAMES,send_frameitself returnsErr(Str0mNetError::Backpressure)so the caller applies flow control (sleep / await / shed load) instead of growing memory without bound. The counter is decremented whenChannel::writeaccepts the frame (Ok(true)) and reset to zero on driver exit (the queue will never drain after close).
The regression test burst_of_frames_arrives_in_order_under_backpressure
exercises this contract by bursting 100 × 8 KiB frames through a single
Str0mNetChannel and asserting all 100 arrive in order at the remote peer,
retrying on Backpressure as needed.
Str0mNet driver I/O error handling (#lzstr0mpolldrive)
The driver’s UDP I/O surfaces failures instead of silently dropping packets:
socket.send_to— pre-fix this waslet _ = socket.send_to(...), which discarded every error:ENOBUFS(send-buffer pressure),ECONNREFUSED(ICMP port-unreachable, peer down),ENETUNREACH/EHOSTUNREACH(route flap),EBADF(socket closed). The corresponding ICE/DTLS/SCTP packet was silently lost and the handshake stalled without diagnostics. Post-fix,WouldBlock/Interruptedretryable errorscontinuethe drain loop (str0m re-emits theTransmiton a laterpoll_output); any other error breaks the driver ('outer), surfacingClosedso the caller re-signals.- Read-timeout cap as command-poll interval —
recv_fromwaits at mostCOMMAND_POLL_INTERVAL(15 ms) so control commands (Send/AcceptAnswer/AddRemoteCandidate/Shutdown) read fromcmd_rxat the top of each outer iteration stay bounded-latency. This is not a str0m timing parameter: str0m is fed an accurate time advance viaInput::Timeout(now)whenever the socket times out without data, and an “early”Input::Timeout(every 15 ms during idle) is harmless — str0m just re-emits its pending deadline if it isn’t time yet.
Signaling glue: Str0mNet over SignalingClient (#lzwebrtcwire)
Str0mNet exchanges its SDP offer/answer and trickled ICE candidates out of
band; SignalingClient (#yxjw, below) is the out-of-band channel. The
webrtc_signaling module (enabled when both signaling-client and
webrtc-str0m are on) is the wire between them — two async driver functions
that own the full handshake:
offer_to_peer(client, peer, bind, timeout)— binds the socket viaStr0mNet::offer, sends the SDPofferand the locallocal_candidate()topeerover the signaling client, then pumps incomingServerMessages (answer→accept_answer,ice→add_remote_candidate) until the data channel opens, returning the connectedStr0mNet.answer_next_offer(client, bind, timeout)— waits for the nextofferframe, produces the SDP answer viaStr0mNet::answer, returns the answer + local candidate over signaling, applies any ICE candidate that raced ahead of the offer, then pumps until open. Returns the offeringPeerIdand the connectedStr0mNet.
Both pump loops re-check Str0mNet::is_open() on a short poll tick as well as on
each signaling frame, because the channel opens on the backend’s driver thread,
off the signaling path. The caller is responsible for learning the target peer is
present (from the welcome roster or a peer-joined frame) before offering; an
offer to an absent peer is dropped by the relay and surfaces only as a timeout.
tests/webrtc_signaling.rs drives this end to end over a loopback signaling
relay: an in-process tokio-tungstenite server implementing the #yxjw roster +
from-stamped routing on 127.0.0.1, two real SignalingClient WebSocket
connections, and the real Str0mNet UDP/DTLS/SCTP transport — proving a
permission-filtered Snapshot crosses a data channel negotiated entirely through
SignalingClient. The only remaining slice is the live two-host / NAT run through
the deployed #yxjw Worker, which is operator-gated (#h6qb).
Capability negotiation
Each non-local session starts with a small compatibility handshake before graph state flows:
- protocol id:
lazily-ipc - protocol major version
- codec (
jsontoday; binary codecs can be transport crates as long as they encode the sameIpcMessageschema) - maximum frame size and fragmentation support
- ordered/reliable delivery guarantee
PeerIdand session/graph id- supported features such as
shared-blob,crdt-cell-plane, andsignaling-relay
If peers disagree on protocol major version, codec, ordering guarantees, or
required feature flags, they fail closed before applying any Snapshot or
Delta.
Cross-language family rules
- The shared semantics are lazy slots, mutable cells, dynamic dependency
tracking,
PartialEq/equality-guarded invalidation, memo equality suppression, batching, and permission-filtered snapshots/deltas. - Compute closures are language-local. Cross-language sync shares the cell state plane by default; derived slots converge remotely only when peers use a shared compiled graph or an explicit compute-descriptor system.
- JavaScript/TypeScript peers must keep
PeerIdvalues at or belowNumber.MAX_SAFE_INTEGER, matching the #s0fc signaling protocol. - Permission filtering happens before serialization on every channel. A
WebSocket relay, WebRTC data channel, or FFI caller must not receive nodes or
operations that
PeerPermissionswould omit. - Channel code must preserve back-pressure and resync behavior: if frame
delivery gaps, reorders, truncates, or exceeds negotiated size, the receiver
requests a fresh
Snapshotinstead of applying a partial delta.
This keeps FFI viable without making it a special semantic path. FFI, IPC,
WebSocket, and WebRTC data differ only in framing, ownership, and reliability;
the lazily family stays compatible because all channels carry the same
permission-filtered IpcMessage state plane.
Multi-writer coordination: CRDT vs Raft (lazily-distributed)
lazily-ipc (above) is a single-writer linear log: one authority mutates
the graph and streams Deltas stamped by a monotonic ipc_epoch.
lazily-distributed asks the harder question — when multiple peers may
write the same shared reactive graph, what coordination model orders those
writes: a CRDT (conflict-free replicated data types, eventual consistency) or
Raft (leader-ordered consensus, strong consistency)?
The reactive structure collapses most of the question
The key observation is that not all graph state is writable. lazily-rs has exactly two node kinds with respect to authorship:
- Cells / source slots — externally writable. These are the only state a peer can directly set.
- Derived slots (
computed/memo) — pure deterministic functions of their dependencies. They are never written; they recompute. Their values, and the dynamic dependency topology discovered during recompute, are a deterministic view of the cell state — provided every peer runs identical compute closures (the closure-replication prerequisite already flagged aslazily-ipc’s mirror-lazy mode).
So coordination is only needed on the small cell plane. The entire derived graph — typically the large majority of nodes, plus all edges and the effect schedule — converges automatically once the cells converge and recompute runs. This is the same property the local engine already relies on: derived state is a function, not a source of truth.
The two models on the cell plane
| Aspect | CRDT (cell-plane registers) | Raft (leader-ordered log) |
|---|---|---|
| Consistency | Eventual; peers converge after delivery | Strong; one total order, every peer identical |
| Availability | Local-first — peers read/write while partitioned | Minority partition cannot write; needs quorum |
| Write latency | Local (no round-trip) | Quorum round-trip to leader per write |
| Offline peers | Native (merge on reconnect) | Not supported (writes need quorum) |
| P2P / WAN fit | Direct (no leader); fits #yxjw signaling | Awkward — leader election over WAN, quorum cost |
| Conflict model | Per-cell merge (LWW / MV register) | None — serialized, last in order wins by fiat |
| Extends #ipc2 | Per-peer Deltas + causal stamps, merged | One global Raft-replicated Delta log |
| Cost | Every writable cell must be a CRDT | Election, log replication, quorum machinery |
Recommendation — CRDT cell plane (HLC-stamped registers), not Raft
Adopt a CRDT layer on the cell plane only, with derived slots recomputing deterministically on each peer. Concretely:
- Each replicated cell is a register CRDT keyed by a hybrid logical clock
(HLC) — wall-clock for human-meaningful ordering, logical counter for
causal tiebreak. Two flavors, chosen per cell via a trait:
- LWW-register (last-write-wins) — default; “current value” semantics that most reactive cells want. Silently drops the losing concurrent write.
- MV-register (multi-value) — surfaces concurrent writes as a set for the compute layer (or app) to resolve, when dropping a write is unacceptable.
- Additive cells can opt into a PN-counter instead of a register.
- The local PartialEq invalidation guard still applies — after merge: a
merge that yields an equal value invalidates nothing, exactly as a local equal
set_celldoes. memo equality suppression likewise holds post-merge, so convergent peers do the same downstream work. lazily-ipc’sDeltageneralizes from one monotonicipc_epochto per-peer causal stamps: each peer keeps its own sequence; cross-peer order comes from the HLC/dot metadata carried on eachCellSet. Delivery can be out-of-order; merge is commutative/associative/idempotent so gaps self-heal without the snapshot-resync that the single-writer log needed.
Raft is the wrong default because the lazily-distributed roadmap is
explicitly availability-first — P2P signaling (#yxjw) and offline peers —
and Raft trades exactly that away for a global total order that the reactive
model does not need: derived state is already deterministic, and the writable
surface is small.
The narrow exception — irreversible effects need an authority
CRDT convergence is correct for state. It is not sufficient for effects that perform irreversible external actions (send an email, charge a card, fire a webhook): convergence may run the same effect on every peer, or run it twice as merges arrive. Pure state can converge; an external side effect cannot be merged.
For that narrow class, gate the effect behind a single-writer effect
authority — a designated peer (or a small Raft group owning only the
effect-intent log, not the whole graph) decides when an irreversible effect
fires, at-most-once. This is a hybrid: CRDT for the state plane, a
single-writer/Raft authority for the irreversible-effect plane, with the
#39c5 RemoteOp allowlist already gating which remote writes and effects a peer
may trigger at all. The large reactive core stays leaderless and local-first;
only the small irreversible-effect tail pays for consensus.
Open prototype gates (deferred to implementation)
- HLC skew bounds and the LWW-vs-MV default per cell category.
- Whether closure replication (required for peers to recompute derived slots) is shipped as serialized compute descriptors or restricted to a shared compiled graph — this gates how much of the derived plane can live remotely at all.
- Delta-state vs op-based CRDT encoding on the wire, reusing
lazily-serde.
Internet-scale peer discovery: signaling server (#yxjw)
The CRDT cell plane (above) is leaderless and local-first, but peers still have
to find each other and open transport before any Delta can flow. On a LAN
that is mDNS or a known address; across the internet it needs a rendezvous
point. #yxjw is that rendezvous: a small Cloudflare Worker signaling
server that brokers peer discovery and relays the WebRTC SDP/ICE handshake so
peers can establish direct P2P data channels, falling back to server relay
of opaque payloads when a direct channel cannot be formed. It is strictly a
discovery + relay layer — it never parses or merges CRDT state, so it stays
trivially horizontally scalable and never becomes the consistency authority the
CRDT design deliberately avoids.
Why a Cloudflare Worker + Durable Objects
- One Durable Object per session id. A WebSocket upgrade to
GET /session/:idroutes to aSignalingRoomDO keyed byidFromName(sessionId). Cloudflare guarantees a single global instance per id, so each session gets a lock-free single-threaded coordination point for its roster with no external store. Scale is achieved by sharding sessions across DO instances, not by growing one server — which matches the availability-first, P2P posture of the CRDT recommendation (it “fits #yxjw signaling”). - Edge-local. Workers run close to peers worldwide, minimizing handshake latency; the DO migrates to wherever its session is most active.
Roles and protocol
The server tracks a per-session roster of connected peers and forwards three
classes of frame. PeerId is the same u64 as Rust PeerId (serialized by
serde as a bare JSON number; ids must stay ≤ Number.MAX_SAFE_INTEGER).
- Client → server:
join { peer, capabilities? },offer { to, sdp },answer { to, sdp },ice { to, candidate },relay { to, payload },leave. - Server → client:
welcome { peer, peers }(roster on join),peer-joined/peer-left, forwardedoffer/answer/ice/relaystamped with the realfrom, anderror { code, message }.
Anti-spoofing: the from on every forwarded frame is the sender
connection’s registered peer id, never a client-supplied field, so a peer
cannot impersonate another.
Permission boundary (reuses #39c5)
Admission and relay are gated by SignalingPermissions, the discovery-layer
mirror of lazily::distributed::PeerPermissions:
openmode — any peer may join and signal any other joined peer (trusted / LAN / common discovery case).allowlistmode — default-deny: a peer may join only when explicitly granted, and may send directed frames only to explicitly allowed targets, exactly as #39c5 gatesRemoteOp. This is the discovery-layer half of the same boundary; the Rust data plane still re-checks everyRemoteOplocally.
Reconnect / resync
The roster lives in the DO; it is authoritative. A peer that drops simply
re-joins and receives a fresh welcome roster — no snapshot epoch is needed
at this layer because signaling carries no CRDT state (the data plane’s
HLC/dot-stamped Deltas self-heal independently per the CRDT design).
Implemented (#yxjw)
Ships as a standalone TypeScript Worker under signaling/ (its own Node
toolchain; not part of the Rust crate build):
src/protocol.ts— wire types + untrusted-frame validation/codec.src/permissions.ts—SignalingPermissions(open/ default-denyallowlist).src/room-core.ts— transport-agnosticRoomCore: roster, routing, anti-spoofing, permission gating.src/room.ts—SignalingRoomDurable Object (thin WebSocket adapter).src/index.ts— Worker entry:/health+/session/:idrouting.test/— 24 vitest tests (protocol/permissions/room-core units plus an end-to-end Worker + DO + WebSocket test in the workerd runtime).
Open gates (deferred)
- TURN/relay fallback policy when both peers are behind symmetric NAT (today the server can relay payloads, but a dedicated TURN allocation is out of scope).
- Authenticated admission tokens feeding the
allowlistgrants from an external identity source rather than static configuration. - Capacity/back-pressure limits per session DO.
Consumable clients (#s0fc)
So a project can depend on the signaling endpoint for distributed peer discovery (the plan for agent-doc), the endpoint ships two clients that speak one shared wire protocol — this section is the normative source of truth both conform to.
Wire protocol (normative). All frames are JSON with a type tag. PeerId
is a u64 serialized as a bare JSON number (Rust PeerId(u64) ⇄ TS number;
keep ids ≤ 2^53).
- Client → server:
join {peer, capabilities?},offer {to, sdp},answer {to, sdp},ice {to, candidate},relay {to, payload},leave. - Server → client:
welcome {peer, peers},peer-joined {peer},peer-left {peer},offer/answer/ice/relay(each stampedfrom),error {code, message}.
Rust client (signaling-client feature, src/signaling_client.rs):
lazily::SignalingClient::connect(base_url, session, peer) opens a
tokio-tungstenite WebSocket to {base_url}/session/{session}, joins, and
exposes offer/answer/ice/relay/leave plus recv() for
ServerMessages. ClientMessage/ServerMessage are serde-tagged
(rename_all = "kebab-case") and reuse the #39c5 PeerId. Conformance tests
assert the exact JSON shapes above. The feature pulls tokio-tungstenite
(rustls) only when enabled; the default build is unaffected.
TypeScript client (@lazily/signaling package, signaling/src/client.ts):
SignalingClient.connect(baseUrl, session, peer) (or attach(socket, peer) for
a pre-opened socket) works against any WebSocket-like transport (browser,
Node ≥ 22, or injected), with onMessage + the same send helpers. The package
exports ./client and ./protocol. Unit tests plus an end-to-end test drive
the real Worker + Durable Object in workerd.
Both clients are covered in CI (cargo test --features signaling-client; the
Worker job’s npm run check). The Rust conformance tests and the TS protocol
share the byte-for-byte frame shapes defined above, so the two implementations
stay wire-compatible.
Differences from lazily-zig
| Aspect | lazily-zig | lazily-rs |
|---|---|---|
| Context | Explicit allocator | Owned allocations |
| Slot creation | comptime function pointers | Closures (Box<dyn Fn>) |
| Storage modes | .direct / .indirect | Unified via generics |
| FFI | Built-in StringView | Via #[no_mangle] + extern "C" |
| Thread safety | Mutex by default; -Dthread_safe=false removes locking | Context is single-threaded (RefCell); ThreadSafeContext uses a context-level lock |
Differences from lazily-py
| Aspect | lazily-py | lazily-rs |
|---|---|---|
| Context | Plain dict | Typed Context struct |
| Slot keys | Object identity | SlotId (u64) |
| Cell equality | != operator | PartialEq trait |
| Context resolvers | resolve_ctx functions | Direct context passing |
| Dependencies | Zero mandatory runtime crates by default; optional Tokio support and dev-only Criterion benchmarks | Zero (pure Rust) |
lazily Wire Protocol
Language-agnostic protocol reference for the lazily reactive-graph family (lazily-rs, lazily-py, lazily-zig). This document describes the wire format, message schemas, and transport contracts. Language-specific APIs live in each binding’s own documentation.
Message Plane
All channels (FFI, IPC, WebSocket, WebRTC data) carry the same two message kinds:
Snapshot— full graph image, sent on connect and on resyncDelta— incremental change set, sent once per outermost batch flush
These are tagged as IpcMessage:
{ "Snapshot": { ... } }
{ "Delta": { ... } }
Wire Types
NodeId
Stable wire identifier for a reactive node (cell or slot). Decoupled from language-internal allocation IDs.
{ "node": 1 }
Wire format: u64 wrapped in a "node" field.
PeerId
Identifies a remote peer.
{ "peer": 42 }
Wire format: u64. JavaScript peers must keep this at or below
Number.MAX_SAFE_INTEGER.
OpKind
Access category for a remote operation.
| Value | Meaning |
|---|---|
"Read" | Read node value into snapshot/delta |
"Write" | Write new value to source cell |
"TriggerEffect" | Trigger effect on irreversible-effect plane |
RemoteOp
A single operation a remote peer may request.
{ "kind": "Read", "node": 1 }
IpcPayload
Opaque serialized value bytes. The producing language owns type-aware encoding
through type_tag; the channel only moves bytes.
Wire format: array of u8 (JSON array of integers).
NodeState
Serialization state for a node.
{ "Payload": [1, 2, 3, 4] }
"Opaque"
{ "SharedBlob": { "offset": 0, "len": 16, "generation": 1, "epoch": 9, "checksum": 123456789 } }
| Variant | Meaning |
|---|---|
{ "Payload": [...] } | Inline serialized value bytes |
"Opaque" | Known node whose value cannot be serialized |
{ "SharedBlob": { ... } } | Descriptor for bytes in shared memory |
ShmBlobRef
Descriptor for a payload stored in a shared-memory blob arena.
| Field | Type | Meaning |
|---|---|---|
offset | u64 | Byte offset from arena start |
len | u64 | Payload length in bytes |
generation | u64 | Per-write generation (stale rejection) |
epoch | u64 | IPC epoch of the publishing message |
checksum | u64 | FNV-1a payload checksum |
IpcValue
Value stored inline or by shared-memory blob reference.
{ "Inline": [10, 20, 30] }
{ "SharedBlob": { "offset": 40, "len": 17, "generation": 2, "epoch": 9, "checksum": 987654321 } }
Snapshot Message
Full graph image sent on connect or resync.
Schema
Snapshot {
epoch: u64,
nodes: Vec<NodeSnapshot>,
edges: Vec<EdgeSnapshot>,
roots: Vec<NodeId>
}
NodeSnapshot
NodeSnapshot {
node: NodeId,
type_tag: string,
state: NodeState
}
EdgeSnapshot
EdgeSnapshot {
dependent: NodeId,
dependency: NodeId
}
Example: Minimal snapshot
{
"Snapshot": {
"epoch": 1,
"nodes": [
{
"node": 1,
"type_tag": "i32",
"state": { "Payload": [1, 2, 3, 4] }
}
],
"edges": [],
"roots": [1]
}
}
Example: Multi-node snapshot with opaque node
{
"Snapshot": {
"epoch": 7,
"nodes": [
{ "node": 1, "type_tag": "i32", "state": { "Payload": [1, 2, 3] } },
{ "node": 2, "type_tag": "f64", "state": { "Payload": [0, 0, 0, 0, 0, 0, 240, 63] } },
{ "node": 3, "type_tag": "opaque-type", "state": "Opaque" }
],
"edges": [
{ "dependent": 2, "dependency": 1 },
{ "dependent": 3, "dependency": 1 }
],
"roots": [1, 2]
}
}
Example: Snapshot with shared-blob node
{
"Snapshot": {
"epoch": 9,
"nodes": [
{
"node": 7,
"type_tag": "text/plain",
"state": {
"SharedBlob": {
"offset": 0,
"len": 16,
"generation": 1,
"epoch": 9,
"checksum": 123456789
}
}
}
],
"edges": [],
"roots": [7]
}
}
Delta Message
Incremental change set emitted after one outermost batch flush.
Schema
Delta {
base_epoch: u64,
epoch: u64,
ops: Vec<DeltaOp>
}
Sequential deltas satisfy epoch == base_epoch + 1. A receiver detects gaps,
reorders, or sender restarts by checking base_epoch == last_epoch.
DeltaOp Variants
| Variant | Fields | Meaning |
|---|---|---|
CellSet | node, payload (IpcValue) | Source cell changed to new value |
SlotValue | node, payload (IpcValue) | Lazily recomputed slot published a value |
Invalidate | node | Node dirtied without a concrete value |
NodeAdd | node, type_tag, state (NodeState) | New node became visible |
NodeRemove | node | Node was removed |
EdgeAdd | dependent, dependency | Dependency edge added |
EdgeRemove | dependent, dependency | Dependency edge removed |
Example: Sequential delta with all op variants
{
"Delta": {
"base_epoch": 40,
"epoch": 41,
"ops": [
{ "CellSet": { "node": 1, "payload": { "Inline": [10] } } },
{ "SlotValue": { "node": 2, "payload": { "Inline": [20] } } },
{ "Invalidate": { "node": 3 } },
{ "NodeAdd": { "node": 4, "type_tag": "u64", "state": { "Payload": [64] } } },
{ "NodeRemove": { "node": 5 } },
{ "EdgeAdd": { "dependent": 2, "dependency": 1 } },
{ "EdgeRemove": { "dependent": 3, "dependency": 1 } }
]
}
}
Example: Non-sequential delta (gap)
{
"Delta": {
"base_epoch": 12,
"epoch": 13,
"ops": []
}
}
When the receiver’s last_epoch is 10, this delta has a gap (expected 10→11,
got 12→13). The receiver must discard it and request a fresh Snapshot.
Example: Delta with shared-blob payload
{
"Delta": {
"base_epoch": 8,
"epoch": 9,
"ops": [
{
"SlotValue": {
"node": 7,
"payload": {
"SharedBlob": {
"offset": 40,
"len": 17,
"generation": 2,
"epoch": 9,
"checksum": 987654321
}
}
}
}
]
}
}
Epoch Contract
ipc_epochis a monotonicu64that advances once per outermost batch flush.Snapshotcarriesepoch.Deltacarries{ base_epoch, epoch }withepoch == base_epoch + 1.- On
Deltawherebase_epoch != last_epoch: discard the delta, request a freshSnapshot, resume from the snapshot’sepoch.
Consistency Invariants
- PartialEq cell guard: equal
CellSetproduces no wire ops. - Memo equality suppression: a dirty memo slot that recomputes to an equal
value emits no
SlotValueor downstreamInvalidate. - Coalesced frontier: a dependent reached through many changed cells in one batch appears at most once per delta.
- Eager Signal nodes always carry a value: an eager
Signal(see below) is recomputed during the invalidation flush, so when it changes it appears in the delta as a concreteSlotValue(never a bareInvalidate). A purely lazy slot that was not read before the flush may instead appear asInvalidatewith no value. Both are valid wire states for the sameSlotValue/Invalidateop set; the distinction is computation timing, not message format.
Eager Signal Nodes
A Signal is the eager derived value in the Slot -> Cell -> Signal family: it
recomputes the instant a dependency invalidates rather than on next read. It is
not a new wire type. A Signal is composed from a memoized backing slot plus a
local puller effect, and only the backing slot is graph state, so on the wire a
Signal node is an ordinary slot node:
- Snapshot: the backing slot appears as a
NodeSnapshotwith its materialized value inNodeState(Payload/SharedBlob), like any other readable slot. - Delta: a value change appears as
SlotValuefor the backing slot’sNodeId. Because the value is eagerly materialized at flush time it is always concrete; eager nodes do not emit bareInvalidate. - Memo guard still applies: an eager recompute that yields an equal value
(
PartialEq) suppresses theSlotValueand any downstreamInvalidate, exactly as forctx.memoslots. - The puller effect is local: it drives eager recomputation but is not
serialized as a node and produces no
TriggerEffectop. Eagerness is a producer-side scheduling property; remote peers receive the same permission-filteredSnapshot/Deltastate plane regardless of whether a node is lazy or eager.
Peers therefore need no protocol change to consume signals from an eager producer — a Signal is observed as a slot that is reliably present in every delta that changes it.
Permission Boundary
Only nodes on the per-peer allowlist are serialized. Non-allowlisted nodes are
omitted entirely (not even as Opaque) so a peer cannot infer their
existence. Edges are retained only when both endpoints are readable.
This filter is applied at snapshot/delta construction time, before serialization, on all channels without exception.
Serialization
JSON (default)
serde_json with derived Serialize/Deserialize. All examples above use
JSON. This is the cross-language default.
Binary (optional)
postcard compact binary encoding via the ipc-binary feature. Smaller and
faster than JSON, but not self-describing — peers must agree on the schema.
For same-language or postcard-aware transports only.
Binary frames decode through IpcMessage::decode_binary(bytes) and encode
through IpcMessage::encode_binary().
Transport Contracts
FFI (C ABI)
- Opaque channel handle + owned byte buffers
- Functions:
channel_new,channel_free,channel_send,channel_recv,ipc_message_validate,ipc_message_kind,ipc_message_clone,bytes_free - Binary variants: same functions with
_binarysuffix - Ownership: caller owns input bytes; Rust owns output buffers until the paired free function is called
- Errors return
LazilyFfiStatusenum; panics are caught before the C ABI
IPC (Unix socket / pipe / local TCP)
- Length-prefixed serialized
IpcMessageframes - Shared-memory optional for large
IpcValue::SharedBlobpayloads IpcSink/IpcSourcetrait interface
WebSocket
- One WebSocket text/binary frame carries one serialized
IpcMessage - Signaling server (#yxjw) relays frames as opaque payload
- Server must not parse CRDT/IPC state
WebRTC Data Channel
- Reliable ordered data channels only (for graph state)
- Length-prefixed framing: 4-byte LE length + payload
- JSON or binary codec negotiated during capability handshake
- On channel failure: re-signaling via
SignalingClient, delta resync covers gaps - Unordered/unreliable channels only for optional lossy telemetry
Capability Negotiation
Each non-local session starts with a handshake:
| Field | Description |
|---|---|
| Protocol id | "lazily-ipc" |
| Protocol major version | 1 |
| Codec | "json" or "binary" |
| Maximum frame size | Negotiated maximum |
| Ordered/reliable | Required for graph state |
| PeerId | Session participant |
| Supported features | shared-blob, crdt-cell-plane, etc. |
If peers disagree on protocol major version, codec, or ordering guarantees,
they fail closed before applying any Snapshot or Delta.
Cross-Language Family Rules
- Compute closures are language-local. Cross-language sync shares the cell state plane; derived slots converge remotely only when peers use a shared compiled graph or explicit compute descriptors.
- Permission filtering happens before serialization on every channel.
- Channel code must preserve back-pressure and resync behavior.
- All channels carry the same permission-filtered
IpcMessagestate plane.
Conformance Test Vectors
Canonical JSON fixtures in tests/conformance/ validate wire-format agreement
across all language bindings:
| Fixture | Coverage |
|---|---|
snapshot_minimal.json | Single payload node, no edges |
snapshot_multi_node.json | Multiple nodes, opaque state, edges |
snapshot_shared_blob.json | Shared-memory blob reference |
delta_sequential.json | All 7 DeltaOp variants |
delta_non_sequential.json | Gap requiring resync |
delta_shared_blob.json | Delta with shared-blob payload |
Each fixture contains:
{
"description": "...",
"protocol_version": 1,
"kind": "Snapshot" | "Delta",
"assertions": { ... },
"wire": { <IpcMessage> }
}
Language bindings should:
- Parse
wireinto native types - Validate
assertions(field values, counts, state kinds) - Re-serialize and verify byte-exact match
lazily Benchmark Results
Generated benchmark data for the lazily reactive primitives library.
Benchmark Results
Generated for package lazily version 0.12.2.
Environment: rustc 1.96.0 (ac68faa20 2026-05-25) on x86_64-unknown-linux-gnu.
Refresh command:
python3 scripts/update-benchmark-results.py
Regression workflow:
cargo bench --features instrumentation -- --save-baseline before
# apply the performance patch
cargo bench --features instrumentation -- --baseline before
python3 scripts/update-benchmark-results.py --no-run
Regression budgets enforced by python3 scripts/update-benchmark-results.py --check:
| Profile | Max lock acquisitions | Site lock budgets |
|---|---|---|
| thread_safe_set_cell_invalidation_independent_slot_contention_16 | 192 | set_cell_invalidation<=0, dependency_edge<=16, get_refresh<=32, publish<=32 |
| thread_safe_set_cell_invalidation_batched_write_bursts_16 | 900 | other<=800, set_cell_invalidation<=16, dependency_edge<=64, get_refresh<=2, publish<=2 |
| thread_safe_contention_same_slot_write_read_16 | 1000 | get_refresh<=160, publish<=256, in_flight_wait<=700, set_cell_invalidation<=180 |
| thread_safe_contention_independent_slots_16 | 1100 | other<=450, get_refresh<=64, publish<=320, dependency_edge<=16, set_cell_invalidation<=300 |
| thread_safe_contention_read_mostly_waiters_16 | 256 | get_refresh<=128, publish<=64, in_flight_wait<=96 |
| thread_safe_contention_batched_write_bursts_16 | 950 | other<=800, get_refresh<=128, dependency_edge<=64, set_cell_invalidation<=16, publish<=64, in_flight_wait<=64 |
| thread_safe_effect_contention_queue_coalescing_16 | 2600 | other<=900, dependency_edge<=1600, set_cell_invalidation<=16, get_refresh<=64, publish<=0 |
| thread_safe_effect_contention_cleanup_execution_16 | 1300 | other<=450, dependency_edge<=700, set_cell_invalidation<=256, get_refresh<=0, publish<=0 |
| thread_safe_effect_contention_batch_flush_16 | 1500 | other<=1300, get_refresh<=32, dependency_edge<=96, set_cell_invalidation<=16, publish<=32 |
Budgets use deterministic lock acquisition counts instead of elapsed wait/hold time.
Synchronization strategy adoption gate:
| Strategy | Status | Required throughput evidence | Required p50/p95 latency evidence | Lock-site and safety gate |
|---|---|---|---|---|
| current_std_mutex_condvar | baseline | thread_safe_contention and thread_safe_effect_contention at 8/16 workers | p50/p95 latency for same-slot, read-mostly, batch, and effect-heavy cases | must stay within current lock-site budgets and Loom safety coverage |
| narrower_condvar_wakeups | adopted for per-slot recompute waiters | same-slot write/read and read-mostly waiter throughput at 8/16 workers | p50/p95 latency for waiter wakeup handoff and stale-completion retry | must not regress effect queue, cleanup, or batch flush budgets |
| parking_lot_style_parking | candidate only | same contention matrix measured against current_std_mutex_condvar | p50/p95 latency for parking/unparking under 8/16 workers | requires no worse lock-site budgets plus a deadlock/starvation model |
| targeted_cas | candidate only | fresh cached reads and independent-slot throughput at 8/16 workers | p50/p95 latency for revision validation fallback and publish races | requires unchanged effect/batch/disposal budgets plus Loom/Shuttle proof |
Candidates do not replace the current strategy before the same run reports throughput, p50/p95 latency, and lock-site budgets for the required 8/16-worker cases.
Required latency evidence uses Criterion sample per-iteration timing.
Watch-item A/B follow-up:
| Watch item | Baseline/current refs | Focused command | Controlled rerun result | Decision |
|---|---|---|---|---|
| cached ThreadSafeContext read latency | a8b6fc3 vs c917401 | cargo bench --features instrumentation --bench context -- cached_reads/thread_safe_context | 73.48 ns baseline vs 73.20 ns current on warm-cache repeat | no tuning; the archived 56.5 ns row did not reproduce under controlled A/B |
| effect cleanup contention at 16 workers | a8b6fc3 vs c917401 | cargo bench --features instrumentation --bench context -- thread_safe_effect_contention/cleanup_execution/16 | 2.31 ms baseline vs 2.43 ms current on warm-cache repeat with overlapping CIs | keep watching; Criterion reported no statistically significant change |
| Group | Case | p50 | p95 | Samples |
|---|---|---|---|---|
| thread_safe_contention | same_slot_write_read / 8 | 2.578 ms | 2.746 ms | 10 |
| thread_safe_contention | same_slot_write_read / 16 | 7.185 ms | 8.642 ms | 10 |
| thread_safe_contention | independent_slots / 8 | 1.133 ms | 1.220 ms | 10 |
| thread_safe_contention | independent_slots / 16 | 2.885 ms | 3.035 ms | 10 |
| thread_safe_contention | read_mostly_waiters / 8 | 533.466 us | 578.630 us | 10 |
| thread_safe_contention | read_mostly_waiters / 16 | 1.105 ms | 1.186 ms | 10 |
| thread_safe_contention | batched_write_bursts / 8 | 2.734 ms | 3.132 ms | 10 |
| thread_safe_contention | batched_write_bursts / 16 | 5.097 ms | 5.433 ms | 10 |
| thread_safe_effect_contention | queue_coalescing / 8 | 1.464 ms | 1.739 ms | 10 |
| thread_safe_effect_contention | queue_coalescing / 16 | 4.039 ms | 4.208 ms | 10 |
| thread_safe_effect_contention | cleanup_execution / 8 | 1.457 ms | 1.719 ms | 10 |
| thread_safe_effect_contention | cleanup_execution / 16 | 3.948 ms | 4.323 ms | 10 |
| thread_safe_effect_contention | batch_flush / 8 | 2.646 ms | 2.929 ms | 10 |
| thread_safe_effect_contention | batch_flush / 16 | 7.896 ms | 8.064 ms | 10 |
| thread_safe_graph_propagation | fan_out_eager_validation / 8 | 3.538 ms | 3.629 ms | 10 |
| thread_safe_graph_propagation | fan_out_eager_validation / 16 | 6.130 ms | 6.357 ms | 10 |
| thread_safe_graph_propagation | fan_out_lazy_dirty_epochs / 8 | 2.551 ms | 2.671 ms | 10 |
| thread_safe_graph_propagation | fan_out_lazy_dirty_epochs / 16 | 4.442 ms | 4.535 ms | 10 |
| thread_safe_graph_propagation | fan_in_lazy_dirty_epochs / 8 | 574.260 us | 585.218 us | 10 |
| thread_safe_graph_propagation | fan_in_lazy_dirty_epochs / 16 | 1.126 ms | 1.138 ms | 10 |
| thread_safe_graph_propagation | fan_in_batched_flush / 8 | 1.167 ms | 1.287 ms | 10 |
| thread_safe_graph_propagation | fan_in_batched_flush / 16 | 2.271 ms | 2.452 ms | 10 |
Criterion estimates are local mean wall-clock time per iteration.
| Group | Case | Mean | 95% CI |
|---|---|---|---|
| cached_reads | context | 8.027 ns | 7.987 ns - 8.066 ns |
| cached_reads | thread_safe_context | 65.223 ns | 64.895 ns - 65.584 ns |
| cold_first_get | context | 80.054 ns | 77.778 ns - 82.208 ns |
| cold_first_get | thread_safe_context | 990.971 ns | 959.974 ns - 1.021 us |
| dependency_fan_out | context / 32 | 3.107 us | 2.962 us - 3.248 us |
| dependency_fan_out | context / 256 | 45.763 us | 43.643 us - 48.510 us |
| dependency_fan_out | thread_safe_context / 32 | 20.950 us | 20.551 us - 21.347 us |
| dependency_fan_out | thread_safe_context / 256 | 160.588 us | 159.203 us - 162.024 us |
| set_cell_invalidation | high_fan_out / 512 | 92.241 us | 89.998 us - 94.223 us |
| set_cell_invalidation | same_slot_contention / 1 | 38.338 us | 37.310 us - 39.419 us |
| set_cell_invalidation | same_slot_contention / 2 | 80.472 us | 78.918 us - 81.907 us |
| set_cell_invalidation | same_slot_contention / 4 | 186.897 us | 180.693 us - 192.872 us |
| set_cell_invalidation | same_slot_contention / 8 | 541.804 us | 517.595 us - 568.932 us |
| set_cell_invalidation | same_slot_contention / 16 | 1.759 ms | 1.718 ms - 1.794 ms |
| set_cell_invalidation | independent_slot_contention / 1 | 39.527 us | 38.588 us - 40.454 us |
| set_cell_invalidation | independent_slot_contention / 2 | 73.353 us | 71.989 us - 74.760 us |
| set_cell_invalidation | independent_slot_contention / 4 | 128.280 us | 126.976 us - 129.514 us |
| set_cell_invalidation | independent_slot_contention / 8 | 281.716 us | 269.854 us - 293.557 us |
| set_cell_invalidation | independent_slot_contention / 16 | 612.042 us | 607.064 us - 617.024 us |
| set_cell_invalidation | batched_write_bursts / 1 | 128.103 us | 126.066 us - 130.111 us |
| set_cell_invalidation | batched_write_bursts / 2 | 226.325 us | 218.580 us - 232.553 us |
| set_cell_invalidation | batched_write_bursts / 4 | 480.877 us | 463.297 us - 496.610 us |
| set_cell_invalidation | batched_write_bursts / 8 | 1.401 ms | 1.353 ms - 1.449 ms |
| set_cell_invalidation | batched_write_bursts / 16 | 4.002 ms | 3.922 ms - 4.075 ms |
| memo_equality_suppression | context | 1.881 us | 1.758 us - 2.005 us |
| memo_equality_suppression | thread_safe_context | 33.002 us | 32.645 us - 33.355 us |
| effect_flushing | context | 47.561 ns | 47.315 ns - 47.835 ns |
| effect_flushing | thread_safe_context | 919.973 ns | 914.308 ns - 926.589 ns |
| batch_storms | context / 64 | 2.764 us | 2.744 us - 2.786 us |
| batch_storms | thread_safe_context / 64 | 7.082 us | 6.899 us - 7.306 us |
| thread_safe_contention | same_slot_write_read / 1 | 98.686 us | 96.377 us - 100.859 us |
| thread_safe_contention | same_slot_write_read / 2 | 266.923 us | 261.168 us - 272.455 us |
| thread_safe_contention | same_slot_write_read / 4 | 871.034 us | 824.425 us - 916.400 us |
| thread_safe_contention | same_slot_write_read / 8 | 2.558 ms | 2.488 ms - 2.625 ms |
| thread_safe_contention | same_slot_write_read / 16 | 7.411 ms | 7.144 ms - 7.749 ms |
| thread_safe_contention | independent_slots / 1 | 101.937 us | 100.782 us - 102.937 us |
| thread_safe_contention | independent_slots / 2 | 212.770 us | 206.378 us - 217.797 us |
| thread_safe_contention | independent_slots / 4 | 463.904 us | 453.067 us - 474.423 us |
| thread_safe_contention | independent_slots / 8 | 1.133 ms | 1.109 ms - 1.159 ms |
| thread_safe_contention | independent_slots / 16 | 2.850 ms | 2.739 ms - 2.949 ms |
| thread_safe_contention | read_mostly_waiters / 1 | 103.037 us | 101.347 us - 104.750 us |
| thread_safe_contention | read_mostly_waiters / 2 | 138.765 us | 134.847 us - 142.827 us |
| thread_safe_contention | read_mostly_waiters / 4 | 277.379 us | 271.246 us - 282.068 us |
| thread_safe_contention | read_mostly_waiters / 8 | 522.464 us | 489.359 us - 552.208 us |
| thread_safe_contention | read_mostly_waiters / 16 | 1.116 ms | 1.097 ms - 1.136 ms |
| thread_safe_contention | batched_write_bursts / 1 | 208.558 us | 206.766 us - 210.767 us |
| thread_safe_contention | batched_write_bursts / 2 | 547.941 us | 521.314 us - 575.389 us |
| thread_safe_contention | batched_write_bursts / 4 | 1.467 ms | 1.454 ms - 1.480 ms |
| thread_safe_contention | batched_write_bursts / 8 | 2.796 ms | 2.716 ms - 2.890 ms |
| thread_safe_contention | batched_write_bursts / 16 | 4.970 ms | 4.733 ms - 5.184 ms |
| thread_safe_effect_contention | queue_coalescing / 8 | 1.491 ms | 1.437 ms - 1.558 ms |
| thread_safe_effect_contention | queue_coalescing / 16 | 4.055 ms | 4.013 ms - 4.102 ms |
| thread_safe_effect_contention | cleanup_execution / 8 | 1.498 ms | 1.445 ms - 1.561 ms |
| thread_safe_effect_contention | cleanup_execution / 16 | 4.042 ms | 3.893 ms - 4.185 ms |
| thread_safe_effect_contention | batch_flush / 8 | 2.689 ms | 2.607 ms - 2.771 ms |
| thread_safe_effect_contention | batch_flush / 16 | 7.856 ms | 7.715 ms - 7.961 ms |
| thread_safe_graph_propagation | fan_out_eager_validation / 8 | 3.558 ms | 3.531 ms - 3.587 ms |
| thread_safe_graph_propagation | fan_out_eager_validation / 16 | 6.154 ms | 6.104 ms - 6.211 ms |
| thread_safe_graph_propagation | fan_out_lazy_dirty_epochs / 8 | 2.544 ms | 2.503 ms - 2.584 ms |
| thread_safe_graph_propagation | fan_out_lazy_dirty_epochs / 16 | 4.451 ms | 4.427 ms - 4.477 ms |
| thread_safe_graph_propagation | fan_in_lazy_dirty_epochs / 8 | 572.971 us | 567.279 us - 578.239 us |
| thread_safe_graph_propagation | fan_in_lazy_dirty_epochs / 16 | 1.109 ms | 1.074 ms - 1.130 ms |
| thread_safe_graph_propagation | fan_in_batched_flush / 8 | 1.169 ms | 1.144 ms - 1.201 ms |
| thread_safe_graph_propagation | fan_in_batched_flush / 16 | 2.296 ms | 2.257 ms - 2.341 ms |
| profile_instrumentation | context_snapshot | 267.600 ns | 265.902 ns - 269.343 ns |
| profile_instrumentation | thread_safe_snapshot | 295.465 us | 294.338 us - 296.488 us |
| async_cached_resolve | async_context | 4.005 us | 3.829 us - 4.197 us |
| async_cached_resolve | sync_context_baseline | 73.063 ns | 69.741 ns - 77.406 ns |
| async_cached_resolve | sync_get | 11.537 ns | 11.488 ns - 11.585 ns |
| async_cached_resolve | thread_safe_context_baseline | 1.394 us | 1.388 us - 1.399 us |
| async_cold_resolve | async_context | 4.261 us | 4.114 us - 4.397 us |
| async_cold_resolve | sync_context_baseline | 83.178 ns | 79.646 ns - 86.312 ns |
| async_cold_resolve | thread_safe_context_baseline | 1.010 us | 983.439 ns - 1.036 us |
| async_invalidation_throughput | async_context | 245.023 us | 232.026 us - 258.919 us |
| async_invalidation_throughput | sync_context_baseline | 2.785 us | 2.763 us - 2.808 us |
| async_invalidation_throughput | thread_safe_context_baseline | 39.314 us | 39.079 us - 39.593 us |
| async_cancellation_throughput | async_invalidate_in_flight | 77.176 us | 64.769 us - 87.137 us |
| async_concurrent_contention | async_context / 1 | 71.400 us | 70.290 us - 72.484 us |
| async_concurrent_contention | async_context / 4 | 391.647 us | 375.605 us - 406.235 us |
| async_concurrent_contention | async_context / 16 | 1.939 ms | 1.834 ms - 2.040 ms |
| async_concurrent_contention | thread_safe_context_baseline / 1 | 60.390 us | 58.969 us - 61.565 us |
| async_concurrent_contention | thread_safe_context_baseline / 4 | 509.481 us | 496.452 us - 522.465 us |
| async_concurrent_contention | thread_safe_context_baseline / 16 | 3.494 ms | 3.408 ms - 3.553 ms |
| async_effect_throughput | async_context | 188.121 ms | 187.959 ms - 188.267 ms |
| async_batch_throughput | async_context | 76.304 us | 73.323 us - 79.503 us |
| async_batch_throughput | sync_context_baseline | 10.755 us | 10.676 us - 10.843 us |
| tokio_sync_cached_read | single_task | 1.488 us | 1.476 us - 1.501 us |
| tokio_sync_cached_read | spawn_read | 5.051 us | 4.882 us - 5.218 us |
| tokio_sync_cold_first_get | single_task | 1.425 us | 1.414 us - 1.435 us |
| tokio_sync_cold_first_get | spawn_compute | 5.226 us | 4.983 us - 5.462 us |
| tokio_sync_invalidation | single_task | 39.148 us | 38.805 us - 39.520 us |
| tokio_sync_concurrent_contention | same_slot_write_read / 1 | 43.792 us | 43.051 us - 44.677 us |
| tokio_sync_concurrent_contention | same_slot_write_read / 4 | 402.701 us | 389.818 us - 416.472 us |
| tokio_sync_concurrent_contention | same_slot_write_read / 16 | 3.734 ms | 3.570 ms - 3.882 ms |
| tokio_sync_concurrent_contention | independent_slots / 1 | 44.150 us | 43.337 us - 44.963 us |
| tokio_sync_concurrent_contention | independent_slots / 4 | 254.247 us | 241.953 us - 268.158 us |
| tokio_sync_concurrent_contention | independent_slots / 16 | 1.460 ms | 1.401 ms - 1.518 ms |
| tokio_sync_batch | spawn_batch | 43.228 us | 42.846 us - 43.603 us |
| tokio_sync_effect | single_task | 10.087 ms | 10.083 ms - 10.093 ms |
| typed_cache_reads | context_cell | 2.513 ns | 2.497 ns - 2.530 ns |
| typed_cache_reads | context_rc_cell | 2.771 ns | 2.754 ns - 2.788 ns |
| typed_cache_reads | context_rc_slot | 8.245 ns | 8.197 ns - 8.297 ns |
| typed_cache_reads | context_slot | 7.961 ns | 7.925 ns - 7.998 ns |
| typed_cache_reads | thread_safe_cell | 25.089 ns | 24.867 ns - 25.314 ns |
| typed_cache_reads | thread_safe_slot | 64.338 ns | 64.048 ns - 64.615 ns |
Instrumentation snapshots are single local profile runs captured by
examples/instrumentation_profile.rs.
| Profile | Alloc | Recomputes | Duplicate recomputes | Edges + | Edges - | Effect pushes | Max queue | Lock acquisitions | Lock wait | Lock hold | Sidecar frontiers | Sidecar dirty marks | Sidecar fallbacks | Dirty epochs |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| context_memo_effect | 4 | 3 | 0 | 4 | 1 | 2 | 1 | 0 | 0.000 ns | 0.000 ns | 0 | 0 | 0 | 0 |
| context_fan_out_32 | 33 | 64 | 0 | 64 | 32 | 0 | 0 | 0 | 0.000 ns | 0.000 ns | 0 | 0 | 0 | 0 |
| context_batch_storm_64 | 65 | 0 | 0 | 128 | 64 | 2 | 1 | 0 | 0.000 ns | 0.000 ns | 0 | 0 | 0 | 0 |
| thread_safe_first_get_2 | 2 | 1 | 0 | 1 | 0 | 0 | 0 | 11 | 19.240 us | 18.020 us | 0 | 0 | 0 | 0 |
| thread_safe_set_cell_invalidation_high_fan_out_512 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.000 ns | 0.000 ns | 1 | 512 | 0 | 512 |
| thread_safe_set_cell_invalidation_same_slot_contention_1 | 2 | 1 | 0 | 1 | 0 | 0 | 0 | 8 | 290.000 ns | 2.880 us | 16 | 16 | 0 | 16 |
| thread_safe_set_cell_invalidation_same_slot_contention_2 | 2 | 1 | 0 | 1 | 0 | 0 | 0 | 8 | 280.000 ns | 1.170 us | 32 | 32 | 0 | 32 |
| thread_safe_set_cell_invalidation_same_slot_contention_4 | 2 | 1 | 0 | 1 | 0 | 0 | 0 | 8 | 280.000 ns | 1.060 us | 64 | 64 | 0 | 64 |
| thread_safe_set_cell_invalidation_same_slot_contention_8 | 2 | 1 | 0 | 1 | 0 | 0 | 0 | 8 | 280.000 ns | 1.110 us | 128 | 128 | 0 | 128 |
| thread_safe_set_cell_invalidation_same_slot_contention_16 | 2 | 1 | 0 | 1 | 0 | 0 | 0 | 8 | 380.000 ns | 5.080 us | 256 | 256 | 0 | 256 |
| thread_safe_set_cell_invalidation_independent_slot_contention_1 | 2 | 1 | 0 | 1 | 0 | 0 | 0 | 8 | 380.000 ns | 7.940 us | 15 | 15 | 0 | 15 |
| thread_safe_set_cell_invalidation_independent_slot_contention_2 | 4 | 2 | 0 | 2 | 0 | 0 | 0 | 16 | 610.000 ns | 2.410 us | 31 | 31 | 0 | 31 |
| thread_safe_set_cell_invalidation_independent_slot_contention_4 | 8 | 4 | 0 | 4 | 0 | 0 | 0 | 32 | 1.140 us | 4.920 us | 63 | 63 | 0 | 63 |
| thread_safe_set_cell_invalidation_independent_slot_contention_8 | 16 | 8 | 0 | 8 | 0 | 0 | 0 | 64 | 2.270 us | 9.930 us | 127 | 127 | 0 | 127 |
| thread_safe_set_cell_invalidation_independent_slot_contention_16 | 32 | 16 | 0 | 16 | 0 | 0 | 0 | 128 | 4.570 us | 18.810 us | 255 | 255 | 0 | 255 |
| thread_safe_set_cell_invalidation_batched_write_bursts_1 | 5 | 1 | 0 | 4 | 0 | 0 | 0 | 97 | 3.650 us | 60.860 us | 0 | 0 | 0 | 15 |
| thread_safe_set_cell_invalidation_batched_write_bursts_2 | 9 | 1 | 0 | 8 | 0 | 0 | 0 | 159 | 193.830 us | 106.741 us | 0 | 0 | 0 | 21 |
| thread_safe_set_cell_invalidation_batched_write_bursts_4 | 17 | 1 | 0 | 16 | 0 | 0 | 0 | 199 | 439.375 us | 150.901 us | 0 | 0 | 0 | 6 |
| thread_safe_set_cell_invalidation_batched_write_bursts_8 | 33 | 1 | 0 | 32 | 0 | 0 | 0 | 369 | 2.189 ms | 231.381 us | 0 | 0 | 0 | 4 |
| thread_safe_set_cell_invalidation_batched_write_bursts_16 | 65 | 1 | 0 | 64 | 0 | 0 | 0 | 715 | 9.403 ms | 432.702 us | 0 | 0 | 0 | 2 |
| thread_safe_contention_same_slot_write_read_1 | 2 | 17 | 0 | 1 | 0 | 0 | 0 | 24 | 1.080 us | 17.510 us | 16 | 16 | 0 | 16 |
| thread_safe_contention_same_slot_write_read_2 | 2 | 28 | 0 | 1 | 0 | 0 | 0 | 77 | 16.610 us | 46.410 us | 23 | 23 | 9 | 32 |
| thread_safe_contention_same_slot_write_read_4 | 2 | 52 | 0 | 1 | 0 | 0 | 0 | 226 | 127.742 us | 107.120 us | 33 | 33 | 31 | 64 |
| thread_safe_contention_same_slot_write_read_8 | 2 | 111 | 0 | 1 | 0 | 0 | 0 | 449 | 207.906 us | 200.263 us | 70 | 70 | 58 | 128 |
| thread_safe_contention_same_slot_write_read_16 | 2 | 211 | 0 | 1 | 0 | 0 | 0 | 862 | 1.132 ms | 429.864 us | 138 | 138 | 118 | 256 |
| thread_safe_contention_independent_slots_1 | 2 | 16 | 0 | 1 | 0 | 0 | 0 | 23 | 980.000 ns | 15.180 us | 15 | 15 | 0 | 15 |
| thread_safe_contention_independent_slots_2 | 4 | 33 | 0 | 2 | 0 | 0 | 0 | 89 | 19.151 us | 39.491 us | 17 | 17 | 14 | 31 |
| thread_safe_contention_independent_slots_4 | 8 | 67 | 0 | 4 | 0 | 0 | 0 | 199 | 416.563 us | 95.721 us | 25 | 25 | 38 | 63 |
| thread_safe_contention_independent_slots_8 | 16 | 135 | 0 | 8 | 0 | 0 | 0 | 465 | 4.256 ms | 296.231 us | 11 | 11 | 116 | 127 |
| thread_safe_contention_independent_slots_16 | 32 | 271 | 0 | 16 | 0 | 0 | 0 | 939 | 19.906 ms | 630.554 us | 9 | 9 | 246 | 255 |
| thread_safe_contention_read_mostly_waiters_1 | 2 | 17 | 0 | 1 | 0 | 0 | 0 | 24 | 1.110 us | 17.340 us | 16 | 16 | 0 | 16 |
| thread_safe_contention_read_mostly_waiters_2 | 2 | 17 | 0 | 1 | 0 | 0 | 0 | 24 | 1.030 us | 14.530 us | 16 | 16 | 0 | 16 |
| thread_safe_contention_read_mostly_waiters_4 | 2 | 17 | 0 | 1 | 0 | 0 | 0 | 41 | 2.500 us | 26.930 us | 16 | 16 | 0 | 16 |
| thread_safe_contention_read_mostly_waiters_8 | 2 | 17 | 0 | 1 | 0 | 0 | 0 | 59 | 22.950 us | 31.790 us | 16 | 16 | 0 | 16 |
| thread_safe_contention_read_mostly_waiters_16 | 2 | 17 | 0 | 1 | 0 | 0 | 0 | 83 | 217.170 us | 114.471 us | 16 | 16 | 0 | 16 |
| thread_safe_contention_batched_write_bursts_1 | 5 | 16 | 0 | 4 | 0 | 0 | 0 | 112 | 4.490 us | 64.750 us | 0 | 0 | 0 | 15 |
| thread_safe_contention_batched_write_bursts_2 | 9 | 21 | 0 | 8 | 0 | 0 | 0 | 188 | 66.861 us | 109.832 us | 0 | 0 | 0 | 20 |
| thread_safe_contention_batched_write_bursts_4 | 17 | 37 | 0 | 16 | 0 | 0 | 0 | 389 | 432.502 us | 282.893 us | 0 | 0 | 0 | 38 |
| thread_safe_contention_batched_write_bursts_8 | 33 | 10 | 0 | 32 | 0 | 0 | 0 | 400 | 2.433 ms | 298.504 us | 0 | 0 | 0 | 9 |
| thread_safe_contention_batched_write_bursts_16 | 65 | 3 | 0 | 64 | 0 | 0 | 0 | 717 | 10.003 ms | 462.403 us | 0 | 0 | 0 | 2 |
| thread_safe_effect_contention_queue_coalescing_8 | 33 | 0 | 0 | 32 | 0 | 8 | 1 | 406 | 2.018 ms | 269.084 us | 0 | 0 | 0 | 0 |
| thread_safe_effect_contention_queue_coalescing_16 | 65 | 0 | 0 | 64 | 0 | 10 | 1 | 775 | 9.282 ms | 531.114 us | 0 | 0 | 0 | 0 |
| thread_safe_effect_contention_cleanup_execution_8 | 9 | 0 | 0 | 8 | 8 | 32 | 1 | 408 | 1.845 ms | 190.321 us | 0 | 0 | 127 | 0 |
| thread_safe_effect_contention_cleanup_execution_16 | 17 | 0 | 0 | 16 | 16 | 32 | 1 | 696 | 7.538 ms | 305.824 us | 0 | 0 | 255 | 0 |
| thread_safe_effect_contention_batch_flush_8 | 34 | 3 | 0 | 33 | 0 | 3 | 1 | 634 | 2.638 ms | 269.453 us | 0 | 0 | 0 | 2 |
| thread_safe_effect_contention_batch_flush_16 | 66 | 2 | 0 | 65 | 0 | 3 | 1 | 1239 | 21.843 ms | 617.275 us | 0 | 0 | 0 | 1 |
| thread_safe_graph_propagation_fan_out_eager_validation_8 | 34 | 549 | 0 | 64 | 0 | 49 | 1 | 1132 | 24.084 ms | 6.033 ms | 10 | 320 | 118 | 4096 |
| thread_safe_graph_propagation_fan_out_eager_validation_16 | 34 | 555 | 0 | 64 | 0 | 49 | 1 | 1380 | 117.009 ms | 11.557 ms | 17 | 544 | 239 | 8192 |
| thread_safe_graph_propagation_fan_out_lazy_dirty_epochs_8 | 33 | 64 | 0 | 32 | 0 | 0 | 0 | 226 | 10.090 us | 76.281 us | 128 | 4096 | 0 | 4096 |
| thread_safe_graph_propagation_fan_out_lazy_dirty_epochs_16 | 33 | 64 | 0 | 32 | 0 | 0 | 0 | 226 | 9.670 us | 69.370 us | 256 | 8192 | 0 | 8192 |
| thread_safe_graph_propagation_fan_in_lazy_dirty_epochs_8 | 65 | 66 | 0 | 64 | 0 | 0 | 0 | 328 | 11.840 us | 105.540 us | 508 | 540 | 0 | 572 |
| thread_safe_graph_propagation_fan_in_lazy_dirty_epochs_16 | 129 | 130 | 0 | 128 | 0 | 0 | 0 | 648 | 27.661 us | 197.172 us | 1020 | 1084 | 0 | 1148 |
| thread_safe_graph_propagation_fan_in_batched_flush_8 | 66 | 66 | 0 | 65 | 0 | 3 | 1 | 605 | 2.600 ms | 397.774 us | 0 | 0 | 0 | 73 |
| thread_safe_graph_propagation_fan_in_batched_flush_16 | 130 | 135 | 0 | 129 | 0 | 5 | 1 | 1319 | 7.690 ms | 744.236 us | 0 | 0 | 0 | 170 |
ThreadSafe lock attribution for contention profiles:
| Profile | Site | Lock acquisitions | Lock wait | Lock hold |
|---|---|---|---|---|
| thread_safe_set_cell_invalidation_same_slot_contention_1 | other | 4 | 110.000 ns | 1.830 us |
| thread_safe_set_cell_invalidation_same_slot_contention_1 | get_refresh | 2 | 90.000 ns | 230.000 ns |
| thread_safe_set_cell_invalidation_same_slot_contention_1 | dependency_edge | 1 | 50.000 ns | 480.000 ns |
| thread_safe_set_cell_invalidation_same_slot_contention_1 | publish | 1 | 40.000 ns | 340.000 ns |
| thread_safe_set_cell_invalidation_same_slot_contention_2 | other | 4 | 110.000 ns | 240.000 ns |
| thread_safe_set_cell_invalidation_same_slot_contention_2 | get_refresh | 2 | 100.000 ns | 200.000 ns |
| thread_safe_set_cell_invalidation_same_slot_contention_2 | dependency_edge | 1 | 30.000 ns | 400.000 ns |
| thread_safe_set_cell_invalidation_same_slot_contention_2 | publish | 1 | 40.000 ns | 330.000 ns |
| thread_safe_set_cell_invalidation_same_slot_contention_4 | other | 4 | 120.000 ns | 210.000 ns |
| thread_safe_set_cell_invalidation_same_slot_contention_4 | get_refresh | 2 | 90.000 ns | 160.000 ns |
| thread_safe_set_cell_invalidation_same_slot_contention_4 | dependency_edge | 1 | 30.000 ns | 370.000 ns |
| thread_safe_set_cell_invalidation_same_slot_contention_4 | publish | 1 | 40.000 ns | 320.000 ns |
| thread_safe_set_cell_invalidation_same_slot_contention_8 | other | 4 | 120.000 ns | 240.000 ns |
| thread_safe_set_cell_invalidation_same_slot_contention_8 | get_refresh | 2 | 90.000 ns | 200.000 ns |
| thread_safe_set_cell_invalidation_same_slot_contention_8 | dependency_edge | 1 | 20.000 ns | 340.000 ns |
| thread_safe_set_cell_invalidation_same_slot_contention_8 | publish | 1 | 50.000 ns | 330.000 ns |
| thread_safe_set_cell_invalidation_same_slot_contention_16 | other | 4 | 180.000 ns | 1.320 us |
| thread_safe_set_cell_invalidation_same_slot_contention_16 | get_refresh | 2 | 130.000 ns | 1.360 us |
| thread_safe_set_cell_invalidation_same_slot_contention_16 | dependency_edge | 1 | 20.000 ns | 1.190 us |
| thread_safe_set_cell_invalidation_same_slot_contention_16 | publish | 1 | 50.000 ns | 1.210 us |
| thread_safe_set_cell_invalidation_independent_slot_contention_1 | other | 4 | 180.000 ns | 2.550 us |
| thread_safe_set_cell_invalidation_independent_slot_contention_1 | get_refresh | 2 | 130.000 ns | 2.340 us |
| thread_safe_set_cell_invalidation_independent_slot_contention_1 | dependency_edge | 1 | 20.000 ns | 1.670 us |
| thread_safe_set_cell_invalidation_independent_slot_contention_1 | publish | 1 | 50.000 ns | 1.380 us |
| thread_safe_set_cell_invalidation_independent_slot_contention_2 | other | 8 | 250.000 ns | 590.000 ns |
| thread_safe_set_cell_invalidation_independent_slot_contention_2 | get_refresh | 4 | 220.000 ns | 410.000 ns |
| thread_safe_set_cell_invalidation_independent_slot_contention_2 | dependency_edge | 2 | 50.000 ns | 710.000 ns |
| thread_safe_set_cell_invalidation_independent_slot_contention_2 | publish | 2 | 90.000 ns | 700.000 ns |
| thread_safe_set_cell_invalidation_independent_slot_contention_4 | other | 16 | 510.000 ns | 1.500 us |
| thread_safe_set_cell_invalidation_independent_slot_contention_4 | get_refresh | 8 | 340.000 ns | 680.000 ns |
| thread_safe_set_cell_invalidation_independent_slot_contention_4 | dependency_edge | 4 | 100.000 ns | 1.330 us |
| thread_safe_set_cell_invalidation_independent_slot_contention_4 | publish | 4 | 190.000 ns | 1.410 us |
| thread_safe_set_cell_invalidation_independent_slot_contention_8 | other | 32 | 950.000 ns | 3.330 us |
| thread_safe_set_cell_invalidation_independent_slot_contention_8 | get_refresh | 16 | 680.000 ns | 1.260 us |
| thread_safe_set_cell_invalidation_independent_slot_contention_8 | dependency_edge | 8 | 250.000 ns | 2.600 us |
| thread_safe_set_cell_invalidation_independent_slot_contention_8 | publish | 8 | 390.000 ns | 2.740 us |
| thread_safe_set_cell_invalidation_independent_slot_contention_16 | other | 64 | 1.990 us | 5.380 us |
| thread_safe_set_cell_invalidation_independent_slot_contention_16 | get_refresh | 32 | 1.390 us | 2.460 us |
| thread_safe_set_cell_invalidation_independent_slot_contention_16 | dependency_edge | 16 | 430.000 ns | 5.200 us |
| thread_safe_set_cell_invalidation_independent_slot_contention_16 | publish | 16 | 760.000 ns | 5.770 us |
| thread_safe_set_cell_invalidation_batched_write_bursts_1 | other | 74 | 2.770 us | 15.940 us |
| thread_safe_set_cell_invalidation_batched_write_bursts_1 | get_refresh | 2 | 80.000 ns | 290.000 ns |
| thread_safe_set_cell_invalidation_batched_write_bursts_1 | dependency_edge | 4 | 170.000 ns | 1.760 us |
| thread_safe_set_cell_invalidation_batched_write_bursts_1 | set_cell_invalidation | 16 | 590.000 ns | 42.450 us |
| thread_safe_set_cell_invalidation_batched_write_bursts_1 | publish | 1 | 40.000 ns | 420.000 ns |
| thread_safe_set_cell_invalidation_batched_write_bursts_2 | other | 126 | 186.900 us | 41.250 us |
| thread_safe_set_cell_invalidation_batched_write_bursts_2 | get_refresh | 2 | 80.000 ns | 180.000 ns |
| thread_safe_set_cell_invalidation_batched_write_bursts_2 | dependency_edge | 8 | 340.000 ns | 6.010 us |
| thread_safe_set_cell_invalidation_batched_write_bursts_2 | set_cell_invalidation | 22 | 6.460 us | 58.951 us |
| thread_safe_set_cell_invalidation_batched_write_bursts_2 | publish | 1 | 50.000 ns | 350.000 ns |
| thread_safe_set_cell_invalidation_batched_write_bursts_4 | other | 174 | 434.615 us | 99.261 us |
| thread_safe_set_cell_invalidation_batched_write_bursts_4 | get_refresh | 2 | 80.000 ns | 190.000 ns |
| thread_safe_set_cell_invalidation_batched_write_bursts_4 | dependency_edge | 16 | 700.000 ns | 7.690 us |
| thread_safe_set_cell_invalidation_batched_write_bursts_4 | set_cell_invalidation | 6 | 3.930 us | 43.400 us |
| thread_safe_set_cell_invalidation_batched_write_bursts_4 | publish | 1 | 50.000 ns | 360.000 ns |
| thread_safe_set_cell_invalidation_batched_write_bursts_8 | other | 330 | 2.187 ms | 199.591 us |
| thread_safe_set_cell_invalidation_batched_write_bursts_8 | get_refresh | 2 | 130.000 ns | 1.410 us |
| thread_safe_set_cell_invalidation_batched_write_bursts_8 | dependency_edge | 32 | 1.600 us | 17.200 us |
| thread_safe_set_cell_invalidation_batched_write_bursts_8 | set_cell_invalidation | 4 | 200.000 ns | 11.920 us |
| thread_safe_set_cell_invalidation_batched_write_bursts_8 | publish | 1 | 50.000 ns | 1.260 us |
| thread_safe_set_cell_invalidation_batched_write_bursts_16 | other | 646 | 9.400 ms | 384.152 us |
| thread_safe_set_cell_invalidation_batched_write_bursts_16 | get_refresh | 2 | 90.000 ns | 280.000 ns |
| thread_safe_set_cell_invalidation_batched_write_bursts_16 | dependency_edge | 64 | 3.180 us | 34.280 us |
| thread_safe_set_cell_invalidation_batched_write_bursts_16 | set_cell_invalidation | 2 | 90.000 ns | 13.590 us |
| thread_safe_set_cell_invalidation_batched_write_bursts_16 | publish | 1 | 50.000 ns | 400.000 ns |
| thread_safe_contention_same_slot_write_read_1 | other | 4 | 130.000 ns | 770.000 ns |
| thread_safe_contention_same_slot_write_read_1 | get_refresh | 2 | 100.000 ns | 510.000 ns |
| thread_safe_contention_same_slot_write_read_1 | dependency_edge | 1 | 30.000 ns | 570.000 ns |
| thread_safe_contention_same_slot_write_read_1 | publish | 17 | 820.000 ns | 15.660 us |
| thread_safe_contention_same_slot_write_read_2 | other | 22 | 3.030 us | 990.000 ns |
| thread_safe_contention_same_slot_write_read_2 | get_refresh | 2 | 90.000 ns | 190.000 ns |
| thread_safe_contention_same_slot_write_read_2 | dependency_edge | 1 | 30.000 ns | 360.000 ns |
| thread_safe_contention_same_slot_write_read_2 | set_cell_invalidation | 9 | 1.920 us | 13.920 us |
| thread_safe_contention_same_slot_write_read_2 | publish | 28 | 11.540 us | 30.950 us |
| thread_safe_contention_same_slot_write_read_2 | in_flight_wait | 15 | 0.000 ns | 0.000 ns |
| thread_safe_contention_same_slot_write_read_4 | other | 63 | 45.511 us | 2.670 us |
| thread_safe_contention_same_slot_write_read_4 | get_refresh | 2 | 80.000 ns | 150.000 ns |
| thread_safe_contention_same_slot_write_read_4 | dependency_edge | 1 | 30.000 ns | 330.000 ns |
| thread_safe_contention_same_slot_write_read_4 | set_cell_invalidation | 31 | 30.140 us | 43.440 us |
| thread_safe_contention_same_slot_write_read_4 | publish | 52 | 51.981 us | 60.530 us |
| thread_safe_contention_same_slot_write_read_4 | in_flight_wait | 77 | 0.000 ns | 0.000 ns |
| thread_safe_contention_same_slot_write_read_8 | other | 117 | 62.331 us | 3.711 us |
| thread_safe_contention_same_slot_write_read_8 | get_refresh | 12 | 12.790 us | 6.830 us |
| thread_safe_contention_same_slot_write_read_8 | dependency_edge | 1 | 20.000 ns | 350.000 ns |
| thread_safe_contention_same_slot_write_read_8 | set_cell_invalidation | 58 | 70.113 us | 62.631 us |
| thread_safe_contention_same_slot_write_read_8 | publish | 111 | 62.652 us | 126.741 us |
| thread_safe_contention_same_slot_write_read_8 | in_flight_wait | 150 | 0.000 ns | 0.000 ns |
| thread_safe_contention_same_slot_write_read_16 | other | 229 | 407.693 us | 8.420 us |
| thread_safe_contention_same_slot_write_read_16 | get_refresh | 16 | 12.330 us | 5.900 us |
| thread_safe_contention_same_slot_write_read_16 | dependency_edge | 1 | 20.000 ns | 1.130 us |
| thread_safe_contention_same_slot_write_read_16 | set_cell_invalidation | 118 | 558.902 us | 150.022 us |
| thread_safe_contention_same_slot_write_read_16 | publish | 211 | 152.932 us | 264.392 us |
| thread_safe_contention_same_slot_write_read_16 | in_flight_wait | 287 | 0.000 ns | 0.000 ns |
| thread_safe_contention_independent_slots_1 | other | 4 | 140.000 ns | 1.130 us |
| thread_safe_contention_independent_slots_1 | get_refresh | 2 | 90.000 ns | 310.000 ns |
| thread_safe_contention_independent_slots_1 | dependency_edge | 1 | 20.000 ns | 1.180 us |
| thread_safe_contention_independent_slots_1 | publish | 16 | 730.000 ns | 12.560 us |
| thread_safe_contention_independent_slots_2 | other | 36 | 2.640 us | 1.450 us |
| thread_safe_contention_independent_slots_2 | get_refresh | 4 | 200.000 ns | 420.000 ns |
| thread_safe_contention_independent_slots_2 | dependency_edge | 2 | 50.000 ns | 710.000 ns |
| thread_safe_contention_independent_slots_2 | set_cell_invalidation | 14 | 7.690 us | 10.741 us |
| thread_safe_contention_independent_slots_2 | publish | 33 | 8.571 us | 26.170 us |
| thread_safe_contention_independent_slots_4 | other | 82 | 103.730 us | 3.910 us |
| thread_safe_contention_independent_slots_4 | get_refresh | 8 | 380.000 ns | 650.000 ns |
| thread_safe_contention_independent_slots_4 | dependency_edge | 4 | 100.000 ns | 1.320 us |
| thread_safe_contention_independent_slots_4 | set_cell_invalidation | 38 | 203.262 us | 32.581 us |
| thread_safe_contention_independent_slots_4 | publish | 67 | 109.091 us | 57.260 us |
| thread_safe_contention_independent_slots_8 | other | 190 | 1.107 ms | 10.290 us |
| thread_safe_contention_independent_slots_8 | get_refresh | 16 | 740.000 ns | 1.310 us |
| thread_safe_contention_independent_slots_8 | dependency_edge | 8 | 220.000 ns | 2.810 us |
| thread_safe_contention_independent_slots_8 | set_cell_invalidation | 116 | 1.523 ms | 130.711 us |
| thread_safe_contention_independent_slots_8 | publish | 135 | 1.625 ms | 151.110 us |
| thread_safe_contention_independent_slots_16 | other | 374 | 5.514 ms | 20.750 us |
| thread_safe_contention_independent_slots_16 | get_refresh | 32 | 1.400 us | 2.970 us |
| thread_safe_contention_independent_slots_16 | dependency_edge | 16 | 430.000 ns | 6.240 us |
| thread_safe_contention_independent_slots_16 | set_cell_invalidation | 246 | 6.024 ms | 288.392 us |
| thread_safe_contention_independent_slots_16 | publish | 271 | 8.367 ms | 312.202 us |
| thread_safe_contention_read_mostly_waiters_1 | other | 4 | 130.000 ns | 1.430 us |
| thread_safe_contention_read_mostly_waiters_1 | get_refresh | 2 | 140.000 ns | 880.000 ns |
| thread_safe_contention_read_mostly_waiters_1 | dependency_edge | 1 | 20.000 ns | 1.180 us |
| thread_safe_contention_read_mostly_waiters_1 | publish | 17 | 820.000 ns | 13.850 us |
| thread_safe_contention_read_mostly_waiters_2 | other | 4 | 130.000 ns | 310.000 ns |
| thread_safe_contention_read_mostly_waiters_2 | get_refresh | 2 | 90.000 ns | 180.000 ns |
| thread_safe_contention_read_mostly_waiters_2 | dependency_edge | 1 | 20.000 ns | 380.000 ns |
| thread_safe_contention_read_mostly_waiters_2 | publish | 17 | 790.000 ns | 13.660 us |
| thread_safe_contention_read_mostly_waiters_4 | other | 4 | 130.000 ns | 250.000 ns |
| thread_safe_contention_read_mostly_waiters_4 | get_refresh | 8 | 1.560 us | 2.200 us |
| thread_safe_contention_read_mostly_waiters_4 | dependency_edge | 1 | 20.000 ns | 360.000 ns |
| thread_safe_contention_read_mostly_waiters_4 | publish | 17 | 790.000 ns | 24.120 us |
| thread_safe_contention_read_mostly_waiters_4 | in_flight_wait | 11 | 0.000 ns | 0.000 ns |
| thread_safe_contention_read_mostly_waiters_8 | other | 4 | 150.000 ns | 390.000 ns |
| thread_safe_contention_read_mostly_waiters_8 | get_refresh | 16 | 14.830 us | 5.350 us |
| thread_safe_contention_read_mostly_waiters_8 | dependency_edge | 1 | 30.000 ns | 360.000 ns |
| thread_safe_contention_read_mostly_waiters_8 | publish | 17 | 7.940 us | 25.690 us |
| thread_safe_contention_read_mostly_waiters_8 | in_flight_wait | 21 | 0.000 ns | 0.000 ns |
| thread_safe_contention_read_mostly_waiters_16 | other | 4 | 180.000 ns | 1.330 us |
| thread_safe_contention_read_mostly_waiters_16 | get_refresh | 22 | 211.850 us | 16.180 us |
| thread_safe_contention_read_mostly_waiters_16 | dependency_edge | 1 | 30.000 ns | 1.190 us |
| thread_safe_contention_read_mostly_waiters_16 | publish | 17 | 5.110 us | 95.771 us |
| thread_safe_contention_read_mostly_waiters_16 | in_flight_wait | 39 | 0.000 ns | 0.000 ns |
| thread_safe_contention_batched_write_bursts_1 | other | 74 | 2.790 us | 14.810 us |
| thread_safe_contention_batched_write_bursts_1 | get_refresh | 2 | 110.000 ns | 340.000 ns |
| thread_safe_contention_batched_write_bursts_1 | dependency_edge | 4 | 170.000 ns | 1.790 us |
| thread_safe_contention_batched_write_bursts_1 | set_cell_invalidation | 16 | 670.000 ns | 29.360 us |
| thread_safe_contention_batched_write_bursts_1 | publish | 16 | 750.000 ns | 18.450 us |
| thread_safe_contention_batched_write_bursts_2 | other | 124 | 52.991 us | 31.280 us |
| thread_safe_contention_batched_write_bursts_2 | get_refresh | 2 | 90.000 ns | 180.000 ns |
| thread_safe_contention_batched_write_bursts_2 | dependency_edge | 8 | 320.000 ns | 4.270 us |
| thread_safe_contention_batched_write_bursts_2 | set_cell_invalidation | 21 | 9.380 us | 44.482 us |
| thread_safe_contention_batched_write_bursts_2 | publish | 21 | 4.080 us | 29.620 us |
| thread_safe_contention_batched_write_bursts_2 | in_flight_wait | 12 | 0.000 ns | 0.000 ns |
| thread_safe_contention_batched_write_bursts_4 | other | 237 | 388.651 us | 83.570 us |
| thread_safe_contention_batched_write_bursts_4 | get_refresh | 10 | 32.921 us | 4.740 us |
| thread_safe_contention_batched_write_bursts_4 | dependency_edge | 16 | 740.000 ns | 7.550 us |
| thread_safe_contention_batched_write_bursts_4 | set_cell_invalidation | 38 | 3.930 us | 94.652 us |
| thread_safe_contention_batched_write_bursts_4 | publish | 37 | 6.260 us | 92.381 us |
| thread_safe_contention_batched_write_bursts_4 | in_flight_wait | 51 | 0.000 ns | 0.000 ns |
| thread_safe_contention_batched_write_bursts_8 | other | 340 | 2.431 ms | 199.842 us |
| thread_safe_contention_batched_write_bursts_8 | get_refresh | 2 | 90.000 ns | 390.000 ns |
| thread_safe_contention_batched_write_bursts_8 | dependency_edge | 32 | 1.560 us | 15.721 us |
| thread_safe_contention_batched_write_bursts_8 | set_cell_invalidation | 9 | 480.000 ns | 49.211 us |
| thread_safe_contention_batched_write_bursts_8 | publish | 10 | 540.000 ns | 33.340 us |
| thread_safe_contention_batched_write_bursts_8 | in_flight_wait | 7 | 0.000 ns | 0.000 ns |
| thread_safe_contention_batched_write_bursts_16 | other | 646 | 9.999 ms | 389.373 us |
| thread_safe_contention_batched_write_bursts_16 | get_refresh | 2 | 120.000 ns | 1.270 us |
| thread_safe_contention_batched_write_bursts_16 | dependency_edge | 64 | 3.040 us | 34.950 us |
| thread_safe_contention_batched_write_bursts_16 | set_cell_invalidation | 2 | 110.000 ns | 17.680 us |
| thread_safe_contention_batched_write_bursts_16 | publish | 3 | 150.000 ns | 19.130 us |
| thread_safe_effect_contention_queue_coalescing_8 | other | 365 | 2.016 ms | 199.513 us |
| thread_safe_effect_contention_queue_coalescing_8 | dependency_edge | 32 | 1.420 us | 11.631 us |
| thread_safe_effect_contention_queue_coalescing_8 | set_cell_invalidation | 9 | 520.000 ns | 57.940 us |
| thread_safe_effect_contention_queue_coalescing_16 | other | 702 | 9.279 ms | 470.073 us |
| thread_safe_effect_contention_queue_coalescing_16 | dependency_edge | 64 | 2.810 us | 24.800 us |
| thread_safe_effect_contention_queue_coalescing_16 | set_cell_invalidation | 9 | 440.000 ns | 36.241 us |
| thread_safe_effect_contention_cleanup_execution_8 | other | 265 | 599.415 us | 62.861 us |
| thread_safe_effect_contention_cleanup_execution_8 | dependency_edge | 16 | 800.000 ns | 11.120 us |
| thread_safe_effect_contention_cleanup_execution_8 | set_cell_invalidation | 127 | 1.245 ms | 116.340 us |
| thread_safe_effect_contention_cleanup_execution_16 | other | 409 | 3.289 ms | 93.061 us |
| thread_safe_effect_contention_cleanup_execution_16 | dependency_edge | 32 | 1.530 us | 13.520 us |
| thread_safe_effect_contention_cleanup_execution_16 | set_cell_invalidation | 255 | 4.247 ms | 199.243 us |
| thread_safe_effect_contention_batch_flush_8 | other | 594 | 2.636 ms | 226.842 us |
| thread_safe_effect_contention_batch_flush_8 | get_refresh | 2 | 110.000 ns | 840.000 ns |
| thread_safe_effect_contention_batch_flush_8 | dependency_edge | 33 | 1.500 us | 17.470 us |
| thread_safe_effect_contention_batch_flush_8 | set_cell_invalidation | 2 | 150.000 ns | 16.160 us |
| thread_safe_effect_contention_batch_flush_8 | publish | 3 | 170.000 ns | 8.141 us |
| thread_safe_effect_contention_batch_flush_16 | other | 1169 | 21.839 ms | 558.475 us |
| thread_safe_effect_contention_batch_flush_16 | get_refresh | 2 | 110.000 ns | 270.000 ns |
| thread_safe_effect_contention_batch_flush_16 | dependency_edge | 65 | 3.080 us | 35.400 us |
| thread_safe_effect_contention_batch_flush_16 | set_cell_invalidation | 1 | 60.000 ns | 12.710 us |
| thread_safe_effect_contention_batch_flush_16 | publish | 2 | 100.000 ns | 10.420 us |
| thread_safe_graph_propagation_fan_out_eager_validation_8 | other | 337 | 5.266 ms | 166.412 us |
| thread_safe_graph_propagation_fan_out_eager_validation_8 | get_refresh | 64 | 3.000 us | 5.140 us |
| thread_safe_graph_propagation_fan_out_eager_validation_8 | dependency_edge | 64 | 2.280 us | 29.920 us |
| thread_safe_graph_propagation_fan_out_eager_validation_8 | set_cell_invalidation | 118 | 14.440 ms | 5.280 ms |
| thread_safe_graph_propagation_fan_out_eager_validation_8 | publish | 549 | 4.373 ms | 552.165 us |
| thread_safe_graph_propagation_fan_out_eager_validation_16 | other | 458 | 30.162 ms | 173.621 us |
| thread_safe_graph_propagation_fan_out_eager_validation_16 | get_refresh | 64 | 3.040 us | 5.550 us |
| thread_safe_graph_propagation_fan_out_eager_validation_16 | dependency_edge | 64 | 2.500 us | 30.240 us |
| thread_safe_graph_propagation_fan_out_eager_validation_16 | set_cell_invalidation | 239 | 76.647 ms | 10.794 ms |
| thread_safe_graph_propagation_fan_out_eager_validation_16 | publish | 555 | 10.195 ms | 553.344 us |
| thread_safe_graph_propagation_fan_out_lazy_dirty_epochs_8 | other | 66 | 3.290 us | 7.310 us |
| thread_safe_graph_propagation_fan_out_lazy_dirty_epochs_8 | get_refresh | 64 | 2.820 us | 5.830 us |
| thread_safe_graph_propagation_fan_out_lazy_dirty_epochs_8 | dependency_edge | 32 | 900.000 ns | 18.280 us |
| thread_safe_graph_propagation_fan_out_lazy_dirty_epochs_8 | publish | 64 | 3.080 us | 44.861 us |
| thread_safe_graph_propagation_fan_out_lazy_dirty_epochs_16 | other | 66 | 2.760 us | 5.380 us |
| thread_safe_graph_propagation_fan_out_lazy_dirty_epochs_16 | get_refresh | 64 | 2.860 us | 5.660 us |
| thread_safe_graph_propagation_fan_out_lazy_dirty_epochs_16 | dependency_edge | 32 | 880.000 ns | 18.490 us |
| thread_safe_graph_propagation_fan_out_lazy_dirty_epochs_16 | publish | 64 | 3.170 us | 39.840 us |
| thread_safe_graph_propagation_fan_in_lazy_dirty_epochs_8 | other | 130 | 3.800 us | 9.170 us |
| thread_safe_graph_propagation_fan_in_lazy_dirty_epochs_8 | get_refresh | 68 | 3.620 us | 10.250 us |
| thread_safe_graph_propagation_fan_in_lazy_dirty_epochs_8 | dependency_edge | 64 | 2.440 us | 28.580 us |
| thread_safe_graph_propagation_fan_in_lazy_dirty_epochs_8 | publish | 66 | 1.980 us | 57.540 us |
| thread_safe_graph_propagation_fan_in_lazy_dirty_epochs_16 | other | 258 | 12.270 us | 18.010 us |
| thread_safe_graph_propagation_fan_in_lazy_dirty_epochs_16 | get_refresh | 132 | 6.440 us | 12.920 us |
| thread_safe_graph_propagation_fan_in_lazy_dirty_epochs_16 | dependency_edge | 128 | 5.030 us | 60.300 us |
| thread_safe_graph_propagation_fan_in_lazy_dirty_epochs_16 | publish | 130 | 3.921 us | 105.942 us |
| thread_safe_graph_propagation_fan_in_batched_flush_8 | other | 403 | 2.572 ms | 216.262 us |
| thread_safe_graph_propagation_fan_in_batched_flush_8 | get_refresh | 68 | 3.290 us | 8.300 us |
| thread_safe_graph_propagation_fan_in_batched_flush_8 | dependency_edge | 65 | 2.350 us | 28.090 us |
| thread_safe_graph_propagation_fan_in_batched_flush_8 | set_cell_invalidation | 3 | 5.310 us | 87.820 us |
| thread_safe_graph_propagation_fan_in_batched_flush_8 | publish | 66 | 16.150 us | 57.302 us |
| thread_safe_graph_propagation_fan_in_batched_flush_16 | other | 795 | 7.602 ms | 346.913 us |
| thread_safe_graph_propagation_fan_in_batched_flush_16 | get_refresh | 254 | 15.790 us | 29.590 us |
| thread_safe_graph_propagation_fan_in_batched_flush_16 | dependency_edge | 129 | 4.860 us | 58.880 us |
| thread_safe_graph_propagation_fan_in_batched_flush_16 | set_cell_invalidation | 6 | 2.760 us | 186.002 us |
| thread_safe_graph_propagation_fan_in_batched_flush_16 | publish | 135 | 65.141 us | 122.851 us |
Multi-Language
lazily is implemented across three languages with shared semantics:
| lazily-rs | lazily-zig | lazily-py | |
|---|---|---|---|
| Context | Owned Context struct | Explicit allocator | Plain dict |
| Slot creation | Box<dyn Fn> closures | comptime function pointers | Lambdas |
| Cell equality | PartialEq trait | std.meta.eql | != operator |
| Thread safety | Single-threaded Context; explicit ThreadSafeContext | Mutex by default | GIL |
| Storage | Unified generics | .direct / .indirect | Object identity |
Related
- lazily-zig — Zig implementation with FFI support
- lazily-py — Python implementation with context-as-dict
- Blog post: Lazily — Reactive Primitives Done Right
License
MIT
lazily v0.11.0
Prepared for the operator publish (#12b1). Last published: crates.io 0.10.3.
Tag v0.11.0 is intended to point at this release commit on main.
Highlights
This minor lands eager Signal primitives across the single-threaded, thread-safe, and async contexts, plus a full WebRTC DataChannel transport stack on top of the v0.10.x reactive core: a sans-IO str0m backend, a real-socket networked backend, a WebSocket fallback, signaling, and the glue that drives a complete handshake end to end.
Features
- #3dmm / #x7sp - add
SignalHandle,ThreadSafeSignalHandle, andAsyncSignalHandleas eager derived values backed by memo slots plus puller effects. Signals provide always-materializedv1 -> v2updates with no observable unset window, inherit memo equality suppression, and are documented in the README, SPEC, PROTOCOL, and mdBook docs. - #lzwebrtcwire — wire
SignalingClienttoStr0mNet. Newwebrtc_signalingmodule (offer_to_peer/answer_next_offer) owns the full SDP offer/answer + trickled-ICE handshake overSignalingClient, pumping frames intoaccept_answer/add_remote_candidateuntil the data channel opens. Integration test brokers two realSignalingClientWebSocket peers through an in-process #yxjw-protocol loopback relay and proves a permission-filteredSnapshotcrosses the negotiated channel. - #lzwebrtcnet — networked str0m
DataChannelbackend (Str0mNet) over a real UDP socket with the str0m DTLS/SCTP/ICE driver. - #97xn — multi-channel reactive bridge hub.
- #akp3 — WebSocket
DataChannelbackend (in-process loopback over a real WS handshake). - #webrtcbackend — concrete sans-IO str0m
DataChannelbackend. - #webrtc2 / #webrtc3 — WebRTC
DataChannelIPC transport abstraction, loopback integration tests, and Criterion benchmarks.
CI / tests
- #lzleanmodel - CI now builds the sibling Lean formal model so protocol invariants stay checked alongside the Rust suite.
- #lzspecconf — IPC conformance run against the canonical lazily-spec fixtures.
- #k03k / #lzasync — deterministic async resolve-loop window coverage.
Remaining (operator-gated)
- Live two-host / NAT validation of
Str0mNetthrough the deployed #yxjw Cloudflare Worker (#lzwebrtcnet-e2e, part of #h6qb) — cannot be done in CI.
Publish checklist (#12b1)
cargo publish(dry-run verified clean: 72 files, 233 KiB compressed).gh release create v0.11.0 --notes-file RELEASE_NOTES_v0.11.0.md --title "lazily v0.11.0".- Rotate the crates.io token if expired before step 1.