The Offset System¶
The Challenge of Time Representation¶
In digital signal processing, we often work with data sampled at specific rates and need to synchronize multiple streams. Representing time precisely in this setting is harder than it first appears:
-
Floating-point precision — Representing time as floating-point seconds leads to rounding errors. At 2048 Hz the inter-sample interval is 0.000488281... seconds, a repeating decimal in binary. After millions of samples (hours of data) the accumulated error can drift past a sample period. For pipelines designed to run for days or longer, this is unacceptable.
-
Integer nanosecond overflow — Integer nanoseconds avoid rounding but introduce alignment problems. 1/2048 Hz = 488281.25 nanoseconds — not an integer — so nanosecond timestamps cannot represent every sample point exactly. They also require very large integers for long time spans.
-
Sample-count ambiguity — Integer sample counts are exact for a single rate, but don't generalize. Sample 1000 at 2048 Hz and sample 2000 at 4096 Hz represent the same moment, but there is no obvious way to compare them without knowing both rates.
The Offset Solution¶
An offset is a sample count at the maximum sample rate (MAX_RATE, 16384
by default). Because all allowed sample rates are power-of-2 divisors of
MAX_RATE, every sample at every rate falls exactly on an integer offset.
This gives a single integer timeline that is:
- Exact — no floating-point rounding, no fractional nanoseconds
- Universal — the same offset value means the same moment regardless of sample rate
- Compact — smaller integers than nanoseconds for the same time span
Key properties:
- One second of data always spans
MAX_RATEoffset units (16384 by default), regardless of the actual sample rate - Converting between offsets and samples at any allowed rate is always an exact integer division
- Two buffers at different sample rates that cover the same time span have
identical
offsetandoffset_endvalues
For example, 1 second of data:
- At 2048 Hz: 2048 samples, offset span = 16384
- At 4096 Hz: 4096 samples, offset span = 16384
- At 512 Hz: 512 samples, offset span = 16384
This makes alignment across channels trivial — just compare offsets.
Visualizing Offset Alignment¶
To see why this works, imagine a simplified system with MAX_RATE=16. One
second of data spans 16 offset units, and every sample at every rate falls
exactly on an offset boundary:
offsets | | | | | | | | | | | | | | | | |
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
sample rate 16 x x x x x x x x x x x x x x x x x
sample rate 8 x x x x x x x x x
sample rate 4 x x x x x
A sample at rate 8 lands on every 2nd offset, a sample at rate 4 on every 4th,
and so on. The same principle applies at the real default of MAX_RATE=16384 —
the scale is larger but the alignment guarantee is identical.
Reference Time and Stride¶
Offsets are counted from a configurable reference time (offset_ref_start,
zero by default). For applications that need to anchor offsets to an absolute
epoch (e.g. GPS time), set this before creating any elements.
Source elements produce data in chunks whose size in offset units is controlled
by SAMPLE_STRIDE_AT_MAX_RATE (16384 by default — one second of data). The
Offset.sample_stride(rate) method converts this to samples at a given rate,
which is how sources know how many samples to produce per iteration.
Large Offset Values¶
For very long-running pipelines, offset values can grow large. The Offset
class handles this transparently — conversion methods like tons switch to
integer arithmetic when values are large enough that floating-point precision
would be lost.
For practical usage of the offset API, see Offsets and Time.