A toy project that shows the flow, not the money. This is for learning about event-driven design and latency. Not trading advice!

What this is

YoctoTrader is a compact, readable pipeline:

1
2
3
Simulated Feed ──> Disruptor Ring ──> Strategy ──> Risk Gate ──> Order Publisher
                         │                                         │
                         └────────────── Latency Recorder <────────┘
  • Feed: Gaussian random-walk prices (no I/O).
  • Disruptor: single-producer ring buffer; reuses event objects to avoid GC churn.
  • Strategy: fast/slow moving-average crossover.
  • Risk Gate: simple position cap and order-rate throttle.
  • Order Publisher: prints orders (placeholder for an OMS/gateway).
  • Latency: end-to-end nanoseconds, summarized with HdrHistogram.

The code is intentionally minimal so you can profile, tweak, and see cause↔effect quickly.


The code

Get the code here on Github – https://github.com/simocoder/yoctotrader

Docker Hub

docker run –rm -it simocoder/yoctotrader:latest

Or GHCR

docker run –rm -it ghcr.io/simocoder/yoctotrader:latest


Quick start

1
2
3
4
5
# build a shaded jar (all deps inside)
mvn -q clean package

# run
java -jar target/yoctotrader-1.0.0.jar

You’ll see frequent ORDER BUY/SELL lines and occasional latency summaries like:

1
e2e latency ns: p50=... p90=... p99=... max=... (count=...)

If you prefer Java-21 bytecode:

1
mvn -q -Pjava21 clean package

Reading the logs

Example:

1
[yoctotrader-20] INFO com.example.yoctotrader.engine.OrderPublisher - ORDER SELL qty=1 px=89.53 ts=74009571300808
  • yoctotrader-20 — thread name from the consumer.
  • ORDER SELL — strategy signal (fast MA below slow MA ⇒ SELL).
  • qty=1 — fixed size in this toy.
  • px=... — current simulated mid price.
  • ts=... — producer nanotime when the tick entered the ring (used for latency).

The stream is chatty because the random walk wiggles constantly and we process at a high tick rate. The RiskGate holds position within ±5 and limits order frequency (default 1 ms gap).


Design notes (short)

  • Single-writer ring: Disruptor excels when one producer hands off to one (or a few) consumers, minimizing locks and cache misses.
  • Allocation discipline: MarketDataEvent instances are reused. The hot path avoids new allocations.
  • Time base: We stamp using System.nanoTime() at publish; the consumer computes e2e latency against “now.”
  • Separation of concerns: Strategy → Risk → Publisher are cleanly split so you can swap any piece without touching the others.

Repo layout

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
src/main/java/com/example/yoctotrader/
  Main.java                      # wire-up and run-loop
  events/MarketDataEvent.java    # reusable event object
  feed/RandomWalkFeed.java       # Gaussian random-walk
  engine/Strategy.java           # MA(8) vs MA(32)
  engine/RiskGate.java           # position cap + min order gap
  engine/Order.java              # tiny order record
  engine/OrderPublisher.java     # logs instead of sending to an exchange
  engine/LatencyRecorder.java    # HdrHistogram percentiles
pom.xml                          # shaded jar, Java 17 by default
README.md

Build artifacts land in target/.


Parameters you’ll probably tweak

Open Main.java:

1
2
3
final long targetRatePerSec = 200_000;    // feed pressure
RiskGate risk = new RiskGate(5, 1_000_000); // max ±5; 1 ms min gap
Strategy strat = new Strategy(8, 32);     // fast vs slow MA
  • Less noise: set targetRatePerSec to 10_000, increase gap to 10_000_000 (10 ms), or widen the MAs (e.g., 16 vs 64).
  • More pressure: shrink the gap, increase feed rate, try BusySpin vs other wait strategies.

Experiments (suggested)

  1. Latency vs ring/wait strategy

    • Compare BusySpinWaitStrategy to YieldingWaitStrategy and BlockingWaitStrategy.
    • Watch p50/p99 change under the same feed rate.
  2. Tick pressure

    • Sweep targetRatePerSec (1k → 200k). Find the knee where p99 jumps.
  3. Smoother signal

    • Use bigger moving averages to reduce flips; track net position.
  4. Add position/PnL printouts

    • Add public double fastMA() / slowMA() getters and log them with each order.
  5. Thread pinning (Linux)

    • Run with taskset -c 2 java ... and pin producer/consumer to isolated cores.
  6. Profiling

    • -XX:StartFlightRecording=filename=jfr.jfr,dumponexit=true
    • Try async-profiler to verify no accidental allocations in the hot path.

Why Disruptor here?

  • Single-producer → single-consumer is common in feed/strategy handoffs.
  • Lock-free sequence claims + cache-friendly ring beats queues with locks in this pattern.
  • It makes back-pressure explicit: when the consumer lags, the producer sees the tail.

This is an educational fit; production systems may add batching, fan-out, IPC, kernel-bypass NICs, and binary encodings.

Disruptor, in plain terms

Disruptor (from LMAX Exchange) is a high-performance in-process messaging pattern built around a pre-allocated ring buffer and sequence counters instead of conventional blocking queues. It aims for very low latency and high throughput with minimal GC.

Why it’s fast

  • Pre-allocated events: the ring buffer is filled once; handlers reuse event objects → near-zero allocations on the hot path.
  • Sequences, not locks: producers/consumers coordinate with monotonic sequence numbers and memory fences (CAS/volatile), avoiding OS locks.
  • Single-writer principle: one producer per sequence stream removes contention on writes.
  • Pluggable waiting: consumers use a wait strategy (spin, yield, park, block) to trade CPU for latency.

Core pieces

  • RingBuffer<T> — fixed-size circular buffer (size must be a power of two).
  • Sequencer / Sequence — coordinates claimed and published positions.
  • EventHandler<T> — consumer callback (onEvent).
  • WorkHandler<T> / WorkerPool — work-queue mode (each event goes to exactly one worker).
  • ProducerType.SINGLE|MULTI — optimize for one or many producers.
  • WaitStrategy — how consumers wait for new data.

Publish/consume flow

  1. Producer claims the next slot (sequence) → writes into the pre-allocated event.
  2. Producer publishes the sequence; the ring’s cursor advances.
  3. Consumer(s) wait until their next needed sequence is available → process the event → advance their own sequence.
  4. Back-pressure: if consumers lag, the producer cannot claim beyond the slowest consumer plus ring capacity.

Tiny example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
// Event object reused in the ring
class PriceEvent { long seq, ts; double px; void set(long s,long t,double p){seq=s;ts=t;px=p;} }

// Consumer handler
EventHandler<PriceEvent> handler = (evt, seq, endOfBatch) -> {
  // process evt without allocating
};

int ringSize = 1 << 16; // power of two
Disruptor<PriceEvent> disruptor = new Disruptor<>(
    PriceEvent::new,                // pre-allocate
    ringSize,
    Executors.defaultThreadFactory(),
    ProducerType.SINGLE,
    new BusySpinWaitStrategy()
);
disruptor.handleEventsWith(handler);
RingBuffer<PriceEvent> rb = disruptor.start();

// Producer side
long s = rb.next();
try {
  PriceEvent e = rb.get(s);
  e.set(s, System.nanoTime(), 123.45);
} finally {
  rb.publish(s);
}

Wait strategies (rule-of-thumb)

Strategy Latency CPU use Notes
BusySpinWaitStrategy lowest highest Great for dedicated cores.
YieldingWaitStrategy very low high Good compromise on busy boxes.
Sleeping/Blocking higher low Better for shared servers, less jitter-sensitive.

When to use it

  • Hot, latency-sensitive pipelines (market-data processing, in-process telemetry/log fans, trading signal stages).
  • You need predictable GC and millions of events/sec in a single process.

When not to bother

  • Typical web APIs/microservices: standard executors/queues are simpler.
  • Cross-process distribution: prefer IPC transports (Aeron, shared memory, sockets); Disruptor is in-process.

In this project (YoctoTrader)

  • We run single producer → single consumer with ProducerType.SINGLE and BusySpinWaitStrategy.
  • MarketDataEvent objects are reused in the ring (no per-tick allocation).
  • The pipeline is Feed → Disruptor → Strategy → Risk → Publisher, with latency measured from producer stamp to consumer handling.

Gotchas

  • Ring size must be a power of two (bit-masking indexes is part of the speed).
  • Back-pressure is real: a slow consumer will stall producers once the ring fills.
  • Multiple producers need ProducerType.MULTI and careful tuning.
  • Treat the Disruptor as a mechanical sympathy tool: pin threads, keep handlers lean, avoid allocations/synchronization in onEvent.

Build details

  • Shaded jar via maven-shade-plugin so java -jar works without external deps.
  • Java 17 bytecode by default for wider compatibility; use -Pjava21 if you want 21.
  • Deps: com.lmax:disruptor, org.hdrhistogram:HdrHistogram, org.slf4j:slf4j-api (+ slf4j-simple runtime).

Not a trading system

This is a lab. The feed is synthetic. There’s no exchange connectivity, FIX/SBE, venue-specific throttles, or full risk controls. Treat the outputs like signals in a scope, not anything to trade on.


Next steps

  • Swap the feed for Aeron IPC or UDP multicast.
  • Replace POJOs with SBE or Chronicle Wire encodings.
  • Split producer/consumer into separate processes and measure IPC budgets.
  • Add a tiny stateful OMS (acks, timeouts, cancel/reject paths).
  • Introduce a backtest that replays a recorded feed deterministically.

Appendix: JVM flags I sometimes use

1
java -XX:+AlwaysActAsServerClassMachine -XX:+UseNUMA -XX:+UseStringDeduplication -jar target/yoctotrader-1.0.0.jar

Remove any flag your JDK complains about; they’re optional.


Copyleft 🄯 YoctoTrader. Educational code; adapt freely. If you improve the labs, I’d love to hear what you measured.