Hyperliquid module — vision

What the Hyperliquid module is becoming — replayable search harness, frontier-planner architecture, deterministic substrate.

What the Hyperliquid module is becoming

The Hyperliquid intelligence module (codename HLQ) is evolving from a one-shot trader search pipeline into a replayable Hyperliquid search harness inside Axe.

The target shape is not "let the model do everything." The model plans and operates over a deterministic environment. The harness remains the source of truth for evidence, state transitions, provenance, and replay.

In practice, this means:

a pi-based Hyperliquid harness exposes bounded reads, working-set edits, and terminal decisions
a frontier model accessed through Codex OAuth acts as the heavyweight planner/operator
small domain-specific auxiliary models handle narrow, high-frequency tasks where latency, cost, and consistency matter
every run stays inspectable, replayable, and grounded in stable artifacts rather than hidden prompt state

Architecture at a glance

HLQ's target architecture has three layers.

1. Deterministic harness

The harness is the execution substrate. It defines:

bounded actions over replayable environment views
explicit keep/drop/prune working-set semantics
structured terminal outcomes, including finalize and abstain
trajectory logs and per-step provenance
stable replay packs for evaluation and debugging

This layer is the contract surface. It is where truth lives.

2. Frontier planner/operator

A heavyweight frontier model, accessed through Codex OAuth, sits on top of the harness.

Its job is to:

interpret the user brief
decide which harness actions to take next
decompose a task into shallow steps
package retained evidence into a final answer
operate conservatively when signal is weak

The frontier model is used for broad reasoning and flexible planning, not as the authority on facts.

3. Auxiliary domain models

Small specialized models are added where they clearly improve the system.

Early candidates include:

action policy selection
keep/drop/prune policy
abstain calibration
verifier or reranker passes on retained evidence

These models are not independent agents. They are cheap, narrow components that support the main planner and improve consistency on repeatable subproblems.

Why this architecture

This design follows a simple rule: use the frontier model for open-ended reasoning, and use small models for bounded domain decisions.

That gives HLQ four advantages.

Grounded execution

The harness, not the prompt, defines what the model can see and do. Evidence is retrieved through explicit actions and retained through explicit state changes.

Replayability

The same replay pack and the same code path should produce the same environment views and the same auditable trajectory. That makes debugging, evaluation, and regression testing possible.

Better economics

Heavy frontier calls are reserved for planning and synthesis. Narrow, repeated decisions can move to small models where latency and cost matter more than generality.

Safer iteration

We do not need to solve the full autonomous-search problem up front. We can improve planner quality, abstain behavior, and evidence management without changing the core harness contract.

Why not start with a separate parallel search model

We are not starting with an independent parallel search model as a second planner.

That path adds complexity before the harness contract and operating loop are mature. It creates another source of policy behavior to train, evaluate, and debug before we have enough evidence that the extra planner is necessary.

Instead, HLQ starts with:

one deterministic harness
one heavyweight planner/operator
a small number of targeted auxiliary models for narrow tasks

If a more independent search policy becomes useful later, it can be introduced against a stable harness and measured cleanly. It should not be the starting point.

Source of truth and contract surfaces

The deterministic harness stays central.

It is responsible for:

provenance on every retained artifact
replay-pack compatibility
explicit working-set transitions
bounded terminal outputs
bridge-mediated access to domain artifacts and DSL surfaces

The bridge DSL and replay-pack formats matter because they let planning improve without making the environment opaque. The planner can change. The truth surface should stay stable.

Phased plan

Phase 0: Freeze the harness contract

Goal: lock the environment and logging surfaces before adding more model complexity

Deliverables:

stable replay-pack schema
stable action names and terminal outputs
explicit keep/drop/prune semantics
trajectory logging that reconstructs working state step by step
bridge-backed deterministic environment views

Success looks like:

reproducible episodes
auditable failures
no ambiguity about what the planner saw or retained

Phase 1: Frontier-operated harness

Goal: run the harness with Codex OAuth as the main planner/operator

Deliverables:

prompt and control loop for bounded harness operation
evidence packaging for final outputs
conservative abstain behavior
baseline evals on replay packs

Success looks like:

useful multi-step investigations without changing the harness contract
better decomposition and synthesis than the one-shot path
clear traces for why a run finalized or abstained

Phase 2: Add targeted small models

Goal: move repeated narrow decisions to cheaper specialist components

Priority areas:

action policy hints
keep/drop/prune decisions
abstain calibration
verifier/reranker over candidate retained evidence

Success looks like:

lower latency and lower cost on repeated steps
improved consistency on bounded subproblems
unchanged replayability and auditability

Phase 3: Expand domain coverage carefully

Goal: grow from Hyperliquid-first harnessing into broader cross-domain investigation without losing determinism

Possible extensions:

richer bridge DSL views
more replay tasks and evaluation packs
additional domain-specific specialists
monitoring and alerting on top of the same harness contracts

Success looks like:

more useful investigations from the same core loop
stronger evidence handling, not more hidden agent behavior

Design principles

Harness first. The environment contract comes before policy complexity.
Provenance by default. Every conclusion should point back to retained evidence.
Replay over intuition. If behavior cannot be replayed, it is hard to trust.
Small models earn their place. They exist to solve narrow problems better, faster, or cheaper.
Abstention is a feature. HLQ should decline to overclaim when retained evidence is weak.

In one sentence

HLQ's vision is a deterministic Hyperliquid search harness, operated by a frontier planner and supported by narrow specialist models, with provenance and replayability kept as the non-negotiable source of truth.

Hyperliquid module — vision

On this page