Hyperliquid module — vision
What the Hyperliquid module is becoming — replayable search harness, frontier-planner architecture, deterministic substrate.
What the Hyperliquid module is becoming
The Hyperliquid intelligence module (codename HLQ) is evolving from a one-shot trader search pipeline into a replayable Hyperliquid search harness inside Axe.
The target shape is not "let the model do everything." The model plans and operates over a deterministic environment. The harness remains the source of truth for evidence, state transitions, provenance, and replay.
In practice, this means:
- a pi-based Hyperliquid harness exposes bounded reads, working-set edits, and terminal decisions
- a frontier model accessed through Codex OAuth acts as the heavyweight planner/operator
- small domain-specific auxiliary models handle narrow, high-frequency tasks where latency, cost, and consistency matter
- every run stays inspectable, replayable, and grounded in stable artifacts rather than hidden prompt state
Architecture at a glance
HLQ's target architecture has three layers.
1. Deterministic harness
The harness is the execution substrate. It defines:
- bounded actions over replayable environment views
- explicit keep/drop/prune working-set semantics
- structured terminal outcomes, including finalize and abstain
- trajectory logs and per-step provenance
- stable replay packs for evaluation and debugging
This layer is the contract surface. It is where truth lives.
2. Frontier planner/operator
A heavyweight frontier model, accessed through Codex OAuth, sits on top of the harness.
Its job is to:
- interpret the user brief
- decide which harness actions to take next
- decompose a task into shallow steps
- package retained evidence into a final answer
- operate conservatively when signal is weak
The frontier model is used for broad reasoning and flexible planning, not as the authority on facts.
3. Auxiliary domain models
Small specialized models are added where they clearly improve the system.
Early candidates include:
- action policy selection
- keep/drop/prune policy
- abstain calibration
- verifier or reranker passes on retained evidence
These models are not independent agents. They are cheap, narrow components that support the main planner and improve consistency on repeatable subproblems.
Why this architecture
This design follows a simple rule: use the frontier model for open-ended reasoning, and use small models for bounded domain decisions.
That gives HLQ four advantages.
Grounded execution
The harness, not the prompt, defines what the model can see and do. Evidence is retrieved through explicit actions and retained through explicit state changes.
Replayability
The same replay pack and the same code path should produce the same environment views and the same auditable trajectory. That makes debugging, evaluation, and regression testing possible.
Better economics
Heavy frontier calls are reserved for planning and synthesis. Narrow, repeated decisions can move to small models where latency and cost matter more than generality.
Safer iteration
We do not need to solve the full autonomous-search problem up front. We can improve planner quality, abstain behavior, and evidence management without changing the core harness contract.
Why not start with a separate parallel search model
We are not starting with an independent parallel search model as a second planner.
That path adds complexity before the harness contract and operating loop are mature. It creates another source of policy behavior to train, evaluate, and debug before we have enough evidence that the extra planner is necessary.
Instead, HLQ starts with:
- one deterministic harness
- one heavyweight planner/operator
- a small number of targeted auxiliary models for narrow tasks
If a more independent search policy becomes useful later, it can be introduced against a stable harness and measured cleanly. It should not be the starting point.
Source of truth and contract surfaces
The deterministic harness stays central.
It is responsible for:
- provenance on every retained artifact
- replay-pack compatibility
- explicit working-set transitions
- bounded terminal outputs
- bridge-mediated access to domain artifacts and DSL surfaces
The bridge DSL and replay-pack formats matter because they let planning improve without making the environment opaque. The planner can change. The truth surface should stay stable.
Phased plan
Phase 0: Freeze the harness contract
Goal: lock the environment and logging surfaces before adding more model complexity
Deliverables:
- stable replay-pack schema
- stable action names and terminal outputs
- explicit keep/drop/prune semantics
- trajectory logging that reconstructs working state step by step
- bridge-backed deterministic environment views
Success looks like:
- reproducible episodes
- auditable failures
- no ambiguity about what the planner saw or retained
Phase 1: Frontier-operated harness
Goal: run the harness with Codex OAuth as the main planner/operator
Deliverables:
- prompt and control loop for bounded harness operation
- evidence packaging for final outputs
- conservative abstain behavior
- baseline evals on replay packs
Success looks like:
- useful multi-step investigations without changing the harness contract
- better decomposition and synthesis than the one-shot path
- clear traces for why a run finalized or abstained
Phase 2: Add targeted small models
Goal: move repeated narrow decisions to cheaper specialist components
Priority areas:
- action policy hints
- keep/drop/prune decisions
- abstain calibration
- verifier/reranker over candidate retained evidence
Success looks like:
- lower latency and lower cost on repeated steps
- improved consistency on bounded subproblems
- unchanged replayability and auditability
Phase 3: Expand domain coverage carefully
Goal: grow from Hyperliquid-first harnessing into broader cross-domain investigation without losing determinism
Possible extensions:
- richer bridge DSL views
- more replay tasks and evaluation packs
- additional domain-specific specialists
- monitoring and alerting on top of the same harness contracts
Success looks like:
- more useful investigations from the same core loop
- stronger evidence handling, not more hidden agent behavior
Design principles
- Harness first. The environment contract comes before policy complexity.
- Provenance by default. Every conclusion should point back to retained evidence.
- Replay over intuition. If behavior cannot be replayed, it is hard to trust.
- Small models earn their place. They exist to solve narrow problems better, faster, or cheaper.
- Abstention is a feature. HLQ should decline to overclaim when retained evidence is weak.
In one sentence
HLQ's vision is a deterministic Hyperliquid search harness, operated by a frontier planner and supported by narrow specialist models, with provenance and replayability kept as the non-negotiable source of truth.