Trajectory Logging
Optional trajectory-native JSONL logging for search requests, plus the additive v1 search-harness trajectory contract.
HLQ can optionally emit trajectory-native JSONL logs for search requests.
This document now serves two purposes:
- describe the current shipped one-shot logging substrate
- freeze the additive v1 search-harness trajectory contract
Current shipped substrate
Current logging remains intentionally narrow:
- logging is disabled by default
- current query behavior is unchanged
- records are append-only side effects for offline analysis and training prep
Enable logging with:
When this environment variable is unset, HLQ does not write any trajectory logs.
When enabled, current one-shot logging appends to:
episodes.jsonlsteps.jsonlartifacts.jsonl
under the directory pointed to by HLQ_TRAJECTORY_LOG_DIR.
The current substrate logs:
- help/capability requests
- single-query dry-run executions
- single-query live executions
- minimal error records for unexpected exceptions
The current substrate does not yet implement:
- workflow graphs
- explicit branch policies
- route scoring
- public API guarantees around trajectory payloads
Current record shape
Episode
One user-visible request. Includes:
episode_iduser_querytopic_familyquery_typequery_modetime_horizonstatussource_surface
Step
One coarse product-side action. Includes:
step_idepisode_idstep_typeparsed_slotsroute_familychosen_routeretrieval_stagerequested_slotssupported_slots_estimatedunsupported_slots_estimatedpost_execution_coverageprovenance_snapshot
Artifact
Typed evidence or execution objects touched by the step. Current v1 emits artifacts such as:
capability_manifestann_shortlistsql_rendersql_result_setcoverage_report
Additive v1 search-harness contract
The shipped one-shot path is still terminal-heavy:
- one query
- one route
- one execution
- one result payload with provenance
The planned harness path is multi-step:
- each environment read is explicit
- working-set edits are explicit
- branching is explicit
- the terminal outcome is only the last step in a visible search trace
The provenance contract in provenance.md remains valid. The harness trajectory adds stepwise state transitions on top of that rather than replacing provenance.
Harness log structure
A harness run should emit one episode record plus zero or more step records and exactly one terminal record.
Required episode-level fields:
episode_idqueryanchor_marketwindow_idpolicy_idstep_budgetstep_countterminal_action
Required per-step fields:
step_idstep_indexstep_typeaction_nameaction_argsartifact_ids_readworking_set_beforeworking_set_aftercontext_pressure_class
Recommended per-step fields:
branch_parent_step_idsubquery_typeselected_artifact_idsdropped_artifact_idsstop_candidatenotes
For deterministic replay, step_index is the primary ordering field. Wall-clock timestamps are optional and non-normative.
Frozen v1 harness step types
The following step types are frozen for the Phase 0 contract:
| Step type | Meaning |
|---|---|
env_read | A deterministic environment read action returned one or more artifacts |
branch_subquery | The policy requested a shallow child retrieval turn |
keep_artifact | The policy added an artifact to the working set |
drop_artifact | The policy removed an artifact from the working set |
prune_working_set | The policy removed multiple inactive artifacts from the working set |
decision_update | The policy revised a provisional stop or confidence state before termination |
finalize | The episode ended with finalize_signal or finalize_low_signal |
abstain | The episode ended with no bounded decision |
action_name should match the corresponding harness action name from search_harness.md.
Step-type specific fields
env_read
Required:
action_nameaction_argsartifact_ids_readworking_set_beforeworking_set_after
Rules:
- reading does not imply keeping
working_set_aftermay be identical toworking_set_before
branch_subquery
Required:
subquery_typebranch_parent_step_id
Rules:
- the child turn writes any produced artifacts into the same episode registry
- child artifacts must still be explicitly kept if the policy wants them in the working set
keep_artifact
Required:
selected_artifact_ids
Rules:
- for v1, this should usually contain exactly one artifact ID
working_set_aftermust contain the selected artifact IDs
drop_artifact
Required:
dropped_artifact_ids
Rules:
- dropping removes an artifact from active context only
- the artifact remains in the replay environment for possible later revisit
prune_working_set
Required:
dropped_artifact_idsaction_args.reason
Rules:
- use this for multi-artifact cleanup when context pressure is high
- the reason should be short and schema-bounded
decision_update
Required:
stop_candidate
Rules:
- this is non-terminal
- use it to record provisional leaning before
finalizeorabstain
finalize
Required:
terminal_action=finalizeaction_args.decision_classselected_artifact_idsaction_args.stop_reason
Rules:
decision_classmust befinalize_signalorfinalize_low_signalselected_artifact_idsmust equal the terminal retained set
abstain
Required:
terminal_action=abstainselected_artifact_idsaction_args.stop_reason
Rules:
decision_classmust be omitted or nullselected_artifact_idsmay be empty
Working-set semantics
The working set is the model-visible evidence context, not the full environment.
Normative rules:
- every step records
working_set_beforeandworking_set_after - a kept artifact stays active until it is dropped, pruned, or the episode ends
- artifacts outside the working set remain addressable by ID if they are still in the episode registry
- trajectory readers must be able to reconstruct the active context at every step from the log alone
This is the main behavioral difference from the current one-shot pipeline, which has no explicit working-set state.
Backward compatibility
Phase 0 does not require retrofitting historical one-shot runs into this schema.
Compatibility expectations for later implementation:
- existing one-shot search behavior should continue to work without harness trajectory output
- harness-specific step fields should be additive rather than breaking for existing consumers
- provenance remains the minimal audit surface for one-shot runs
Operational notes
- Logging is best-effort and should never break product behavior.
- Current CLI and MCP outputs remain compatible; trajectory logs are written to disk only.
- Multi-step workflow structure is intentionally deferred to the harness implementation slices.