Trajectory Logging

Optional trajectory-native JSONL logging for search requests, plus the additive v1 search-harness trajectory contract.

HLQ can optionally emit trajectory-native JSONL logs for search requests.

This document now serves two purposes:

describe the current shipped one-shot logging substrate
freeze the additive v1 search-harness trajectory contract

Current shipped substrate

Current logging remains intentionally narrow:

logging is disabled by default
current query behavior is unchanged
records are append-only side effects for offline analysis and training prep

Enable logging with:

export HLQ_TRAJECTORY_LOG_DIR=/tmp/hlq_trajectory_logs

When this environment variable is unset, HLQ does not write any trajectory logs.

When enabled, current one-shot logging appends to:

episodes.jsonl
steps.jsonl
artifacts.jsonl

under the directory pointed to by HLQ_TRAJECTORY_LOG_DIR.

The current substrate logs:

help/capability requests
single-query dry-run executions
single-query live executions
minimal error records for unexpected exceptions

The current substrate does not yet implement:

workflow graphs
explicit branch policies
route scoring
public API guarantees around trajectory payloads

Current record shape

Episode

One user-visible request. Includes:

episode_id
user_query
topic_family
query_type
query_mode
time_horizon
status
source_surface

Step

One coarse product-side action. Includes:

step_id
episode_id
step_type
parsed_slots
route_family
chosen_route
retrieval_stage
requested_slots
supported_slots_estimated
unsupported_slots_estimated
post_execution_coverage
provenance_snapshot

Artifact

Typed evidence or execution objects touched by the step. Current v1 emits artifacts such as:

capability_manifest
ann_shortlist
sql_render
sql_result_set
coverage_report

Additive v1 search-harness contract

The shipped one-shot path is still terminal-heavy:

one query
one route
one execution
one result payload with provenance

The planned harness path is multi-step:

each environment read is explicit
working-set edits are explicit
branching is explicit
the terminal outcome is only the last step in a visible search trace

The provenance contract in provenance.md remains valid. The harness trajectory adds stepwise state transitions on top of that rather than replacing provenance.

Harness log structure

A harness run should emit one episode record plus zero or more step records and exactly one terminal record.

Required episode-level fields:

episode_id
query
anchor_market
window_id
policy_id
step_budget
step_count
terminal_action

Required per-step fields:

step_id
step_index
step_type
action_name
action_args
artifact_ids_read
working_set_before
working_set_after
context_pressure_class

Recommended per-step fields:

branch_parent_step_id
subquery_type
selected_artifact_ids
dropped_artifact_ids
stop_candidate
notes

For deterministic replay, step_index is the primary ordering field. Wall-clock timestamps are optional and non-normative.

Frozen v1 harness step types

The following step types are frozen for the Phase 0 contract:

Step type	Meaning
`env_read`	A deterministic environment read action returned one or more artifacts
`branch_subquery`	The policy requested a shallow child retrieval turn
`keep_artifact`	The policy added an artifact to the working set
`drop_artifact`	The policy removed an artifact from the working set
`prune_working_set`	The policy removed multiple inactive artifacts from the working set
`decision_update`	The policy revised a provisional stop or confidence state before termination
`finalize`	The episode ended with `finalize_signal` or `finalize_low_signal`
`abstain`	The episode ended with no bounded decision

action_name should match the corresponding harness action name from search_harness.md.

Step-type specific fields

`env_read`

Required:

action_name
action_args
artifact_ids_read
working_set_before
working_set_after

Rules:

reading does not imply keeping
working_set_after may be identical to working_set_before

`branch_subquery`

Required:

subquery_type
branch_parent_step_id

Rules:

the child turn writes any produced artifacts into the same episode registry
child artifacts must still be explicitly kept if the policy wants them in the working set

`keep_artifact`

Required:

selected_artifact_ids

Rules:

for v1, this should usually contain exactly one artifact ID
working_set_after must contain the selected artifact IDs

`drop_artifact`

Required:

dropped_artifact_ids

Rules:

dropping removes an artifact from active context only
the artifact remains in the replay environment for possible later revisit

`prune_working_set`

Required:

dropped_artifact_ids
action_args.reason

Rules:

use this for multi-artifact cleanup when context pressure is high
the reason should be short and schema-bounded

`decision_update`

Required:

stop_candidate

Rules:

this is non-terminal
use it to record provisional leaning before finalize or abstain

`finalize`

Required:

terminal_action=finalize
action_args.decision_class
selected_artifact_ids
action_args.stop_reason

Rules:

decision_class must be finalize_signal or finalize_low_signal
selected_artifact_ids must equal the terminal retained set

`abstain`

Required:

terminal_action=abstain
selected_artifact_ids
action_args.stop_reason

Rules:

decision_class must be omitted or null
selected_artifact_ids may be empty

Working-set semantics

The working set is the model-visible evidence context, not the full environment.

Normative rules:

every step records working_set_before and working_set_after
a kept artifact stays active until it is dropped, pruned, or the episode ends
artifacts outside the working set remain addressable by ID if they are still in the episode registry
trajectory readers must be able to reconstruct the active context at every step from the log alone

This is the main behavioral difference from the current one-shot pipeline, which has no explicit working-set state.

Backward compatibility

Phase 0 does not require retrofitting historical one-shot runs into this schema.

Compatibility expectations for later implementation:

existing one-shot search behavior should continue to work without harness trajectory output
harness-specific step fields should be additive rather than breaking for existing consumers
provenance remains the minimal audit surface for one-shot runs

Operational notes

Logging is best-effort and should never break product behavior.
Current CLI and MCP outputs remain compatible; trajectory logs are written to disk only.
Multi-step workflow structure is intentionally deferred to the harness implementation slices.

Trajectory Logging

On this page