Docs › Contributors
Contributors
Trace Replay Testing
Trace Replay Testing
Trace replay tests compare the engine against a frame-by-frame recording from the original ROM running in BizHawk. They are the highest-signal tests in the repo for physics, object timing, spawn timing, and collision parity work.
Workflow skill. When investigating or fixing a
*TraceReplaytest failure, invoke thetrace-replay-bug-fixingskill (mirrored at.claude/skills/trace-replay-bug-fixing/skill.mdand.agents/skills/trace-replay-bug-fixing/SKILL.md). It codifies the comparison-only invariant, the four mission rules, the diagnose-fix-regen-loop workflow, recorder extension recipe, and pointers to disassembly and process skills.
The Comparison-Only Invariant
The engine must be able to play back any BK2 movie and produce ROM-correct behaviour natively, with no trace data on its inputs. Trace replay tests prove this — they are not state-syncing harnesses.
- The only input the engine receives during a trace replay test is the BK2 controller stream (Player 1 buttons each frame). Everything else — sidekick AI, oscillator phase, object state, audio, RNG, sub-pixel position — must evolve natively from the same starting conditions ROM had at frame 0.
physics.csvandaux_state.jsonl(or their.gzvariants) are read-only diagnostic data. They feed the divergence comparator and the divergence report. They are not allowed to write back into engine state in committed test code.- Pre-trace bootstrap is fine. Setting starting position, RNG seed, oscillation pre-advance, and frame counter once at frame 0 is “load a save state at the BK2 starting point”. The prohibition is on per-frame write-back during the comparison loop.
If a trace replay test passes only because engine state is snapped back to ROM-correct values each frame, the engine has not been verified — it has been masked. Honest tests force honest engine fixes.
Current Trace Coverage
| Game | Test class | Trace directory |
|---|---|---|
| Sonic 1 | TestS1Ghz1TraceReplay | src/test/resources/traces/s1/ghz1_fullrun/ |
| Sonic 1 | TestS1Mz1TraceReplay | src/test/resources/traces/s1/mz1_fullrun/ |
| Sonic 1 | TestS1Credits00..07*TraceReplay (×8) | src/test/resources/traces/s1/credits_*/ |
| Sonic 2 | TestS2Ehz1TraceReplay | src/test/resources/traces/s2/ehz1_fullrun/ |
| Sonic 3&K | TestS3kAizTraceReplay | src/test/resources/traces/s3k/aiz1_to_hcz_fullrun/ |
| Sonic 3&K | TestS3kCnzTraceReplay | src/test/resources/traces/s3k/cnz/ |
The Sonic 1 credits demos use the ROM’s built-in ending replays as the input source and do not require BK2 movies (the ROM owns the input stream).
Each trace directory contains:
metadata.json— trace metadata and start statephysics.csv(orphysics.csv.gz) — per-frame core stateaux_state.jsonl(oraux_state.jsonl.gz) — richer diagnostic events- a
.bk2movie used to drive the engine replay (except credits-demo traces)
Larger traces are stored gzip-compressed (.gz). The Java parser and the test harness load
both forms transparently. See Compressing trace payloads below
for the workflow.
Replay phase is derived from recorded ROM counters:
gameplay_frame_counterchanges only when the level main loop completedvblank_counterchanges on every VBlanklag_counteris diagnostic where the ROM exposes it
When To Use It
Use trace replay when:
- a bug is about ROM parity rather than just local correctness
- a fix affects movement, terrain collision, object timing, or hurt/bounce flow
- you need to prove that a change fixed the first real divergence instead of only masking symptoms
Prefer ordinary headless tests for small isolated behaviours. Use trace replay when the interaction between multiple systems is the thing under test.
Running Existing Trace Tests
Trace replay tests require:
- the matching ROM for the game(s) you want to replay, in the repo root or via system property:
- Sonic 1:
Sonic The Hedgehog (W) (REV01) [!].gen(or-Dsonic1.rom.path=...) - Sonic 2:
Sonic The Hedgehog 2 (W) (REV01) [!].gen(or-Dsonic2.rom.path=...) - Sonic 3 & Knuckles:
Sonic and Knuckles & Sonic 3 (W) [!].gen(or-Ds3k.rom.path=...)
- Sonic 1:
- the
.bk2file to be present in the trace directory (except for ROM-driven credits demo traces)
Run all trace tests:
mvn test -Dtest='*TraceReplay'
Run cross-game baseline set (the canonical “did I regress anything?” check):
mvn test -Dtest='TestS1Ghz1TraceReplay,TestS1Mz1TraceReplay,TestS2Ehz1TraceReplay,TestS3kAizTraceReplay,TestS3kCnzTraceReplay'
Run one trace:
mvn test -Dtest=TestS1Mz1TraceReplay
Note on
-Dtest=filtering. In some configurations the surefire-Dtest=filter does not restrict the run (the full test suite executes regardless). If you observepassed=NwhereN > 100for a focused selection, that’s the signal. Currently a known followup; track via thetarget/surefire-reports/TEST-*.xmlfiles for actual focused results.
Useful optional override while tuning S1 oscillation alignment:
mvn test -Dtest=TestS1Mz1TraceReplay -Dosc.override=0
Reports are written to target/trace-reports/. On divergence you will typically see:
<game>_<zone>_report.json(e.g.s3k_aiz1_report.json,s3k_cnz1_report.json)<game>_<zone>_context.txt(e.g.s3k_aiz1_context.txt)
The JSON groups divergences by field and frame range. The context file is the fastest way to find the first non-cascading failure — open it, scroll to the first error, look at the side-by-side ROM-vs-engine columns.
Diagnosing tip — “phantom landing” patterns. When a sidekick lands on a surface ROM keeps falling through (engine
tails_y_speed=0x0000andtails_g_speed=<prior_x_speed>while ROMtails_y_speedkeeps advancing by gravity), the fingerprint maps to one of two ROM paths: terrainSonic_HitFlooror solid-objectSolidObjectTopSloped2/SolidObject_Landed. Add asetGSpeedvelocity-probe inAbstractPlayableSpritethat prints the call site and check the stack; if it points intoObjectManager.SolidContacts.resolveContactInternal, the offender is a solid object whose state machine has drifted from ROM (e.g. AIZ F7127 was the AIZ collapsing platform stuck instate==2/solid-stay forever; ROM’sloc_205DEat sonic3k.asm:44850-44854 unconditionally promotes to the falling state when its post-fragment timer underflows, but the engine had been gating that promotion on a still-standing player). The correct fix touches the object’s state machine, not the collision sensor.
Recording Or Refreshing A Trace
The recorder workflow is game-specific at the BizHawk entrypoint. The emitted trace contract has
diverged per game as each game’s recorder evolves to capture more diagnostic state — schema
versions are tracked independently in metadata.json (trace_schema, csv_version,
lua_script_version).
| Game | Recorder script | Launcher | metadata.json["game"] | gameplay_frame_counter | vblank_counter | lag_counter | Current trace_schema / csv_version / lua_script_version |
|---|---|---|---|---|---|---|---|
| Sonic 1 | s1_trace_recorder.lua | record_trace.bat | s1 | 0xFE04 | 0xFE0E | placeholder 0 | 3 / 4 / 3.0 |
| Sonic 2 | s2_trace_recorder.lua | record_s2_trace.bat | s2 | 0xFE04 | 0xFE0E | placeholder 0 | 8 / 6 / 9.1-s2 |
| Sonic 3&K | s3k_trace_recorder.lua | record_s3k_trace.bat | s3k | 0xFE08 | 0xFE12 | 0xF628 diagnostic counter | 5 / 5 / 6.4-s3k |
Schema versioning rules:
trace_schema— bumps on CSV column changes only.csv_version— bumps independently fromtrace_schemawhen CSV semantics change but column count stays the same.lua_script_version— bumps on any recorder change (CSV, aux, behaviour, hooks). Per-game versions exist independently.
Prerequisites:
- BizHawk 2.11 installed at
docs/BizHawk-2.11-win-x64/EmuHawk.exe, or setBIZHAWK_EXE - the matching game ROM for the recorder you are using
- a BK2 movie for the act you want to record
Examples:
tools\bizhawk\record_trace.bat ^
"Sonic The Hedgehog (W) (REV01) [!].gen" ^
"docs\BizHawk-2.11-win-x64\Movies\s1-mz1.bk2"
tools\bizhawk\record_s2_trace.bat ^
"Sonic The Hedgehog 2 (W) (REV01) [!].gen" ^
"docs\BizHawk-2.11-win-x64\Movies\s2-ehz1.bk2"
tools\bizhawk\record_s2_trace.bat ^
"Sonic The Hedgehog 2 (W) (REV01) [!].gen" ^
"docs\BizHawk-2.11-win-x64\Movies\s2-lvl-select-CPZ.bk2" ^
level_gated_reset_aware
PowerShell -NoProfile -ExecutionPolicy Bypass -File tools\bizhawk\record_s2_level_select_traces.ps1 ^
-RomPath "Sonic The Hedgehog 2 (W) (REV01) [!].gen" ^
-Only cpz
tools\bizhawk\record_s3k_trace.bat ^
"Sonic and Knuckles & Sonic 3 (W) [!].gen" ^
"docs\BizHawk-2.11-win-x64\Movies\s3k-aiz1.bk2"
tools\bizhawk\record_s3k_trace.bat ^
"Sonic and Knuckles & Sonic 3 (W) [!].gen" ^
"src\test\resources\traces\s3k\aiz1_to_hcz_fullrun\s3k-aiz1-aiz2-sonictails.bk2" ^
aiz_end_to_end
The recorder writes output relative to the recorder script:
tools/bizhawk/trace_output/
After recording, copy:
metadata.jsonphysics.csvaux_state.jsonl
into the target trace directory under src/test/resources/traces/<game>/..., and place the .bk2
movie in the same directory if it is not already there.
Sanity-check metadata.json before trusting the trace:
gameshould bes1,s2, ors3kto match the recorderzoneandactshould match the intended recordingtrace_frame_countshould be in the expected range for that routestart_xandstart_yshould look plausible for the level starttrace_schema,csv_version,lua_script_versionshould match the current per-game values in the recorder source (see the table above)bizhawk_versionshould be2.11genesis_coreshould beGenplus-gx
Sonic 2 level-select traces use the level_gated_reset_aware profile. The profile discards the
shared setup section when debug mode exits EHZ back to the menu, then starts the persisted trace at
the selected gameplay route. For these traces, zone_id is the engine progression id used by the
Java replay/catalog code, rom_zone_id is the raw Sonic 2 ROM zone id, and act remains 1-based in
metadata. Frame 0 must include either zone_act_state or a gameplay_start checkpoint with
game_mode: 12 so the fixture cannot accidentally begin in the menu. The generator script
normalizes the fixture input column from the BK2 log, then validates that metadata, BK2 input
alignment, frame-zero marker, and gzip-only payload storage before copying fixtures into
src/test/resources/traces/s2/<route>/.
The Sonic 2 DEZ ending movie is catalog/parser-only until replay support for the ending sequence is proven; keep it out of replay test discovery unless that support is added deliberately.
If trace_schema is below the engine’s expected version for that game, the parser will load
the trace but warn about missing fields. Older fixtures may need regeneration.
S3K traces additionally include aux_schema_extras listing opt-in per-frame event types — see
Aux event types below.
Interpret the execution counters the same way across games:
gameplay_frame_counteradvances only when the level main loop completedvblank_counteradvances on every VBlanklag_counteris meaningful only for Sonic 3&K; Sonic 1 and Sonic 2 record0
Aux event types
aux_state.jsonl contains a stream of one-JSON-object-per-line events. Standard events are
present in every game’s trace:
zone_act_state— emitted on zone/act/apparent-act/game-mode changecheckpoint— semantic milestones (e.g.gameplay_start,aiz2_reload_resume)state_snapshot— periodic pose/state capturesmode_change— Sonic/Tails routine transitionsslot_dump— periodic OST dumpsobject_appeared/object_near/object_removed— object-window tracking
S3K traces opt in to additional per-frame diagnostic events declared in aux_schema_extras.
The current set (S3K recorder v6.4-s3k):
| Aux key | Event type | Purpose |
|---|---|---|
cpu_state_per_frame | cpu_state | Sidekick CPU routine + flight timer + cached Tails_CPU_interact per frame |
oscillation_state_per_frame | oscillation_state | ROM oscillator counters per frame |
object_state_per_frame | object_state | Per-slot OST first 32 bytes for nearby objects |
interact_state_per_frame | interact_state | Per-player status / status_secondary / object_control bytes |
cage_state_per_frame | cage_state | CNZ wire-cage status + per-player phase/state bytes |
cage_execution_per_frame | cage_execution | M68K execution-hook hits inside CNZ cage routines (BizHawk Lua event.onmemoryexecute) |
velocity_write_per_frame | velocity_write | Per-frame writer-PC trace for Tails x_vel / y_vel (BizHawk Lua event.onmemorywrite, frame-window-gated) |
position_write_per_frame | position_write | Per-frame writer-PC trace for Tails x_pos / y_pos with the captured (a1)/(a0) registers (v6.11-s3k recorder; BizHawk Lua event.onmemorywrite, frame-window-gated). Used to disambiguate Player_1 vs Player_2 targeting in routines like SolidObjectFull2_1P that loop both players |
solid_object_cont_entry_per_frame | solid_object_cont_entry | Per-frame snapshot of (a0)/(a1)/d1/d2 plus y_radius/default_y_radius and pixel x/y on entry to SolidObject_cont (sonic3k.asm:41394 / ROM 0x1DF90; v6.11-s3k recorder; BizHawk Lua event.onmemoryexecute, frame-window-gated). Used to reconstruct the loc_1DFD6+/loc_1E154 push-up branch d3/d4 conditional after the fact |
All event types are diagnostic only — they feed the divergence comparator and the divergence report’s context window. They are never written into engine state by the test harness in committed code (see Comparison-Only Invariant).
Adding a new event type:
- Lua: emit a JSONL line with a new
eventfield; bumplua_script_version; add the opt-in key toaux_schema_extras. - Java parser: add a new sealed-record subtype to
TraceEvent.java; parse it inparseJsonLine; addTraceMetadata.hasPerFrame<Feature>()+ a typed accessor onTraceData. Keep parsers tolerant: old traces without the new key must still load. - Wire the data into
DivergenceReport.getContextWindowrendering or a probe class. - Regenerate the affected trace(s). Commit the regen as a separate logical change from the recorder schema bump.
Compressing trace payloads
Larger aux_state.jsonl and physics.csv files are stored gzip-compressed in the repo to keep
under GitHub’s 100 MB hard limit (50 MB recommended max). The Java parser
(TraceData.load) loads either form
transparently. Use the helper script after regenerating a large trace:
powershell -File tools/traces/compress-traces.ps1 -Path tools/bizhawk/trace_output -Recurse
The script:
- Walks
aux_state*.jsonlandphysics*.csvfiles. - Skips files below 1 MB (override with
-ThresholdBytes <bytes>). - Compresses to
<file>.gz, verifies the round-trip via SHA-256, and removes the uncompressed source unless-KeepOriginalis set.
After regenerating a trace, copy the resulting .gz files (alongside metadata.json) into the
test resources tree. Never commit the uncompressed aux_state.jsonl or physics.csv for
the larger zones — GitHub will reject the push.
Recording The S3K End-To-End Fixture
Use the aiz_end_to_end profile when refreshing the full AIZ intro through HCZ handoff trace.
That profile starts at BK2 frame 0 and augments aux_state.jsonl with:
- edge-triggered
zone_act_stateevents whenever actual zone/act, apparent act, or game mode changes - one-shot semantic
checkpointevents forintro_begin,gameplay_start,aiz1_fire_transition_begin,aiz2_reload_resume,aiz2_main_gameplay,hcz_handoff_begin, andhcz_handoff_complete
Workflow:
- Keep the locked-on ROM at the repo root.
- Run:
tools\bizhawk\record_s3k_trace.bat ^
"Sonic and Knuckles & Sonic 3 (W) [!].gen" ^
"src\test\resources\traces\s3k\aiz1_to_hcz_fullrun\s3k-aiz1-aiz2-sonictails.bk2" ^
aiz_end_to_end
- Run the fixture sanity gate against the uncompressed
tools\bizhawk\trace_output\aux_state.jsonl(do this BEFORE compressing):
$events = Get-Content tools/bizhawk/trace_output/aux_state.jsonl | ForEach-Object { $_ | ConvertFrom-Json }
$required = 'intro_begin','gameplay_start','aiz1_fire_transition_begin','aiz2_reload_resume','aiz2_main_gameplay','hcz_handoff_begin','hcz_handoff_complete'
$frames = $events | Select-Object -ExpandProperty frame
if (($events | Where-Object event -eq 'checkpoint' | Where-Object name -eq 'intro_begin').Count -ne 1) { throw 'intro_begin count != 1' }
if ((($frames | Sort-Object) -join ',') -ne ($frames -join ',')) { throw 'frame order regressed' }
foreach ($name in $required) {
if (($events | Where-Object event -eq 'checkpoint' | Where-Object name -eq $name).Count -ne 1) {
throw "required checkpoint missing or duplicated: $name"
}
}
for ($i = 1; $i -lt $events.Count; $i++) {
if ($events[$i].frame -eq $events[$i-1].frame -and $events[$i].event -eq 'zone_act_state' -and $events[$i-1].event -eq 'checkpoint') {
throw "zone_act_state emitted after checkpoint on frame $($events[$i].frame)"
}
}
Expected result: no output and no exception.
- Compress the large payloads:
powershell -File tools\traces\compress-traces.ps1 -Path tools\bizhawk\trace_output -Recurse
- Copy
tools\bizhawk\trace_output\metadata.json,physics.csv.gz, andaux_state.jsonl.gzintosrc\test\resources\traces\s3k\aiz1_to_hcz_fullrun\.
Recording Sonic 1 Credits Demo Traces
Use the dedicated credits recorder when you want the ROM’s ending replays rather than a human-played
BK2 route. This path forces GM_Credits from the title screen, waits for each demo to become active,
and records each replay into its own trace directory under tools/bizhawk/trace_output/credits_demos/.
Command:
tools\bizhawk\record_s1_credits_traces.bat ^
"Sonic The Hedgehog (W) (REV01) [!].gen" ^
all
Record one replay only:
tools\bizhawk\record_s1_credits_traces.bat ^
"Sonic The Hedgehog (W) (REV01) [!].gen" ^
3
Supported replay indices:
0=ghz1_credits_demo_11=mz2_credits_demo2=syz3_credits_demo3=lz3_credits_demo4=slz3_credits_demo5=sbz1_credits_demo6=sbz2_credits_demo7=ghz1_credits_demo_2
Output layout:
tools/bizhawk/trace_output/credits_demos/manifest.jsontools/bizhawk/trace_output/credits_demos/00_ghz1_credits_demo_1/tools/bizhawk/trace_output/credits_demos/01_mz2_credits_demo/...
Each replay directory contains the same core files as a normal trace capture:
metadata.jsonphysics.csvaux_state.jsonl
Credits-demo metadata adds a few fields that are useful for future trace-framework work:
trace_type = "credits_demo"input_source = "rom_ending_demo"credits_demo_indexcredits_demo_slugemu_frame_start
Notes:
- No
.bk2is produced or required for this workflow. - The script exits after the last gameplay replay finishes; it does not wait through the final
text-only “PRESENTED BY SEGA” card or the
TRY AGAIN / ENDscreen. - The capture format is intentionally aligned with the existing S1
physics.csv/aux_state.jsonlschema so the replay-side test harness can be extended later without inventing a parallel parser.
Adding A New Trace Test
- Create a new directory under
src/test/resources/traces/<game>/<name>/(e.g.src/test/resources/traces/s3k/mgz/). - Record (see Recording Or Refreshing A Trace) and run
tools/traces/compress-traces.ps1if your payload sizes warrant it. Copymetadata.json,physics.csv(or.csv.gz),aux_state.jsonl(or.jsonl.gz), and the.bk2movie into the target directory. - Add a test class under
src/test/java/com/openggf/tests/trace/<game>/extendingAbstractTraceReplayTest. - Return the correct engine zone index, act index, and trace directory path.
- Run the new test once and inspect
target/trace-reports/before treating the baseline as valid.
Minimal example (S3K):
@RequiresRom(SonicGame.SONIC_3K)
public class TestS3kMgzTraceReplay extends AbstractTraceReplayTest {
@Override
protected SonicGame game() { return SonicGame.SONIC_3K; }
@Override
protected int zone() { return 7; } // engine zone index, not raw ROM v_zone
@Override
protected int act() { return 0; }
@Override
protected Path traceDirectory() {
return Path.of("src/test/resources/traces/s3k/mgz");
}
}
Important: the test’s zone() is the engine’s zone index, not necessarily the raw ROM v_zone
value. TestS1Mz1TraceReplay.java
shows the Marble Zone mapping nuance.
Reading Divergences
Focus on the first non-cascading error. Everything after that may be fallout.
The main sources are:
physics.csv(or.csv.gz) Core compared fields: position (x/y), velocities (x_speed/y_speed/g_speed), angle, air/rolling flags, ground mode, status byte. For S2 / S3K, also Tails columns.aux_state.jsonl(or.jsonl.gz) Event stream — see Aux event types for the full taxonomy.target/trace-reports/<game>_<zone>_context.txtSide-by-side ROM and engine diagnostic context generated byAbstractTraceReplayTest.
The current recorder/test pipeline also carries diagnostic-only fields that do not fail the test by themselves, but greatly narrow investigation:
- subpixel position
- routine (Sonic / Tails CPU)
- camera position
- ring count
- status byte (and per-game
status_secondary/object_control) - gameplay frame counter
- VBlank counter
- lag counter
- ridden object slot / latched solid object
- ObjPosLoad cursor state
- (S3K only) Tails CPU flight timer, oscillation phase, spring/cage execution path
Typical debugging pattern:
- Open the first error in the context file.
- Check whether the first mismatch is terrain, object contact, hurt/bounce, or spawn timing.
- Cross-reference the same frame in the aux event stream (use
TraceData.eventsForFrame(K)or open the JSONL/JSONL.gz directly withgzip -d -c | grep '"frame":K'). - Trace the divergence backward to the earliest divergent frame — that’s the real bug, not necessarily the first-error frame.
- If the divergence involves object control, participant selection, or object lifetime, map it
to
ObjectControlState,ObjectPlayerQuery/ObjectPlayerParticipationPolicy, orObjectLifetimeOpsbefore adding local trace-specific logic. - Only after the first error is understood should you trust later divergences.
The trace-replay-bug-fixing skill contains a full diagnose-fix-regen-loop workflow with
ROM-citation requirements and per-game parity rules.
Common Pitfalls
- Missing
.bk2file: the test will skip even ifmetadata.jsonandphysics.csvexist. - Wrong zone index in the test class: the metadata may be correct while the engine loads the wrong level.
- Regenerating only
physics.csvbut notaux_state.jsonlormetadata.json: diagnostics become misleading. - Treating warning-only 1px drift as the root cause when a later object timing error is the real break.
- Updating the trace baseline before understanding whether a behaviour change is a real fix or a regression.
- Committing uncompressed large traces. GitHub will reject any push containing files
100 MB. Always run
tools/traces/compress-traces.ps1after regenerating a large trace. - Adding per-frame trace-driven hydration to the test harness. Setting engine state from recorded values mid-replay violates the comparison-only invariant and masks engine bugs. If you find yourself wanting to “preserve” or “set” engine fields each frame from the trace, the engine probably has a real bug — fix it instead.
- Fixing object trace failures with one-off local control/lifetime logic. Prefer
ObjectControlStatefor native object-control bits,ObjectPlayerQueryplusObjectPlayerParticipationPolicyfor player/sidekick participation, andObjectLifetimeOpsfor destruction, respawn-latch, offscreen-expiry, and slot-transfer semantics. If the current engine path still needs alevel.objectscompatibility wrapper, use that wrapper and keep the trace fix behavior-neutral outside the diagnosed divergence. - Growing guard baselines to land a trace fix. Baselines are for known legacy violations and should ratchet downward as object/boss/badnik code migrates. A new trace fix should not add raw focused-player access, first-sidekick access, object-control setters, or direct lifecycle mutation unless the native-only reason is documented and covered by the guard.
- Branching on game id in shared physics/AI code. Per-game divergences must be gated via
PhysicsFeatureSetflags, neverif (gameId == GameId.S3K). When ROM uses a different semantic on a value that exists across games (e.g.cmp.w y_pos(a0),d0inPlayer_LevelBoundreads centre-Y while the engine’sgetY()returns top-left), prefer adding a feature flag (default-false on games whose trace baselines were calibrated against the engine’s prior behaviour) over flipping the global default in the same change. SeePhysicsFeatureSet.levelBoundaryUsesCentreYfor the canonical example: ROM-cited as correct for S1/S2/S3K but enabled only on S3K initially so S1 GHZ/MZ1 and S2 EHZ baselines stay green until they are re-validated. Another example isPhysicsFeatureSet.solidObjectTopBranchAlwaysLiftsOnUpwardVelocity, which enables ROMloc_1E154(sonic3k.asm:41606-41632) writing the top-branch position lift before thetst.w y_vel(a1) / bmi.s loc_1E198test only on S3K; S1Solid_Landed(s1disasm/_incObj/sub SolidObject.asm:278) and S2SolidObject_Landed(s2.asm:35379-35380) bail on upward y_vel BEFORE any lift, so the flag stays false there. - Edit-tool BOM/CRLF silent failure. Some files (e.g.
CnzCylinderInstance.java,AbstractTraceReplayTest.java) have UTF-8 BOM + CRLF endings. The Claude Code Edit tool has been observed to silently fail on these — it returns “successfully” but doesn’t write. Fall back to Pythonopen(path, 'rb').read().replace(...)-style edits if you suspect this. - Misreading aux trace
cpu_stateevent field semantics. S3K’scpu_stateevent reportscpu_routineas the rawTails_CPU_routinebyte —0x06is the NORMAL ground-following AI (loc_13D4A, sonic3k.asm:26656), NOT a “FLY” routine. Similarly,flight_timerissub_13EFC’sStatus_OnObjwatchdog used by NORMAL routine (sonic3k.asm:26816-26847), not an active-flight indicator; it ticks any time Tails is riding the sameTails_CPU_interactslot, regardless of whether Tails is flying. Andtarget_x/target_yare leader-history positions (the CPU’s follow target), not Tails’s own position — for Tails’s actual coordinates, use theobject_stateevent for slot 1 in the same frame. Always cross-referenceTails_CPU_Control_Index(sonic3k.asm:26368-26386) when interpretingcpu_routinevalues, and do not infer “Tails is flying” from a non-zeroflight_timeralone.
Related Files
Test harness:
Parser (production code, used by tests + diagnostics):
TraceData.javaTraceMetadata.javaTraceFrame.javaTraceEvent.javaTraceBinder.javaDivergenceReport.java
Recorders + launchers:
s1_trace_recorder.luas2_trace_recorder.luas3k_trace_recorder.luas1_credits_trace_recorder.luarecord_trace.batrecord_s2_trace.batrecord_s3k_trace.batrecord_s1_credits_traces.bat
Trace payload tooling:
Skills:
trace-replay-bug-fixing— the canonical workflow for any*TraceReplaytest failure.s1-trace-replay— Sonic 1 trace recording specifics.s1-retro-trace— recording S1 traces using stable-retro instead of BizHawk.
Next Steps
Diagnostic note: validating hypotheses against the recorded JSONL
When a trace replay test fails and the failure has been attributed to a
specific in-game object (e.g. “napalm projectile ride-bridge release”
in S3K AIZ F7552), always verify the hypothesis by inspecting the
recorded JSONL aux state directly before writing engine code. The
trace files live at
src/test/resources/traces/<game>/<run>/aux_state.jsonl.gz. Search
for the relevant per-frame events (sidekick_interact_object,
object_state with the suspect object code, aiz_transition_floor_solid,
etc.) across the suspect frame window and confirm the prerequisite
state matches the hypothesis.
For the AIZ F7552 case (round 2) this check disproved the prior-round
ride-bridge hypothesis: the suspected interact slot was
destroyed=true for ~50 frames before the divergent frame, ruling out
any Solid_Object_Detach-style release as the cause. Skipping this
check would have produced an engine fix targeted at the wrong root
cause — a hack by the repo policy in CLAUDE.md. Documented in
docs/S3K_KNOWN_BUGS.md under the AIZ F7552 entry.
For the AIZ F8927 case (round 1, diagnosis-only) the same check
disproved a “swing/vine peak handling” guess that fell out of the
prior round’s F7660 fix: at F8927 Sonic’s status=0x06 (Air|Roll)
with obj_control=0x00 and interact_slot=9 pointing to a static
sprite (object_code=0x0002BF5A) that has not moved since F7252, so
no swing or carry object is active. The actual signature in the
recorded data is a 5-frame 0/0x18/0x30/0x48/0x60/0 cycle on
x_speed with a sub-x snap pushback once per cycle — the canonical
ROM “rolling-air sliding into a flush right-side wall” pattern from
SonicKnux_DoLevelCollision’s CheckRightWallDist arm
(sonic3k.asm:24061-24065). This narrowed the next-round
investigation to the engine’s airborne right-wall probe vs the
player path’s fixed addi.w #$A,d3 (sonic3k.asm:20195) before any
engine code was changed.