Files

T

admin 89c95a70a2 feat: loudness walk (U/I) keeps its own cursor, independent of J/K

The clip bar had one shared cursor, so pressing U/I re-derived the
loudness rank from wherever you currently were. A J/K detour to review
clips around an interesting spot therefore hijacked the loudness walk:
returning to U/I continued from the J/K position, not from where the
ranking left off.

Add a separate scoreCursor that only U/I (and "highlights only" auto-
advance) move; J/K never touches it. So: U/I to a loud moment, J/K to
review the time-adjacent clips, U/I again resumes the ranking exactly
where you left it. scoreCursor resets to -1 on every explicit jump /
queue re-arm (hideClipBar, chip clicks, day-highlights arm) so the next
U/I re-anchors on the selected section.

Also label the position count "by time" (J/K) or "by loudness" (U/I) so
the blind user can hear which dimension the count is in — the two looked
identical before and switched meaning silently when changing keys.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-13 10:28:52 +02:00

15 KiB

Raw Blame History

CLAUDE.md

Guidance for Claude Code when working in this repository.

Rules

Always update README.md when user-facing behaviour changes (flags, endpoints, Docker setup, features), and commit it in the same commit as the code change. README is the external reference; CLAUDE.md documents internals.
Run python -m pytest tests/ after changing isr.py or web.py (tests cover the recorder and the loud-section detector).

Files

File	Purpose
`isr.py`	Recorder: streams (Icecast/HTTP) + ALSA soundcards, time-aligned file splits
`web.py`	Archive browser: HTTP server, file listing, RMS loudness analysis, cut/delete
`webui.html`	Single-page UI (HTML/CSS/JS), loaded by `web.py` at startup — must sit next to `web.py` and be copied in the Dockerfile
`config.ini`	Recording sources; copy from `config.example.ini`. `[general]` gives defaults, every other section is a source (`type = stream` or `type = soundcard`)
`asound.conf`	dsnoop device `shared_mic` so ISR and other ALSA apps can share a soundcard

Commands

python isr.py [config.ini]        # recorder; --list-devices to list ALSA inputs
python web.py                     # web UI on :8080 (--dir, --port, --margin, --min-gap, --min-duration, --analyses-dir)
python -m pytest tests/           # test suite
docker compose up -d / down       # web UI mapped to host port 8050

Dependencies: requests (streams), numpy + soundfile (FLAC output and FLAC analysis/clips — both optional, code degrades gracefully).

Code map

web.py:

Detection: _compute_rms_windows_wav() / analyze_flac() produce 100 ms RMS windows → _noise_floor_db() estimates the rolling floor → _loud_sections() emits scored sections → _package_result() shapes the /api/analyze payload.
Clips: _api_clip() validates params, _clip_wav() / _clip_flac() stream the decoded slice, _wav_header() builds the 44-byte PCM header.
Filenames as a clock: _recording_start() parses the start time out of a filename stem; _cut_filename() turns a (stem, ext, start, end) into a wall-clock-named cut. Both the listing date field and _api_cut() use them.
Live headers: _live_wav_header(), _live_flac_header() (+ _flac_frame_samples(), CRC-8 verified).
Serving: _stream() (Range support), _copy_to_response(), _safe_path() (path traversal guard).

webui.html (one <script> block):

Clip review: clipQueue/clipCursor/scoreCursor globals, playClip(), playFileSection(), hideClipBar(); markup is the #clip-bar div. The clip label shows wall-clock occurrence time + dB prominence + position (17:09:56 to 17:09:57 · +30 dB (30 / 426 by time)): queue entries carry absStart (epoch s), derived from fileStartEpoch(f.date) — the filename clock — with in-file offsets as fallback for non-standard names; only the filename/in-file offset lives in the tooltip now. playClip() builds one text string used for both label.textContent and the announce() aria-live message — they must never diverge (a past bug had the announcement read Clip N of M: … while the label read the wall-clock form). Two cursors, deliberately independent: clipCursor is the time-order position and follows every play; scoreCursor is the loudness walk's own position in scoreOrder() and is moved only by U/I (and "highlights only" auto-advance) in stepClip's by-score branch — J/K must never touch it, so a J/K detour to review nearby clips leaves the loudness walk's place intact and the next U/I resumes the ranking where it left off. scoreCursor resets to -1 on every explicit jump / queue (re-)arm (hideClipBar, playFileSection, jumpToDaySection, both day-highlights arm points), so the next U/I re-anchors on clipCursor (the section just selected) else the loudest. playClip(i, byScore): when byScore the count shows scoreCursor (rank, suffix "by loudness"), else i (time-order index, suffix "by time") — the suffix tells the blind user which dimension the count is in. stepClip's by-score branch passes byScore=true; every other caller (chip click, J/K, Prev/Next) leaves it false.
Day review: dayHighlights() builds dayActiveSections (chronological); jumpToDaySection() arms the queue. Section absStart comes from fileStartEpoch(f.date) (filename clock), mtime−duration only as fallback. The user is blind and uses a screen reader — there is deliberately no day-timeline SVG (one existed and was removed on request as useless); the highlights panel is linear text/buttons: summary line → key-hint note → chips toggle → chips. Do not add decorative visualizations; any future graphic must be aria-hidden and must not be the only carrier of information. Chip lists longer than 12 are collapsed behind an aria-expanded toggle button (the .chips[hidden]{display:none} rule is required — the author-level display:flex on .chips would otherwise override the UA [hidden] rule). Group aria-labels stay short ("Day loud sections") — the J/K/U/I key explanation lives only in the visible note, per user feedback against repeating info text in labels. The Highlights button is a collapse/expand toggle (setHlExpanded() keeps arrow + aria-expanded in sync, also from the day-collapse path): a built panel is kept and re-armed from dayHlSections instead of recomputing, keyed by hlRow.dataset.loaded = hlParams() (margin|gap|minDur string) so changed params force a re-run. The #dayhls-<dayId> "· analysed" suffix appears when every file's cached_analysis passes cachedParamsMatch(); fetchAnalysis() updates f.cached_analysis client-side so the marker survives re-renders without refetching /api/files. aria-label on a button replaces its visible content for screen readers — any status text rendered inside a labelled button (like the analysed suffix) must be mirrored into the label (render path + the end of dayHighlights()), or the blind user never hears it. Likewise, the accessible text of a button built as aria-hidden arrow span + text node must not start the text node with a space (the hidden arrow drops out and the leading space reads as an indent) — keep separator spaces inside the aria-hidden span.
J/K/U/I/O: single document-level keydown listener — clip queue takes priority, in-player currentTime stepping is the fallback when no queue is armed; O calls openClipInFile() (shared with the "Open in file" button). J/K (and Prev/Next) always step in time order; U/I walk the loudest-first ranking from scoreOrder() — no top-N cutoff (the #clip-top input and #clip-hl-only checkbox were removed deliberately; J/K must never be affected by an auto-advance/highlights setting). Auto-advance is the input[name="clip-adv"] radio (off / next in time / next by loudness), read by advanceMode(); stepClip(dir, byScore) is the shared queue-stepping path. In-player U/I anchor the ranking on the section under the playhead, else start at the loudest.
Analysis: fetchAnalysis() (session analysisCache), analyse() (per-row render: meta line with section count + params, then chips — no waveform SVG, see day-review note on the blind user), cachedParamsMatch() (autoload guard).

Verifying changes

python -m pytest tests/ covers the recorder (test_isr.py) and the detector (test_web.py).
There is no JS toolchain and no node on the dev box. After editing webui.html, cross-check every getElementById('x') against an id="x" declaration, and smoke-test endpoints.
Endpoint smoke pattern: write a temp WAV/FLAC with a known loud burst, subclass web._Handler with recordings_dir/analyses_dir pointing at the temp dir, serve web._Server(('127.0.0.1', 0), H) in a daemon thread, then hit /api/analyze and /api/clip with urllib — assert section start/score and that Content-Length == len(body) == 44 + frames × channels × 2.
Dev box is Windows / PowerShell 5.1. Multi-line commit messages: use the Bash tool with git commit -F - <<'EOF' — PowerShell here-strings containing quotes get mangled into separate arguments.

Non-obvious internals

Recorder/web coupling is one file: RecorderManager atomically writes recordings/status.json every 2 s listing in-progress files; deleted on clean shutdown. web.py reads it to show REC badges and to refuse analyse/cut/delete on active files. In-progress WAV/FLAC headers are unfinalized, so durations are not read for active files.
Stream splits: OGG/Opus/FLAC codec headers are extracted from the first ~16 KB of each connection and prepended to every split file so each file plays standalone. A new file is always opened on reconnect (gap in stream). MP3/AAC need no headers.
Split timing: files split at clock-aligned boundaries (get_next_split_time()), e.g. split_minutes = 60 → on the hour.
Filename is the clock — fixed format, not configurable. Recordings are named %Y%m%d_%H%M%S.<ext> (the start time). This is hardcoded as FILENAME_FORMAT, defined in both isr.py (recorder writes it) and web.py (reads it back) — the two copies must stay in sync. There is no filename_pattern config option (removed; web.py can't see config.ini, so a configurable pattern would break parsing). web.py derives the displayed DATE column from the filename via _recording_start() (falling back to mtime only for non-standard names — mtime is the last write ≈ end, not the start). Cut downloads are named by the wall-clock span they cover via _cut_filename(): a 22:31:30→22:32:30 slice of 20260523_220000.flac becomes 20260523_22-31-30_22-32-30.flac; non-standard source names fall back to <stem>_cut_<start>s-<end>s.
ALSA: capture spawns arecord as a subprocess, raw PCM read in 100 ms chunks by a thread. Device spec resolution: default → exact hw:X,Y → partial name → fallback to any literal ALSA PCM name (so shared_mic from asound.conf works without appearing in arecord -l).
Shutdown: SIGTERM is converted to KeyboardInterrupt in main(); RecorderManager.stop() joins all threads against a single shared 25 s deadline to stay inside Docker's stop_grace_period: 30s.
Loud-section detection is adaptive — do not regress it to an absolute threshold. Per-window dB is compared against a rolling noise floor (NOISE_PERCENTILE-th percentile per NOISE_BLOCK_SECONDS block, min-smoothed over ±2 blocks so events can't raise their own floor; clamped to ≥ MIN_RMS). A section needs margin dB of prominence and carries a score used for ranking: peak dB above floor, capped by the sharpest rise within ONSET_SECONDS (0.5 s) — so a short (~10 s) swell that outruns the 30 s floor blocks still flags but scores ≈ 0 and sinks in the U/I highlight ranking, while sharp events keep their full prominence. A section starting in the first 0.5 s of a file is scored against the floor instead (events cut off by a file split must not be punished as swells). Do not regress the scoring to raw peak, and do not fight swells with a higher margin. If flagging itself (not just ranking) ever needs improving, the next step is a spectral filter or optional Silero VAD over candidate sections. Sections shorter than min_duration (default 0.5 s, after min_gap merging) are discarded — without this, isolated 100 ms pops (clicks, single raindrops) produced thousands of zero-length sections per day. The original fixed RMS threshold flagged every ambience change (passing cars, rain) and produced ~600 useless sections/day — that is why it was replaced. Tests in tests/test_web.py.
Analysis params are coupled in five places. CLI --margin/--min-gap/--min-duration → /api/config → UI inputs #margin-input/#min-gap-input/#min-duration-input → /api/analyze query params → cache JSON head keys. Renaming or adding a param means touching all five plus cachedParamsMatch() and the _cached_analysis_params() regex (see the threshold→margin change c84b7d8 and the min_duration addition).
Analysis cache: results stored as <analyses-dir>/<file>.analysis.json keyed by margin+min_gap+min_duration; orphans pruned at web startup. In Docker the recordings mount is read-only for the web container, so docker-compose layers a read-write ./recordings/analyses bind mount over it. The detector, margin, min_gap, and min_duration keys MUST stay first in the cache JSON — _cached_analysis_params() reads only the first 256 bytes to avoid parsing the large embedded result. detector is DETECTOR_VERSION: bump it whenever detection/scoring changes make old cached results wrong (e.g. v2 = onset-capped scores); caches with another version (or missing keys) never match and get overwritten on the next analyse.
Analyze responses: /api/analyze returns only sections, duration, window — no RMS data of any kind. rms_display (~800-point waveform preview) and the full per-window list were dropped when the waveform SVGs were removed (user is blind, see webui notes). Old caches that embed rms/rms_display are still valid; both keys are popped when serving from cache, so no DETECTOR_VERSION bump was needed.
Section playback uses clips, not seeks: /api/clip?file&start&end decodes the slice server-side (wave/soundfile) and returns a standalone 16-bit WAV with exact Content-Length (capped at CLIP_MAX_SECONDS), Cache-Control: private so re-listening is free. The UI plays chips/J-K through the bottom clip bar (clipQueue in webui.html); seeking the full file only happens via "Open in file". Rationale (finding): libsndfile writes FLAC without a SEEKTABLE, so a browser seek bisects the whole multi-hundred-MB file with Range requests — seeking big FLACs in <audio> is inherently slow and must not be reintroduced as the primary navigation. Server-side sf.SoundFile.seek() on local disk is fast and frame-accurate.
HTTP/1.1 keep-alive: _Handler.protocol_version = 'HTTP/1.1'; every response path must set an accurate Content-Length. _copy_to_response() force-closes the connection if it under-delivers (file truncated mid-serve).
Live playback: for files listed in status.json, /stream/ patches the header on the fly so the browser sees the duration recorded so far and can seek; responses get Cache-Control: no-store. WAV: _live_wav_header derives sizes from the byte count. FLAC: _live_flac_header parses the sample count out of the last frame header in the file tail (CRC-8-verified to reject false sync matches) and rewrites STREAMINFO total_samples — duration is NOT derivable from byte size for FLAC.
Path safety: every file parameter in web.py goes through _safe_path(), which resolves and verifies the path stays inside the recordings dir.
dsnoop in Docker: sharing the soundcard requires asound.conf on the host and ipc: host in docker-compose (dsnoop uses shared memory across the container boundary).

15 KiB Raw Blame History Unescape Escape