ISR/CLAUDE.md

# CLAUDE.md

Guidance for Claude Code when working in this repository.

## Rules

- **Always update `README.md`** when user-facing behaviour changes (flags, endpoints, Docker setup, features), and **commit it in the same commit** as the code change. README is the external reference; CLAUDE.md documents internals.
- Run `python -m pytest tests/` after changing `isr.py` or `web.py` (tests cover the recorder and the loud-section detector).

## Files

| File | Purpose |
|------|---------|
| `isr.py` | Recorder: streams (Icecast/HTTP) + ALSA soundcards, time-aligned file splits |
| `web.py` | Archive browser: HTTP server, file listing, RMS loudness analysis, cut/delete |
| `webui.html` | Single-page UI (HTML/CSS/JS), loaded by `web.py` at startup — must sit next to `web.py` and be copied in the Dockerfile |
| `config.ini` | Recording sources; copy from `config.example.ini`. `[general]` gives defaults, every other section is a source (`type = stream` or `type = soundcard`) |
| `asound.conf` | dsnoop device `shared_mic` so ISR and other ALSA apps can share a soundcard |

## Commands

```bash
python isr.py [config.ini]        # recorder; --list-devices to list ALSA inputs
python web.py                     # web UI on :8080 (--dir, --port, --margin, --min-gap, --analyses-dir)
python -m pytest tests/           # test suite
docker compose up -d / down       # web UI mapped to host port 8050
```

Dependencies: `requests` (streams), `numpy` + `soundfile` (FLAC output and FLAC/waveform analysis — both optional, code degrades gracefully).

## Non-obvious internals

- **Recorder/web coupling is one file:** `RecorderManager` atomically writes `recordings/status.json` every 2 s listing in-progress files; deleted on clean shutdown. `web.py` reads it to show REC badges and to refuse analyse/cut/delete on active files. In-progress WAV/FLAC headers are unfinalized, so durations are not read for active files.
- **Stream splits:** OGG/Opus/FLAC codec headers are extracted from the first ~16 KB of each connection and prepended to every split file so each file plays standalone. A new file is always opened on reconnect (gap in stream). MP3/AAC need no headers.
- **Split timing:** files split at clock-aligned boundaries (`get_next_split_time()`), e.g. `split_minutes = 60` → on the hour.
- **ALSA:** capture spawns `arecord` as a subprocess, raw PCM read in 100 ms chunks by a thread. Device spec resolution: `default` → exact `hw:X,Y` → partial name → fallback to any literal ALSA PCM name (so `shared_mic` from asound.conf works without appearing in `arecord -l`).
- **Shutdown:** SIGTERM is converted to KeyboardInterrupt in `main()`; `RecorderManager.stop()` joins all threads against a single shared 25 s deadline to stay inside Docker's `stop_grace_period: 30s`.
- **Loud-section detection is adaptive:** per-window dB is compared against a rolling noise floor (`NOISE_PERCENTILE`-th percentile per `NOISE_BLOCK_SECONDS` block, min-smoothed over ±2 blocks so events can't raise their own floor; clamped to ≥ `MIN_RMS`). A section needs `margin` dB of prominence and carries a `score` (peak dB above floor) used for ranking. Tests in `tests/test_web.py`.
- **Analysis cache:** results stored as `<analyses-dir>/<file>.analysis.json` keyed by margin+min_gap; orphans pruned at web startup. In Docker the recordings mount is **read-only** for the web container, so the cache uses a separate `./analyses` bind mount. The `margin` and `min_gap` keys MUST stay first in the cache JSON — `_cached_analysis_params()` reads only the first 256 bytes to avoid parsing the large embedded result. Old `threshold`-keyed caches never match and get overwritten on the next analyse.
- **Analyze responses:** `/api/analyze` returns `rms_display` (~800 points), never the full per-window RMS list — the UI doesn't use it and it is ~45x larger.
- **Section playback uses clips, not seeks:** `/api/clip?file&start&end` decodes the slice server-side (wave/soundfile) and returns a standalone 16-bit WAV with exact Content-Length (capped at `CLIP_MAX_SECONDS`). The UI plays chips/J-K through the bottom clip bar (`clipQueue` in webui.html); seeking the full file only happens via "Open in file". Rationale: our FLACs have no SEEKTABLE, so browser seeks bisect the whole file with Range requests.
- **HTTP/1.1 keep-alive:** `_Handler.protocol_version = 'HTTP/1.1'`; every response path must set an accurate `Content-Length`. `_copy_to_response()` force-closes the connection if it under-delivers (file truncated mid-serve).
- **Live playback:** for files listed in status.json, `/stream/` patches the header on the fly so the browser sees the duration recorded so far and can seek; responses get `Cache-Control: no-store`. WAV: `_live_wav_header` derives sizes from the byte count. FLAC: `_live_flac_header` parses the sample count out of the last frame header in the file tail (CRC-8-verified to reject false sync matches) and rewrites STREAMINFO total_samples — duration is NOT derivable from byte size for FLAC.
- **Path safety:** every file parameter in `web.py` goes through `_safe_path()`, which resolves and verifies the path stays inside the recordings dir.
- **dsnoop in Docker:** sharing the soundcard requires `asound.conf` on the host *and* `ipc: host` in docker-compose (dsnoop uses shared memory across the container boundary).