docs: slim CLAUDE.md down to rules, file map, and non-obvious internals

Drop class-by-class architecture listings that are derivable from the code; keep only constraints a model cannot infer (status.json coupling, header injection, read-only mount, shutdown deadline, dsnoop/ipc). Add webui.html to the file map. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 11:50:09 +02:00
parent 907fd90a5e
commit c445eb3e04
1 changed files with 25 additions and 78 deletions
@@ -1,93 +1,40 @@
 # CLAUDE.md
-This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+Guidance for Claude Code when working in this repository.
 ## Rules
- **Always update `README.md`** whenever user-facing behaviour changes (new flags, new endpoints, changed Docker setup, new features). The README is the primary external reference; CLAUDE.md documents internals.
+- **Always update `README.md`** when user-facing behaviour changes (flags, endpoints, Docker setup, features), and **commit it in the same commit** as the code change. README is the external reference; CLAUDE.md documents internals.
- **Always commit `README.md`** in the same commit as the code changes it documents — never let the README fall behind.
+- Run `python -m pytest tests/` after changing `isr.py` (tests cover the recorder only).
-## Project Overview
+## Files
-ISR is a Python audio recording application that captures from multiple simultaneous sources (Icecast/HTTP streams and ALSA soundcard devices) with time-based file splitting. All application code is in two files: `isr.py` (recorder) and `web.py` (archive browser UI).
+| File | Purpose |
 |------|---------|
 | `isr.py` | Recorder: streams (Icecast/HTTP) + ALSA soundcards, time-aligned file splits |
 | `web.py` | Archive browser: HTTP server, file listing, RMS loudness analysis, cut/delete |
 | `webui.html` | Single-page UI (HTML/CSS/JS), loaded by `web.py` at startup — must sit next to `web.py` and be copied in the Dockerfile |
 | `config.ini` | Recording sources; copy from `config.example.ini`. `[general]` gives defaults, every other section is a source (`type = stream` or `type = soundcard`) |
 | `asound.conf` | dsnoop device `shared_mic` so ISR and other ALSA apps can share a soundcard |
 ## Commands
 ```bash
-# Run the recorder
+python isr.py [config.ini]        # recorder; --list-devices to list ALSA inputs
-python isr.py                     # uses config.ini
+python web.py                     # web UI on :8080 (--dir, --port, --threshold, --min-gap, --analyses-dir)
-python isr.py myconfig.ini        # custom config file
+python -m pytest tests/           # test suite
-python isr.py --list-devices      # list available ALSA devices
+docker compose up -d / down       # web UI mapped to host port 8050
 # Run the web UI
 python web.py                          # http://localhost:8080
 python web.py --dir recordings         # custom recordings directory
 python web.py --port 8888              # custom port
 python web.py --threshold 0.03         # loudness threshold (0-1, default 0.05)
 # Stop: Ctrl+C (or docker compose down)
 # Install dependencies
 pip install requests              # for stream recording
 pip install numpy soundfile       # for FLAC output and web waveform analysis (optional)
 # Docker
 docker compose up -d
 docker compose logs -f
 docker compose down
 ```
-## Architecture
+Dependencies: `requests` (streams), `numpy` + `soundfile` (FLAC output and FLAC/waveform analysis — both optional, code degrades gracefully).
-### Audio Backend System
+## Non-obvious internals
 - **AudioDevice** — Dataclass: id, name, channels, sample_rate, backend type
 - **AudioBackend** (ABC) — Abstract base for audio capture backends
  - **ALSABackend** — Native ALSA support via `arecord` subprocess (the only backend)
 - **ALSAStream** — Context manager that wraps an `arecord` subprocess and reads PCM in a thread
 - **AudioSystem** — Discovers available backends, lists devices, resolves device specs
-### Recorder Classes
+- **Recorder/web coupling is one file:** `RecorderManager` atomically writes `recordings/status.json` every 2 s listing in-progress files; deleted on clean shutdown. `web.py` reads it to show REC badges and to refuse analyse/cut/delete on active files. In-progress WAV/FLAC headers are unfinalized, so durations are not read for active files.
- **BaseRecorder** (ABC) — Common settings, `get_next_split_time()`, `generate_filename()`, `record()` interface
+- **Stream splits:** OGG/Opus/FLAC codec headers are extracted from the first ~16 KB of each connection and prepended to every split file so each file plays standalone. A new file is always opened on reconnect (gap in stream). MP3/AAC need no headers.
- **StreamRecorder(BaseRecorder)** — Records HTTP/Icecast streams with format auto-detection and OGG/FLAC header injection
+- **Split timing:** files split at clock-aligned boundaries (`get_next_split_time()`), e.g. `split_minutes = 60` → on the hour.
- **SoundcardRecorder(BaseRecorder)** — Records from ALSA devices; outputs WAV or FLAC via `_AudioFileWriter`
+- **ALSA:** capture spawns `arecord` as a subprocess, raw PCM read in 100 ms chunks by a thread. Device spec resolution: `default` → exact `hw:X,Y` → partial name → fallback to any literal ALSA PCM name (so `shared_mic` from asound.conf works without appearing in `arecord -l`).
- **_AudioFileWriter** — Unified write/close interface for wave (WAV) and soundfile (FLAC)
+- **Shutdown:** SIGTERM is converted to KeyboardInterrupt in `main()`; `RecorderManager.stop()` joins all threads against a single shared 25 s deadline to stay inside Docker's `stop_grace_period: 30s`.
- **RecorderManager** — Loads config, creates recorders, manages threads, handles shutdown
+- **Analysis cache:** results stored as `<analyses-dir>/<file>.analysis.json` keyed by threshold+min_gap; orphans pruned at web startup. In Docker the recordings mount is **read-only** for the web container, so the cache uses a separate `./analyses` bind mount.
-
+- **Path safety:** every file parameter in `web.py` goes through `_safe_path()`, which resolves and verifies the path stays inside the recordings dir.
-### Key Implementation Details
+- **dsnoop in Docker:** sharing the soundcard requires `asound.conf` on the host *and* `ipc: host` in docker-compose (dsnoop uses shared memory across the container boundary).
 - ALSA backend spawns `arecord` as a subprocess; raw PCM is read in 100 ms chunks via a reader thread
 - Device selection: `default`, `monitor` (loopback), partial name match, or exact `hw:X,Y` ID
 - Thread-safe audio buffering with `threading.Lock()`
 - OGG/Opus/FLAC headers captured from first ~16 KB of stream and prepended to each split file
 - File splits aligned to time period boundaries (`get_next_split_time()`)
 - SIGTERM handled in `main()` so Docker `docker compose down` shuts down cleanly
 - `RecorderManager._write_status()` atomically writes `recordings/status.json` every 2 s while running; deleted on clean shutdown so the web UI shows no stale active-recording badges
 ### Web UI (web.py)
 - **`GET /`** — Single-page archive table; lists all recordings sorted newest first
 - **`GET /api/files`** — JSON list of file metadata (name, size, date, duration, ext, recording flag)
 - **`GET /api/analyze?file=<path>`** — RMS loudness analysis for WAV and FLAC files; returns waveform data, loud sections, and duration. Requires `numpy` and `soundfile` for FLAC.
 - **`GET /api/status`** — Returns `{"active": [...]}` from `status.json`; used by the UI to animate the REC badge on in-progress files (polled every 5 s)
 - **`GET /stream/<path>`** — Serves audio for inline `<audio>` playback with full HTTP Range support (seekable). Responds 206 Partial Content for range requests. Files are served with `Content-Disposition: inline`.
 - **`GET /download/<path>`** — Serves audio as a file download (`Content-Disposition: attachment`)
 - All paths are validated against the recordings directory to prevent path traversal.
 ## Configuration
 Copy `config.example.ini` to `config.ini`. Each section defines a recording source:
 - `type = stream` — HTTP/Icecast stream recording
 - `type = soundcard` — ALSA device recording
 The `output_directory` value is used as-is: a relative path like `recordings` resolves to `recordings/` next to `isr.py`. No Docker-specific config change is needed — the docker-compose.yml mounts `./recordings` at `/app/recordings` to match this default.
 ## Docker
 Two services share a `./recordings` bind mount:
 - `recorder` — runs `isr.py`; volume at `/app/recordings`; mounts `asound.conf` as `/etc/asound.conf`; maps `/dev/snd`; `ipc: host` for dsnoop shared memory; `stop_grace_period: 30s`
 - `web` — runs `web.py`; same `./recordings` read-only at `/recordings`; exposes port 8080 internally (mapped to 8050 on the host)
 **Sharing the soundcard with darkice (or any other ALSA app):**
 ALSA `hw:` devices are exclusive. `asound.conf` defines a `dsnoop` virtual device `shared_mic` that both processes use instead:
 1. `sudo cp asound.conf /etc/asound.conf` on the host
 2. Change darkice config to `device = shared_mic`
 3. Set `device = shared_mic` in `config.ini`
 4. `ipc: host` in `docker-compose.yml` is already set — required for dsnoop shared memory to cross the container boundary