feat: drop rms_display from /api/analyze

The UI no longer draws waveforms, so the server stops computing and
sending the ~800-point RMS preview; the payload is now just sections,
duration, window. Old analysis caches stay valid: rms/rms_display are
popped when serving a cache hit (same pattern as the earlier full-RMS
removal), so no DETECTOR_VERSION bump.

Verified: 63 tests pass; endpoint smoke test (fresh + cached analyze)
confirms no RMS keys and correct section detection.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
2026-06-12 12:02:47 +02:00
parent 91701ce4d3
commit 9c58e35546
3 changed files with 7 additions and 12 deletions
+1 -1
View File
@@ -61,7 +61,7 @@ Dependencies: `requests` (streams), `numpy` + `soundfile` (FLAC output and FLAC
- **Loud-section detection is adaptive — do not regress it to an absolute threshold.** Per-window dB is compared against a rolling noise floor (`NOISE_PERCENTILE`-th percentile per `NOISE_BLOCK_SECONDS` block, min-smoothed over ±2 blocks so events can't raise their own floor; clamped to ≥ `MIN_RMS`). A section needs `margin` dB of prominence and carries a `score` used for ranking: peak dB above floor, **capped by the sharpest rise within `ONSET_SECONDS` (0.5 s)** — so a short (~10 s) swell that outruns the 30 s floor blocks still flags but scores ≈ 0 and sinks in the U/I highlight ranking, while sharp events keep their full prominence. A section starting in the first 0.5 s of a file is scored against the floor instead (events cut off by a file split must not be punished as swells). Do not regress the scoring to raw peak, and do not fight swells with a higher margin. If flagging itself (not just ranking) ever needs improving, the next step is a spectral filter or optional Silero VAD over candidate sections. Sections shorter than `min_duration` (default 0.5 s, after `min_gap` merging) are discarded — without this, isolated 100 ms pops (clicks, single raindrops) produced thousands of zero-length sections per day. The original fixed RMS threshold flagged every ambience change (passing cars, rain) and produced ~600 useless sections/day — that is why it was replaced. Tests in `tests/test_web.py`. - **Loud-section detection is adaptive — do not regress it to an absolute threshold.** Per-window dB is compared against a rolling noise floor (`NOISE_PERCENTILE`-th percentile per `NOISE_BLOCK_SECONDS` block, min-smoothed over ±2 blocks so events can't raise their own floor; clamped to ≥ `MIN_RMS`). A section needs `margin` dB of prominence and carries a `score` used for ranking: peak dB above floor, **capped by the sharpest rise within `ONSET_SECONDS` (0.5 s)** — so a short (~10 s) swell that outruns the 30 s floor blocks still flags but scores ≈ 0 and sinks in the U/I highlight ranking, while sharp events keep their full prominence. A section starting in the first 0.5 s of a file is scored against the floor instead (events cut off by a file split must not be punished as swells). Do not regress the scoring to raw peak, and do not fight swells with a higher margin. If flagging itself (not just ranking) ever needs improving, the next step is a spectral filter or optional Silero VAD over candidate sections. Sections shorter than `min_duration` (default 0.5 s, after `min_gap` merging) are discarded — without this, isolated 100 ms pops (clicks, single raindrops) produced thousands of zero-length sections per day. The original fixed RMS threshold flagged every ambience change (passing cars, rain) and produced ~600 useless sections/day — that is why it was replaced. Tests in `tests/test_web.py`.
- **Analysis params are coupled in five places.** CLI `--margin`/`--min-gap`/`--min-duration``/api/config` → UI inputs `#margin-input`/`#min-gap-input`/`#min-duration-input``/api/analyze` query params → cache JSON head keys. Renaming or adding a param means touching all five plus `cachedParamsMatch()` and the `_cached_analysis_params()` regex (see the threshold→margin change `c84b7d8` and the min_duration addition). - **Analysis params are coupled in five places.** CLI `--margin`/`--min-gap`/`--min-duration``/api/config` → UI inputs `#margin-input`/`#min-gap-input`/`#min-duration-input``/api/analyze` query params → cache JSON head keys. Renaming or adding a param means touching all five plus `cachedParamsMatch()` and the `_cached_analysis_params()` regex (see the threshold→margin change `c84b7d8` and the min_duration addition).
- **Analysis cache:** results stored as `<analyses-dir>/<file>.analysis.json` keyed by margin+min_gap+min_duration; orphans pruned at web startup. In Docker the recordings mount is **read-only** for the web container, so docker-compose layers a read-write `./recordings/analyses` bind mount over it. The `detector`, `margin`, `min_gap`, and `min_duration` keys MUST stay first in the cache JSON — `_cached_analysis_params()` reads only the first 256 bytes to avoid parsing the large embedded result. `detector` is `DETECTOR_VERSION`: bump it whenever detection/scoring changes make old cached results wrong (e.g. v2 = onset-capped scores); caches with another version (or missing keys) never match and get overwritten on the next analyse. - **Analysis cache:** results stored as `<analyses-dir>/<file>.analysis.json` keyed by margin+min_gap+min_duration; orphans pruned at web startup. In Docker the recordings mount is **read-only** for the web container, so docker-compose layers a read-write `./recordings/analyses` bind mount over it. The `detector`, `margin`, `min_gap`, and `min_duration` keys MUST stay first in the cache JSON — `_cached_analysis_params()` reads only the first 256 bytes to avoid parsing the large embedded result. `detector` is `DETECTOR_VERSION`: bump it whenever detection/scoring changes make old cached results wrong (e.g. v2 = onset-capped scores); caches with another version (or missing keys) never match and get overwritten on the next analyse.
- **Analyze responses:** `/api/analyze` returns `rms_display` (~800 points), never the full per-window RMS list (~45x larger). Since the waveform SVG was removed (user is blind, see webui notes) the bundled UI no longer reads `rms_display` at all — it stays in the payload for API stability and because cached results embed it. - **Analyze responses:** `/api/analyze` returns only `sections`, `duration`, `window` — no RMS data of any kind. `rms_display` (~800-point waveform preview) and the full per-window list were dropped when the waveform SVGs were removed (user is blind, see webui notes). Old caches that embed `rms`/`rms_display` are still valid; both keys are popped when serving from cache, so no DETECTOR_VERSION bump was needed.
- **Section playback uses clips, not seeks:** `/api/clip?file&start&end` decodes the slice server-side (wave/soundfile) and returns a standalone 16-bit WAV with exact Content-Length (capped at `CLIP_MAX_SECONDS`), `Cache-Control: private` so re-listening is free. The UI plays chips/J-K through the bottom clip bar (`clipQueue` in webui.html); seeking the full file only happens via "Open in file". Rationale (finding): libsndfile writes FLAC **without a SEEKTABLE**, so a browser seek bisects the whole multi-hundred-MB file with Range requests — seeking big FLACs in `<audio>` is inherently slow and must not be reintroduced as the primary navigation. Server-side `sf.SoundFile.seek()` on local disk is fast and frame-accurate. - **Section playback uses clips, not seeks:** `/api/clip?file&start&end` decodes the slice server-side (wave/soundfile) and returns a standalone 16-bit WAV with exact Content-Length (capped at `CLIP_MAX_SECONDS`), `Cache-Control: private` so re-listening is free. The UI plays chips/J-K through the bottom clip bar (`clipQueue` in webui.html); seeking the full file only happens via "Open in file". Rationale (finding): libsndfile writes FLAC **without a SEEKTABLE**, so a browser seek bisects the whole multi-hundred-MB file with Range requests — seeking big FLACs in `<audio>` is inherently slow and must not be reintroduced as the primary navigation. Server-side `sf.SoundFile.seek()` on local disk is fast and frame-accurate.
- **HTTP/1.1 keep-alive:** `_Handler.protocol_version = 'HTTP/1.1'`; every response path must set an accurate `Content-Length`. `_copy_to_response()` force-closes the connection if it under-delivers (file truncated mid-serve). - **HTTP/1.1 keep-alive:** `_Handler.protocol_version = 'HTTP/1.1'`; every response path must set an accurate `Content-Length`. `_copy_to_response()` force-closes the connection if it under-delivers (file truncated mid-serve).
- **Live playback:** for files listed in status.json, `/stream/` patches the header on the fly so the browser sees the duration recorded so far and can seek; responses get `Cache-Control: no-store`. WAV: `_live_wav_header` derives sizes from the byte count. FLAC: `_live_flac_header` parses the sample count out of the last frame header in the file tail (CRC-8-verified to reject false sync matches) and rewrites STREAMINFO total_samples — duration is NOT derivable from byte size for FLAC. - **Live playback:** for files listed in status.json, `/stream/` patches the header on the fly so the browser sees the duration recorded so far and can seek; responses get `Cache-Control: no-store`. WAV: `_live_wav_header` derives sizes from the byte count. FLAC: `_live_flac_header` parses the sample count out of the last frame header in the file tail (CRC-8-verified to reject false sync matches) and rewrites STREAMINFO total_samples — duration is NOT derivable from byte size for FLAC.
+1 -1
View File
@@ -186,7 +186,7 @@ Everything the UI does goes through these endpoints, so they can also be scripte
| Endpoint | Description | | Endpoint | Description |
|----------|-------------| |----------|-------------|
| `GET /api/files` | File listing with size, mtime, duration, recording state, cached-analysis params | | `GET /api/files` | File listing with size, mtime, duration, recording state, cached-analysis params |
| `GET /api/analyze?file=&margin=&min_gap=&min_duration=` | Loud-section analysis: scored `sections`, `duration` (plus `rms_display`, ~800 RMS points, unused by the bundled UI) | | `GET /api/analyze?file=&margin=&min_gap=&min_duration=` | Loud-section analysis: scored `sections`, `duration` |
| `GET /api/clip?file=&start=&end=` | Section of a WAV/FLAC decoded server-side, returned as a standalone WAV (max 600 s) | | `GET /api/clip?file=&start=&end=` | Section of a WAV/FLAC decoded server-side, returned as a standalone WAV (max 600 s) |
| `GET /api/cut?file=&start=&end=` | ffmpeg-trimmed copy of the file as a download | | `GET /api/cut?file=&start=&end=` | ffmpeg-trimmed copy of the file as a download |
| `GET /stream/<name>` | Inline playback with HTTP Range support; live files get an on-the-fly patched header | | `GET /stream/<name>` | Inline playback with HTTP Range support; live files get an on-the-fly patched header |
+5 -10
View File
@@ -402,16 +402,9 @@ def _package_result(rms_values: list, framerate: int, n_frames: int,
window_dur = window_samples / framerate window_dur = window_samples / framerate
duration = n_frames / framerate duration = n_frames / framerate
if len(rms_values) > 800: # Note: no RMS data is returned — the UI is screen-reader oriented and
step = len(rms_values) / 800 # draws no waveform, so sections + duration is all it needs.
rms_display = [rms_values[int(i * step)] for i in range(800)]
else:
rms_display = rms_values
# Note: the full per-window RMS list is deliberately NOT returned — the UI
# only renders rms_display (~800 points), and the full list is ~45x larger.
return { return {
'rms_display': rms_display,
'sections': _loud_sections(rms_values, window_dur, duration, margin_db, min_gap, min_duration), 'sections': _loud_sections(rms_values, window_dur, duration, margin_db, min_gap, min_duration),
'duration': round(duration, 2), 'duration': round(duration, 2),
'window': round(window_dur, 4), 'window': round(window_dur, 4),
@@ -690,7 +683,9 @@ class _Handler(BaseHTTPRequestHandler):
and cached.get('margin') == margin and cached.get('min_gap') == min_gap and cached.get('margin') == margin and cached.get('min_gap') == min_gap
and cached.get('min_duration') == min_duration): and cached.get('min_duration') == min_duration):
payload = dict(cached['result']) payload = dict(cached['result'])
payload.pop('rms', None) # caches written before the full-RMS field was dropped # strip fields that older cache versions embedded
payload.pop('rms', None) # full per-window RMS list
payload.pop('rms_display', None) # ~800-point waveform preview
payload['cached'] = True payload['cached'] = True
self._send(200, json.dumps(payload).encode('utf-8'), 'application/json') self._send(200, json.dumps(payload).encode('utf-8'), 'application/json')
return return