feat: minimum section duration filter (--min-duration, default 0.5 s)

A single 100 ms RMS window above the noise floor used to become its own
section, so isolated pops (clicks, single raindrops) flooded a day with
thousands of sub-second clips like "21:18 to 21:18". Sections shorter
than min_duration (measured after min_gap merging, so a cluster of blips
spanning longer still flags) are now discarded.

Wired through all coupled places: CLI flag, /api/config, controls-bar
input, /api/analyze query param, and the analysis-cache head keys (old
two-key caches no longer match and are recomputed on next analyse).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
2026-06-11 09:00:37 +02:00
parent e4d82483b5
commit f3716d3ff1
5 changed files with 114 additions and 42 deletions
+4 -4
View File
@@ -21,7 +21,7 @@ Guidance for Claude Code when working in this repository.
```bash ```bash
python isr.py [config.ini] # recorder; --list-devices to list ALSA inputs python isr.py [config.ini] # recorder; --list-devices to list ALSA inputs
python web.py # web UI on :8080 (--dir, --port, --margin, --min-gap, --analyses-dir) python web.py # web UI on :8080 (--dir, --port, --margin, --min-gap, --min-duration, --analyses-dir)
python -m pytest tests/ # test suite python -m pytest tests/ # test suite
docker compose up -d / down # web UI mapped to host port 8050 docker compose up -d / down # web UI mapped to host port 8050
``` ```
@@ -56,9 +56,9 @@ Dependencies: `requests` (streams), `numpy` + `soundfile` (FLAC output and FLAC
- **Split timing:** files split at clock-aligned boundaries (`get_next_split_time()`), e.g. `split_minutes = 60` → on the hour. - **Split timing:** files split at clock-aligned boundaries (`get_next_split_time()`), e.g. `split_minutes = 60` → on the hour.
- **ALSA:** capture spawns `arecord` as a subprocess, raw PCM read in 100 ms chunks by a thread. Device spec resolution: `default` → exact `hw:X,Y` → partial name → fallback to any literal ALSA PCM name (so `shared_mic` from asound.conf works without appearing in `arecord -l`). - **ALSA:** capture spawns `arecord` as a subprocess, raw PCM read in 100 ms chunks by a thread. Device spec resolution: `default` → exact `hw:X,Y` → partial name → fallback to any literal ALSA PCM name (so `shared_mic` from asound.conf works without appearing in `arecord -l`).
- **Shutdown:** SIGTERM is converted to KeyboardInterrupt in `main()`; `RecorderManager.stop()` joins all threads against a single shared 25 s deadline to stay inside Docker's `stop_grace_period: 30s`. - **Shutdown:** SIGTERM is converted to KeyboardInterrupt in `main()`; `RecorderManager.stop()` joins all threads against a single shared 25 s deadline to stay inside Docker's `stop_grace_period: 30s`.
- **Loud-section detection is adaptive — do not regress it to an absolute threshold.** Per-window dB is compared against a rolling noise floor (`NOISE_PERCENTILE`-th percentile per `NOISE_BLOCK_SECONDS` block, min-smoothed over ±2 blocks so events can't raise their own floor; clamped to ≥ `MIN_RMS`). A section needs `margin` dB of prominence and carries a `score` (peak dB above floor) used for ranking. The original fixed RMS threshold flagged every ambience change (passing cars, rain) and produced ~600 useless sections/day — that is why it was replaced. Known limitation: a short (~10 s) swell on a quiet street still flags because the floor blocks are 30 s; the planned fix is an onset/spectral filter or optional Silero VAD, **not** a higher margin. Tests in `tests/test_web.py`. - **Loud-section detection is adaptive — do not regress it to an absolute threshold.** Per-window dB is compared against a rolling noise floor (`NOISE_PERCENTILE`-th percentile per `NOISE_BLOCK_SECONDS` block, min-smoothed over ±2 blocks so events can't raise their own floor; clamped to ≥ `MIN_RMS`). A section needs `margin` dB of prominence and carries a `score` (peak dB above floor) used for ranking. Sections shorter than `min_duration` (default 0.5 s, after `min_gap` merging) are discarded — without this, isolated 100 ms pops (clicks, single raindrops) produced thousands of zero-length sections per day. The original fixed RMS threshold flagged every ambience change (passing cars, rain) and produced ~600 useless sections/day — that is why it was replaced. Known limitation: a short (~10 s) swell on a quiet street still flags because the floor blocks are 30 s; the planned fix is an onset/spectral filter or optional Silero VAD, **not** a higher margin. Tests in `tests/test_web.py`.
- **Analysis params are coupled in five places.** CLI `--margin`/`--min-gap``/api/config` → UI inputs `#margin-input`/`#min-gap-input``/api/analyze` query params → cache JSON head keys. Renaming or adding a param means touching all five plus `cachedParamsMatch()` (see the threshold→margin change, commit `c84b7d8`). - **Analysis params are coupled in five places.** CLI `--margin`/`--min-gap`/`--min-duration``/api/config` → UI inputs `#margin-input`/`#min-gap-input`/`#min-duration-input``/api/analyze` query params → cache JSON head keys. Renaming or adding a param means touching all five plus `cachedParamsMatch()` and the `_cached_analysis_params()` regex (see the threshold→margin change `c84b7d8` and the min_duration addition).
- **Analysis cache:** results stored as `<analyses-dir>/<file>.analysis.json` keyed by margin+min_gap; orphans pruned at web startup. In Docker the recordings mount is **read-only** for the web container, so docker-compose layers a read-write `./recordings/analyses` bind mount over it. The `margin` and `min_gap` keys MUST stay first in the cache JSON — `_cached_analysis_params()` reads only the first 256 bytes to avoid parsing the large embedded result. Old `threshold`-keyed caches never match and get overwritten on the next analyse. - **Analysis cache:** results stored as `<analyses-dir>/<file>.analysis.json` keyed by margin+min_gap+min_duration; orphans pruned at web startup. In Docker the recordings mount is **read-only** for the web container, so docker-compose layers a read-write `./recordings/analyses` bind mount over it. The `margin`, `min_gap`, and `min_duration` keys MUST stay first in the cache JSON — `_cached_analysis_params()` reads only the first 256 bytes to avoid parsing the large embedded result. Caches written by older detector versions (missing a key) never match and get overwritten on the next analyse.
- **Analyze responses:** `/api/analyze` returns `rms_display` (~800 points), never the full per-window RMS list — the UI doesn't use it and it is ~45x larger. - **Analyze responses:** `/api/analyze` returns `rms_display` (~800 points), never the full per-window RMS list — the UI doesn't use it and it is ~45x larger.
- **Section playback uses clips, not seeks:** `/api/clip?file&start&end` decodes the slice server-side (wave/soundfile) and returns a standalone 16-bit WAV with exact Content-Length (capped at `CLIP_MAX_SECONDS`), `Cache-Control: private` so re-listening is free. The UI plays chips/J-K through the bottom clip bar (`clipQueue` in webui.html); seeking the full file only happens via "Open in file". Rationale (finding): libsndfile writes FLAC **without a SEEKTABLE**, so a browser seek bisects the whole multi-hundred-MB file with Range requests — seeking big FLACs in `<audio>` is inherently slow and must not be reintroduced as the primary navigation. Server-side `sf.SoundFile.seek()` on local disk is fast and frame-accurate. - **Section playback uses clips, not seeks:** `/api/clip?file&start&end` decodes the slice server-side (wave/soundfile) and returns a standalone 16-bit WAV with exact Content-Length (capped at `CLIP_MAX_SECONDS`), `Cache-Control: private` so re-listening is free. The UI plays chips/J-K through the bottom clip bar (`clipQueue` in webui.html); seeking the full file only happens via "Open in file". Rationale (finding): libsndfile writes FLAC **without a SEEKTABLE**, so a browser seek bisects the whole multi-hundred-MB file with Range requests — seeking big FLACs in `<audio>` is inherently slow and must not be reintroduced as the primary navigation. Server-side `sf.SoundFile.seek()` on local disk is fast and frame-accurate.
- **HTTP/1.1 keep-alive:** `_Handler.protocol_version = 'HTTP/1.1'`; every response path must set an accurate `Content-Length`. `_copy_to_response()` force-closes the connection if it under-delivers (file truncated mid-serve). - **HTTP/1.1 keep-alive:** `_Handler.protocol_version = 'HTTP/1.1'`; every response path must set an accurate `Content-Length`. `_copy_to_response()` force-closes the connection if it under-delivers (file truncated mid-serve).
+5 -3
View File
@@ -154,6 +154,7 @@ python web.py --dir /path/to/audio # custom recordings directory
python web.py --port 8888 # custom port python web.py --port 8888 # custom port
python web.py --margin 15 # dB above background noise for a section to count as loud (default 12) python web.py --margin 15 # dB above background noise for a section to count as loud (default 12)
python web.py --min-gap 15 # grace period in seconds for merging loud sections (default 2) python web.py --min-gap 15 # grace period in seconds for merging loud sections (default 2)
python web.py --min-duration 1 # discard loud sections shorter than this many seconds (default 0.5)
python web.py --analyses-dir /path/to/dir # where to store analysis cache files (default: <recordings>/analyses) python web.py --analyses-dir /path/to/dir # where to store analysis cache files (default: <recordings>/analyses)
``` ```
@@ -164,8 +165,9 @@ Shows recordings grouped by day with collapsible sections. Features:
- **Day groups** — recordings are grouped under a collapsible day heading showing date, file count, total duration, and total size. The most recent day is expanded by default; older days start collapsed. Expanded state is preserved across filter changes. - **Day groups** — recordings are grouped under a collapsible day heading showing date, file count, total duration, and total size. The most recent day is expanded by default; older days start collapsed. Expanded state is preserved across filter changes.
- **Day highlights** — click **★ Highlights** on any day heading to run loudness analysis across all WAV/FLAC files in that day and display a combined activity timeline SVG. Orange segments show when loud sections occurred relative to the day's time span; blue shows the file extents. Labels show the start, midpoint, and end times. When a day has more sections than fit as chips, the chips show the top 50 by score (loudest-above-background first) so the most promising events are reviewed first; J/K still steps through all sections in time order. - **Day highlights** — click **★ Highlights** on any day heading to run loudness analysis across all WAV/FLAC files in that day and display a combined activity timeline SVG. Orange segments show when loud sections occurred relative to the day's time span; blue shows the file extents. Labels show the start, midpoint, and end times. When a day has more sections than fit as chips, the chips show the top 50 by score (loudest-above-background first) so the most promising events are reviewed first; J/K still steps through all sections in time order.
- **Inline playback** — collapsible `▶ Play` button per row; audio loads lazily via a seekable `/stream/` endpoint with HTTP Range support. Metadata is fetched immediately so the duration is visible without pressing play. - **Inline playback** — collapsible `▶ Play` button per row; audio loads lazily via a seekable `/stream/` endpoint with HTTP Range support. Metadata is fetched immediately so the duration is visible without pressing play.
- **Waveform analysis** — on demand per file; computes RMS per 100 ms window and marks sections that stand out above the background. Detection is **adaptive**: a rolling noise floor (20th percentile per 30 s block) is estimated across the file, and a section is flagged when the level rises at least *margin* dB (default 12) above that floor. Slow ambience changes — rain setting in, day/night traffic hum — move the floor instead of producing false positives. Each section gets a **score** (its peak dB above the floor) used to rank sections by how much they stand out. Supported for WAV and FLAC (FLAC requires `numpy` + `soundfile`). Pure-Python fallback for WAV when numpy is absent. Results are cached in `recordings/analyses/<filename>.analysis.json`; subsequent requests at the same margin and min-gap settings return instantly without re-reading the audio. The cache file is deleted automatically when the audio file is deleted. Orphaned cache files (audio deleted outside the UI) are pruned on startup. - **Waveform analysis** — on demand per file; computes RMS per 100 ms window and marks sections that stand out above the background. Detection is **adaptive**: a rolling noise floor (20th percentile per 30 s block) is estimated across the file, and a section is flagged when the level rises at least *margin* dB (default 12) above that floor. Slow ambience changes — rain setting in, day/night traffic hum — move the floor instead of producing false positives. Each section gets a **score** (its peak dB above the floor) used to rank sections by how much they stand out. Supported for WAV and FLAC (FLAC requires `numpy` + `soundfile`). Pure-Python fallback for WAV when numpy is absent. Results are cached in `recordings/analyses/<filename>.analysis.json`; subsequent requests at the same margin, min-gap, and min-duration settings return instantly without re-reading the audio. The cache file is deleted automatically when the audio file is deleted. Orphaned cache files (audio deleted outside the UI) are pruned on startup.
- **Grace period** — configurable in the controls bar (default 2 s). Loud sections separated by less than this gap are merged into one. Raise this (e.g. to 1530 s) when a single event generates many timestamps due to brief quiet gaps within it. - **Grace period** — configurable in the controls bar (default 2 s). Loud sections separated by less than this gap are merged into one. Raise this (e.g. to 1530 s) when a single event generates many timestamps due to brief quiet gaps within it.
- **Min duration** — configurable in the controls bar (default 0.5 s). Loud sections shorter than this (after grace-period merging) are discarded, so isolated sub-second pops — a click, a single raindrop — don't flood a day with thousands of near-zero-length sections. Set to 0 to disable.
- **Clip playback** — clicking a loud-section chip plays a short server-rendered WAV clip (`/api/clip`, pre-roll included) in a player bar at the bottom of the page. Playback starts instantly even for sections deep inside multi-hundred-MB FLACs, because the browser never has to seek the full file. **J** / **K** (or ⏮ / ⏭) step through the queued sections — one file's, or a whole day's after ★ Highlights — and **Auto-advance** plays the next section when one ends, turning a day's detections into a continuous review reel. **⤴ Open in file** switches to the full recording at the same position for context; each chip click also pre-fills the cut panel. - **Clip playback** — clicking a loud-section chip plays a short server-rendered WAV clip (`/api/clip`, pre-roll included) in a player bar at the bottom of the page. Playback starts instantly even for sections deep inside multi-hundred-MB FLACs, because the browser never has to seek the full file. **J** / **K** (or ⏮ / ⏭) step through the queued sections — one file's, or a whole day's after ★ Highlights — and **Auto-advance** plays the next section when one ends, turning a day's detections into a continuous review reel. **⤴ Open in file** switches to the full recording at the same position for context; each chip click also pre-fills the cut panel.
- **Cut & download** — `✂ Cut` button opens the player row and reveals a cut panel. Enter start and end times in `m:ss` or `h:mm:ss` format and click **↓ Download cut** to receive an ffmpeg-trimmed copy without re-encoding. Requires ffmpeg (included in the Docker image). - **Cut & download** — `✂ Cut` button opens the player row and reveals a cut panel. Enter start and end times in `m:ss` or `h:mm:ss` format and click **↓ Download cut** to receive an ffmpeg-trimmed copy without re-encoding. Requires ffmpeg (included in the Docker image).
- **Filters** — live filename search and from/to date pickers above the table; applied client-side with no additional requests. Shows `N of M shown` when a filter is active. - **Filters** — live filename search and from/to date pickers above the table; applied client-side with no additional requests. Shows `N of M shown` when a filter is active.
@@ -182,14 +184,14 @@ Everything the UI does goes through these endpoints, so they can also be scripte
| Endpoint | Description | | Endpoint | Description |
|----------|-------------| |----------|-------------|
| `GET /api/files` | File listing with size, mtime, duration, recording state, cached-analysis params | | `GET /api/files` | File listing with size, mtime, duration, recording state, cached-analysis params |
| `GET /api/analyze?file=&margin=&min_gap=` | Loud-section analysis: `rms_display` (~800-point waveform), scored `sections`, `duration` | | `GET /api/analyze?file=&margin=&min_gap=&min_duration=` | Loud-section analysis: `rms_display` (~800-point waveform), scored `sections`, `duration` |
| `GET /api/clip?file=&start=&end=` | Section of a WAV/FLAC decoded server-side, returned as a standalone WAV (max 600 s) | | `GET /api/clip?file=&start=&end=` | Section of a WAV/FLAC decoded server-side, returned as a standalone WAV (max 600 s) |
| `GET /api/cut?file=&start=&end=` | ffmpeg-trimmed copy of the file as a download | | `GET /api/cut?file=&start=&end=` | ffmpeg-trimmed copy of the file as a download |
| `GET /stream/<name>` | Inline playback with HTTP Range support; live files get an on-the-fly patched header | | `GET /stream/<name>` | Inline playback with HTTP Range support; live files get an on-the-fly patched header |
| `GET /download/<name>` | Raw file download | | `GET /download/<name>` | Raw file download |
| `GET /api/status` | Currently recording files (`status.json` passthrough) | | `GET /api/status` | Currently recording files (`status.json` passthrough) |
| `GET /api/storage` | Disk free/total | | `GET /api/storage` | Disk free/total |
| `GET /api/config` | Server-side defaults for margin and min-gap (seeds the UI controls) | | `GET /api/config` | Server-side defaults for margin, min-gap, and min-duration (seeds the UI controls) |
| `DELETE /api/files/<name>` | Delete a recording and its analysis cache | | `DELETE /api/files/<name>` | Delete a recording and its analysis cache |
Analysis, clips, cut, and delete return `409` for files that are still being recorded. Analysis, clips, cut, and delete return `409` for files that are still being recorded.
+34 -2
View File
@@ -7,9 +7,9 @@ from web import _loud_sections, _noise_floor_db
WINDOW_DUR = 0.1 # 100 ms windows, as produced by WINDOW_SAMPLES at 48 kHz WINDOW_DUR = 0.1 # 100 ms windows, as produced by WINDOW_SAMPLES at 48 kHz
def _run(rms, margin_db=12.0, min_gap=2.0): def _run(rms, margin_db=12.0, min_gap=2.0, min_duration=0.5):
duration = len(rms) * WINDOW_DUR duration = len(rms) * WINDOW_DUR
return _loud_sections(rms, WINDOW_DUR, duration, margin_db, min_gap) return _loud_sections(rms, WINDOW_DUR, duration, margin_db, min_gap, min_duration)
def test_burst_above_quiet_floor_is_detected(): def test_burst_above_quiet_floor_is_detected():
@@ -60,6 +60,38 @@ def test_min_gap_merges_nearby_bursts():
assert sections[1]['start'] == 90.0 assert sections[1]['start'] == 90.0
def test_min_duration_drops_subsecond_blips():
# Isolated single-window pops (clicks, single raindrops) spaced wider than
# min_gap must not each become their own section — this is what used to
# produce thousands of zero-length sections per day.
rms = [0.002] * 1200
for i in range(600, 660, 30): # 0.1 s blips, 3 s apart (> min_gap)
rms[i] = 0.05
assert _run(rms) == []
# With the filter disabled they are all reported
assert len(_run(rms, min_duration=0.0)) == 2
def test_min_duration_keeps_sections_at_or_above_it():
rms = [0.002] * 1200
rms[600:605] = [0.05] * 5 # exactly 0.5 s
sections = _run(rms, min_duration=0.5)
assert len(sections) == 1
assert sections[0]['start'] == 60.0
def test_min_duration_applies_after_gap_merging():
# Two sub-min_duration blips within min_gap merge into one section whose
# loud span exceeds min_duration — the merged section must survive.
rms = [0.002] * 1200
rms[600] = 0.05
rms[610] = 0.05 # 1 s apart < 2 s min_gap → merged, 1.1 s span
sections = _run(rms, min_duration=1.0)
assert len(sections) == 1
assert sections[0]['start'] == 60.0
assert sections[0]['end'] >= 61.0
def test_noise_floor_tracks_blocks_and_ignores_short_events(): def test_noise_floor_tracks_blocks_and_ignores_short_events():
quiet_db = 20 * math.log10(0.002) quiet_db = 20 * math.log10(0.002)
db = [quiet_db] * 1200 db = [quiet_db] * 1200
+53 -25
View File
@@ -49,6 +49,7 @@ AUDIO_EXTENSIONS = {'.wav', '.mp3', '.ogg', '.flac', '.aac', '.opus'}
WINDOW_SAMPLES = 4800 # 100 ms at 48 kHz WINDOW_SAMPLES = 4800 # 100 ms at 48 kHz
MARGIN_DB = 12.0 # sections must rise this many dB above the noise floor MARGIN_DB = 12.0 # sections must rise this many dB above the noise floor
MIN_GAP_SECONDS = 2.0 # merge loud sections separated by less than this MIN_GAP_SECONDS = 2.0 # merge loud sections separated by less than this
MIN_DURATION_SECONDS = 0.5 # discard loud sections shorter than this
NOISE_BLOCK_SECONDS = 30.0 # noise floor is estimated per block of this length NOISE_BLOCK_SECONDS = 30.0 # noise floor is estimated per block of this length
NOISE_PERCENTILE = 20 # percentile of windowed dB levels taken as the floor NOISE_PERCENTILE = 20 # percentile of windowed dB levels taken as the floor
@@ -292,10 +293,16 @@ def _noise_floor_db(db_values: list, window_dur: float) -> list:
def _loud_sections(rms_values: list, window_dur: float, duration: float, def _loud_sections(rms_values: list, window_dur: float, duration: float,
margin_db: float, min_gap: float = MIN_GAP_SECONDS) -> list: margin_db: float, min_gap: float = MIN_GAP_SECONDS,
min_duration: float = MIN_DURATION_SECONDS) -> list:
"""Sections whose level rises at least margin_db above the local noise """Sections whose level rises at least margin_db above the local noise
floor. Each section carries a 'score': its peak dB above the floor, used floor. Each section carries a 'score': its peak dB above the floor, used
by the UI to rank sections by how much they stand out.""" by the UI to rank sections by how much they stand out.
Sections shorter than min_duration (after min_gap merging) are discarded:
without this, every isolated 100 ms window that pops above the floor — a
click, a single raindrop — becomes its own section and a day can drown in
thousands of sub-second clips."""
db = [20 * math.log10(max(r, 1e-6)) for r in rms_values] db = [20 * math.log10(max(r, 1e-6)) for r in rms_values]
floor = _noise_floor_db(db, window_dur) floor = _noise_floor_db(db, window_dur)
min_db = 20 * math.log10(MIN_RMS) min_db = 20 * math.log10(MIN_RMS)
@@ -316,13 +323,15 @@ def _loud_sections(rms_values: list, window_dur: float, duration: float,
peak = max(peak, d - floor_eff) peak = max(peak, d - floor_eff)
else: else:
if start_t is not None and (t - last_loud_t) > min_gap: if start_t is not None and (t - last_loud_t) > min_gap:
sections.append({'start': round(start_t, 1), end_t = last_loud_t + window_dur
'end': round(last_loud_t + window_dur, 1), if end_t - start_t >= min_duration - 1e-9:
'score': round(peak, 1)}) sections.append({'start': round(start_t, 1),
'end': round(end_t, 1),
'score': round(peak, 1)})
start_t = None start_t = None
last_loud_t = None last_loud_t = None
if start_t is not None: if start_t is not None and (last_loud_t + window_dur - start_t) >= min_duration - 1e-9:
sections.append({'start': round(start_t, 1), 'end': round(duration, 1), sections.append({'start': round(start_t, 1), 'end': round(duration, 1),
'score': round(peak, 1)}) 'score': round(peak, 1)})
@@ -331,7 +340,8 @@ def _loud_sections(rms_values: list, window_dur: float, duration: float,
def _package_result(rms_values: list, framerate: int, n_frames: int, def _package_result(rms_values: list, framerate: int, n_frames: int,
window_samples: int, margin_db: float, window_samples: int, margin_db: float,
min_gap: float = MIN_GAP_SECONDS) -> dict: min_gap: float = MIN_GAP_SECONDS,
min_duration: float = MIN_DURATION_SECONDS) -> dict:
window_dur = window_samples / framerate window_dur = window_samples / framerate
duration = n_frames / framerate duration = n_frames / framerate
@@ -345,7 +355,7 @@ def _package_result(rms_values: list, framerate: int, n_frames: int,
# only renders rms_display (~800 points), and the full list is ~45x larger. # only renders rms_display (~800 points), and the full list is ~45x larger.
return { return {
'rms_display': rms_display, 'rms_display': rms_display,
'sections': _loud_sections(rms_values, window_dur, duration, margin_db, min_gap), 'sections': _loud_sections(rms_values, window_dur, duration, margin_db, min_gap, min_duration),
'duration': round(duration, 2), 'duration': round(duration, 2),
'window': round(window_dur, 4), 'window': round(window_dur, 4),
} }
@@ -353,7 +363,8 @@ def _package_result(rms_values: list, framerate: int, n_frames: int,
def analyze_wav(path: Path, window_samples: int = WINDOW_SAMPLES, def analyze_wav(path: Path, window_samples: int = WINDOW_SAMPLES,
margin_db: float = MARGIN_DB, margin_db: float = MARGIN_DB,
min_gap: float = MIN_GAP_SECONDS) -> dict: min_gap: float = MIN_GAP_SECONDS,
min_duration: float = MIN_DURATION_SECONDS) -> dict:
try: try:
with wave.open(str(path), 'rb') as wf: with wave.open(str(path), 'rb') as wf:
channels = wf.getnchannels() channels = wf.getnchannels()
@@ -365,12 +376,13 @@ def analyze_wav(path: Path, window_samples: int = WINDOW_SAMPLES,
except Exception as e: except Exception as e:
return {'error': str(e)} return {'error': str(e)}
return _package_result(rms_values, framerate, n_frames, window_samples, margin_db, min_gap) return _package_result(rms_values, framerate, n_frames, window_samples, margin_db, min_gap, min_duration)
def analyze_flac(path: Path, window_samples: int = WINDOW_SAMPLES, def analyze_flac(path: Path, window_samples: int = WINDOW_SAMPLES,
margin_db: float = MARGIN_DB, margin_db: float = MARGIN_DB,
min_gap: float = MIN_GAP_SECONDS) -> dict: min_gap: float = MIN_GAP_SECONDS,
min_duration: float = MIN_DURATION_SECONDS) -> dict:
"""Analyse a FLAC file for loudness. Requires numpy and soundfile.""" """Analyse a FLAC file for loudness. Requires numpy and soundfile."""
if not NUMPY_AVAILABLE or not SOUNDFILE_AVAILABLE: if not NUMPY_AVAILABLE or not SOUNDFILE_AVAILABLE:
return {'error': 'FLAC analysis requires: pip install numpy soundfile'} return {'error': 'FLAC analysis requires: pip install numpy soundfile'}
@@ -392,7 +404,7 @@ def analyze_flac(path: Path, window_samples: int = WINDOW_SAMPLES,
except Exception as e: except Exception as e:
return {'error': str(e)} return {'error': str(e)}
return _package_result(rms_values, framerate, n_frames, window_samples, margin_db, min_gap) return _package_result(rms_values, framerate, n_frames, window_samples, margin_db, min_gap, min_duration)
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
@@ -405,19 +417,21 @@ def _analysis_cache_path(analyses_base: Path, recordings_base: Path, audio_path:
def _cached_analysis_params(cache_path: Path): def _cached_analysis_params(cache_path: Path):
"""Read just margin/min_gap from a cache file without parsing the whole """Read just margin/min_gap/min_duration from a cache file without parsing
JSON (the embedded result can be hundreds of KB). Relies on the writer in the whole JSON (the embedded result can be hundreds of KB). Relies on the
_api_analyze putting these two keys first. Caches written by the old writer in _api_analyze putting these three keys first. Caches written by
fixed-threshold detector have no margin key and simply never match.""" older detector versions lack one of the keys and simply never match."""
try: try:
with open(cache_path, 'r', encoding='utf-8') as fh: with open(cache_path, 'r', encoding='utf-8') as fh:
head = fh.read(256) head = fh.read(256)
except OSError: except OSError:
return None return None
m = re.search(r'"margin":\s*([0-9.eE+-]+),\s*"min_gap":\s*([0-9.eE+-]+)', head) m = re.search(r'"margin":\s*([0-9.eE+-]+),\s*"min_gap":\s*([0-9.eE+-]+),'
r'\s*"min_duration":\s*([0-9.eE+-]+)', head)
if not m: if not m:
return None return None
return {'margin': float(m.group(1)), 'min_gap': float(m.group(2))} return {'margin': float(m.group(1)), 'min_gap': float(m.group(2)),
'min_duration': float(m.group(3))}
def prune_orphan_analyses(analyses_base: Path, recordings_base: Path): def prune_orphan_analyses(analyses_base: Path, recordings_base: Path):
@@ -518,6 +532,7 @@ class _Handler(BaseHTTPRequestHandler):
analyses_dir: str = 'recordings/analyses' analyses_dir: str = 'recordings/analyses'
margin_db: float = MARGIN_DB margin_db: float = MARGIN_DB
min_gap: float = MIN_GAP_SECONDS min_gap: float = MIN_GAP_SECONDS
min_duration: float = MIN_DURATION_SECONDS
def do_DELETE(self): def do_DELETE(self):
parsed = urlparse(self.path) parsed = urlparse(self.path)
@@ -593,6 +608,12 @@ class _Handler(BaseHTTPRequestHandler):
except (ValueError, TypeError): except (ValueError, TypeError):
min_gap = self.min_gap min_gap = self.min_gap
try:
min_duration = float(qs.get('min_duration', [self.min_duration])[0])
min_duration = max(0.0, min(60.0, min_duration))
except (ValueError, TypeError):
min_duration = self.min_duration
if self._is_active(filename): if self._is_active(filename):
self._json_err(409, 'File is currently being recorded — analysis unavailable until recording stops') self._json_err(409, 'File is currently being recorded — analysis unavailable until recording stops')
return return
@@ -602,7 +623,8 @@ class _Handler(BaseHTTPRequestHandler):
cache_path = _analysis_cache_path(analyses_base, recordings_base, path) cache_path = _analysis_cache_path(analyses_base, recordings_base, path)
try: try:
cached = json.loads(cache_path.read_text('utf-8')) cached = json.loads(cache_path.read_text('utf-8'))
if cached.get('margin') == margin and cached.get('min_gap') == min_gap: if (cached.get('margin') == margin and cached.get('min_gap') == min_gap
and cached.get('min_duration') == min_duration):
payload = dict(cached['result']) payload = dict(cached['result'])
payload.pop('rms', None) # caches written before the full-RMS field was dropped payload.pop('rms', None) # caches written before the full-RMS field was dropped
payload['cached'] = True payload['cached'] = True
@@ -613,12 +635,12 @@ class _Handler(BaseHTTPRequestHandler):
ext = path.suffix.lower() ext = path.suffix.lower()
if ext == '.wav': if ext == '.wav':
result = analyze_wav(path, margin_db=margin, min_gap=min_gap) result = analyze_wav(path, margin_db=margin, min_gap=min_gap, min_duration=min_duration)
elif ext == '.flac': elif ext == '.flac':
if not (NUMPY_AVAILABLE and SOUNDFILE_AVAILABLE): if not (NUMPY_AVAILABLE and SOUNDFILE_AVAILABLE):
self._json_err(400, 'FLAC analysis requires: pip install numpy soundfile') self._json_err(400, 'FLAC analysis requires: pip install numpy soundfile')
return return
result = analyze_flac(path, margin_db=margin, min_gap=min_gap) result = analyze_flac(path, margin_db=margin, min_gap=min_gap, min_duration=min_duration)
else: else:
self._json_err(400, f'Loudness analysis is not available for {ext} files') self._json_err(400, f'Loudness analysis is not available for {ext} files')
return return
@@ -626,9 +648,10 @@ class _Handler(BaseHTTPRequestHandler):
try: try:
cache_path.parent.mkdir(parents=True, exist_ok=True) cache_path.parent.mkdir(parents=True, exist_ok=True)
tmp = cache_path.with_suffix('.tmp') tmp = cache_path.with_suffix('.tmp')
# margin and min_gap MUST stay first: _cached_analysis_params reads # margin, min_gap and min_duration MUST stay first:
# only the first 256 bytes of this file # _cached_analysis_params reads only the first 256 bytes of this file
tmp.write_text(json.dumps({'margin': margin, 'min_gap': min_gap, 'result': result}), 'utf-8') tmp.write_text(json.dumps({'margin': margin, 'min_gap': min_gap,
'min_duration': min_duration, 'result': result}), 'utf-8')
os.replace(tmp, cache_path) os.replace(tmp, cache_path)
except Exception as e: except Exception as e:
print(f'Warning: could not write analysis cache {cache_path}: {e}', flush=True) print(f'Warning: could not write analysis cache {cache_path}: {e}', flush=True)
@@ -745,7 +768,8 @@ class _Handler(BaseHTTPRequestHandler):
self._send(200, data.encode(), 'application/json') self._send(200, data.encode(), 'application/json')
def _api_config(self): def _api_config(self):
data = json.dumps({'margin': self.margin_db, 'min_gap': self.min_gap}) data = json.dumps({'margin': self.margin_db, 'min_gap': self.min_gap,
'min_duration': self.min_duration})
self._send(200, data.encode(), 'application/json') self._send(200, data.encode(), 'application/json')
def _api_delete(self, filename: str): def _api_delete(self, filename: str):
@@ -1041,6 +1065,9 @@ def main():
f'to count as loud (default: {MARGIN_DB})') f'to count as loud (default: {MARGIN_DB})')
parser.add_argument('--min-gap', type=float, default=MIN_GAP_SECONDS, parser.add_argument('--min-gap', type=float, default=MIN_GAP_SECONDS,
help=f'Seconds gap for merging loud sections (default: {MIN_GAP_SECONDS})') help=f'Seconds gap for merging loud sections (default: {MIN_GAP_SECONDS})')
parser.add_argument('--min-duration', type=float, default=MIN_DURATION_SECONDS,
help=f'Discard loud sections shorter than this many seconds '
f'(default: {MIN_DURATION_SECONDS})')
parser.add_argument('--analyses-dir', default=None, parser.add_argument('--analyses-dir', default=None,
help='Directory for analysis cache files (default: <recordings-dir>/analyses)') help='Directory for analysis cache files (default: <recordings-dir>/analyses)')
args = parser.parse_args() args = parser.parse_args()
@@ -1059,6 +1086,7 @@ def main():
analyses_dir = str(_analyses_dir) analyses_dir = str(_analyses_dir)
margin_db = args.margin margin_db = args.margin
min_gap = args.min_gap min_gap = args.min_gap
min_duration = args.min_duration
server = _Server((args.host, args.port), Handler) server = _Server((args.host, args.port), Handler)
+18 -8
View File
@@ -155,6 +155,10 @@ body.clip-open{padding-bottom:70px}
<input type="number" id="min-gap-input" min="0" max="300" step="0.5" value="2" <input type="number" id="min-gap-input" min="0" max="300" step="0.5" value="2"
aria-describedby="min-gap-hint"> aria-describedby="min-gap-hint">
<span id="min-gap-hint" class="controls-hint">seconds — merge loud sections closer than this</span> <span id="min-gap-hint" class="controls-hint">seconds — merge loud sections closer than this</span>
<label for="min-duration-input" style="margin-left:16px">Min duration:</label>
<input type="number" id="min-duration-input" min="0" max="60" step="0.1" value="0.5"
aria-describedby="min-duration-hint">
<span id="min-duration-hint" class="controls-hint">seconds — ignore loud sections shorter than this</span>
</div> </div>
<div class="filter-bar" role="search" aria-label="Filter recordings"> <div class="filter-bar" role="search" aria-label="Filter recordings">
<label for="filter-name">Search:</label> <label for="filter-name">Search:</label>
@@ -403,16 +407,17 @@ document.getElementById('clip-context').addEventListener('click', () => {
seekToSection(c.fileIdx, c.filename, c.start, c.end, null); seekToSection(c.fileIdx, c.filename, c.start, c.end, null);
}); });
// filename|margin|gap -> analysis result, so re-renders (filtering, // filename|margin|gap|minDur -> analysis result, so re-renders (filtering,
// refresh) never refetch what this session already has // refresh) never refetch what this session already has
const analysisCache = new Map(); const analysisCache = new Map();
async function fetchAnalysis(filename, margin, minGap, force = false) { async function fetchAnalysis(filename, margin, minGap, minDur, force = false) {
const key = `${filename}|${margin}|${minGap}`; const key = `${filename}|${margin}|${minGap}|${minDur}`;
if (!force && analysisCache.has(key)) return analysisCache.get(key); if (!force && analysisCache.has(key)) return analysisCache.get(key);
const r = await fetch('/api/analyze?file='+encodeURIComponent(filename) const r = await fetch('/api/analyze?file='+encodeURIComponent(filename)
+'&margin='+encodeURIComponent(margin) +'&margin='+encodeURIComponent(margin)
+'&min_gap='+encodeURIComponent(minGap)); +'&min_gap='+encodeURIComponent(minGap)
+'&min_duration='+encodeURIComponent(minDur));
const d = await r.json(); const d = await r.json();
if (!d.error) analysisCache.set(key, d); if (!d.error) analysisCache.set(key, d);
return d; return d;
@@ -424,13 +429,14 @@ async function analyse(idx, filename, cell, btn, force = false) {
cell.innerHTML = '<div class="spin" aria-live="polite" aria-busy="true">Analysing…</div>'; cell.innerHTML = '<div class="spin" aria-live="polite" aria-busy="true">Analysing…</div>';
const margin = document.getElementById('margin-input').value || '12'; const margin = document.getElementById('margin-input').value || '12';
const minGap = document.getElementById('min-gap-input').value || '2'; const minGap = document.getElementById('min-gap-input').value || '2';
const minDur = document.getElementById('min-duration-input').value || '0.5';
const restoreBtn = () => { const restoreBtn = () => {
btn.textContent = 'Analyse'; btn.disabled = false; btn.textContent = 'Analyse'; btn.disabled = false;
btn.onclick = () => analyse(idx, filename, cell, btn); btn.onclick = () => analyse(idx, filename, cell, btn);
if (!cell.contains(btn)) cell.appendChild(btn); if (!cell.contains(btn)) cell.appendChild(btn);
}; };
try { try {
const d = await fetchAnalysis(filename, margin, minGap, force); const d = await fetchAnalysis(filename, margin, minGap, minDur, force);
if (d.error) { if (d.error) {
cell.innerHTML = `<div class="spin" role="alert">Error: ${esc(d.error)}</div>`; cell.innerHTML = `<div class="spin" role="alert">Error: ${esc(d.error)}</div>`;
restoreBtn(); return; restoreBtn(); return;
@@ -439,7 +445,7 @@ async function analyse(idx, filename, cell, btn, force = false) {
box.appendChild(drawWave(d.rms_display||[], d.sections||[], d.duration||0, filename)); box.appendChild(drawWave(d.rms_display||[], d.sections||[], d.duration||0, filename));
const meta = document.createElement('div'); meta.className='analysis-meta'; const meta = document.createElement('div'); meta.className='analysis-meta';
meta.textContent = `margin: ${margin} dB · gap: ${minGap}s${d.cached ? ' · cached' : ''}`; meta.textContent = `margin: ${margin} dB · gap: ${minGap}s · min: ${minDur}s${d.cached ? ' · cached' : ''}`;
box.appendChild(meta); box.appendChild(meta);
const chips = document.createElement('div'); const chips = document.createElement('div');
@@ -576,7 +582,8 @@ async function updateStorage() {
function cachedParamsMatch(ca) { function cachedParamsMatch(ca) {
return ca != null return ca != null
&& Number(ca.margin) === parseFloat(document.getElementById('margin-input').value) && Number(ca.margin) === parseFloat(document.getElementById('margin-input').value)
&& Number(ca.min_gap) === parseFloat(document.getElementById('min-gap-input').value); && Number(ca.min_gap) === parseFloat(document.getElementById('min-gap-input').value)
&& Number(ca.min_duration) === parseFloat(document.getElementById('min-duration-input').value);
} }
// Run the deferred analyses of a freshly expanded day // Run the deferred analyses of a freshly expanded day
@@ -852,6 +859,7 @@ async function dayHighlights(dayId, analyzableFiles) {
const margin = document.getElementById('margin-input').value || '12'; const margin = document.getElementById('margin-input').value || '12';
const minGap = document.getElementById('min-gap-input').value || '2'; const minGap = document.getElementById('min-gap-input').value || '2';
const minDur = document.getElementById('min-duration-input').value || '0.5';
const results = []; const results = [];
let nCached = 0, nLive = 0; let nCached = 0, nLive = 0;
@@ -860,7 +868,7 @@ async function dayHighlights(dayId, analyzableFiles) {
progFile.textContent = `${i + 1} / ${n}${f.name}`; progFile.textContent = `${i + 1} / ${n}${f.name}`;
progFill.style.width = `${(i / n) * 100}%`; progFill.style.width = `${(i / n) * 100}%`;
try { try {
const d = await fetchAnalysis(f.name, margin, minGap); const d = await fetchAnalysis(f.name, margin, minGap, minDur);
if (!d.error) { results.push({ f, data: d }); d.cached ? nCached++ : nLive++; } if (!d.error) { results.push({ f, data: d }); d.cached ? nCached++ : nLive++; }
} catch(e) {} } catch(e) {}
} }
@@ -1099,6 +1107,8 @@ fetch('/api/config').then(r => r.json()).then(cfg => {
document.getElementById('margin-input').value = cfg.margin; document.getElementById('margin-input').value = cfg.margin;
if (cfg.min_gap != null) if (cfg.min_gap != null)
document.getElementById('min-gap-input').value = cfg.min_gap; document.getElementById('min-gap-input').value = cfg.min_gap;
if (cfg.min_duration != null)
document.getElementById('min-duration-input').value = cfg.min_duration;
}).catch(() => {}).finally(() => load().then(() => setInterval(pollStatus, 5000))); }).catch(() => {}).finally(() => load().then(() => setInterval(pollStatus, 5000)));
</script> </script>
</body> </body>