feat: minimum section duration filter (--min-duration, default 0.5 s)

A single 100 ms RMS window above the noise floor used to become its own section, so isolated pops (clicks, single raindrops) flooded a day with thousands of sub-second clips like "21:18 to 21:18". Sections shorter than min_duration (measured after min_gap merging, so a cluster of blips spanning longer still flags) are now discarded. Wired through all coupled places: CLI flag, /api/config, controls-bar input, /api/analyze query param, and the analysis-cache head keys (old two-key caches no longer match and are recomputed on next analyse). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 09:00:37 +02:00
parent e4d82483b5
commit f3716d3ff1
5 changed files with 114 additions and 42 deletions
@@ -21,7 +21,7 @@ Guidance for Claude Code when working in this repository.

 ```bash
 python isr.py [config.ini]        # recorder; --list-devices to list ALSA inputs
-python web.py                     # web UI on :8080 (--dir, --port, --margin, --min-gap, --analyses-dir)
+python web.py                     # web UI on :8080 (--dir, --port, --margin, --min-gap, --min-duration, --analyses-dir)
 python -m pytest tests/           # test suite
 docker compose up -d / down       # web UI mapped to host port 8050
 ```
@@ -56,9 +56,9 @@ Dependencies: `requests` (streams), `numpy` + `soundfile` (FLAC output and FLAC
 - **Split timing:** files split at clock-aligned boundaries (`get_next_split_time()`), e.g. `split_minutes = 60` → on the hour.
 - **ALSA:** capture spawns `arecord` as a subprocess, raw PCM read in 100 ms chunks by a thread. Device spec resolution: `default` → exact `hw:X,Y` → partial name → fallback to any literal ALSA PCM name (so `shared_mic` from asound.conf works without appearing in `arecord -l`).
 - **Shutdown:** SIGTERM is converted to KeyboardInterrupt in `main()`; `RecorderManager.stop()` joins all threads against a single shared 25 s deadline to stay inside Docker's `stop_grace_period: 30s`.
- **Loud-section detection is adaptive — do not regress it to an absolute threshold.** Per-window dB is compared against a rolling noise floor (`NOISE_PERCENTILE`-th percentile per `NOISE_BLOCK_SECONDS` block, min-smoothed over ±2 blocks so events can't raise their own floor; clamped to ≥ `MIN_RMS`). A section needs `margin` dB of prominence and carries a `score` (peak dB above floor) used for ranking. The original fixed RMS threshold flagged every ambience change (passing cars, rain) and produced ~600 useless sections/day — that is why it was replaced. Known limitation: a short (~10 s) swell on a quiet street still flags because the floor blocks are 30 s; the planned fix is an onset/spectral filter or optional Silero VAD, **not** a higher margin. Tests in `tests/test_web.py`.
- **Analysis params are coupled in five places.** CLI `--margin`/`--min-gap` → `/api/config` → UI inputs `#margin-input`/`#min-gap-input` → `/api/analyze` query params → cache JSON head keys. Renaming or adding a param means touching all five plus `cachedParamsMatch()` (see the threshold→margin change, commit `c84b7d8`).
- **Analysis cache:** results stored as `<analyses-dir>/<file>.analysis.json` keyed by margin+min_gap; orphans pruned at web startup. In Docker the recordings mount is **read-only** for the web container, so docker-compose layers a read-write `./recordings/analyses` bind mount over it. The `margin` and `min_gap` keys MUST stay first in the cache JSON — `_cached_analysis_params()` reads only the first 256 bytes to avoid parsing the large embedded result. Old `threshold`-keyed caches never match and get overwritten on the next analyse.
+- **Loud-section detection is adaptive — do not regress it to an absolute threshold.** Per-window dB is compared against a rolling noise floor (`NOISE_PERCENTILE`-th percentile per `NOISE_BLOCK_SECONDS` block, min-smoothed over ±2 blocks so events can't raise their own floor; clamped to ≥ `MIN_RMS`). A section needs `margin` dB of prominence and carries a `score` (peak dB above floor) used for ranking. Sections shorter than `min_duration` (default 0.5 s, after `min_gap` merging) are discarded — without this, isolated 100 ms pops (clicks, single raindrops) produced thousands of zero-length sections per day. The original fixed RMS threshold flagged every ambience change (passing cars, rain) and produced ~600 useless sections/day — that is why it was replaced. Known limitation: a short (~10 s) swell on a quiet street still flags because the floor blocks are 30 s; the planned fix is an onset/spectral filter or optional Silero VAD, **not** a higher margin. Tests in `tests/test_web.py`.
+- **Analysis params are coupled in five places.** CLI `--margin`/`--min-gap`/`--min-duration` → `/api/config` → UI inputs `#margin-input`/`#min-gap-input`/`#min-duration-input` → `/api/analyze` query params → cache JSON head keys. Renaming or adding a param means touching all five plus `cachedParamsMatch()` and the `_cached_analysis_params()` regex (see the threshold→margin change `c84b7d8` and the min_duration addition).
+- **Analysis cache:** results stored as `<analyses-dir>/<file>.analysis.json` keyed by margin+min_gap+min_duration; orphans pruned at web startup. In Docker the recordings mount is **read-only** for the web container, so docker-compose layers a read-write `./recordings/analyses` bind mount over it. The `margin`, `min_gap`, and `min_duration` keys MUST stay first in the cache JSON — `_cached_analysis_params()` reads only the first 256 bytes to avoid parsing the large embedded result. Caches written by older detector versions (missing a key) never match and get overwritten on the next analyse.
 - **Analyze responses:** `/api/analyze` returns `rms_display` (~800 points), never the full per-window RMS list — the UI doesn't use it and it is ~45x larger.
 - **Section playback uses clips, not seeks:** `/api/clip?file&start&end` decodes the slice server-side (wave/soundfile) and returns a standalone 16-bit WAV with exact Content-Length (capped at `CLIP_MAX_SECONDS`), `Cache-Control: private` so re-listening is free. The UI plays chips/J-K through the bottom clip bar (`clipQueue` in webui.html); seeking the full file only happens via "Open in file". Rationale (finding): libsndfile writes FLAC **without a SEEKTABLE**, so a browser seek bisects the whole multi-hundred-MB file with Range requests — seeking big FLACs in `<audio>` is inherently slow and must not be reintroduced as the primary navigation. Server-side `sf.SoundFile.seek()` on local disk is fast and frame-accurate.
 - **HTTP/1.1 keep-alive:** `_Handler.protocol_version = 'HTTP/1.1'`; every response path must set an accurate `Content-Length`. `_copy_to_response()` force-closes the connection if it under-delivers (file truncated mid-serve).
@@ -154,6 +154,7 @@ python web.py --dir /path/to/audio         # custom recordings directory
 python web.py --port 8888                  # custom port
 python web.py --margin 15                  # dB above background noise for a section to count as loud (default 12)
 python web.py --min-gap 15                 # grace period in seconds for merging loud sections (default 2)
+python web.py --min-duration 1             # discard loud sections shorter than this many seconds (default 0.5)
 python web.py --analyses-dir /path/to/dir  # where to store analysis cache files (default: <recordings>/analyses)
 ```

@@ -164,8 +165,9 @@ Shows recordings grouped by day with collapsible sections. Features:
 - **Day groups** — recordings are grouped under a collapsible day heading showing date, file count, total duration, and total size. The most recent day is expanded by default; older days start collapsed. Expanded state is preserved across filter changes.
 - **Day highlights** — click **★ Highlights** on any day heading to run loudness analysis across all WAV/FLAC files in that day and display a combined activity timeline SVG. Orange segments show when loud sections occurred relative to the day's time span; blue shows the file extents. Labels show the start, midpoint, and end times. When a day has more sections than fit as chips, the chips show the top 50 by score (loudest-above-background first) so the most promising events are reviewed first; J/K still steps through all sections in time order.
 - **Inline playback** — collapsible `▶ Play` button per row; audio loads lazily via a seekable `/stream/` endpoint with HTTP Range support. Metadata is fetched immediately so the duration is visible without pressing play.
- **Waveform analysis** — on demand per file; computes RMS per 100 ms window and marks sections that stand out above the background. Detection is **adaptive**: a rolling noise floor (20th percentile per 30 s block) is estimated across the file, and a section is flagged when the level rises at least *margin* dB (default 12) above that floor. Slow ambience changes — rain setting in, day/night traffic hum — move the floor instead of producing false positives. Each section gets a **score** (its peak dB above the floor) used to rank sections by how much they stand out. Supported for WAV and FLAC (FLAC requires `numpy` + `soundfile`). Pure-Python fallback for WAV when numpy is absent. Results are cached in `recordings/analyses/<filename>.analysis.json`; subsequent requests at the same margin and min-gap settings return instantly without re-reading the audio. The cache file is deleted automatically when the audio file is deleted. Orphaned cache files (audio deleted outside the UI) are pruned on startup.
+- **Waveform analysis** — on demand per file; computes RMS per 100 ms window and marks sections that stand out above the background. Detection is **adaptive**: a rolling noise floor (20th percentile per 30 s block) is estimated across the file, and a section is flagged when the level rises at least *margin* dB (default 12) above that floor. Slow ambience changes — rain setting in, day/night traffic hum — move the floor instead of producing false positives. Each section gets a **score** (its peak dB above the floor) used to rank sections by how much they stand out. Supported for WAV and FLAC (FLAC requires `numpy` + `soundfile`). Pure-Python fallback for WAV when numpy is absent. Results are cached in `recordings/analyses/<filename>.analysis.json`; subsequent requests at the same margin, min-gap, and min-duration settings return instantly without re-reading the audio. The cache file is deleted automatically when the audio file is deleted. Orphaned cache files (audio deleted outside the UI) are pruned on startup.
 - **Grace period** — configurable in the controls bar (default 2 s). Loud sections separated by less than this gap are merged into one. Raise this (e.g. to 15–30 s) when a single event generates many timestamps due to brief quiet gaps within it.
+- **Min duration** — configurable in the controls bar (default 0.5 s). Loud sections shorter than this (after grace-period merging) are discarded, so isolated sub-second pops — a click, a single raindrop — don't flood a day with thousands of near-zero-length sections. Set to 0 to disable.
 - **Clip playback** — clicking a loud-section chip plays a short server-rendered WAV clip (`/api/clip`, pre-roll included) in a player bar at the bottom of the page. Playback starts instantly even for sections deep inside multi-hundred-MB FLACs, because the browser never has to seek the full file. **J** / **K** (or ⏮ / ⏭) step through the queued sections — one file's, or a whole day's after ★ Highlights — and **Auto-advance** plays the next section when one ends, turning a day's detections into a continuous review reel. **⤴ Open in file** switches to the full recording at the same position for context; each chip click also pre-fills the cut panel.
 - **Cut & download** — `✂ Cut` button opens the player row and reveals a cut panel. Enter start and end times in `m:ss` or `h:mm:ss` format and click **↓ Download cut** to receive an ffmpeg-trimmed copy without re-encoding. Requires ffmpeg (included in the Docker image).
 - **Filters** — live filename search and from/to date pickers above the table; applied client-side with no additional requests. Shows `N of M shown` when a filter is active.
@@ -182,14 +184,14 @@ Everything the UI does goes through these endpoints, so they can also be scripte
 | Endpoint | Description |
 |----------|-------------|
 | `GET /api/files` | File listing with size, mtime, duration, recording state, cached-analysis params |
-| `GET /api/analyze?file=&margin=&min_gap=` | Loud-section analysis: `rms_display` (~800-point waveform), scored `sections`, `duration` |
+| `GET /api/analyze?file=&margin=&min_gap=&min_duration=` | Loud-section analysis: `rms_display` (~800-point waveform), scored `sections`, `duration` |
 | `GET /api/clip?file=&start=&end=` | Section of a WAV/FLAC decoded server-side, returned as a standalone WAV (max 600 s) |
 | `GET /api/cut?file=&start=&end=` | ffmpeg-trimmed copy of the file as a download |
 | `GET /stream/<name>` | Inline playback with HTTP Range support; live files get an on-the-fly patched header |
 | `GET /download/<name>` | Raw file download |
 | `GET /api/status` | Currently recording files (`status.json` passthrough) |
 | `GET /api/storage` | Disk free/total |
-| `GET /api/config` | Server-side defaults for margin and min-gap (seeds the UI controls) |
+| `GET /api/config` | Server-side defaults for margin, min-gap, and min-duration (seeds the UI controls) |
 | `DELETE /api/files/<name>` | Delete a recording and its analysis cache |

 Analysis, clips, cut, and delete return `409` for files that are still being recorded.
@@ -7,9 +7,9 @@ from web import _loud_sections, _noise_floor_db
 WINDOW_DUR = 0.1  # 100 ms windows, as produced by WINDOW_SAMPLES at 48 kHz


-def _run(rms, margin_db=12.0, min_gap=2.0):
+def _run(rms, margin_db=12.0, min_gap=2.0, min_duration=0.5):
    duration = len(rms) * WINDOW_DUR
-    return _loud_sections(rms, WINDOW_DUR, duration, margin_db, min_gap)
+    return _loud_sections(rms, WINDOW_DUR, duration, margin_db, min_gap, min_duration)


 def test_burst_above_quiet_floor_is_detected():
@@ -60,6 +60,38 @@ def test_min_gap_merges_nearby_bursts():
    assert sections[1]['start'] == 90.0


+def test_min_duration_drops_subsecond_blips():
+    # Isolated single-window pops (clicks, single raindrops) spaced wider than
+    # min_gap must not each become their own section — this is what used to
+    # produce thousands of zero-length sections per day.
+    rms = [0.002] * 1200
+    for i in range(600, 660, 30):   # 0.1 s blips, 3 s apart (> min_gap)
+        rms[i] = 0.05
+    assert _run(rms) == []
+    # With the filter disabled they are all reported
+    assert len(_run(rms, min_duration=0.0)) == 2
+
+
+def test_min_duration_keeps_sections_at_or_above_it():
+    rms = [0.002] * 1200
+    rms[600:605] = [0.05] * 5     # exactly 0.5 s
+    sections = _run(rms, min_duration=0.5)
+    assert len(sections) == 1
+    assert sections[0]['start'] == 60.0
+
+
+def test_min_duration_applies_after_gap_merging():
+    # Two sub-min_duration blips within min_gap merge into one section whose
+    # loud span exceeds min_duration — the merged section must survive.
+    rms = [0.002] * 1200
+    rms[600] = 0.05
+    rms[610] = 0.05               # 1 s apart < 2 s min_gap → merged, 1.1 s span
+    sections = _run(rms, min_duration=1.0)
+    assert len(sections) == 1
+    assert sections[0]['start'] == 60.0
+    assert sections[0]['end'] >= 61.0
+
+
 def test_noise_floor_tracks_blocks_and_ignores_short_events():
    quiet_db = 20 * math.log10(0.002)
    db = [quiet_db] * 1200
@@ -49,6 +49,7 @@ AUDIO_EXTENSIONS = {'.wav', '.mp3', '.ogg', '.flac', '.aac', '.opus'}
 WINDOW_SAMPLES   = 4800    # 100 ms at 48 kHz
 MARGIN_DB        = 12.0    # sections must rise this many dB above the noise floor
 MIN_GAP_SECONDS  = 2.0     # merge loud sections separated by less than this
+MIN_DURATION_SECONDS = 0.5 # discard loud sections shorter than this

 NOISE_BLOCK_SECONDS = 30.0 # noise floor is estimated per block of this length
 NOISE_PERCENTILE    = 20   # percentile of windowed dB levels taken as the floor
@@ -292,10 +293,16 @@ def _noise_floor_db(db_values: list, window_dur: float) -> list:


 def _loud_sections(rms_values: list, window_dur: float, duration: float,
-                   margin_db: float, min_gap: float = MIN_GAP_SECONDS) -> list:
+                   margin_db: float, min_gap: float = MIN_GAP_SECONDS,
+                   min_duration: float = MIN_DURATION_SECONDS) -> list:
    """Sections whose level rises at least margin_db above the local noise
    floor. Each section carries a 'score': its peak dB above the floor, used
-    by the UI to rank sections by how much they stand out."""
+    by the UI to rank sections by how much they stand out.
+
+    Sections shorter than min_duration (after min_gap merging) are discarded:
+    without this, every isolated 100 ms window that pops above the floor — a
+    click, a single raindrop — becomes its own section and a day can drown in
+    thousands of sub-second clips."""
    db = [20 * math.log10(max(r, 1e-6)) for r in rms_values]
    floor = _noise_floor_db(db, window_dur)
    min_db = 20 * math.log10(MIN_RMS)
@@ -316,13 +323,15 @@ def _loud_sections(rms_values: list, window_dur: float, duration: float,
            peak = max(peak, d - floor_eff)
        else:
            if start_t is not None and (t - last_loud_t) > min_gap:
-                sections.append({'start': round(start_t, 1),
-                                 'end':   round(last_loud_t + window_dur, 1),
-                                 'score': round(peak, 1)})
+                end_t = last_loud_t + window_dur
+                if end_t - start_t >= min_duration - 1e-9:
+                    sections.append({'start': round(start_t, 1),
+                                     'end':   round(end_t, 1),
+                                     'score': round(peak, 1)})
                start_t = None
                last_loud_t = None

-    if start_t is not None:
+    if start_t is not None and (last_loud_t + window_dur - start_t) >= min_duration - 1e-9:
        sections.append({'start': round(start_t, 1), 'end': round(duration, 1),
                         'score': round(peak, 1)})

@@ -331,7 +340,8 @@ def _loud_sections(rms_values: list, window_dur: float, duration: float,

 def _package_result(rms_values: list, framerate: int, n_frames: int,
                    window_samples: int, margin_db: float,
-                    min_gap: float = MIN_GAP_SECONDS) -> dict:
+                    min_gap: float = MIN_GAP_SECONDS,
+                    min_duration: float = MIN_DURATION_SECONDS) -> dict:
    window_dur = window_samples / framerate
    duration   = n_frames / framerate

@@ -345,7 +355,7 @@ def _package_result(rms_values: list, framerate: int, n_frames: int,
    # only renders rms_display (~800 points), and the full list is ~45x larger.
    return {
        'rms_display': rms_display,
-        'sections':    _loud_sections(rms_values, window_dur, duration, margin_db, min_gap),
+        'sections':    _loud_sections(rms_values, window_dur, duration, margin_db, min_gap, min_duration),
        'duration':    round(duration, 2),
        'window':      round(window_dur, 4),
    }
@@ -353,7 +363,8 @@ def _package_result(rms_values: list, framerate: int, n_frames: int,

 def analyze_wav(path: Path, window_samples: int = WINDOW_SAMPLES,
                margin_db: float = MARGIN_DB,
-                min_gap: float = MIN_GAP_SECONDS) -> dict:
+                min_gap: float = MIN_GAP_SECONDS,
+                min_duration: float = MIN_DURATION_SECONDS) -> dict:
    try:
        with wave.open(str(path), 'rb') as wf:
            channels  = wf.getnchannels()
@@ -365,12 +376,13 @@ def analyze_wav(path: Path, window_samples: int = WINDOW_SAMPLES,
    except Exception as e:
        return {'error': str(e)}

-    return _package_result(rms_values, framerate, n_frames, window_samples, margin_db, min_gap)
+    return _package_result(rms_values, framerate, n_frames, window_samples, margin_db, min_gap, min_duration)


 def analyze_flac(path: Path, window_samples: int = WINDOW_SAMPLES,
                 margin_db: float = MARGIN_DB,
-                 min_gap: float = MIN_GAP_SECONDS) -> dict:
+                 min_gap: float = MIN_GAP_SECONDS,
+                 min_duration: float = MIN_DURATION_SECONDS) -> dict:
    """Analyse a FLAC file for loudness. Requires numpy and soundfile."""
    if not NUMPY_AVAILABLE or not SOUNDFILE_AVAILABLE:
        return {'error': 'FLAC analysis requires: pip install numpy soundfile'}
@@ -392,7 +404,7 @@ def analyze_flac(path: Path, window_samples: int = WINDOW_SAMPLES,
    except Exception as e:
        return {'error': str(e)}

-    return _package_result(rms_values, framerate, n_frames, window_samples, margin_db, min_gap)
+    return _package_result(rms_values, framerate, n_frames, window_samples, margin_db, min_gap, min_duration)


 # ---------------------------------------------------------------------------
@@ -405,19 +417,21 @@ def _analysis_cache_path(analyses_base: Path, recordings_base: Path, audio_path:


 def _cached_analysis_params(cache_path: Path):
-    """Read just margin/min_gap from a cache file without parsing the whole
-    JSON (the embedded result can be hundreds of KB). Relies on the writer in
-    _api_analyze putting these two keys first. Caches written by the old
-    fixed-threshold detector have no margin key and simply never match."""
+    """Read just margin/min_gap/min_duration from a cache file without parsing
+    the whole JSON (the embedded result can be hundreds of KB). Relies on the
+    writer in _api_analyze putting these three keys first. Caches written by
+    older detector versions lack one of the keys and simply never match."""
    try:
        with open(cache_path, 'r', encoding='utf-8') as fh:
            head = fh.read(256)
    except OSError:
        return None
-    m = re.search(r'"margin":\s*([0-9.eE+-]+),\s*"min_gap":\s*([0-9.eE+-]+)', head)
+    m = re.search(r'"margin":\s*([0-9.eE+-]+),\s*"min_gap":\s*([0-9.eE+-]+),'
+                  r'\s*"min_duration":\s*([0-9.eE+-]+)', head)
    if not m:
        return None
-    return {'margin': float(m.group(1)), 'min_gap': float(m.group(2))}
+    return {'margin': float(m.group(1)), 'min_gap': float(m.group(2)),
+            'min_duration': float(m.group(3))}


 def prune_orphan_analyses(analyses_base: Path, recordings_base: Path):
@@ -518,6 +532,7 @@ class _Handler(BaseHTTPRequestHandler):
    analyses_dir:   str = 'recordings/analyses'
    margin_db: float    = MARGIN_DB
    min_gap: float      = MIN_GAP_SECONDS
+    min_duration: float = MIN_DURATION_SECONDS

    def do_DELETE(self):
        parsed = urlparse(self.path)
@@ -593,6 +608,12 @@ class _Handler(BaseHTTPRequestHandler):
        except (ValueError, TypeError):
            min_gap = self.min_gap

+        try:
+            min_duration = float(qs.get('min_duration', [self.min_duration])[0])
+            min_duration = max(0.0, min(60.0, min_duration))
+        except (ValueError, TypeError):
+            min_duration = self.min_duration
+
        if self._is_active(filename):
            self._json_err(409, 'File is currently being recorded — analysis unavailable until recording stops')
            return
@@ -602,7 +623,8 @@ class _Handler(BaseHTTPRequestHandler):
        cache_path = _analysis_cache_path(analyses_base, recordings_base, path)
        try:
            cached = json.loads(cache_path.read_text('utf-8'))
-            if cached.get('margin') == margin and cached.get('min_gap') == min_gap:
+            if (cached.get('margin') == margin and cached.get('min_gap') == min_gap
+                    and cached.get('min_duration') == min_duration):
                payload = dict(cached['result'])
                payload.pop('rms', None)  # caches written before the full-RMS field was dropped
                payload['cached'] = True
@@ -613,12 +635,12 @@ class _Handler(BaseHTTPRequestHandler):

        ext = path.suffix.lower()
        if ext == '.wav':
-            result = analyze_wav(path, margin_db=margin, min_gap=min_gap)
+            result = analyze_wav(path, margin_db=margin, min_gap=min_gap, min_duration=min_duration)
        elif ext == '.flac':
            if not (NUMPY_AVAILABLE and SOUNDFILE_AVAILABLE):
                self._json_err(400, 'FLAC analysis requires: pip install numpy soundfile')
                return
-            result = analyze_flac(path, margin_db=margin, min_gap=min_gap)
+            result = analyze_flac(path, margin_db=margin, min_gap=min_gap, min_duration=min_duration)
        else:
            self._json_err(400, f'Loudness analysis is not available for {ext} files')
            return
@@ -626,9 +648,10 @@ class _Handler(BaseHTTPRequestHandler):
        try:
            cache_path.parent.mkdir(parents=True, exist_ok=True)
            tmp = cache_path.with_suffix('.tmp')
-            # margin and min_gap MUST stay first: _cached_analysis_params reads
-            # only the first 256 bytes of this file
-            tmp.write_text(json.dumps({'margin': margin, 'min_gap': min_gap, 'result': result}), 'utf-8')
+            # margin, min_gap and min_duration MUST stay first:
+            # _cached_analysis_params reads only the first 256 bytes of this file
+            tmp.write_text(json.dumps({'margin': margin, 'min_gap': min_gap,
+                                       'min_duration': min_duration, 'result': result}), 'utf-8')
            os.replace(tmp, cache_path)
        except Exception as e:
            print(f'Warning: could not write analysis cache {cache_path}: {e}', flush=True)
@@ -745,7 +768,8 @@ class _Handler(BaseHTTPRequestHandler):
        self._send(200, data.encode(), 'application/json')

    def _api_config(self):
-        data = json.dumps({'margin': self.margin_db, 'min_gap': self.min_gap})
+        data = json.dumps({'margin': self.margin_db, 'min_gap': self.min_gap,
+                           'min_duration': self.min_duration})
        self._send(200, data.encode(), 'application/json')

    def _api_delete(self, filename: str):
@@ -1041,6 +1065,9 @@ def main():
                             f'to count as loud (default: {MARGIN_DB})')
    parser.add_argument('--min-gap',     type=float, default=MIN_GAP_SECONDS,
                        help=f'Seconds gap for merging loud sections (default: {MIN_GAP_SECONDS})')
+    parser.add_argument('--min-duration', type=float, default=MIN_DURATION_SECONDS,
+                        help=f'Discard loud sections shorter than this many seconds '
+                             f'(default: {MIN_DURATION_SECONDS})')
    parser.add_argument('--analyses-dir', default=None,
                        help='Directory for analysis cache files (default: <recordings-dir>/analyses)')
    args = parser.parse_args()
@@ -1059,6 +1086,7 @@ def main():
        analyses_dir   = str(_analyses_dir)
        margin_db      = args.margin
        min_gap        = args.min_gap
+        min_duration   = args.min_duration

    server = _Server((args.host, args.port), Handler)

@@ -155,6 +155,10 @@ body.clip-open{padding-bottom:70px}
  <input type="number" id="min-gap-input" min="0" max="300" step="0.5" value="2"
    aria-describedby="min-gap-hint">
  <span id="min-gap-hint" class="controls-hint">seconds — merge loud sections closer than this</span>
+  <label for="min-duration-input" style="margin-left:16px">Min duration:</label>
+  <input type="number" id="min-duration-input" min="0" max="60" step="0.1" value="0.5"
+    aria-describedby="min-duration-hint">
+  <span id="min-duration-hint" class="controls-hint">seconds — ignore loud sections shorter than this</span>
 </div>
 <div class="filter-bar" role="search" aria-label="Filter recordings">
  <label for="filter-name">Search:</label>
@@ -403,16 +407,17 @@ document.getElementById('clip-context').addEventListener('click', () => {
  seekToSection(c.fileIdx, c.filename, c.start, c.end, null);
 });

-// filename|margin|gap -> analysis result, so re-renders (filtering,
+// filename|margin|gap|minDur -> analysis result, so re-renders (filtering,
 // refresh) never refetch what this session already has
 const analysisCache = new Map();

-async function fetchAnalysis(filename, margin, minGap, force = false) {
-  const key = `${filename}|${margin}|${minGap}`;
+async function fetchAnalysis(filename, margin, minGap, minDur, force = false) {
+  const key = `${filename}|${margin}|${minGap}|${minDur}`;
  if (!force && analysisCache.has(key)) return analysisCache.get(key);
  const r = await fetch('/api/analyze?file='+encodeURIComponent(filename)
    +'&margin='+encodeURIComponent(margin)
-    +'&min_gap='+encodeURIComponent(minGap));
+    +'&min_gap='+encodeURIComponent(minGap)
+    +'&min_duration='+encodeURIComponent(minDur));
  const d = await r.json();
  if (!d.error) analysisCache.set(key, d);
  return d;
@@ -424,13 +429,14 @@ async function analyse(idx, filename, cell, btn, force = false) {
  cell.innerHTML = '<div class="spin" aria-live="polite" aria-busy="true">Analysing…</div>';
  const margin = document.getElementById('margin-input').value || '12';
  const minGap = document.getElementById('min-gap-input').value || '2';
+  const minDur = document.getElementById('min-duration-input').value || '0.5';
  const restoreBtn = () => {
    btn.textContent = 'Analyse'; btn.disabled = false;
    btn.onclick = () => analyse(idx, filename, cell, btn);
    if (!cell.contains(btn)) cell.appendChild(btn);
  };
  try {
-    const d = await fetchAnalysis(filename, margin, minGap, force);
+    const d = await fetchAnalysis(filename, margin, minGap, minDur, force);
    if (d.error) {
      cell.innerHTML = `<div class="spin" role="alert">Error: ${esc(d.error)}</div>`;
      restoreBtn(); return;
@@ -439,7 +445,7 @@ async function analyse(idx, filename, cell, btn, force = false) {
    box.appendChild(drawWave(d.rms_display||[], d.sections||[], d.duration||0, filename));

    const meta = document.createElement('div'); meta.className='analysis-meta';
-    meta.textContent = `margin: ${margin} dB · gap: ${minGap}s${d.cached ? ' · cached' : ''}`;
+    meta.textContent = `margin: ${margin} dB · gap: ${minGap}s · min: ${minDur}s${d.cached ? ' · cached' : ''}`;
    box.appendChild(meta);

    const chips = document.createElement('div');
@@ -576,7 +582,8 @@ async function updateStorage() {
 function cachedParamsMatch(ca) {
  return ca != null
    && Number(ca.margin)  === parseFloat(document.getElementById('margin-input').value)
-    && Number(ca.min_gap) === parseFloat(document.getElementById('min-gap-input').value);
+    && Number(ca.min_gap) === parseFloat(document.getElementById('min-gap-input').value)
+    && Number(ca.min_duration) === parseFloat(document.getElementById('min-duration-input').value);
 }

 // Run the deferred analyses of a freshly expanded day
@@ -852,6 +859,7 @@ async function dayHighlights(dayId, analyzableFiles) {

  const margin = document.getElementById('margin-input').value || '12';
  const minGap = document.getElementById('min-gap-input').value || '2';
+  const minDur = document.getElementById('min-duration-input').value || '0.5';

  const results = [];
  let nCached = 0, nLive = 0;
@@ -860,7 +868,7 @@ async function dayHighlights(dayId, analyzableFiles) {
    progFile.textContent  = `${i + 1} / ${n} — ${f.name}`;
    progFill.style.width  = `${(i / n) * 100}%`;
    try {
-      const d = await fetchAnalysis(f.name, margin, minGap);
+      const d = await fetchAnalysis(f.name, margin, minGap, minDur);
      if (!d.error) { results.push({ f, data: d }); d.cached ? nCached++ : nLive++; }
    } catch(e) {}
  }
@@ -1099,6 +1107,8 @@ fetch('/api/config').then(r => r.json()).then(cfg => {
    document.getElementById('margin-input').value = cfg.margin;
  if (cfg.min_gap != null)
    document.getElementById('min-gap-input').value = cfg.min_gap;
+  if (cfg.min_duration != null)
+    document.getElementById('min-duration-input').value = cfg.min_duration;
 }).catch(() => {}).finally(() => load().then(() => setInterval(pollStatus, 5000)));
 </script>
 </body>