ISR/README.md

# ISR — Audio Recorder

> AI-generated code. Run at your own risk. MIT licence.

Records from multiple simultaneous sources — Icecast/HTTP streams and ALSA soundcards — with time-based file splitting.

## Features

- Multiple sources recorded in parallel (each in its own thread)
- **Stream recording** — HTTP/Icecast, auto-detects MP3 / OGG / AAC / FLAC / Opus from `Content-Type`
- **Soundcard recording** — ALSA (`arecord`), works on any Linux / Raspberry Pi
- Time-aligned file splits (e.g. every hour, on the hour)
- OGG / Opus / FLAC header injection so every split file is independently playable
- Auto-reconnect on stream drops or device errors
- WAV and FLAC output for soundcard sources
- Web UI to browse, download, and analyse recordings

---

## Quick start — bare-metal

```bash
pip install requests              # stream recording
pip install numpy soundfile       # FLAC output + web waveform analysis (optional)

cp config.example.ini config.ini
# edit config.ini to add your sources
python isr.py                     # start recorder  (Ctrl+C to stop)
python web.py                     # start web UI at http://localhost:8080
```

## Quick start — Docker

```bash
cp config.example.ini config.ini
# In config.ini set: output_directory = /recordings
# Optionally set:    log_file = /recordings/recorder.log

docker compose up -d
# recorder starts immediately; web UI at http://<host>:8080
docker compose logs -f            # tail logs from both services
docker compose down               # graceful stop
```

Recordings land in `./recordings/` on the host (bind-mounted into both containers).

If you only record streams (no soundcard), comment out the `devices` block in `docker-compose.yml`.

---

## Configuration

`config.ini` uses standard INI format. `[general]` provides defaults; every other section is a recording source. Source sections inherit all general settings and can override any of them.

### `[general]`

| Key | Default | Description |
|-----|---------|-------------|
| `output_directory` | `recordings` | Output path. Use `/recordings` in Docker. |
| `split_minutes` | `60` | Split into a new file every N minutes, aligned to clock boundaries (e.g. 60 → files start at :00, 30 → at :00 and :30). |
| `filename_pattern` | `%Y%m%d_%H%M%S` | strftime pattern; file extension is appended automatically. |
| `max_retries` | `10` | Give up after this many consecutive failures per source. |
| `retry_delay_seconds` | `5` | Wait between retries. |
| `log_level` | `INFO` | `DEBUG` / `INFO` / `WARNING` / `ERROR` / `CRITICAL` |
| `log_file` | `recorder.log` | Log file path. Use `/recordings/recorder.log` in Docker. |

### `type = stream`

```ini
[my_stream]
type     = stream
url      = http://icecast.example.com:8000/live
username =          # leave blank for public streams
password =
format   = auto     # auto | mp3 | ogg | aac | flac | opus
```

`format = auto` detects from the `Content-Type` response header. For OGG/Opus/FLAC the first ~16 KB of each connection is buffered to extract codec headers, which are then prepended to every split file — all files are independently playable.

A new file is always opened on (re)connect so gaps between connections are never silently merged.

### `type = soundcard`

```ini
[mic_in]
type        = soundcard
device      = default   # see device selection below
sample_rate = 44100
channels    = 2
format      = wav       # wav | flac
```

**Device selection:**

| Value | Behaviour |
|-------|-----------|
| `default` | System default input |
| `monitor` | First loopback/monitor source (capture system audio) |
| `<partial name>` | Case-insensitive substring match against device name |
| `hw:X,Y` | Exact ALSA hardware ID |

Run `python isr.py --list-devices` (or `arecord -l`) to see available devices and their IDs.

FLAC output requires `pip install soundfile numpy`.

### Multiple sources

Every section except `[general]` is a source — they all record simultaneously:

```ini
[general]
output_directory = recordings
split_minutes    = 60

[radio1]
type             = stream
url              = http://radio.example.com:8000/stream1
filename_pattern = radio1_%Y%m%d_%H%M%S

[system_audio]
type             = soundcard
device           = hw:0,0
filename_pattern = system_%Y%m%d_%H%M%S
```

---

## Filename patterns

strftime codes are substituted at split time. The file extension is added automatically.

| Pattern | Example |
|---------|---------|
| `%Y%m%d_%H%M%S` | `20241225_143000.mp3` |
| `radio_%Y-%m-%d_%H%M` | `radio_2024-12-25_1430.mp3` |
| `%Y/%m/%d/rec_%H%M%S` | `2024/12/25/rec_143000.mp3` *(subdirs created automatically)* |

---

## Web UI (`web.py`)

```bash
python web.py                        # serves ./recordings on port 8080
python web.py --dir /path/to/audio   # custom recordings directory
python web.py --port 8888            # custom port
python web.py --threshold 0.03       # loudness threshold 0–1 (default 0.05)
```

Shows a table of all recordings sorted newest-first with file size, duration (WAV only), and a waveform analysis button. Analysis computes RMS per 100 ms window and highlights contiguous sections above the loudness threshold.

Waveform analysis is WAV-only; numpy speeds it up significantly (pure-Python fallback available without it).

---

## How it works

**Streams:** Connect via HTTP → detect format from `Content-Type` → buffer first ~16 KB to extract OGG/FLAC codec headers → stream raw bytes to disk → at each split boundary open a new file and prepend the saved headers. No transcoding, no decoding — raw bytes in, raw bytes out.

**Soundcard:** Spawn `arecord` as a subprocess (raw PCM output) → read 100 ms chunks via a thread → write 16-bit PCM to WAV or FLAC → split at configured boundaries.

Both recorder types run in separate threads and retry independently up to `max_retries`.

---

## Docker notes

**ALSA device access:** The `recorder` container needs `/dev/snd` mapped. The container runs as root, so no group configuration is needed — the device mapping alone is sufficient.

If ALSA still fails to find the device inside the container, verify the device exists on the host:
```bash
arecord -l           # list capture hardware
ls -la /dev/snd      # check device nodes
```

**Stream-only deployments:** If you don't use soundcard recording, remove the `devices` block from `docker-compose.yml` — the image works fine without it.

**Log file in Docker:** Set `log_file = /recordings/recorder.log` in `config.ini` so logs survive container restarts. Alternatively, use `docker compose logs` (the recorder always logs to stdout as well).

**File retention:** ISR never deletes recordings. Add a cron job on the host if needed:
```bash
# Delete recordings older than 30 days
find recordings/ -type f -mtime +30 -delete
```

---

## Licence

MIT