# Local TTS Setup Reference

## Piper Wrapper Script (`~/bin/tts_briefing.sh`)

The recommended way to call Piper in cron. Handles model paths automatically.

```bash
#!/usr/bin/env bash
set -euo pipefail

MODEL_DIR="${PIPER_MODEL_DIR:-$HOME/.cache/huggingface/hub/models--csukuangfj--vits-piper-zh_CN-huayan-medium/snapshots/8baec97632a57257b09073b7986f4d319efdc29a}"
MODEL="$MODEL_DIR/zh_CN-huayan-medium.onnx"
CONFIG="$MODEL_DIR/zh_CN-huayan-medium.onnx.json"
CACHE_DIR="${AUDIO_CACHE_DIR:-$HOME/.hermes/audio_cache}"

OUT_FILE=""
TEXT=""

while [[ $# -gt 0 ]]; do
    case "$1" in
        --out)
            OUT_FILE="$2"
            shift 2
            ;;
        *)
            TEXT="$1"
            shift
            ;;
    esac
done

[[ -z "$TEXT" ]] && TEXT="$(cat)"
[[ -z "$TEXT" ]] && { echo "Usage: tts_briefing.sh \"text\" [--out output.wav]" >&2; exit 1; }

mkdir -p "$CACHE_DIR"
[[ -z "$OUT_FILE" ]] && OUT_FILE="$CACHE_DIR/tts_$(date +%Y%m%d_%H%M%S).wav"

echo "$TEXT" | piper -m "$MODEL" -c "$CONFIG" -f "$OUT_FILE" --sentence-silence 0.3
echo "$OUT_FILE"
```

Usage:
```bash
echo "播报文本" | ~/bin/tts_briefing.sh --out /path/to/output.wav
```

## Pitfalls

### Cron has no access to shell environment variables
Hermes cron runs in a **fresh environment** — it does NOT inherit the shell's `MINIMAX_API_KEY` or other env vars set in `.env` or shell profile. This is why MiniMax TTS silently fails in cron with "MINIMAX_API_KEY not configured" even when it's set in the running session.

**Solution:** Always use local Piper (`~/bin/tts_briefing.sh`) for cron TTS. It has no env var dependencies.

### Numbers read as digits by TTS
MiniMax TTS (and most Chinese TTS) reads `18°C`、`2026年5月18日`、`85%` as individual digits. **ALL numbers must be pre-converted to Chinese characters:**

| Original | For TTS |
|----------|---------|
| 18°C | 十八度 |
| 27°C | 二十七度 |
| 2026年5月18日 | 二零二六年五月十八日 |
| 85% | 百分之八十五 |

## Piper (recommended — lightweight, offline, good Chinese)

### Model
- **Repo:** `csukuangfj/vits-piper-zh_CN-huayan-medium`
- **Files needed:**
  - `zh_CN-huayan-medium.onnx` (the model)
  - `zh_CN-huayan-medium.onnx.json` (config)

### Download via Python API
```python
# USE python3.10 — huggingface_hub is installed under python3.10 site-packages
# (hermes-agent session uses python3.11, but pip packages go to 3.10)
python3.10 -c "
from huggingface_hub import hf_hub_download
model = hf_hub_download(repo_id='csukuangfj/vits-piper-zh_CN-huayan-medium', filename='zh_CN-huayan-medium.onnx')
config = hf_hub_download(repo_id='csukuangfj/vits-piper-zh_CN-huayan-medium', filename='zh_CN-huayan-medium.onnx.json')
print(model)
"
# Output: ~/.cache/huggingface/hub/models--csukuangfj--vits-piper-zh_CN-huayan-medium/snapshots/.../zh_CN-huayan-medium.onnx
```

### Test generation
```bash
MODEL=$(python3.10 -c "from huggingface_hub import hf_hub_download; print(hf_hub_download(repo_id='csukuangfj/vits-piper-zh_CN-huayan-medium', filename='zh_CN-huayan-medium.onnx'))")
CONFIG="${MODEL%.onnx}.json"
echo "测试文本" | piper -m "$MODEL" -c "$CONFIG" -f /tmp/test.wav
```

### Cron integration
```bash
piper -m "$MODEL" -c "$CONFIG" --input /tmp/brief_raw.txt --output /tmp/brief.wav
ffmpeg -i /tmp/brief.wav -acodec libopus /tmp/brief.ogg -y 2>/dev/null
# Then deliver MEDIA:/tmp/brief.ogg
```

## Coqui TTS (alternative — higher quality, much heavier)

⚠️ **Not viable on this machine.** Coqui TTS requires PyTorch (CPU ~2GB download, very slow). Do not attempt.

## MiniMax TTS API

- **Endpoint:** `api.minimaxi.com/v1/t2a_v2`
- **Key env var:** `MINIMAX_API_KEY` (NOT `MINIMAX_CN_API_KEY` — that's for chat)
- **Cron environment:** Cron runs in a fresh environment. If `MINIMAX_API_KEY` is not in `.env`, TTS will fail silently with "MINIMAX_API_KEY not configured"
- **Fix:** Set `MINIMAX_API_KEY` in the hermes `.env` file, or use Piper fallback
