Deliveries
What's currently in the corpus, what's missing, and what changed in the latest batch. One row per data delivery in the log at the bottom.
Current delivery — 003
provisional · 2026-05-12 #5 · slug: multi-project-multi-voice · supersedes delivery-002.
What's in it
|
|
| Clips |
20 |
| Total duration |
~41.6 min |
| Projects |
she_proves (12) + elephant_in_the_room (8) |
| Tiers |
A (12 clean) + B (8 room-augmented) |
| TTS backends |
Azure (18) + Google Chirp 3 HD (2) |
| Unique speaker personas |
6 (4 in She-Proves, 2 in Elephant) |
| Validation failures |
0 / 20 |
| Pipeline |
SynthBanshee 0.1.0 @ 1ea48f3 |
Authoritative records: metadata.yaml · notes.md · qa-report.json.
Known limitations
- All clips are
split: train. Only 4 unique speaker personas across 20 clips — speaker-disjoint partitioning isn't feasible at this scale.
- One room type for Elephant. All 8 Tier-B clips use
clinic_office. welfare_office and open_office are in the pipeline but not exercised yet.
- One device profile for She-Proves. No
phone_in_pocket etc. augmentation applied yet — Tier-A clips are clean, not phone-captured.
- Voice diversity is low. 2 voice families per gender; the QA threshold for "diverse" is ≥3.
- Toy-batch scale. 20 clips is enough to wire up consumer plumbing. Not enough to train a production model.
Open QA flags
| Flag |
Detail |
What to do about it |
low_voice_diversity_male |
2 male voice families across the corpus (threshold ≥3) |
Track per-voice eval separately; expect feature overfit to AvriNeural until more voices land |
low_voice_diversity_female |
Same, for female voices |
Same |
vic_f0_high (per-clip × 2) |
sp_sv_a_0003_00, sp_it_a_0003_00 — Google Chirp HD female F0 above Azure baseline |
Nothing. Don't exclude the clips. Calibrate F0 features per backend if you compute them. See Audio Format. |
quality_flagged_clips: 15 |
Mostly emotion_downgrade from prosody cap activations at I3+ |
Don't reflexively filter these out — they pass validation. See Common mistakes #7. |
Distribution
| Typology |
Tier A (She-Proves) |
Tier B (Elephant) |
Total |
SV |
3 |
2 |
5 |
IT |
3 |
2 |
5 |
NEG |
3 |
2 |
5 |
NEU |
3 |
2 |
5 |
max_intensity across the 20 clips: I5 = 10 clips · I3 = 4 clips · I2 = 6 clips.
What this delivery exercises
Use these to check your consumer code on the schema features the delivery was designed to cover:
- Full
ClipMetadata schema — including the generation_metadata block and (for Tier B) populated acoustic_scene.
- Per-surface casing rules — UPPERCASE
speaker_id, lowercase paths and clip IDs.
has_violence derivation from events — NEG clips correctly false even at max_intensity ≥ 3.
- Multi-project layout under a single
data/he/ root.
- Multi-backend provenance —
generation_metadata.tts_backend differs per speaker.
What changed vs delivery-002
Closed QA findings (vs. delivery-002)
| Finding |
Delivery-002 |
Delivery-003 |
agg_no_escalation |
3 clips |
0 — AGG RMS now escalates with intensity |
warn_no_overlap |
4 clips |
0 — turn-overlap fires on I4+ clips |
warn_emotion_downgrade |
4 clips |
0 |
generation_metadata absent |
0 of 8 clips had it |
20 of 20 carry the full block |
dirty_file_path null |
7 of 8 clips |
20 of 20 retain dirty files |
normalized_dbfs hardcoded -1.0 |
all 8 clips |
Records the measured peak |
Closed by the 2026-05-12 schema-shift regen
Three SynthBanshee PRs landed alongside the regen (#110 / #111 / #112):
| Finding |
Resolution |
single_backend false positive |
qa.py derives backend diversity from generation_metadata.tts_backend.values(); reports clips_by_tts_backend: {azure: 18, google: 2} |
| Absolute paths in clip JSON |
dirty_file_path and transcript_path are now repo-relative POSIX |
Leaked pytest tmp_path on sp_neu_a_0001_00 |
Regen overwrote with canonical path; autouse env-var strip fixture prevents future leaks |
Delivery log
| # |
Date |
Slug |
Project |
Tier |
Clips |
Duration |
Status |
| 003 |
2026-05-12 |
multi-project-multi-voice |
she_proves + elephant |
A + B |
20 |
~42m |
provisional |
| 002 |
2026-04-15 |
m2a-wettest |
she_proves |
A |
8 |
~17m |
superseded |
| 001 |
2026-04-15 |
debug-run-1 |
she_proves |
A |
1 |
2m 36s |
superseded |
Status definitions
| Status |
Meaning |
provisional |
Preview batch; consumer-integration only, not approved for training |
approved |
QA passed; cleared for training use |
superseded |
Replaced by a later delivery covering the same scenes at higher quality |