Three prompt-search methods: Bin | Clap | Dasheng. Each combines batch_outputs_* and generated_noises_* from the dataset.
How to read the IDs
- Numeric IDs (e.g.
00_000357) come from batch_outputs (SONYC/UrbanSound8k).
- Long prompt-like IDs (e.g.
a_bulldozer_moving_gravel_...) come from generated_noises.
Audio labels: BG = background noise | FG = generated foreground | Mix = BG + FG