NearestNeighbor Audio Demo

Data from AE-W/batch_outputs

View
Noise (ID)

Three prompt-search methods: Bin | Clap | Dasheng. Each combines batch_outputs_* and generated_noises_* from the dataset.

How to read the IDs

  • Numeric IDs (e.g. 00_000357) come from batch_outputs (SONYC/UrbanSound8k).
  • Long prompt-like IDs (e.g. a_bulldozer_moving_gravel_...) come from generated_noises.

Audio labels: BG = background noise | FG = generated foreground | Mix = BG + FG

Nearest Neighbor (Clap): Baseline outputs (top 10 prompts)