In March 2025, I (Jakob Abesser), working for Fraunhofer IDMT, was fortunate to participate in the 51st Annual Meeting on Acoustics (DAS | DAGA 2025), which took place at the Bella Center in Copenhagen, Denmark. This provided me with the opportunity to present and share some of my veraAI research work. I furthermore had the opportunity to gain deeper insights into the work of others. Here's a short recap of the event.
The conference was a joint event hosted by the German and Danish national acoustic societies DEGA and DAS and included sessions in various fields such as room acoustics, virtual acoustics, and building acoustics. Alongside the scientific program, there was plenty of time for networking, capped off with a reception at the Copenhagen Town Hall and a social evening at the famous VEGA music club.
In addition to co-organizing a structured session entitled "Acoustic Scene Analysis using AI", I presented the first results from the veraAI project on "Automatic Retrieval of Indicator Sounds for Acoustic Geo-Tagging" as a late-breaking poster on the final conference day.
As the main idea, we propose the term "indicator sounds" to describe sounds which are characteristic (but not unique) for a specific geo-entity. These entities can either be specific acoustic scenes such as from a park or pedestrian zone, or even a particular city such as Helsinki or Barcelona.
Indicator sounds, when recognized in the background of, for instance, an outdoor speech recording, can provide valuable clues for verifying the geographic origin of an audio recording.
In the poster (available here, low-res image below), we present two data-driven strategies to identify such indicator sounds based on a dataset of ambient recordings covering 10 different classes of acoustic scenes and ten European cities.
The findings so far show that both methods allow us to identify plausible sounds such as "rumble" and "engine starting" for the acoustic scene class "bus", or "cricket" and "bird" for the scene class "park". However, the indicator sounds found for cities show the strong influence of (and potential bias introduced by) the recording location within a city. For instance, we found "siren" sounds in Vienna, "water noises" in Helsinki, and "street musicians" in Barcelona.
As a next step we want to enhance this study to include a multiple dataset of recorded acoustic scenes and soundscapes to generate an extensive list of indicator sounds for content verification. Stay tuned!
Author: Jakob Abesser (Fraunhofer IDMT)
Editor: Jochen Spangenberg (DW)