A mismatch between training and test data statistics can result in a significant degradation of performance of machine learning systems. For sound recognition in acoustic sensor networks (ASNs) this is a significant issue because of the huge number and variability of sounds and acoustic environments, and because of the large variety of sensor locations and geometric configurations one can encounter. Therefore, existing databases for sound recognition will almost never be a perfect fit to any concrete target application in acoustic sensor networks.
The main objective of this project is to devise techniques for making use of available resources for the development of high-performance acoustic event and scene classifiers for a specific target application in an ASN. Those available resources are on one hand weakly labeled data (data annotated only with the event class, but not with temporal on/offset information), which stem from a different domain than the target domain for which an application is to be developed. On the other hand we assume availability of lots of unlabeled audio recordings from the target domain. We will develop techniques to compute strong labels (event category plus on/offset times), to compute domain-invariant features, and to carry out domain adaptation. The main methodology applied will be deep generative models.
- Project duration:
- 01/2017 - 12/2023
- Funded by: