The ODSS dataset can be used for development and benchmarking of synthetic speech detection methods. It incorporates tailored data distributions ready for training and provides multiple dimensions for the evaluation and analysis of generalizability. The utterances are uncompressed and don't include background noise, therefore audio augmentation techniques can also be applied to improve or test the robustness to various transformation.
Call to action: Would you like to contribute with your own speech samples and help us expose disinformation? Then do not miss the chance and contact us via mail (preferred) to discuss the details, or use the IDMT contact form. We also encourage you to design multiple splits for training and evaluation, besides the ones we suggested as a starting point.
The dataset can be found at https://zenodo.org/records/8370668 , always including the latest version.
Reference, including a detailed description of the dataset:
A. Yaroshchuk et al., "An Open Dataset of Synthetic Speech" in IEEE International Workshop on Information Forensics and Security (WIFS), Nürnberg, Germany, 2023, pp. 1-6, doi: 10.1109/WIFS58808.2023.10374863.
Author: Artem Yaroshchuk (IDMT)
Editor: Jochen Spangenberg (DW)