Curated list of open datasets launched

, Toni Heittola
DCASE

I maintained a listing of audio datasets suitable for research related to environmental audio under my homepage over 8 years. Recently, I helped to launch better structured dataset listing as community project under DCASE Community.

The DCASE Datalist is a DCASE Community effort to collect curated meta-information about DCASE related datasets into a uniform structure. The list focuses specifically on pre-packaged datasets rather than online data repositories. Datasets included in the list are well documented, packaged for easy usage, and have a free or open license. Datasets are placed roughly into a couple of data collections at the high level based on the audio content analysis type they are mainly focusing on.

The data listing is maintained through a Github repository. In case you notice datasets missing, errors, or you want to contribute otherwise to the data listings, you can raise issues in the repository or fork it and make a pull request with your edits. Proposals for new data collections are welcomed as well.

A curated list of open datasets for DCASE related research has been launched!!