VoxCeleb-PT: Difference between revisions
From HLT@INESC-ID
No edit summary |
No edit summary |
||
Line 10: | Line 10: | ||
== Dowload == | == Dowload == | ||
* Original. The dataset follows the kaldi file system: each folder contains speech files from a given speaker and the following: | * Original. The dataset follows the kaldi file system: each folder contains speech files from a given speaker and the following: ''text'', ''utt2spk'' and ''wav.scp''. The ''spk_info.csv'' file maps the speaker id to the celebrities' age and gender. | ||
* With splits | * With splits |
Revision as of 14:49, 7 June 2022
VoxCeleb-PT is a small dataset of voices of Portuguese celebrities that can be used as a language-specific extension of the widely used VoxCeleb corpus.
Statistics
The dataset contains 51 celebrities, of which 23 are female. Altogether, Voxceleb-PT contains 26,663 automatically transcribed utterances (.wav 16kHz, pcm_s16le). The total duration is 17:55:14, with an average of 20 min/spk, and utterance duration 2-5s.
The dataset can be obtained in its original form and with both development (train/val) and test sets, all containing the full speaker cohort. All sets contain speaker id, age, gender and manually corrected transcriptions. As such, VoxCeleb-PT should prove useful for a variety of tasks, namely ASR, Speaker Verification and Age/Gender Recognition.
Dowload
- Original. The dataset follows the kaldi file system: each folder contains speech files from a given speaker and the following: text, utt2spk and wav.scp. The spk_info.csv file maps the speaker id to the celebrities' age and gender.
- With splits