Difference between revisions of "VoxCeleb-PT"

From HLT@INESC-ID

(Created page with "VoxCeleb-PT is a small dataset of voices of Portuguese celebrities that can be used as a language-specific extension of the widely used VoxCeleb corpus. The dataset contain...")
 
Line 1: Line 1:
 
VoxCeleb-PT is a small dataset of voices of Portuguese celebrities that can be used as a language-specific extension of the widely used VoxCeleb corpus.  
 
VoxCeleb-PT is a small dataset of voices of Portuguese celebrities that can be used as a language-specific extension of the widely used VoxCeleb corpus.  
  
 +
== Statistics ==
  
 
The dataset contains 51 celebrities, of which 23 are female. Altogether, Voxceleb-PT contains 26,663 automatically transcribed utterances (.wav 16kHz, pcm_s16le). The total duration is 17:55:14, with an average of 20 min/spk, and utterance duration 2-5s.
 
The dataset contains 51 celebrities, of which 23 are female. Altogether, Voxceleb-PT contains 26,663 automatically transcribed utterances (.wav 16kHz, pcm_s16le). The total duration is 17:55:14, with an average of 20 min/spk, and utterance duration 2-5s.
Line 6: Line 7:
  
 
The dataset can be obtained in its original form and with both development (train/val) and test sets, all containing the full speaker cohort. All sets contain speaker id, age, gender and manually corrected transcriptions. As such, VoxCeleb-PT should prove useful for a variety of tasks, namely ASR, Speaker Verification and Age/Gender Recognition.
 
The dataset can be obtained in its original form and with both development (train/val) and test sets, all containing the full speaker cohort. All sets contain speaker id, age, gender and manually corrected transcriptions. As such, VoxCeleb-PT should prove useful for a variety of tasks, namely ASR, Speaker Verification and Age/Gender Recognition.
 +
 +
== Dowload ==
 +
 +
* Original. The dataset follows the kaldi file system: each folder contains speech files from a given speaker and the following: `text`, `utt2spk`and `wav.scp`.  The `spk_info.csv` file maps the speaker id to the celebrities' age and gender.
 +
 +
* With splits

Revision as of 14:48, 7 June 2022

VoxCeleb-PT is a small dataset of voices of Portuguese celebrities that can be used as a language-specific extension of the widely used VoxCeleb corpus.

Statistics

The dataset contains 51 celebrities, of which 23 are female. Altogether, Voxceleb-PT contains 26,663 automatically transcribed utterances (.wav 16kHz, pcm_s16le). The total duration is 17:55:14, with an average of 20 min/spk, and utterance duration 2-5s.


The dataset can be obtained in its original form and with both development (train/val) and test sets, all containing the full speaker cohort. All sets contain speaker id, age, gender and manually corrected transcriptions. As such, VoxCeleb-PT should prove useful for a variety of tasks, namely ASR, Speaker Verification and Age/Gender Recognition.

Dowload

  • Original. The dataset follows the kaldi file system: each folder contains speech files from a given speaker and the following: `text`, `utt2spk`and `wav.scp`. The `spk_info.csv` file maps the speaker id to the celebrities' age and gender.
  • With splits