Revision as of 16:07, 21 June 2022 by Jmen (Talk | contribs) (License)

VoxCeleb-PT is a small dataset of voices of Portuguese celebrities that can be used as a language-specific extension of the widely used VoxCeleb corpus.


The dataset contains 51 celebrities, of which 23 are female. Altogether, Voxceleb-PT contains 26,663 automatically transcribed utterances (.wav 16kHz, pcm_s16le). The total duration is 17:55:14, with an average of 20 min/spk, and utterance duration 2-5s.

The dataset can be obtained in its original form and with both development (train/val) and test sets, all containing the full speaker cohort. All sets contain speaker id, age, gender and manually corrected transcriptions. As such, VoxCeleb-PT should prove useful for a variety of tasks, namely ASR, Speaker Verification and Age/Gender Recognition.


  • Original. The dataset follows the kaldi file system: each folder contains speech files from a given speaker and the following: text, utt2spk and wav.scp. The spk_info.csv file maps the speaker id to the celebrities' age and gender.
  • With splits


This dataset is available to download for research purposes under a Creative Commons Attribution 4.0 International License[1]. The copyright remains with the original owners of the video.

The views and opinions expressed by speakers in the dataset are those of the individual speakers and do not necessarily reflect the positions of the University of Lisbon, INESC-ID, or the authors.