A. Abad, L. J. Rodriguez Fuentes, M. Penagarikano, A. Varona, M. Diez, and G. Bordel. On the Calibration and Fusion of Heterogeneous Spoken Term Detection Systems. In Interspeech 2013, August 25-29 2013
README.txt - Contains pretty much the same information of this wiki site ./bin/ - Contains the different scripts necessary for calibration and fusion ./bin/PrepareForFusion.sh - Main script than normalizes, aligns, hypotesizes missing scores and creates the groundtruh for the input systems ./bin/align_score_files.pl - Used by ./bin/PrepareForFusion.sh ./bin/create_groundtruth_centerdistance.pl - Used by ./bin/PrepareForFusion.sh ./bin/heuristicScoring.pl - Used by ./bin/PrepareForFusion.sh ./bin/fusion.sh - This script takes the output of ./bin/PrepareForFusion.sh to learn the fusion parameters and apply them to the evaluation scores ./bin/raw2stdslist.sh - Converts scores from a raw (internal) format to the stdlist xml format of the SWS evaluation ./scoring_atwv_sws2013/ - Contains the Mediaeval 2013 atwv scoring package ./scores/ - Contains sample dev (./scores/dev/akws_br-devterms.stdlist.xml, ./scores/dev/dtw_br-devterms.stdlist.xml) and eval scores (./scores/eval/akws_br-evalterms.stdlist.xml, ./scores/eval/dtw_br-evalterms.stdlist.xml) ./sample_results/ - Contains the sample (intermediate) results that are obtained if you run the instructions bellow
STEP1 Prepare the development scores for learning the calibration/fusion
./bin/PrepareForFusion.sh \ -q ./scoring_atwv_sws2013/sws2013_dev/sws2013_dev.tlist.xml \ -r ./scoring_atwv_sws2013/sws2013_dev/sws2013_dev.rttm \ -o DEV_SCORES_4FUSION.txt \ -t queries_in_data.ref \ -g -z qnorm -m 1 -n 1 \ ./scores/dev/akws_br-devterms.stdlist.xml \ ./scores/dev/dtw_br-devterms.stdlist.xml
<query_id> <file_id> <start_time> <duration> <sc1> <sc2> ... <scN> <label>
Usage: PrepareForFusion.sh -q <tlistxml> -r <rttm> -o <outputfile> [opts] <stdlistfile1> [stdlistfile2] [stdlistfile3] ... [stdlistfileN]
<stdlistfile*> input score stdlist file in the SWS2012 format (*.stdlist.xml) | - Required argument (at least 1) -q <tlistxml> termlist file in the SWS2012 format (*.tlist.xml) | - Required parameter -r <rttm> rttm file in the SWS2012 format (*.rttm) | - Required parameter -o <outputfile> output file name | - Required parameter -z <value> score z-norm type (none, qnorm, fnorm, qfnorm, fqnorm) | - Default: none -g add ground-truth information to the outputfile -t <filename> saves the number of true terms in the reference per query (implies -g) -m <value> apply majority voting fusion with <value> minimum number of votes | - Default: 1 -n <value> method for creating default scores (0: average of the other detections (MV approach); 1; min per query, 2: global min, 3: histogram based) | - Default: 0 -d debug mode, done remove auxiliar files stored in /tmp/tmpdir.$$ -h help
NOTE: Requires Matlab (or octave) and perl. It also depends on the perl scripts align_score_files.pl, heuristicScoring.pl and create_groundtruth_centerdistance.pl that should be located in the same folder of the main script
STEP2 Prepare the evaluations scores for fusion
./bin/PrepareForFusion.sh \ -q ./scoring_atwv_sws2013/sws2013_eval/sws2013_eval.tlist.xml \ -r ./scoring_atwv_sws2013/sws2013_eval/sws2013_eval.rttm \ -o EVAL_SCORES_4FUSION.txt \ -z qnorm -m 1 -n 1 \ ./scores/eval/akws_br-evalterms.stdlist.xml \ ./scores/eval/dtw_br-evalterms.stdlist.xml
STEP3 - Train fusion on development scores and apply it to eval scores typing the following command:
./bin/fusion.sh DEV_SCORES_4FUSION.txt EVAL_SCORES_4FUSION.txt queries_in_data.ref
Before running this script you will need to change also the value of the MATLAB_BIN variable and download the Bosaris toolkit and change the variable BOSARIS to the path that contains this toolkit.
Notice also that the script contains 4 hard-coded variables that are specific to the Mediaeval SWS2013 evaluation. These are the cost parameters P_TARGET, C_MISS, C_FA and the total duration in seconds of the collection data TSPEECH
This fusion script uses the DEV_SCORES_4FUSION.txt (with groundtruh information) to learn the calibration and fusion parameters that are applied both to the development set and to the evaluation set.
As mentioned previously the queries_in_data.ref contains the statistics abbout the true number of instances of each query on the data and it is used for hypothesizing missing scores and for computing AWTV.
This call (that can take a while, several minutes) will produce 1 output file for the dev scores (dev_fusion.scores) and 1 output file for the eval scores (eval_fusion.scores). The format of these output files is as follows:
<query_id> <file_id> <start_time> <duration> <fusion_score> <decision> ... ... <query_id> <file_id> <start_time> <duration> <fusion_score> <decision>
The <decision> field is 0 or 1 depending respectively if the score is lower or greater than the minimum cost Bayes optimum threshold (see the Interspeech paper for details).
Additionally, the fusion parameters are stored in the fuse_params.txt.
The final step consists of converting this result files to the format used in the SWS2013 challenge:
./bin/raw2stdslist.sh dev_fusion.scores ./scoring_atwv_sws2013/sws2013_dev/sws2013_dev.tlist.xml > fusion-devterms.stdlist.xml ./bin/raw2stdslist.sh eval_fusion.scores ./scoring_atwv_sws2013/sws2013_eval/sws2013_eval.tlist.xml > fusion-evalterms.stdlist.xml Notice that this script can take a third parameter. This paramater is a threshold value that you can use to apply a different decision threshold to the optimal Bayes one.
Reference TWV results in Mediaeval SWS2013 task:
dev eval mtwv atwv mtwv atwv akws_br 0.1571 0.1408 dtw_br fusion