Line 11: | Line 11: | ||
* Unfortunately, we did not have time to consolidate (and clean up) the code and what we are making available is very much the same set of scripts that we developed during the first experiments. Thus, the code is written in different pieces in different coding languages (bash, perl, matlab, etc.) and it has some external dependencies. Anyway, we expect that it can be still useful for researchers that want to try to fuse their STD systems. | * Unfortunately, we did not have time to consolidate (and clean up) the code and what we are making available is very much the same set of scripts that we developed during the first experiments. Thus, the code is written in different pieces in different coding languages (bash, perl, matlab, etc.) and it has some external dependencies. Anyway, we expect that it can be still useful for researchers that want to try to fuse their STD systems. | ||
− | * The package can be downloaded from here. | + | * The package can be downloaded from [[Media:STDfusion.tgz | here]]. |
== Package contents == | == Package contents == | ||
Line 44: | Line 44: | ||
./scores/dev/dtw_br-devterms.stdlist.xml | ./scores/dev/dtw_br-devterms.stdlist.xml | ||
− | :* '''Warning''' Before running the script, you will have to change the MATLAB_BIN variable to be the path where your Matlab binary is actually installed. | + | :* '''Warning''' Before running the script, you will have to change the $MATLAB_BIN variable to be the path where your Matlab binary is actually installed. |
:* This process will generate 2 output files: | :* This process will generate 2 output files: | ||
Line 98: | Line 98: | ||
./bin/fusion.sh dev_scores_4fusion.txt eval_scores_4fusion.txt queries_in_data.ref | ./bin/fusion.sh dev_scores_4fusion.txt eval_scores_4fusion.txt queries_in_data.ref | ||
− | :* '''Warning''' Before running the script, you will have to change the MATLAB_BIN variable to be the path where your Matlab binary is actually installed. | + | :* '''Warning''' Before running the script, you will have to change the $MATLAB_BIN variable to be the path where your Matlab binary is actually installed. |
− | :* '''Warning''' The Bosaris toolkit needs to be installed. Change the variable BOSARIS to the path where this the Bosaris toolkit is installed. | + | :* '''Warning''' The Bosaris toolkit needs to be installed. Change the variable $BOSARIS to the path where this the Bosaris toolkit is installed. |
:* '''Warning''' The script contains 4 internal variables that are specific to the Mediaeval SWS2013 evaluation: | :* '''Warning''' The script contains 4 internal variables that are specific to the Mediaeval SWS2013 evaluation: | ||
− | ::* The cost parameters P_TARGET, C_MISS, C_FA | + | ::* The cost parameters $P_TARGET, $C_MISS, $C_FA |
− | ::* TSPEECH - The total duration (in seconds) of the data collection | + | ::* $TSPEECH - The total duration (in seconds) of the data collection |
:* '''Warning''' The call to the fusion script can take several minutes. | :* '''Warning''' The call to the fusion script can take several minutes. |
A. Abad, L. J. Rodriguez Fuentes, M. Penagarikano, A. Varona, M. Diez, and G. Bordel. On the Calibration and Fusion of Heterogeneous Spoken Term Detection Systems. In Interspeech 2013, August 25-29 2013
README.txt - Contains pretty much the same information of this wiki site ./bin/ - Contains the different scripts necessary for calibration and fusion ./bin/PrepareForFusion.sh - Main script than normalizes, aligns, hypotesizes missing scores and creates the groundtruh for the input systems ./bin/align_score_files.pl - Used by ./bin/PrepareForFusion.sh ./bin/create_groundtruth_centerdistance.pl - Used by ./bin/PrepareForFusion.sh ./bin/heuristicScoring.pl - Used by ./bin/PrepareForFusion.sh ./bin/fusion.sh - This script takes the output of ./bin/PrepareForFusion.sh to learn the fusion parameters and apply them to the evaluation scores ./bin/raw2stdslist.sh - Converts scores from a raw (internal) format to the stdlist xml format of the SWS evaluation ./scoring_atwv_sws2013/ - Contains the Mediaeval 2013 atwv scoring package ./scores/ - Contains sample dev (./scores/dev/akws_br-devterms.stdlist.xml, ./scores/dev/dtw_br-devterms.stdlist.xml) and eval scores (./scores/eval/akws_br-evalterms.stdlist.xml, ./scores/eval/dtw_br-evalterms.stdlist.xml) ./sample_results/ - Contains the sample (intermediate) results that are obtained if you run the instructions bellow
STEP1 Prepare the development scores for learning the calibration/fusion
./bin/PrepareForFusion.sh \ -q ./scoring_atwv_sws2013/sws2013_dev/sws2013_dev.tlist.xml \ -r ./scoring_atwv_sws2013/sws2013_dev/sws2013_dev.rttm \ -o dev_scores_4fusion.txt \ -t num_queries_in_data.ref \ -g -z qnorm -m 1 -n 1 \ ./scores/dev/akws_br-devterms.stdlist.xml \ ./scores/dev/dtw_br-devterms.stdlist.xml
<query_id> <file_id> <start_time> <duration> <sc1> <sc2> ... <scN> <label>
Usage: PrepareForFusion.sh -q <tlistxml> -r <rttm> -o <outputfile> [opts] <stdlistfile1> [stdlistfile2] [stdlistfile3] ... [stdlistfileN]
<stdlistfile*> input score stdlist file in the SWS2012 format (*.stdlist.xml) | - Required argument (at least 1) -q <tlistxml> termlist file in the SWS2012 format (*.tlist.xml) | - Required parameter -r <rttm> rttm file in the SWS2012 format (*.rttm) | - Required parameter -o <outputfile> output file name | - Required parameter -z <value> score z-norm type (none, qnorm, fnorm, qfnorm, fqnorm) | - Default: none -g add ground-truth information to the outputfile -t <filename> saves the number of true terms in the reference per query (implies -g) -m <value> apply majority voting fusion with <value> minimum number of votes | - Default: 1 -n <value> method for creating default scores (0: average of the other detections (MV approach); 1; min per query, 2: global min, 3: histogram based) | - Default: 0 -d debug mode, done remove auxiliar files stored in /tmp/tmpdir.$$ -h help
NOTE: Requires Matlab (or octave) and perl. It also depends on the perl scripts align_score_files.pl, heuristicScoring.pl and create_groundtruth_centerdistance.pl that should be located in the same folder of the main script
STEP2 Prepare the evaluations scores for fusion
./bin/PrepareForFusion.sh \ -q ./scoring_atwv_sws2013/sws2013_eval/sws2013_eval.tlist.xml \ -r ./scoring_atwv_sws2013/sws2013_eval/sws2013_eval.rttm \ -o eval_scores_4fusion.txt \ -z qnorm -m 1 -n 1 \ ./scores/eval/akws_br-evalterms.stdlist.xml \ ./scores/eval/dtw_br-evalterms.stdlist.xml
STEP3 Train fusion parameters using thedevelopment scores and apply it to the evaluation scores
./bin/fusion.sh dev_scores_4fusion.txt eval_scores_4fusion.txt queries_in_data.ref
<query_id> <file_id> <start_time> <duration> <score> <decision>
./bin/raw2stdslist.sh dev_fusion.scores \ ./scoring_atwv_sws2013/sws2013_dev/sws2013_dev.tlist.xml > fusion-devterms.stdlist.xml
./bin/raw2stdslist.sh eval_fusion.scores \ ./scoring_atwv_sws2013/sws2013_eval/sws2013_eval.tlist.xml > fusion-evalterms.stdlist.xml
Reference results
dev (mtwv/atwv) | eval (mtwv/atwv) | |
---|---|---|
akws-br | 0.1571 / 0.1408 | 0.1441 / 0.1271 |
dtw-br | 0.2066 / 0.2012 | 0.1654 / 0.1581 |
fusion | 0.2731 / 0.2713 |