STDfusion: Difference between revisions

From HLT@INESC-ID

No edit summary
No edit summary
 
(10 intermediate revisions by the same user not shown)
Line 7: Line 7:
           In Interspeech 2013, August 25-29 2013
           In Interspeech 2013, August 25-29 2013


* The proposed method is the result of the collaboration between the L2F and the GTTS groups in the context of their activities in the Mediaeval Spoken Web Search (SWS) task. The method has been succesfully tested in the SWS2012 and SWS2013 tasks.  
* The proposed method is the result of the collaboration between the [https://www.l2f.inesc-id.pt L2F] and the [http://gtts.ehu.es/gtts/ GTTS] groups in the context of their research activities in the topic of query-by-example STD. The method has been succesfully tested in the SWS2012 and SWS2013 tasks of  [http://www.multimediaeval.org/mediaeval2013/sws2013/ Mediaeval Evaluation].  


* Unfortunately, we did not have time to consolidate (and clean up) the code and what we are making available is very much the same set of scripts that we developed during the first experiments. Thus, the code is written in different pieces in different coding languages (bash, perl, matlab, etc.) and it has some external dependencies. Anyway, we expect that it can be still useful for researchers that want to try to fuse their STD systems.
* Unfortunately, we did not have time to consolidate (and clean up) the code and what we are making available is very much the same set of scripts that we developed during the first experiments. Thus, the code is written in different pieces in different coding languages (bash, perl, matlab, etc.) and it has some external dependencies. Anyway, we expect that it can be still useful for researchers that want to try to fuse their STD systems.


* The package can be downloaded from here.
* The package can be downloaded from [http://www.l2f.inesc-id.pt/~alberto/STDfusion/STDfusion.v1.tgz here].


== Package contents ==
== Package contents ==
Line 38: Line 38:
           -q ./scoring_atwv_sws2013/sws2013_dev/sws2013_dev.tlist.xml \
           -q ./scoring_atwv_sws2013/sws2013_dev/sws2013_dev.tlist.xml \
           -r ./scoring_atwv_sws2013/sws2013_dev/sws2013_dev.rttm \
           -r ./scoring_atwv_sws2013/sws2013_dev/sws2013_dev.rttm \
           -o DEV_SCORES_4FUSION.txt \
           -o dev_scores_4fusion.txt \
           -t queries_in_data.ref \
           -t num_queries_in_data.ref \
           -g -z qnorm -m 1 -n 1 \
           -g -z qnorm -m 1 -n 1 \
           ./scores/dev/akws_br-devterms.stdlist.xml \
           ./scores/dev/akws_br-devterms.stdlist.xml \
           ./scores/dev/dtw_br-devterms.stdlist.xml
           ./scores/dev/dtw_br-devterms.stdlist.xml


:* '''Warning''' Before running the script, you will have to change the MATLAB_BIN variable to be the path where your Matlab binary is actually installed.
:* '''Warning''' Before running the script, you will have to change the $MATLAB_BIN variable to be the path where your Matlab binary is actually installed.


:* This process  will generate 2 output files:
:* This process  will generate 2 output files:


::'''1.''' DEV_SCORES_4FUSION.txt - It contains the scores ready to be used for the following fusion stage in the following general format (one row per detection candidate):
::'''1.''' dev_scores_4fusion.txt - It contains the scores ready to be used for the following fusion stage in the following general format (one row per detection candidate):


                   <query_id> <file_id> <start_time> <duration> <sc1> <sc2> ... <scN> <label>
                   <query_id> <file_id> <start_time> <duration> <sc1> <sc2> ... <scN> <label>
     
::*Notice that now all systems produce a score for all candidate detections (the score matrix is full).  
::*Notice that now all systems produce a score for all candidate detections (the score matrix is full).  
::*  If the -g option is selected (like in this case) the <label> column contains 0s and 1s for the false and true trials respectively (derived from de rttm file). If the -g option is not selected, the last column simply contains a column with 1s.
::*  If the -g option is selected (like in this case) the <label> column contains 0s and 1s for the false and true trials respectively (derived from de rttm file). If the -g option is not selected, the last column simply contains a column with 1s.


::'''2.'''queries_in_data.ref - This second (optional) output file contains the number of times each query appears in the collection data and it is used later in the fusion stage.
::'''2.'''num_queries_in_data.ref - This second (optional) output file contains the number of times each query appears in the collection data and it is used later in the fusion stage.


*You can have a look to the general usage of this script calling it without arguments:
*You can have a look to the general usage of this script calling it without arguments:
Line 77: Line 75:
           that should be located in the same folder of the main script
           that should be located in the same folder of the main script


<big>'''STEP2'''</big> Prepare the evaluations scores for fusion
<big>'''STEP2'''</big> Prepare the evaluation scores for fusion


* Type  the following command:
* Type  the following command:
Line 84: Line 82:
           -q ./scoring_atwv_sws2013/sws2013_eval/sws2013_eval.tlist.xml \
           -q ./scoring_atwv_sws2013/sws2013_eval/sws2013_eval.tlist.xml \
           -r ./scoring_atwv_sws2013/sws2013_eval/sws2013_eval.rttm \
           -r ./scoring_atwv_sws2013/sws2013_eval/sws2013_eval.rttm \
           -o EVAL_SCORES_4FUSION.txt \
           -o eval_scores_4fusion.txt \
           -z qnorm -m 1 -n 1 \
           -z qnorm -m 1 -n 1 \
           ./scores/eval/akws_br-evalterms.stdlist.xml  \
           ./scores/eval/akws_br-evalterms.stdlist.xml  \
           ./scores/eval/dtw_br-evalterms.stdlist.xml
           ./scores/eval/dtw_br-evalterms.stdlist.xml


:*'''Warning''' It is fundamental to provide the stdlist.xml input  files in the same order used in theprevious step.
:*'''Warning''' It is fundamental to provide the stdlist.xml input  files in the same order used in the previous step.
 
:*Notice that in contrast to case of the development scores, we did not use the -g option since we do not need the groudtruth in this case.  
:*Notice that in contrast to case of the development scores, we did not use the -g option since we do not need the groudtruth in this case.  
      
      
Line 97: Line 94:
* Type the following command:
* Type the following command:


           ./bin/fusion.sh DEV_SCORES_4FUSION.txt EVAL_SCORES_4FUSION.txt  queries_in_data.ref
           ./bin/fusion.sh dev_scores_4fusion.txt eval_scores_4fusion.txt  queries_in_data.ref
 
:* '''Warning''' Before running the script, you will have to change the MATLAB_BIN variable to be the path where your Matlab binary is actually installed.
 
:* '''Warning''' The Bosaris toolkit needs to be installed. Change the variable BOSARIS to the path where this the Bosaris toolkit is installed.
 
:* '''Warning''' The script contains 4 internal variables that are specific to the Mediaeval SWS2013 evaluation:
::* The cost parameters P_TARGET, C_MISS, C_FA
 
::*  TSPEECH - The total duration (in seconds) of the data collection


:* '''Warning''' The call to the fusion script can take several minutes.
:* This call permits learning from the dev_scores_4fusion.txt (with groundtruh information) the calibration and fusion parameters that are applied both to the development and to the evaluation scores. It produces 3 output files:
 
:* This call permits learning from the DEV_SCORES_4FUSION.txt (with groundtruh information) the calibration and fusion parameters that are applied both to the development and to the evaluation scores. It produces 3 output files:


::'''1.''' dev_fusion.scores - file containing the development well-calibrated fusion scores
::'''1.''' dev_fusion.scores - file containing the development well-calibrated fusion scores
Line 121: Line 107:
::The <decision> field is 0 or 1 depending if the score is lower or greater than the minimum cost Bayes optimum threshold (see the Interspeech paper for details).
::The <decision> field is 0 or 1 depending if the score is lower or greater than the minimum cost Bayes optimum threshold (see the Interspeech paper for details).


:* '''Warning''' Before running the script, you will have to change the $MATLAB_BIN variable to be the path where your Matlab binary is actually installed.
:* '''Warning''' The [https://sites.google.com/site/bosaristoolkit/ Bosaris toolkit] needs to be installed. Change the variable $BOSARIS to the path where this the Bosaris toolkit is installed.
:* '''Warning''' The script contains 4 internal variables that are specific to the Mediaeval SWS2013 evaluation:
::* The cost parameters $P_TARGET, $C_MISS, $C_FA
::*  $TSPEECH - The total duration (in seconds) of the data collection
:* '''Warning''' The call to the fusion script can take several minutes.
* To convert the score files to the SWS2013 stdslist.xml format type the following commands:
* To convert the score files to the SWS2013 stdslist.xml format type the following commands:
           ./bin/raw2stdslist.sh dev_fusion.scores \
           ./bin/raw2stdslist.sh dev_fusion.scores \
           ./scoring_atwv_sws2013/sws2013_dev/sws2013_dev.tlist.xml > fusion-devterms.stdlist.xml
           ./scoring_atwv_sws2013/sws2013_dev/sws2013_dev.tlist.xml > fusion-devterms.stdlist.xml
Line 128: Line 119:
           ./bin/raw2stdslist.sh eval_fusion.scores \
           ./bin/raw2stdslist.sh eval_fusion.scores \
           ./scoring_atwv_sws2013/sws2013_eval/sws2013_eval.tlist.xml > fusion-evalterms.stdlist.xml
           ./scoring_atwv_sws2013/sws2013_eval/sws2013_eval.tlist.xml > fusion-evalterms.stdlist.xml
 
: This script can take a third parameter, a threshold value  to apply a different decision threshold to the optimal Bayes one.
:* This script can take a third parameter, a threshold value  to apply a different decision threshold to the optimal Bayes one.


<big>'''Reference results'''<big>
<big>'''Reference results'''<big>


* In the folder ./sample_results/ you can find each one of the intermediate files procuced in this example
* In the folder ./sample_results/ you can find each one of the intermediate files procuced in this example
* The TWV results obtained with the sample score files provided for the Mediaeval SWS2013 task for each one of the individual systems and for the fusion system obtained following this example are:


* The TWV results obtained with the sample score files provided for the Mediaeval SWS2013 task for each one of the indovidual systems and for the fusion system obtained following this example are :
{| border=1 cellspacing=0 align=center cellpadding=5px width=50%
 
   |+ '''Reference results for Mediaeval SWS 2013'''
 
{| border=1
   |+ Reference results for Mediaeval SWS 2013  
|-
|-
   ! !! dev (mtwv/atwv) !! eval  (mtwv/atwv)  
   ! !! dev (mtwv/atwv) !! eval  (mtwv/atwv)  
Line 147: Line 135:
   |  dtw-br  ||  0.2066  / 0.2012 ||  0.1654  /  0.1581  
   |  dtw-br  ||  0.2066  / 0.2012 ||  0.1654  /  0.1581  
|-
|-
   | fusion || 0.2731  / 0.2713 ||  
   | fusion || 0.2731  / 0.2713 || 0.2355  / 0.2320
|}
|}

Latest revision as of 00:18, 17 October 2013

Introduction

  • In this wiki we make available the necessary code and the basic instructions to carry out discriminative calibration and fusion of heterogeneous spoken term detection (STD) systems as described in:
          A. Abad, L. J. Rodriguez Fuentes, M. Penagarikano, A. Varona, M. Diez, and G. Bordel. 
          On the Calibration and Fusion of Heterogeneous Spoken Term Detection Systems. 
          In Interspeech 2013, August 25-29 2013
  • The proposed method is the result of the collaboration between the L2F and the GTTS groups in the context of their research activities in the topic of query-by-example STD. The method has been succesfully tested in the SWS2012 and SWS2013 tasks of Mediaeval Evaluation.
  • Unfortunately, we did not have time to consolidate (and clean up) the code and what we are making available is very much the same set of scripts that we developed during the first experiments. Thus, the code is written in different pieces in different coding languages (bash, perl, matlab, etc.) and it has some external dependencies. Anyway, we expect that it can be still useful for researchers that want to try to fuse their STD systems.
  • The package can be downloaded from here.

Package contents

  • The package contains the following files and directories:
        README.txt - Contains pretty much the same information of this wiki site
        ./bin/ - Contains the different scripts necessary for calibration and fusion
        ./bin/PrepareForFusion.sh  - Main script than normalizes, aligns, hypotesizes missing scores and creates the groundtruh for the input systems
        ./bin/align_score_files.pl - Used by ./bin/PrepareForFusion.sh
        ./bin/create_groundtruth_centerdistance.pl - Used by ./bin/PrepareForFusion.sh
        ./bin/heuristicScoring.pl - Used by ./bin/PrepareForFusion.sh 
        ./bin/fusion.sh  - This script takes the output of ./bin/PrepareForFusion.sh to learn the fusion parameters and apply them to the evaluation scores
        ./bin/raw2stdslist.sh - Converts scores from a raw (internal) format to the stdlist xml format of the SWS evaluation
        ./scoring_atwv_sws2013/ - Contains the Mediaeval 2013 atwv scoring package
        ./scores/ - Contains sample dev (./scores/dev/akws_br-devterms.stdlist.xml, ./scores/dev/dtw_br-devterms.stdlist.xml) and eval scores (./scores/eval/akws_br-evalterms.stdlist.xml, ./scores/eval/dtw_br-evalterms.stdlist.xml)
        ./sample_results/ - Contains the sample (intermediate) results that are obtained if you run the  instructions bellow

Example of use

STEP1 Prepare the development scores for learning the calibration/fusion

  • Type the following command:
         ./bin/PrepareForFusion.sh \
         -q ./scoring_atwv_sws2013/sws2013_dev/sws2013_dev.tlist.xml \
         -r ./scoring_atwv_sws2013/sws2013_dev/sws2013_dev.rttm \
         -o dev_scores_4fusion.txt \
         -t num_queries_in_data.ref \
         -g -z qnorm -m 1 -n 1 \
         ./scores/dev/akws_br-devterms.stdlist.xml \
         ./scores/dev/dtw_br-devterms.stdlist.xml
  • Warning Before running the script, you will have to change the $MATLAB_BIN variable to be the path where your Matlab binary is actually installed.
  • This process will generate 2 output files:
1. dev_scores_4fusion.txt - It contains the scores ready to be used for the following fusion stage in the following general format (one row per detection candidate):
                 <query_id> <file_id> <start_time> <duration> <sc1> <sc2> ... <scN> <label>
  • Notice that now all systems produce a score for all candidate detections (the score matrix is full).
  • If the -g option is selected (like in this case) the <label> column contains 0s and 1s for the false and true trials respectively (derived from de rttm file). If the -g option is not selected, the last column simply contains a column with 1s.
2.num_queries_in_data.ref - This second (optional) output file contains the number of times each query appears in the collection data and it is used later in the fusion stage.
  • You can have a look to the general usage of this script calling it without arguments:
      Usage: PrepareForFusion.sh -q <tlistxml> -r <rttm> -o <outputfile> [opts] <stdlistfile1> [stdlistfile2] [stdlistfile3]  ... [stdlistfileN] 
       <stdlistfile*> input score stdlist file in the SWS2012 format (*.stdlist.xml)        | - Required argument (at least 1)
       -q <tlistxml> termlist file in the SWS2012 format (*.tlist.xml)                      | - Required parameter
       -r <rttm> rttm file in the SWS2012 format (*.rttm)                                   | - Required parameter
       -o <outputfile> output file name                                                     | - Required parameter
       -z <value> score z-norm type (none, qnorm, fnorm, qfnorm, fqnorm)                    | - Default: none
       -g add ground-truth information to the outputfile
       -t <filename> saves the number of true terms in the reference per query (implies -g)
       -m <value> apply majority voting fusion with <value> minimum number of votes         | - Default: 1
       -n <value> method for creating default scores (0: average of the other detections (MV approach); 1; min per query, 2: global min, 3: histogram based)  | - Default: 0
       -d debug mode, done remove auxiliar files stored in /tmp/tmpdir.$$
       -h help 
       NOTE: Requires Matlab (or octave) and perl. It also depends on the perl scripts align_score_files.pl, heuristicScoring.pl and create_groundtruth_centerdistance.pl 
          that should be located in the same folder of the main script

STEP2 Prepare the evaluation scores for fusion

  • Type the following command:
         ./bin/PrepareForFusion.sh \
         -q ./scoring_atwv_sws2013/sws2013_eval/sws2013_eval.tlist.xml \
         -r ./scoring_atwv_sws2013/sws2013_eval/sws2013_eval.rttm \
         -o eval_scores_4fusion.txt \
         -z qnorm -m 1 -n 1 \
         ./scores/eval/akws_br-evalterms.stdlist.xml  \
         ./scores/eval/dtw_br-evalterms.stdlist.xml
  • Warning It is fundamental to provide the stdlist.xml input files in the same order used in the previous step.
  • Notice that in contrast to case of the development scores, we did not use the -g option since we do not need the groudtruth in this case.

STEP3 Train fusion parameters using thedevelopment scores and apply it to the evaluation scores

  • Type the following command:
         ./bin/fusion.sh dev_scores_4fusion.txt eval_scores_4fusion.txt  queries_in_data.ref
  • This call permits learning from the dev_scores_4fusion.txt (with groundtruh information) the calibration and fusion parameters that are applied both to the development and to the evaluation scores. It produces 3 output files:
1. dev_fusion.scores - file containing the development well-calibrated fusion scores
2. eval_fusion.scores - file containing the evaluation well-calibrated fusion scores
3. fuse_params.txt - file containig thefusion parameters
The format of the score output files is as follows (one row per candidate detection):
             <query_id> <file_id> <start_time> <duration> <score> <decision>
The <decision> field is 0 or 1 depending if the score is lower or greater than the minimum cost Bayes optimum threshold (see the Interspeech paper for details).
  • Warning Before running the script, you will have to change the $MATLAB_BIN variable to be the path where your Matlab binary is actually installed.
  • Warning The Bosaris toolkit needs to be installed. Change the variable $BOSARIS to the path where this the Bosaris toolkit is installed.
  • Warning The script contains 4 internal variables that are specific to the Mediaeval SWS2013 evaluation:
  • The cost parameters $P_TARGET, $C_MISS, $C_FA
  • $TSPEECH - The total duration (in seconds) of the data collection
  • Warning The call to the fusion script can take several minutes.
  • To convert the score files to the SWS2013 stdslist.xml format type the following commands:
         ./bin/raw2stdslist.sh dev_fusion.scores \
         ./scoring_atwv_sws2013/sws2013_dev/sws2013_dev.tlist.xml > fusion-devterms.stdlist.xml
         ./bin/raw2stdslist.sh eval_fusion.scores \
         ./scoring_atwv_sws2013/sws2013_eval/sws2013_eval.tlist.xml > fusion-evalterms.stdlist.xml
This script can take a third parameter, a threshold value to apply a different decision threshold to the optimal Bayes one.

Reference results

  • In the folder ./sample_results/ you can find each one of the intermediate files procuced in this example
  • The TWV results obtained with the sample score files provided for the Mediaeval SWS2013 task for each one of the individual systems and for the fusion system obtained following this example are:
Reference results for Mediaeval SWS 2013
dev (mtwv/atwv) eval (mtwv/atwv)
akws-br 0.1571 / 0.1408 0.1441 / 0.1271
dtw-br 0.2066 / 0.2012 0.1654 / 0.1581
fusion 0.2731 / 0.2713 0.2355 / 0.2320