Thursday, October 16, 2014

Converting HTK Binary MFCC values in to ASCII MFCC

HTK is a Open Source Toolkit for Hidden Markov models developed by Cambridge university . mostly people use this tool kit for speech recognition and speech synthesis purposes.



When we are using HTK tool kit it stores most of the data in binary files. Which can't understand by us. HCopy command is used to convert WAV files in to MFCC (Mel-frequency cepstral coefficient) files. MFCC is used to extract some features of speech. but when we need to do some processing part, we should need MFCC values. therefore we can use HList command to read binary MFCC files but it don't output a ASCII file only show the values.

Therefore we can change the source code to get output as a file.
you can add at the beginning of main function
 
freopen("output.txt", "w", stdout);

and add at the end of the main function.
 
fclose (stdout);


then your main function will be like follows
int main(int argc, char *argv[])
{
   char *s,buf[MAXSTRLEN];
   void ListSpeech(char *src);
   freopen("output.txt", "w", stdout);
   if(InitShell(argc,argv,hlist_version,hlist_vc_id)< SUCCESS)
      HError(1100,"HList: InitShell failed");
   InitMem();
   InitMath();  InitSigP();
   InitWave();  InitAudio();
   InitVQ(); InitLabel();
   InitModel();
   if(InitParm() 0 ) {
         if (NextArg() != STRINGARG)
            HError(1119,"HList: List file name expected");
         ListSpeech(GetStrArg());
      }
   fclose (stdout);
   Exit(0);
   return (0);          /* never reached -- make compiler happy */
}


Compile it and try HList with tracing then it will print an output file with ASCII MFCC values. You can compile this by gcc or VStudio.

Here is my complied HList, It will works on windows

You should be able read MFCC values to do speech recognition in hybrid approach. such as HMM + ANN or HMM + SVM. I hope this blog post help you to do your research easily.

1 comment: