Comments so far (30/07/2006) on VT_DTW
=======================================

For the examples shown below, it is useful to think that:
 1. Input speech comes from the file: fl1
 2. MFCC from that file are projected into the space  of V
 3. Using the aligned data, get AR coefficients from flAR
 
 
 Example for test it: 
 ---------------------
 use target speaker the same as the source speaker (thierry)
 and use a file already analyzed (from the training data). It should produced the
 original signal, if everything works well.

 OrderMffc = 20;
 fp=fopen('../DTW/source_td.aligned.mfcc','rb');
 V = fread(fp,'float');
 V = reshape(V,OrderMffc,length(V)/OrderMffc);
 fclose(fp);
 flAR = '../DTW/source_td.aligned.ar';
 fl1 = '../DataSet/eNTERFACE06_us_td_arctic/Training/wav/arctic_a0030.wav';

 LPCas_byexample(fl1, V, flAR);
 
 check if the file ../DataSet/eNTERFACE06_us_td_arctic/Training/wav/arctic_a0030.wav
 sounds the SAME as the output generated speech file: test.wav (later renamed to: test_td_a0030.wav)

 RESULTS:
 For (Yannis) it sounds OK! => Matlab program works fine

 A second check:
------------------
 It would be to produce Thierry's signal from the Validation directory:
 V is as before
 flAR = '../DTW/source_td.aligned.ar'; (as before)
 fl1 = '../DataSet/eNTERFACE06_us_td_arctic/Validation/wav/arctic_a0005.wav'
 LPCas_byexample(fl1, V, flAR);

 Check if the file ../DataSet/eNTERFACE06_us_td_arctic/Training/wav/arctic_a0005.wav
 sounds the SIMILAR as the output generated speech file: test.wav (later renamed to: test_td_a0005.wav)
 
 RESULTS:
 NOT BAD! (Yannis: I am really surprised!) => it seems we have enough data (at least for this file)
 to generate 'unknown' Source spectral envelopes (i.e., using the excitation from Thierry)

 Let's do the same for Alan's data, for an 'unseen' speech file:
 ---------------------------------------------------------------
 OrderMffc = 20;
 fp=fopen('../DTW/target_awb.aligned.mfcc','rb');
 V = fread(fp,'float');
 V = reshape(V,OrderMffc,length(V)/OrderMffc);
 fclose(fp);
 flAR = '../DTW/target_awb.aligned.ar';
 'Unseen' sentence:
 fl1 = '../DataSet/cmu_us_awb_arctic/wav/arctic_a0005.wav';
 LPCas_byexample(fl1, V, flAR);

 Check if the file ../DataSet/cmu_us_awb_arctic/wav/arctic_a0005.wav
 sounds the SIMILAR as the output generated speech file: test.wav (later renamed to: test_awb_a0005.wav)
 
 RESULTS:
 NOT BAD! (Yannis: again,  I am really surprised!) => it seems we have enough data (at least for this file)
 to generate 'unknown' Target spectral envelopes (i.e., using the excitation from AWB)
 
 
 A REAL example for run it: (it sounds terrible)
 ------------------------------------------------

 OrderMffc = 20;
 fp=fopen('../DTW/source_td.aligned.mfcc','rb');
 V = fread(fp,'float');
 V = reshape(V,OrderMffc,length(V)/OrderMffc);
 fclose(fp);

 flAR = '../DTW/target_awb.aligned.ar';
 fl1 = '../DataSet/eNTERFACE06_us_td_arctic/Validation/wav/arctic_a0005.wav'
 LPCas_byexample(fl1, V, flAR);

 Check if file ../DataSet/cmu_us_awb_arctic/wav/arctic_a0005.wav
 sounds the SOMEHOW SIMILAR as the generated output speech file: test.wav (later renamed to: test_td2awb_a0005.wav)
 
 RESULTS:
 EXTREMELY BAD! We even recognize Thierry with Target (AWB) spectral envelopes!!).
 
 OVERALL IMPRESSION: I think this comes from the alignment. Since the alignment worked fine on test data
 it seems that here has a very hard task to complet, which is not however, completed. Work should
 be done on improving the alignement.
 Also, we may consider the problem of the excitation (and not the above one). 
 We should check further!!
 
 
 Yannis Stylianou, 30-31/07/2006
 yannis@csd.uoc.gr


