牛刀小试,建立“是”、“否”识别系统——学习《HTK (v.3.1): Basic Tutorial》

看了《HTK (v.3.1): Basic Tutorial》,总结一下,参照了这篇文章。不懂得太多了,还得回去不信号与线性系统,这个毕设对于我来说还是很有挑战性的。

步骤:
1. 建立训练数据库
2. 声学(Acoustical)分析
3. 定义模型
4. 训练模型
5. 定义任务
6. 识别未知信号
7. 评价

过程
1. 建立训练资料
a. 录制音频
HSLab name.sig
b. 标记信号
在HSLab中标记信号位置
2. 声学(Acoustical)分析
a. 配置参数(analysis.conf)
#
# Example of an acoustical analysis configuration file
#
SOURCEFORMAT = HTK # Gives the format of the speech files
TARGETKIND = MFCC_0_D_A # Identifier of the coefficients to use
# Unit = 0.1 micro-second :
WINDOWSIZE = 250000.0 # = 25 ms = length of a time frame
TARGETRATE = 100000.0 # = 10 ms = frame periodicity
NUMCEPS = 12 # Number of MFCC coeffs (here from c1 to c12)
USEHAMMING = T # Use of Hamming function for windowing frames
PREEMCOEF = 0.97 # Pre-emphasis coefficient
NUMCHANS = 26 # Number of filterbank channels
CEPLIFTER = 22 # Length of cepstral liftering
# The End

b. 源目标列表(targetlist.txt)
data/train/sig/yes01.sig data/train/mfcc/yes01.mfcc
data/train/sig/yes02.sig data/train/mfcc/yes02.mfcc
etc...
data/train/sig/no01.sig data/train/mfcc/no01.mfcc
data/train/sig/no02.sig data/train/mfcc/no02.mfcc
etc...

c. 使用HCopy进行声学分析
HCopy -C analysis.conf -S targetlist.txt
3. 定义模型
~o <VecSize> 39 <MFCC_0_D_A>
~h "lable"
<BeginHMM>
<NumStates> 6
<State> 2
<Mean> 39
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
<Variance> 39
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
<State> 3
<Mean> 39
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
<Variance> 39
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
<State> 4
<Mean> 39
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
<Variance> 39
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
<State> 5
<Mean> 39
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
<Variance> 39
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
<TransP> 6
0.0 0.5 0.5 0.0 0.0 0.0
0.0 0.4 0.3 0.3 0.0 0.0
0.0 0.0 0.4 0.3 0.3 0.0
0.0 0.0 0.0 0.4 0.3 0.3
0.0 0.0 0.0 0.0 0.5 0.5
0.0 0.0 0.0 0.0 0.0 0.0
<EndHMM>

4. 模型训练
a. 初始化
HInit -S trainlist.txt -M model\hmm0 -H model\proto\hmm_sil -l sil -L data\train\lab sil
HCompv -S trainlist.txt -M model/hmm0flat -H model/proto/hmm_no -f 0.01 no

b. 训练
HRest -S trainlist.txt -M model/hmm1 -H model/hmm0flat/vFloors -H model/hmm0/hmm_yes -l yes -L data/train/lab yes
一直提示错误,不知道为什么,去掉-H model/hmm0flat/vFloors后正常

HRest -S trainlist.txt -M model/hmm1 -H model/hmm0/hmm_yes -l yes -L data\train\lab yes
HRest -S trainlist.txt -M model/hmm2 -H model/hmm1/hmm_yes -l yes -L data\train\lab yes
HRest -S trainlist.txt -M model/hmm3 -H model/hmm2/hmm_yes -l yes -L data\train\lab yes
HRest -S trainlist.txt -M model/hmm1 -H model/hmm0/hmm_no -l no -L data\train\lab no
HRest -S trainlist.txt -M model/hmm2 -H model/hmm1/hmm_no -l no -L data\train\lab no
HRest -S trainlist.txt -M model/hmm3 -H model/hmm2/hmm_no -l no -L data\train\lab no
HRest -S trainlist.txt -M model/hmm1 -H model/hmm0/hmm_sil -l sil -L data\train\lab sil
HRest -S trainlist.txt -M model/hmm2 -H model/hmm1/hmm_sil -l sil -L data\train\lab sil
HRest -S trainlist.txt -M model/hmm3 -H model/hmm2/hmm_sil -l sil -L data\train\lab sil

5. 定义任务
a. 语法(gram.txt)
/*
* Task grammar
*/
$WORD = YES | NO;
( { START_SIL } [ $WORD ] { END_SIL } )
b. 字典(dict.txt)
YES [yes] yes
NO [no] no
START_SIL [sil] sil
END_SIL [sil] sil

c. 使用HParse和HSGen建立状态网络
HParse -A -D -T 1 def/gram.txt def/net.slf
HSGen -A -D -n 10 -s def/net.slf def/dict.txt

6. 识别未知信号,使用HVite
HVite -A -D -T 1 -H model/hmm3/hmm_yes -H model/hmm3/hmm_no -H model/hmm3/hmm_sil -i reco.mlf -w def/net.slf def/dict.txt hmmlist.txt input.mfcc
hmmlist.txt 为hmm名称列表,和hmm搞混了,纳闷了好久
7. 评价(略)

牛刀小试,建立“是”、“否”识别系统——学习《HTK (v.3.1): Basic Tutorial》》有3个想法

  1. zoe

    大侠,为什么我要录yes/no时,键入命令 HSLab name.sig时总会出现这个问题
    D:\htk>HSLab name.sig
    ERROR [+6870] MakeXGraf: Not compiled with X11 support: use HGraf.X.c
    FATAL ERROR – Terminating program HSLab

    回复
  2. 匿名

    想问一下如何建立一个configuration file? 不知道“.conf”文件如何建立,谢谢

    回复

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注

此站点使用Akismet来减少垃圾评论。了解我们如何处理您的评论数据