基于 nginx php redis htk 搭建在线连续语音识别系统

前言
上文已经描述了如何利用htk搭建连续语音识别系统,本文将描述如何将改系统推送到线上
采用nginx+php来接受用户需要识别的wav音频文件,通过htk来识别音频,redis充当两个部分的通信桥梁

常用命令:

/etc/init.d/nginx restart
tail -f /var/log/nginx/error.log
vi /etc/nginx/nginx.conf
/etc/init.d/php5-fpm start
php5-fpm -i

nginx 安装

1
2
3
apt-get update
apt-get upgrade
apt-get install nginx

启动nginx

1
2
3
service nginx start #启动nginx服务
ps -elf | grep nginx #查看是否启动nginx
/etc/init.d/nginx restart #重启nginx

查看nginx日志

1
2
3
vi /etc/nginx/nginx.conf  #查看log文件位置
tail -f /var/log/nginx/error.log #查看错误log
tail -f /var/log/nginx/access.log #查看链接log

测试nginx是否运行成功

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
root@24aa86ae75b9:/home/vrgroup/Desktop# curl localhost #若无curl命令 apt-get install curl
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
body {
width: 35em;
margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif;
}
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>
root@24aa86ae75b9:/home/vrgroup/Desktop#

php 安装

1
2
apt-get install php5-fpm
/etc/init.d/php5-fpm start

配置nginx中php选项

1
2
3
4
5
6
7
8
9
10
11
12
13
14
vim /etc/nginx/sites-available/default 
------#去掉部分注释保留如下
location ~ \.php$ {
fastcgi_split_path_info ^(.+\.php)(/.+)$;
# NOTE: You should have "cgi.fix_pathinfo = 0;" in php.ini

# With php5-cgi alone:
# fastcgi_pass 127.0.0.1:9000;
# With php5-fpm:
fastcgi_pass unix:/var/run/php5-fpm.sock;
fastcgi_index index.php;
include fastcgi_params;
}
-------

在/etc/nginx/sites-available/default文件中root目录下即:/usr/share/nginx/html目录下创建 upload.php

1
2
3
4
5
touch /usr/share/nginx/html/upload.php
添加内容如下
<?php
echo "tests"
?>

测试是否配置成功

1
2
3
4
/etc/init.d/nginx restart
root@24aa86ae75b9:/home/vrgroup/Desktop# curl localhost/upload.php
test
root@24aa86ae75b9:/home/vrgroup/Desktop#

配置php

若返回值不对可查看nginx错误日志

1
2
3
4
5
6
7
8
9
10
tail -f /var/log/nginx/error.log

若提示/var/run/php5-fpm.sock无法找到可按如下配置
vi /etc/php5/fpm/pool.d/www.conf
add "listen = /var/run/php5-fpm.sock" under "listen = 127.0.0.1:9000"
/etc/init.d/php5-fpm restart
chmod 666 /var/run/php5-fpm.sock

ifconfig #查看本机ip 如无ifconfig工具:apt-get install net-tools
curl http://172.17.0.2/upload.php#测试站点是否正常工作

redis 安装

1
2
3
apt-get install redis-server #安装redis
/etc/init.d/redis-server start #启动redis
redis-cli ping #得到结果PONG表示安装成功

在php中安装redis扩展

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
apt-get install php5-dev
root@24aa86ae75b9:/home/vrgroup/Desktop# git clone https://github.com/phpredis/phpredis.git #若无该命令:apt-get install git
root@24aa86ae75b9:/home/vrgroup/Desktop# cd phpredis/
root@24aa86ae75b9:/home/vrgroup/Desktop/phpredis# phpize
Configuring for:
PHP Api Version: 20121113
Zend Module Api No: 20121212
Zend Extension Api No: 220121212
root@24aa86ae75b9:/home/vrgroup/Desktop/phpredis#

root@24aa86ae75b9:/home/vrgroup/Desktop/phpredis# ./configure
root@24aa86ae75b9:/home/vrgroup/Desktop/phpredis# make && make install
Build complete.
Don't forget to run 'make test'.

Installing shared extensions: /usr/lib/php5/20121212/
root@24aa86ae75b9:/home/vrgroup/Desktop/phpredis#
```
### 配置php.ini文件
``` bash
php5-fpm -i|grep ini#查看php.ini文件位置
vi /etc/php5/fpm/php.ini
;---
extension="/usr/lib/php5/20121212/redis.so" #尾部增加此行
;---

/etc/init.d/php5-fpm restart

查看php redis扩展是否安装成功

1
2
3
4
5
6
7
8
php5-fpm -i |grep redis #查看是否安装成功
root@24aa86ae75b9:/home/vrgroup/Desktop/phpredis# php5-fpm -i |grep redis
redis
Registered save handlers => files user redis rediscluster redis rediscluster
PWD => /home/vrgroup/Desktop/phpredis
_SERVER["PWD"] => /home/vrgroup/Desktop/phpredis
This program is free software; you can redistribute it and/or modify
root@24aa86ae75b9:/home/vrgroup/Desktop/phpredis#

编译redis C扩展

1
2
3
4
5
6
7
8
9
10
11
12
13
root@24aa86ae75b9:/home/vrgroup/Desktop/phpredis# cd ..
root@24aa86ae75b9:/home/vrgroup/Desktop# git clone https://github.com/redis/hiredis.git
root@24aa86ae75b9:/home/vrgroup/Desktop# cd hiredis/
root@24aa86ae75b9:/home/vrgroup/Desktop/hiredis# make 32bit
root@24aa86ae75b9:/home/vrgroup/Desktop/hiredis# make install
mkdir -p /usr/local/include/hiredis /usr/local/lib
cp -a hiredis.h async.h read.h sds.h adapters /usr/local/include/hiredis
cp -a libhiredis.so /usr/local/lib/libhiredis.so.0.13
cd /usr/local/lib && ln -sf libhiredis.so.0.13 libhiredis.so
cp -a libhiredis.a /usr/local/lib
mkdir -p /usr/local/lib/pkgconfig
cp -a hiredis.pc /usr/local/lib/pkgconfig
root@24aa86ae75b9:/home/vrgroup/Desktop/hiredis#

将redis编译进HDecode中

修改HDecode工具源码,使之能够通过redis传入的参数来识别句子

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
cd /home/vrgroup/Desktop/htk/HTKLVRec
root@24aa86ae75b9:/home/vrgroup/Desktop/htk/HTKLVRec# cp HDecode.c RedisToHDecode.c
编辑RedisToHDecode.c文件
添加 hiredis.h 头文件
#include "hiredis.h"
将hiredis/examples/example.c文件中如下代码修改后拷贝入RedisToHDecode.c中main函数下
char name[512];
redisContext *c;
redisReply *reply;
const char *hostname = "127.0.0.1";
int port = 6379;

struct timeval timeout = { 1, 500000 };
c = redisConnectWithTimeout(hostname, port, timeout);
if (c == NULL || c->err) {
if (c) {
printf("Connection error: %s\n", c->errstr);
redisFree(c);
} else {
printf("Connection error: can't allocate redis context\n");
}
exit(1);
}

将main函数中DoRecognition (dec, datafn)函数调用替换为如下代码
while (1){
printf("BRPOP filesQueue 0\n");
printf("block for redis filesQueue. to wakeup it use 'LPUSH filesQueue plpfile' --plpfile was the absolute path of php file to be recognitioned\n");
reply = redisCommand(c,"BRPOP filesQueue 0");
printf("GET file: %s\n", reply->element[1]->str);
DoRecognition (dec, reply->element[1]->str);

s=reply->element[1]->str;
while(*s!='.'&&*s!='\0')s++;
*s='\0';
sprintf(name,"%s",reply->element[1]->str+30);
printf("%s",name);

redisCommand(c,"LPUSH %s %s",name,"1");
freeReplyObject(reply);
}

修改Makefile文件
在all:选项中添加RedisToHDecode模块
并添加如下内容
# binaries
RedisToHDecode: RedisToHDecode.orig.o HLVNet.orig.o HLVRec.orig.o HLVLM.orig.o HLVModel.orig.o $(HTKLIB)
$(CC) $(CFLAGS) -o RedisToHDecode RedisToHDecode.orig.o HLVNet.orig.o HLVRec.orig.o HLVLM.orig.o \
HLVModel.orig.o $(HTKLIB) $(LDFLAGS) -L. -lhiredis


RedisToHDecode.orig.o: RedisToHDecode.c $(HEADER)
$(CC) -c $(CFLAGS) $<
mv RedisToHDecode.o $@

拷贝hiredis中头文件和so文件
root@24aa86ae75b9:/home/vrgroup/Desktop/hiredis# ls|grep "\.h" | xargs -i cp {} /home/vrgroup/Desktop/htk/HTKLVRec/
root@24aa86ae75b9:/home/vrgroup/Desktop/hiredis# cp libhiredis.so /home/vrgroup/Desktop/htk/HTKLVRec/

编译修改HDecode源码得到的RedisToHDecode工具

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
root@24aa86ae75b9:/home/vrgroup/Desktop/htk/HTKLVRec# make
gcc -c -DNO_LAT_LM -m32 -ansi -D_SVID_SOURCE -DOSS_AUDIO -D'ARCH="x86_64"' -Wall -Wno-switch -g -O2 -I../HTKLib RedisToHDecode.c
In file included from hiredis.h:40:0,
from RedisToHDecode.c:60:
sds.h:50:1: error: unknown type name 'inline'

按提示删除abs.h文件中两处inline

root@24aa86ae75b9:/home/vrgroup/Desktop/htk/HTKLVRec# make
gcc -c -DNO_LAT_LM -m32 -ansi -D_SVID_SOURCE -DOSS_AUDIO -D'ARCH="x86_64"' -Wall -Wno-switch -g -O2 -I../HTKLib RedisToHDecode.c
RedisToHDecode.c: In function 'CreateBestInfo':
RedisToHDecode.c:833:10: warning: variable 'lnLabId' set but not used [-Wunused-but-set-variable]
LabId lnLabId;
^
RedisToHDecode.c: In function 'DoRecognition':
RedisToHDecode.c:954:15: warning: variable 'changed' set but not used [-Wunused-but-set-variable]
Boolean changed;
^
In file included from hiredis.h:40:0,
from RedisToHDecode.c:60:
RedisToHDecode.c: At top level:
sds.h:50:15: warning: 'sdslen' defined but not used [-Wunused-function]
static size_t sdslen(const sds s) {
^
sds.h:55:15: warning: 'sdsavail' defined but not used [-Wunused-function]
static size_t sdsavail(const sds s) {
^
mv RedisToHDecode.o RedisToHDecode.orig.o
gcc -DNO_LAT_LM -m32 -ansi -D_SVID_SOURCE -DOSS_AUDIO -D'ARCH="x86_64"' -Wall -Wno-switch -g -O2 -I../HTKLib -o RedisToHDecode RedisToHDecode.orig.o HLVNet.orig.o HLVRec.orig.o HLVLM.orig.o \
HLVModel.orig.o ../HTKLib/HTKLiblv.a -L/usr/X11R6/lib -lm -L. -lhiredis
root@24aa86ae75b9:/home/vrgroup/Desktop/htk/HTKLVRec#

运行RedisToHDecode

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
root@24aa86ae75b9:/home/vrgroup/Desktop/htk/HTKLVRec# cd ../model/
root@24aa86ae75b9:/home/vrgroup/Desktop/htk/model# cp ../HTKLVRec/RedisToHDecode .
root@24aa86ae75b9:/home/vrgroup/Desktop/htk/model# ./RedisToHDecode -A -D -V -T 1 -C hdecode.hlda.cfg -H S2.hlda.MMF -y rec -t 250.0 250.0 -u 3500 -v 125.0 -s 12.0 -p -10.0 -w MedArch-3gram -i out 64k.decode.dct xwrd.clustered.mlist 1501.plp
Reading dictionary from 64k.decode.dct
Reading acoustic models...Read 30782 physical / 1587926 logical HMMs
lmla count 86584
nprons 63904
Reading language model from MedArch-3gram
n1 . .
n2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
n3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
File: 1501.plp
BRPOP filesQueue 0
block for redis filesQueue. to wakeup it use 'LPUSH filesQueue plpfile' --plpfile was the absolute path of php file to be recognitioned

在php中将上传的音频文件等内容通过redis传给RedisToHDocde

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
vim /usr/share/nginx/html/upload.php
编辑upload.php文件,内容如下:
<?php
$redis = new Redis();
$redis->pconnect('127.0.0.1', 6379);

$name = substr(basename( $_FILES['file']['name']),0,-4);
$target_path = "./uploads/";//½�~S�~UļþĿ¼
$target_path = $target_path . basename( $_FILES['file']['name']);

$plp_path = "/home/vrgroup/Desktop/htk/plp/";
$plp_name = $plp_path . $name. ".plp";

if(move_uploaded_file($_FILES['file']['tmp_name'], $target_path)) {
exec("/usr/local/bin/HCopy -C /home/vrgroup/Desktop/htk/model/config-plp $target_path $plp_name");
$redis->lPush('filesQueue',$plp_name);
$redis->blPop($name,0);
$getcmd="iconv -f GBK -t utf-8 /home/vrgroup/Desktop/htk/model/out | awk '/$name/,/SENT_END/'|awk '{if (NF==4){ printf $3}}'";
exec($getcmd,$result);
if(!empty($result[0]))echo substr($result[0],11,-9);
} else{
echo "There was an error uploading the file, please try again!" . $_FILES['uploadedfile']['error'];
}
?>

创建保存上传文件的文件夹

1
2
mkdir /usr/share/nginx/html/uploads
chmod 777 /usr/share/nginx/html/uploads

创建保存特征文件的文件夹

1
2
mkdir /home/vrgroup/Desktop/htk/plp
chmod 777 /home/vrgroup/Desktop/htk/plp

测试在线系统

1
2
hucd@ubuntu:/media/hucd/disk$ curl -F file=@./1501.wav -F filename=1501.plp http://172.17.0.2/upload.php
患者觉气喘有所改善咳嗽少见少许白色粘痰无夜间阵发性呼吸困难hucd@ubuntu:/media/hucd/disk$