Improving Speaker Identification Performance by Combining Vocal Tract Features

S.  Selva Nidhyananthan; R.  Shantha Selva Kumari; G.  Jaffino

Improving Speaker Identification Performance by Combining Vocal Tract Features

International Journal of Applied Information Systems
Foundation of Computer Science (FCS), NY, USA
Volume 3 - Number 1
Year of Publication: 2012
Authors: S. Selva Nidhyananthan, R. Shantha Selva Kumari, G. Jaffino
http:/ijais12-450433

1911

Export

S. Selva Nidhyananthan and R. Shantha Selva Kumari and G. Jaffino 2012. Improving Speaker Identification Performance by Combining Vocal Tract Features. International Journal of Applied Information Systems. 3, 1 (July 2012), 27-33. DOI=http://dx.doi.org/10.5120/ijais450433

@article{10.5120/ijais2017451568,
author = {S. Selva Nidhyananthan and R. Shantha Selva Kumari and G. Jaffino},
title = {Improving Speaker Identification Performance by Combining Vocal Tract Features},
journal = {International Journal of Applied Information Systems},
issue_date = {July 2012},
volume = {3},
number = {},
month = {July},
year = {2012},
issn = {},
pages = {27-33},
numpages = {},
url = {/archives/volume3/number1/197-0433},
doi = { http:/ijais12-450433},
publisher = { xA9 2010 by IJAIS Journal},
address = {}
}

%1 450433
%A S.  Selva Nidhyananthan
%A R.  Shantha Selva Kumari
%A G.  Jaffino
%T Improving Speaker Identification Performance by Combining Vocal Tract Features
%J International Journal of Applied Information Systems
%@ 
%V 3
%N 
%P 27-33
%D 2012
%I  xA9 2010 by IJAIS Journal

Abstract

This paper proposes fusion and addition techniques of vocal tract features such as Mel Frequency Cepstral Coefficients (MFCC) and Dynamic Mel Frequency Cepstral Coefficients (DMFCC) in speaker identification. Feature extraction plays an important role as a front end processing block in Speaker Identification (SI) process. Mel frequency features are used to extract the spectral characteristics of the speech such as formant frequency and the bandwidth of formant frequency. This feature estimation method leads to robust recognition performance. The Dynamic Mel frequency features are used to extract the dynamic behavior of the human vocal tract using pitch frequency. This work is focused to increase the identification accuracy with databases containing short length speech signal. Experimental evaluation is carried out on TIMIT database with 630 speakers using Gaussian Mixture Model (GMM).

References

Douglas O' Shaughnessy,"Speech Communication Human and Machines," II nd edition, Universities press (India) Limited (2001).
S. Selva Nidhyananthan, R. Shantha Selva Kumari and G. Jaffino," Text-Independent speaker identification using residual feature extraction Technique," CiiT International Journal of Digital signal processing, march 2012.
A. E. Rosenberg et al. , "Connected word talker verification using whole word Hidden Markov Models," in Proc. ICASSP, 1991, pp. 381-384.
D. A. Reynolds and R. C. Rose published a paper, "Robust test-independent speaker identification using Gaussian mixture Speaker models. "IEEE Transaction on Speech Audio Processing, vol. 3, 1995, pp 72-83.
Tomoko Matsui and Sadaoki Furui, "Comparison of Text Independent Speaker Recognition Methods Using VQ Distortion and Discrete Continuous HMM's," IEEE transactions on speech and audio processing, vol. 2, no. 3, July 1994.
Md. Rashidul Hasan, Mustafa Jamil Md. Golam Rabbani,Md. Saifur Rahman, "Speaker Identification using Mel Frequency Cepstral Coefficients", 3rd International conference on Electrical and computer engineering ICECE 2004,Dec 2004.
Douglas O' Shaughnessy, "Speech communication Human and Machines", IInd edition , Universities Press(India) Limited(2001).
Prodesy and speech recognition by Alex Waibel, vol. 1, Nos. 1-2, 2007.
Sandipan Chakroborty, Goutam Saha, "Improved Text-Independent Speaker Identification using Fused MFCC & IMFCC Feature Sets based on Gaussian Filter" , International Journal of Signal Processing 5:1, 2009.
Wang Yutai, Li Bo, Jiang Xiaoqing, Liu Feng, Wang Lihao," Speaker Recognition Based on Dynamic MFCC Parameters" IEEE proceedings 2009.
Tomi Kinnunen, Haizhou Li,"An Overview of Text-Independent Speaker Recognition: From Features to Super vectors", august 2009.
Miyajima, Y. Hattori, K. Tokuda, T. Kabayashi and T. Kitamura " Text-Independent Speaker Identification using Gaussian Mixture Models based on multispace probability distribution," IEEE Transactions on information and system, vol. E84-B,2001,pp. 847-855.
C. Miyajima, Y. Hattori, K. Tokuda, T. Kabayashi and T. Kitamura," Text-Independent speaker identification using Gaussian mixture models based on multispace probability distribution," IEEE Transactions on information and system, vol. E84-B, 2001, pp. 847-855.
Murthy. K and Yegnanarayana. B," Combining evidence from residual phase and MFCC features for speaker Recognition," Signal Processing Letters; IEEE, vol. 13, no. 1, pp. 52- 55, Jan 2006.
Victor Zue, Stephanie Seneff, James Glass,"Speech database development at MIT: Timit and beyond", Speech Communication, Volume 9, Issue 4, August 1990, Pages 351–356.

Keywords

Dmfcc, Mfcc, Gmm, Feature Extraction, Speaker Identification

Index Terms

Computer Science

Information Sciences

Call for paper April Edition 2017

Number 1

Improving Speaker Identification Performance by Combining Vocal Tract Features

Export

Abstract

References

Keywords

Index Terms

Call for paper
April Edition 2017