||Exploiting Speaker Recognition for Home Security Applications
||Department of Engineering Science
Nowadays privacy is considered as an important issue for all individuals. Conventional approach of utilizing the combination of username and password for authentication is becoming less secure. On the other hand, biometrics become new features for authentication because biometrics are inherently unique and measurable characteristics that can be used to identify a person. Among several types of biometrics, it is generally easier to get voice data of a person. In order to improve home security, in our research using speaker recognition technology, the technology is to identify a person utilizing the characteristics of human voice. Speaker recognition techniques can be effectively to extraction the person’s vocal tract features. Their vocal tract shapes, larynx sizes, and other parts of their voice production organs are different no two individuals sound identical. In prior works, the Mel-Frequency Cepstrum Coefficients can describe the vocal tract characteristics and easy to captures vocal tract characteristics more effectively. The extracted features by vector quantization approach to create voice modeling. During the identification, a speech sample or utterance is compared against a previously created voice model. Our prototype system has two main functions including speaker recognition and history review. Experimental results show that our system can instantly and accurately identify family members in the home environment. Moreover, strangers can be detected so as to actively alert family members.
Chapter 1 Introduction 1
1.1 Motivation and Overview 1
1.2 Contributions of this Work 2
Chapter 2 Preliminaries 3
2.1 Biometrics Authentication for Home Security 3
2.2 Overview of a Speaker Recognition System 5
2.3 Variants of Spoken Input in Speaker Recognition Systems 9
2.4 Preprocessing of Voice Data 10
Chapter 3 Utilizing GQSOM for Voice Modeling 14
3.1 Voice Modeling 14
3.1.1 Usage of K-means 14
3.1.2 Usage of the Self-Organizing Map 16
3.1.3 Usage of the Growing Quadtree Self-Organizing Map 19
3.2 Exceptional Enrollment and Identification for Family Members 23
Chapter 4 Empirical Studies 27
4.1 Proposed Scheme 27
4.2 Experimental Results 28
4.3 Proposed Speaker Recognition System 38
Chapter 5 Conclusions and Future Works 41
 B. Ayoub, K. Jamal, and Z. Arsalane, “Self-organizing mixture models for text-independent speaker identification,” Third IEEE International Colloquium in Information Science and Technology (CIST), pages 345-350, Oct 2014.
 P. Agrawal, and H. A. Patil, “Fusion of TEO Phase with MFCC Features for Speaker Verification,” Proceedings of the 2nd International Conference on Perception and Machine Intelligence, pages 161-166, February 2015.
 H. Choi, R. Gutierrez-Osuna, S. Choi, and Y. Choe, “Kernel oriented discriminant analysis for speaker-independent phoneme spaces,” International Conference on Pattern Recognition, pages 1-4, December 2008.
 S. Chakroborty, A. Roy, S. Majumdar, and G. Saha, “Capturing Complementary Information via Reversed Filter Bank and Parallel Implementation with MFCC for Improved Text-Independent Speaker Identification,” International Conference on Computing: Theory and Applications, pages 463-467, March 2007.
 S. Debnath, B. Soni, U. Baruah, and D. K. Sah, “Text-Dependent Speaker Verification System: A Review,” International Conference on Intelligent Systems and Control (ISCO), pages 1-7, Jan 2015.
 D. Govind, A. S. Biju, and A. Smily, “Automatic speech polarity detection using phase information from complex analytic signal representations,” International Conference on Signal Processing and Communications (SPCOM), pages 1-5, July 2014.
 T. Gulzar, A. Singh, and S. Sharma, “Comparative Analysis of LPCC, MFCC and BFCC for the Recognition of Hindi Words using Artificial Neural Networks,” International Journal of Computer Applications, 101(12):22-27, September 2014.
 B. Homayoon, “Fundamentals of speaker recognition,” Springer Science & Business Media, 2011.
 D. Hosseinzadeh and S. Krishnan, “Combining Vocal Source and MFCC Features for Enhanced Speaker Recognition Performance Using GMMs,” Workshop on Multimedia Signal Processing, pages 365-368, Oct 2007.
 R. Hasan, M. Jamil, G. Rabbani, and Saifur Rahman, “Speaker Identification Using Mel Frequency Cepstral Coefficients,” International Conference on Electrical & Computer Engineering (ICECE), pages 565-568, December 2004.
 J. Justiniano, C. Javier, A. Blecher, and H. Beigi, “Acceptability Research for Audio Visual Recognition Technology,” Recognition Technologies Technical Report No. RTI-20150128-01, January 2015
 A. K. Jain, A. A. Ross, and K. Nandakumar, “Introduction to Biometrics,” Springer Science & Business Media, 2011.
 A. K. Jain, A. Ross, and S. Prabhakar, “An introduction to biometric recognition,” IEEE Transactions on Circuits and Systems for Video Technology, 14(1):4-20, Jan 2004.
 P. Kumar, and S. L. Lahudkar, “Automatic Speaker Recognition using LPCC and MFCC,” International Journal on Recent and Innovation Trends in Computing and Communication, 3(4):2106-2109, April 2015.
 T. Kohonen, “The Self-organizing Map,” Proceedings of IEEE, 78(9):1464-1480, September 1990.
 T. Kinnunen, and H. Li, “An overview of text-independent speaker recognition: From features to supervectors,” Speech Communication, 52(1):12-40, January 2010.
 T. Kinnunen, T. Kilpeläinen, and P. FrÄnti, “Comparison of clustering algorithms in speaker identification,” dim, 1(2), 2011.
 I. Lapidot, “Self-Organizing-Maps with BIC For Speaker Clustering,” IDIAP, 2(60), December 2002.
 I. Lapidot, H. Guterman, and A. Cohen, “Unsupervised speaker recognition based on competition between self-organizing maps,” IEEE Transactions on Neural Networks, 13(4):877-887, July 2002.
 A. T. Mafra and M. G. Simoes, “Text independent automatic speaker recognition using selforganizing maps,” Conference Record of the 2004 IEEE Industry Applications Conference, 3:1503-1510, Oct 2004.
 L. Muda, M. Begam and I. Elamvazuthi, “Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques,” Journal of Computing, 2(3):138-143, March 2010.
 R. Mathur and S. N. Sharma, “Performance Comparison of Speaker Identification using Vector Quantization by MFCC Algorithm,” International Journal of Engineering Development and Research (IJEDR), 3(2):252-255, 2015.
 S. Nakagawa, L. Wang, and S. Ohtsuka, “Speaker Identification and Verification by Combining MFCC and Phase Information,” IEEE Transactions on Audio, Speech, and Language Processing , 20(4):1085-1095, May 2012.
 J. Patel and A. Nandurbarkar, “Development and Implementation of Algorithm for Speaker recognition for Gujarati Language,” International Research Journal of Engineering and Technology (IRJET), 2(2):444-448, May 2015.
 D. A. Reynolds, “An overview of automatic speaker recognition technology,” IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 4:4072-4075, May 2002.
 H. Seddik, A. Rahmouni, and M. Sayadi, “Text independent speaker recognition using the Mel frequency cepstral coefficients and a neural network classifier,” First International Symposium on Control, Communications and Signal Processing, pages 631-634, 2004.
 M. Sahidullah, and G. Saha, “Design, analysis and experimental evaluation of block based transformation in MFCC computation for speaker recognition,” Speech Communication, 54(4):543-565, May 2012.
 M. Sahidullah and T. Kinnunen, “Local spectral variability features for speaker verification,” Digital Signal Processing, 50:1-11, November 2015.
 V. Tiwari, “MFCC and its applications in speaker recognition,” International Journal on Emerging Technologies, 1(1):19-22, 2010.
 W.-G. Teng, P.-L. Chang, and C.-T. Yang, “Adaptive and Efficient Colour Quantisation Based on a Growing Self-Organising Map,” IET Image Processing, 6(5):463-472, July 2012.
 O. Viikki, and K. Laurila, “Cepstral domain segmental feature vector normalization for noise robust speech recognition,” Speech Communication, 25:133-147, February 1998.
 Z. Weng, L. Li, and Donghui Guo, “Speaker recognition using weighted dynamic MFCC based on GMM,” International Conference on Anti-Counterfeiting, Security and Identification, pages 285-288, July 2010.
 Y. Yujin, Z. Peihua, and Z. Qun, “Research of speaker recognition based on combination of LPCC and MFCC,” IEEE International Conference on Intelligent Computing and Intelligent Systems (ICIS), 3:765-767, October 2010.