Speaker Verification System using Wavelet Transform and Neural Network for short utterances

Krishna Sarma; Fidalizia Pyrtuh; Debarun Chakraborty

doi:10.33130/AJCT.2020v06i01.006

Speaker Verification System using Wavelet Transform and Neural Network for short utterances

Krishna Sarma
Fidalizia Pyrtuh
Debarun Chakraborty

DOI: https://doi.org/10.33130/AJCT.2020v06i01.006

Keywords: Speaker verification system, short utterances, Discrete Wavelet Transform (DWT), Discrete Cosine Transform (DCT), Feedforward Neural Network (FFNN), Hillbert Transform.

Abstract

In this paper, wavelet transform technique and neural network is used for development of Speaker Verification System for short utterances. The sampled data undergo 4-level decomposition in wavelet decomposition technique. DCT (Discrete Cosine Transform) is performed on the dataset, to improve the features extraction process. This study includes Hilbert Transform, which shows the importance of magnitude and phase for speaker classification and their performance was shown. Hilbert Transform is explored, to analyze performance of phase for the data. The features are then, fed to feed-forward back propagation neural network for further classification. The proposed technique is evaluated on fixed phrase of the RedDots dataset and self-recorded numerical dataset. The proposed method performs effectively up to 95% recognition rate.

References

[1]S. Davis and P. Mermelstein, “Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 28, no. 4, pp. 357-366, August 1980.
[2]L. R. Rabiner and R. W. Schafer, “Digital Processing of Speech Signals,” Prentice- Hall Englewood Cliffs,1978.
[3]E. R. Rady, A. H. Yahia, El-Dahshan, El-S.A. and H. El-Borey, “Speech Recognition System Based on Wavelet Transform and Artificial Neural Network,” Egyptian Computer Science Journal, ECS, Vol. 37 No. 3, ISSN-1110-2586 , 2013.
[4] N. Almaadeed, A. Aggoun and A. Amira, “Speaker Identification Using Multimodal Neuralnetworks and Wavelet Analysis,” IET-BMT, 2014.
[5]M. Siafarikas, T. Ganchev and Fakotakis, “Wavelet Packets Based Speaker Verification,” Proceedings of the ISCA speaker and language recognition workshop Odyssey, Toledo, Spain, May 31–June 3, pp. 257–264, 2004.
[6]T. B. Adam, M.S. Salam and T.S. Gunawan, “Wavelet Based Cepstral Coefficients for Neural Network Speech Recognition,” IEEE International Conference on Signal and Image Processing Applications( ICSIPA), October 2013.
[7]J.B. Buckheit and D.L. Donoho, Wave Lab and Reproducible Research, Dept. of Statistics, Stanford University, Tech. Rep. 474, 1995.
[8]E. Wesfred and V. Wickerhauser, “Adapted local trigonometric transforms and speech processing,” IEEE trans. on Signal Proc. 41 N.12, pp-3596-3600, 1993.
[9]E. Visser, M. Otsuka and Lee, “A Spatio-Temporal Speech Enhancement Scheme for Robust Speech Recognition in Noisy Environment,” Speech Communication, pp. 393-407, 2003.
[10]Y.A. Alotaibi, Investigation of Spoken Arabic Digits in Speech Recognition Setting. Informatics and Computer Sciences 173 (1–3)105–139, 2005.
[11]J. Lampinen and E. Oja, “Fast Self-organization by the Probing Algorithm,” Proceedings of the International Joint Conference on Neural Networks (IJCNN), volume II, pp. 503-507, Piscataway, NJ. IEEE Service Center, 1989.
[12]S. Haykin, Neural Networks: A Comprehensive Foundation, Edition 2, Prentice Hall, 1999.
[13]S. Mallat, A Wavelet Tour of Signal Processing, Elsevier, UK, 1999.
[14]S. Lung and C. Chen, “Further Reduced Form of Karhunen–Loeve transform for Text Independent Speaker Recognition ,” Electronics Letters, Volume 34, ISSN 0013-5194, pp. 1380–1382 , July 1998.
[15]M. Vetterli and J. Kovacevic, Wavelets and Subband Coding, Prentice-Hall, NewJersey, 1995.
[16]A. Shukla, R. Tiwari, H.K. Meena and R. Kala, “Speaker Identification using Wavelet Analysis and Modular Neural Networks,” J. Acoust. Soc. India (JASI), Volume 36, (1), pp. 14–19, 2009.
[17]M. Sifuzzaman, M.R. Islam and M.Z. Ali, “Application of Wavelet Transform and its Advantages Compared to Fourier Transform,” Journal of Physical Sciences, Vol. 13, pp. 121-134, ISSN: 0972-8791, 2009.
[18]V.R. Vimal Krishnan and P. Babu Anto, “Feature Parameter Extraction from Wavelet Sub band Analysis for the Recognition of Isolated Malayalam Spoken Words,” International Journal of Computer and Network Security(IJCNS), 1(1), October 2009.
[19]H. Amhia and R. Kumar, “A New Approach of Speech Compression by Using DWT & DCT,” IJAREEIE3(7), pp. 10762-10765, 2014.
[20]K.A. Lee, A. Larcher, G. Wang, P. Kenny, N. Li. H. Brummer, T. Stafylakis, J. Alam, A. Swart and J. Perrez, “The RedDots Data Collection for Speaker Recognition,” INTERSPEECH, 2015.

Published

2020-04-15

How to Cite

Sarma, K., Pyrtuh, F., & Chakraborty, D. (2020). Speaker Verification System using Wavelet Transform and Neural Network for short utterances. Asian Journal For Convergence In Technology (AJCT) ISSN -2350-1146, 6(1), 30-35. https://doi.org/10.33130/AJCT.2020v06i01.006

Download Citation

Issue

Vol 6 No 1 (2020): Volume 6 Issue I

Section

Article

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

To ensure uniformity of treatment among all contributors, other forms may not be substituted for this form, nor may any wording of the form be changed. This form is intended for original material submitted to AJCT and must accompany any such material in order to be published by AJCT. Please read the form carefully.
The undersigned hereby assigns to the Asian Journal of Convergence in Technology Issues ("AJCT") all rights under copyright that may exist in and to the above Work, any revised or expanded derivative works submitted to AJCT by the undersigned based on the Work, and any associated written, audio and/or visual presentations or other enhancements accompanying the Work. The undersigned hereby warrants that the Work is original and that he/she is the author of the Work; to the extent the Work incorporates text passages, figures, data or other material from the works of others, the undersigned has obtained any necessary permission. See Retained Rights, below.

AUTHOR RESPONSIBILITIES
AJCT distributes its technical publications throughout the world and wants to ensure that the material submitted to its publications is properly available to the readership of those publications. Authors must ensure that The Work is their own and is original. It is the responsibility of the authors, not AJCT, to determine whether disclosure of their material requires the prior consent of other parties and, if so, to obtain it.

RETAINED RIGHTS/TERMS AND CONDITIONS
1. Authors/employers retain all proprietary rights in any process, procedure, or article of manufacture described in the Work.
2. Authors/employers may reproduce or authorize others to reproduce The Work and for the author's personal use or for company or organizational use, provided that the source and any AJCT copyright notice are indicated, the copies are not used in any way that implies AJCT endorsement of a product or service of any employer, and the copies themselves are not offered for sale.
3. Authors/employers may make limited distribution of all or portions of the Work prior to publication if they inform AJCT in advance of the nature and extent of such limited distribution.
4. For all uses not covered by items 2 and 3, authors/employers must request permission from AJCT.
5. Although authors are permitted to re-use all or portions of the Work in other works, this does not include granting third-party requests for reprinting, republishing, or other types of re-use.

INFORMATION FOR AUTHORS
AJCT Copyright Ownership
It is the formal policy of AJCT to own the copyrights to all copyrightable material in its technical publications and to the individual contributions contained therein, in order to protect the interests of AJCT, its authors and their employers, and, at the same time, to facilitate the appropriate re-use of this material by others.
Author/Employer Rights
If you are employed and prepared the Work on a subject within the scope of your employment, the copyright in the Work belongs to your employer as a work-for-hire. In that case, AJCT assumes that when you sign this Form, you are authorized to do so by your employer and that your employer has consented to the transfer of copyright, to the representation and warranty of publication rights, and to all other terms and conditions of this Form. If such authorization and consent has not been given to you, an authorized representative of your employer should sign this Form as the Author.
Reprint/Republication Policy
AJCT requires that the consent of the first-named author and employer be sought as a condition to granting reprint or republication rights to others or for permitting use of a Work for promotion or marketing purposes.

GENERAL TERMS

1. The undersigned represents that he/she has the power and authority to make and execute this assignment.
2. The undersigned agrees to indemnify and hold harmless AJCT from any damage or expense that may arise in the event of a breach of any of the warranties set forth above.
3. In the event the above work is accepted and published by AJCT and consequently withdrawn by the author(s), the foregoing copyright transfer shall become null and void and all materials embodying the Work submitted to AJCT will be destroyed.
4. For jointly authored Works, all joint authors should sign, or one of the authors should sign as authorized agent
for the others.

Licenced by :

Creative Commons Attribution 4.0 International License.

Speaker Verification System using Wavelet Transform and Neural Network for short utterances

Abstract

References

Most read articles by the same author(s)