A study on Speaker Recognition System

dc.Location2015 T 58.6 B35
dc.SupervisorProfessor Khaled Shalaan
dc.contributor.authorBakkar, Hazem Wa'il Mohammed
dc.date.accessioned2020-11-15T13:19:10Z
dc.date.available2020-11-15T13:19:10Z
dc.date.issued2015-05
dc.description.abstract"The huge development in information technology opened the door for finding an increasing number of security gaps in the daily used systems like email accounts. Security systems developers and manufacturers are trying hardly to cope with the increasing security breaching attacks. The need to overcome this challenge forced many researchers and manufacturers to think about adding extra levels of security to protect information and resources; these extra levels of security are mainly involve around using the human biometrics in order to identify the real identity of the user. Speaker recognition methods are considered a leading approach in applying biometric security systems. In this thesis we aimed to develop a unique speaker recognition system with a user friendly interface. The proposed system was mainly developed using Python (Python.org, 2015). This system was used to implement and study several methods and techniques in speaker recognition domain. Another main goal for conducting this research is to make a scientific comparison between tools and methods that are related to speaker recognition domain, the following are the techniques that were studied : 1) Energy based tool and Long-Term Spectral Divergence (LTSD) in the preprocessing module of the system, 2) Mel Frequency Cepstral Coefficients (MFCC) and Linear Predictive Cepstral Coefficients (LPCC) in the feature extraction module, and 3) scikit-learn Gaussian Mixture Model (GMM), Universal Background Model (UBM), Continuous Restricted Boltzmann Machine (CRBM) and Joint Factor Analysis (JFA) in the recognition module. Finally, we proposed a new GMM in this thesis which was compared with the famous scikit-learn GMM technique. All the mentioned tools and methods were tested and experimented in this thesis. Findings of the experiments showed that: 1) LTSD for voice activity detection is faster and more practical than the energy based tools, 2) MFCC is computationally more expensive than LPCC but MFCC is faster and more accurate, also LPCC needs double size utterance to achieve the same accuracy MFCC generates. 3) The new GMM showed that it is five times faster than scikit-learn GMM, also the proposed GMM outperforms all other techniques studied in this thesis. As a result, to build a user-friendly speaker recognition system, it is better to use LSTD for preprocessing, MFCC for feature extraction, and our enhanced GMM for speaker testing and recognition."en_US
dc.identifier.other2013128078
dc.identifier.urihttps://bspace.buid.ac.ae/handle/1234/1699
dc.language.isoenen_US
dc.publisherThe British University in Dubai (BUiD)en_US
dc.subjectinformation technologyen_US
dc.subjectspeaker recognition systemen_US
dc.subjectsecurity systemsen_US
dc.subjecthuman biometricsen_US
dc.titleA study on Speaker Recognition Systemen_US
dc.typeDissertationen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2013128078.pdf
Size:
2.15 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: