Repository logo
  • English
  • Català
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Tiếng Việt
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Yкраї́нська
  • Log In
    New user? Click here to register.Have you forgotten your password?
Repository logo
  • Communities & Collections
  • All of BSpace
  • English
  • Català
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Tiếng Việt
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Yкраї́нська
  • Log In
    New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Browse by Author

Browsing by Author "Hifny, Yasser"

Now showing 1 - 1 of 1
Results Per Page
Sort Options
  • No Thumbnail Available
    Item
    A Multimodal Approach to Improve Performance Evaluation of Call Center Agent
    (MDPI, 2021) Ahmed , Abdelrahman; Shaalan, Khaled; Toral, Sergio; Hifny, Yasser
    The paper proposes three modeling techniques to improve the performance evaluation of the call center agent. The first technique is speech processing supported by an attention layer for the agent’s recorded calls. The speech comprises 65 features for the ultimate determination of the context of the call using the Open-Smile toolkit. The second technique uses the Max Weights Similarity (MWS) approach instead of the Softmax function in the attention layer to improve the classification accuracy. MWS function replaces the Softmax function for fine-tuning the output of the attention layer for processing text. It is formed by determining the similarity in the distance of input weights of the attention layer to the weights of the max vectors. The third technique combines the agent’s recorded call speech with the corresponding transcribed text for binary classification. The speech modeling and text modeling are based on combinations of the Convolutional Neural Networks (CNNs) and Bi-directional Long-Short Term Memory (BiLSTMs). In this paper, the classification results for each model (text versus speech) are proposed and compared with the multimodal approach’s results. The multimodal classification provided an improvement of (0.22%) compared with acoustic model and (1.7%) compared with text model.
  • Library Website
  • University Website
The British University in Dubai (BUiD)

PO Box 345015 | 1st & 2nd Floors, Block 11, Dubai International Academic City (DIAC)
United Arab Emirates, Phone: +971 4 279 1471, Email: library@buid.ac.ae

DSpace software copyright © 2002-2025 LYRASIS

  • Cookie settings
  • Privacy policy
  • End User Agreement
  • Send Feedback