Please use this identifier to cite or link to this item: https://bspace.buid.ac.ae1234/800
Title: Improving performance of collaborative question answering systems by using semantic resources
Authors: Javed, Muhammad Arshad
Keywords: semantic resources
information retrieval system
question answering system
Collaborative Question Answering (CQA)
Issue Date: Jun-2015
Publisher: The British University in Dubai (BUiD)
Abstract: In this modern age of technology, World Wide Web (WWW) provides us a platform to share the information with each other. People use different types of web applications for example online forums/blogs, portals for question answering, e-mail, and prompt messaging tools to collect and share their information and develop online communities. All these shared information on the web create a huge collection of data. This data is increasing day by day. Online social networks gather data from individual users and offer them to create link with other users of mutual interests in the same network. In this fashion, the social networks evolved as platforms to launch and uphold the social relationships in addition to share their knowledge and information. To manage such a large information, we need to use Information Retrieval (IR) techniques in efficient way. An Information Retrieval (IR) system retrieves the text related to the query of the user from massive collection of documents in real time. A document may comprise a collection of text, like a web page or an article. Information Retrieval system efforts to gratify the user's requirements effectively. Usually, an IR system takes the user query in natural language and returns the documents containing information pertinent to the question. One typical example of an IR system is Question Answering System. Usually a question answering system contains three phases namely question analysis, document retrieval and answer analysis. The question analysis phase takes the user questions and applies several processes such as question classification, query expansion to increase the probability of finding the relevant documents. The document analysis phase takes the processed question and retrieves the documents containing possible answers. The answer analysis phase identifies the relevant passages or set of sentences containing the possible answers and presents it to users. Thus, Question Answering Systems are very useful for retrieving documents from a collection of documents. In order to take full advantage of data generated by users over the social networks, a special class of Question Answering Systems was designed. These systems are called Collaborative Question Answering (CQA) Systems or Community Question Answering Systems. There are dozens of Collaborative Question Answering Systems available on the internet. The research proposed in this dissertation focuses mainly on CQA Systems and proposes methods to improve performances of these systems. One major problem with the existing CQAs is the mismatch between the user questions and the set of questions present in the CQAs. Though these CQAs contain the question, which is semantically similar to the user question, they fail to return the answers. The research in this dissertation proposes the methods to solve this issue. Thus, the scope of this dissertation is limited to the question analysis phase of the CQA systems. The overall performance of a CQA depends a lot on the question analysis phase. The question analysis phase in the proposed research attempts to improve the question matching in two steps. In the first step, called Question classification, questions are classified into several coarse grained and fine grained classes based on some rules. Based on predicted class of the question, the entity type (person, location, time etc.) expected to be present in the answers are determined. In question classification, we have used Wikipedia and WordNet tools. In the second step, called query expansion, irrelevant words are removed and semantically equivalent words are added. We have used a freely available open source thesaurus named Collaborative International Dictionary of English (CIDE) to find the semantically equivalent words. The methods proposed in this research are tested over a number of questions collected from existing CQA systems. The results are presented in the thesis.
URI: http://bspace.buid.ac.ae/handle/1234/800
Appears in Collections:Dissertations for Informatics (Knowledge and Data Management)

Files in This Item:
File Description SizeFormat 
120166.pdf1.92 MBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.