Improving performance of collaborative question answering systems by using semantic resources

Javed, Muhammad Arshad

Improving performance of collaborative question answering systems by using semantic resources

dc.Location	2015 T 58.6 J38
dc.Supervisor	Professor Khaled Shaalan
dc.contributor.author	Javed, Muhammad Arshad
dc.date.accessioned	2016-05-19T12:54:23Z
dc.date.available	2016-05-19T12:54:23Z
dc.date.issued	2015-06
dc.description.abstract	In this modern age of technology, World Wide Web (WWW) provides us a platform to share the information with each other. People use different types of web applications for example online forums/blogs, portals for question answering, e-mail, and prompt messaging tools to collect and share their information and develop online communities. All these shared information on the web create a huge collection of data. This data is increasing day by day. Online social networks gather data from individual users and offer them to create link with other users of mutual interests in the same network. In this fashion, the social networks evolved as platforms to launch and uphold the social relationships in addition to share their knowledge and information. To manage such a large information, we need to use Information Retrieval (IR) techniques in efficient way. An Information Retrieval (IR) system retrieves the text related to the query of the user from massive collection of documents in real time. A document may comprise a collection of text, like a web page or an article. Information Retrieval system efforts to gratify the user's requirements effectively. Usually, an IR system takes the user query in natural language and returns the documents containing information pertinent to the question. One typical example of an IR system is Question Answering System. Usually a question answering system contains three phases namely question analysis, document retrieval and answer analysis. The question analysis phase takes the user questions and applies several processes such as question classification, query expansion to increase the probability of finding the relevant documents. The document analysis phase takes the processed question and retrieves the documents containing possible answers. The answer analysis phase identifies the relevant passages or set of sentences containing the possible answers and presents it to users. Thus, Question Answering Systems are very useful for retrieving documents from a collection of documents. In order to take full advantage of data generated by users over the social networks, a special class of Question Answering Systems was designed. These systems are called Collaborative Question Answering (CQA) Systems or Community Question Answering Systems. There are dozens of Collaborative Question Answering Systems available on the internet. The research proposed in this dissertation focuses mainly on CQA Systems and proposes methods to improve performances of these systems. One major problem with the existing CQAs is the mismatch between the user questions and the set of questions present in the CQAs. Though these CQAs contain the question, which is semantically similar to the user question, they fail to return the answers. The research in this dissertation proposes the methods to solve this issue. Thus, the scope of this dissertation is limited to the question analysis phase of the CQA systems. The overall performance of a CQA depends a lot on the question analysis phase. The question analysis phase in the proposed research attempts to improve the question matching in two steps. In the first step, called Question classification, questions are classified into several coarse grained and fine grained classes based on some rules. Based on predicted class of the question, the entity type (person, location, time etc.) expected to be present in the answers are determined. In question classification, we have used Wikipedia and WordNet tools. In the second step, called query expansion, irrelevant words are removed and semantically equivalent words are added. We have used a freely available open source thesaurus named Collaborative International Dictionary of English (CIDE) to find the semantically equivalent words. The methods proposed in this research are tested over a number of questions collected from existing CQA systems. The results are presented in the thesis.	en_US
dc.identifier.other	120166
dc.identifier.uri	http://bspace.buid.ac.ae/handle/1234/800
dc.language.iso	en	en_US
dc.publisher	The British University in Dubai (BUiD)	en_US
dc.subject	semantic resources	en_US
dc.subject	information retrieval system	en_US
dc.subject	question answering system	en_US
dc.subject	Collaborative Question Answering (CQA)	en_US
dc.title	Improving performance of collaborative question answering systems by using semantic resources	en_US
dc.type	Dissertation	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 120166.pdf
Size:: 1.87 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Dissertations for Informatics (Knowledge and Data Management)