Coupled intrinsic and extrinsic human language resource-based query expansion


Selvaretnam, Bhawani and Belkhatir, Mohammed (2018) Coupled intrinsic and extrinsic human language resource-based query expansion. Knowledge and Information Systems (1). pp. 1397-1426. ISSN 0219-1377

[img] Text
selvaretnam2018.pdf - Published Version
Restricted to Repository staff only

Download (1MB)


Poor information retrieval performance has often been attributed to the query-document vocabulary mismatch problem which is defined as the difficulty for human users to formulate precise natural language queries that are in line with the vocabulary of the documents deemed relevant to a specific search goal. To alleviate this problem, query expansion processes are applied in order to spawn and integrate additional terms to an initial query. This requires accurate identification of main query concepts to ensure the intended search goal is duly emphasized and relevant expansion concepts are extracted and included in the enriched query. Natural language queries have intrinsic linguistic properties such as parts-of-speech labels and grammatical relations which can be utilized in determining the intended search goal. Additionally, extrinsic language-based resources such as ontologies are needed to suggest expansion concepts semantically coherent with the query content. We present here a query expansion framework which capitalizes on both linguistic characteristics of user queries and ontology resources for query constituent encoding, expansion concept extraction and concept weighting. A thorough empirical evaluation on real-world datasets validates our approach against unigram language model, relevance model and a sequential dependence-based technique.

Item Type: Article
Uncontrolled Keywords: Query languages (Computer science), Human language processing, Ontology processing, Concept weighting
Subjects: Q Science > QA Mathematics > QA71-90 Instruments and machines > QA75.5-76.95 Electronic computers. Computer science > QA76.75-76.765 Computer software
Divisions: Faculty of Computing and Informatics (FCI)
Depositing User: Ms Suzilawati Abu Samah
Date Deposited: 09 Nov 2020 19:21
Last Modified: 09 Nov 2020 19:21


Downloads per month over past year

View ItemEdit (login required)