Sponsor
Portland State University. Department of Computer Science
First Advisor
Lois Delcambre
Date of Publication
3-2008
Document Type
Dissertation
Degree Name
Doctor of Philosophy (Ph.D.) in Computer Science
Department
Computer Science
Language
English
Subjects
Semantic differential technique, Indexing, Semantic integration (Computer systems)
DOI
10.15760/etd.2670
Physical Description
1 online resource (ix, 363 pages)
Abstract
Despite the success of general Internet search engines, information retrieval remains an incompletely solved problem. Our research focuses on supporting domain experts when they search domain-specific libraries to satisfy targeted information needs. The semantic components model introduces a schema specific to a particular document collection. A semantic component schema consists of a two-level hierarchy, document classes and semantic components. A document class represents a document grouping, such as topic type or document purpose. A semantic component is a characteristic type of information that occurs in a particular document class and represents an important aspect of the document’s main topic. Semantic component indexing identifies the location and extent of semantic component instances within a document and can supplement traditional full text and keyword indexing techniques. Semantic component searching allows a user to refine a topical search by indicating a preference for documents containing specific semantic components or by indicating terms that should appear in specific semantic components.
We investigate four aspects of semantic components in this research. First, we describe lessons learned from using two methods for developing schemas in two domains. Second, we demonstrate use of semantic components to express domainspecific concepts and relationships by mapping a published taxonomy of questions asked by family practice physicians to the semantic component schemas for two document collections about medical care. Third, we report the results of a user study, showing that manual semantic component indexing is comparable to manual keyword indexing with respect to time and perceived difficulty and suggesting that semantic component indexing may be more accurate and consistent than manual keyword indexing. Fourth, we report the results of an interactive searching study, demonstrating the ability of semantic components to enhance search results compared to a baseline system without semantic components.
In addition, we contribute a formal description of the semantic components model, a prototype implementation of semantic component indexing software, and a prototype implementation adding semantic components to an existing commercial search engine. Finally, we analyze metrics for evaluating instances of semantic component indexing and keyword indexing and illustrate use of a session-based metric for evaluating multiple-query search sessions.
Rights
In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/ This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
Persistent Identifier
http://archives.pdx.edu/ds/psu/16547
Recommended Citation
Price, Susan Loucette, "Semantic Components: A Model for Enhancing Retrieval of Domain- Specific Information" (2008). Dissertations and Theses. Paper 2673.
https://doi.org/10.15760/etd.2670
Comments
If you are the rightful copyright holder of this dissertation or thesis and wish to have it removed from the Open Access Collection, please submit a request to pdxscholar@pdx.edu and include clear identification of the work, preferably with URL