Advisor

Lois Delcambre

Date of Award

3-2008

Document Type

Dissertation

Degree Name

Doctor of Philosophy (Ph.D.) in Computer Science

Department

Computer Science

Physical Description

1 online resource (ix, 363 pages)

Subjects

Semantic differential technique, Indexing, Semantic integration (Computer systems)

DOI

10.15760/etd.2670

Abstract

Despite the success of general Internet search engines, information retrieval remains an incompletely solved problem. Our research focuses on supporting domain experts when they search domain-specific libraries to satisfy targeted information needs. The semantic components model introduces a schema specific to a particular document collection. A semantic component schema consists of a two-level hierarchy, document classes and semantic components. A document class represents a document grouping, such as topic type or document purpose. A semantic component is a characteristic type of information that occurs in a particular document class and represents an important aspect of the document’s main topic. Semantic component indexing identifies the location and extent of semantic component instances within a document and can supplement traditional full text and keyword indexing techniques. Semantic component searching allows a user to refine a topical search by indicating a preference for documents containing specific semantic components or by indicating terms that should appear in specific semantic components.

We investigate four aspects of semantic components in this research. First, we describe lessons learned from using two methods for developing schemas in two domains. Second, we demonstrate use of semantic components to express domainspecific concepts and relationships by mapping a published taxonomy of questions asked by family practice physicians to the semantic component schemas for two document collections about medical care. Third, we report the results of a user study, showing that manual semantic component indexing is comparable to manual keyword indexing with respect to time and perceived difficulty and suggesting that semantic component indexing may be more accurate and consistent than manual keyword indexing. Fourth, we report the results of an interactive searching study, demonstrating the ability of semantic components to enhance search results compared to a baseline system without semantic components.

In addition, we contribute a formal description of the semantic components model, a prototype implementation of semantic component indexing software, and a prototype implementation adding semantic components to an existing commercial search engine. Finally, we analyze metrics for evaluating instances of semantic component indexing and keyword indexing and illustrate use of a session-based metric for evaluating multiple-query search sessions.

Description

If you are the rightful copyright holder of this dissertation or thesis and wish to have it removed from the Open Access Collection, please submit a request to pdxscholar@pdx.edu and include clear identification of the work, preferably with URL

Persistent Identifier

http://archives.pdx.edu/ds/psu/16547

Share

COinS