First Advisor

David Maier

Term of Graduation

Winter 2020

Date of Publication

3-26-2020

Document Type

Dissertation

Degree Name

Doctor of Philosophy (Ph.D.) in Computer Science

Department

Computer Science

Language

English

Subjects

Computer algorithms, Web search engines, User interfaces (Computer systems), Internet searching

DOI

10.15760/etd.7299

Physical Description

1 online resource (xvii, 227 pages)

Abstract

Domain novices learning about a new subject can struggle to find their way in large collections. Typical searching and browsing tools are better utilized if users know what to search for or browse to. In this dissertation, we present Multiple Diagram Navigation (MDN) to assist domain novices by providing multiple overviews of the content matter using multiple diagrams. Rather than relying on specific types of visualizations, MDN superimposes any type of diagram or map over a collection of documents, allowing content providers to reveal interesting perspectives of their content. Domain novices can navigate through the content in an exploratory way using three types of queries (navigation): diagram to content (D2C), diagram to diagram (D2D), and content to diagram (C2D).

To evaluate the MDN user interface, we conducted a user study, which showed that users found MDN useful and easy to use in exploratory-navigation scenarios. Encouraged by these positive results, we extended the functionality of MDN to provide a ranking of collection documents for D2C queries (expressed by a selected diagram concept). We studied different elements of the ranking process. As a case study, we targeted our research towards the Wikipedia collection. With the goal of studying ranking in different types of diagrams, we introduced two diagram models: the Items-and-Attributes model and the Universal model. We also studied two ranking algorithms: Personalized PageRank (PPR), an algorithm used in similar applications; and Greedy Energy Spreading (GES), an algorithm that we designed. We also studied different approaches to computing rankings for C2D queries.

Our results show encouraging performance on the ranking of D2C and C2D queries. For example, in an experiment targeting diagrams conforming to the Items-and-Attributes model, results showed reasonably high similarity between a diagram concept selected by the user and the top-ten-ranked pages. We also found that diagrams had a strong influence on D2C ranking, which yielded a ranking reflecting the aspect presented by the diagram. In C2D-query ranking, GES was able to rank the most related concept in the diagram to a Wikipedia page selected by the user in the top five or six positions on average (in diagrams with 50 elements).

In our tuning for the two studied algorithms, we observed that GES performs slightly better than PPR in some aspects of D2C and C2D ranking. We also noticed differences between MDN and similar applications on the optimal settings for Personalized PageRank. Our configuration of the Wikipedia graph revealed that including (or excluding) Wikipedia categories and article-template hyperlinks have a high influence on the tested algorithms. Also, compared to related work that included only reciprocal links (where a pair of pages link to each other) in the graph, we report better performance when also including low-weight non-reciprocal links.

Rights

In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/ This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).

Persistent Identifier

https://archives.pdx.edu/ds/psu/32616

Available for download on Friday, March 26, 2021

Share

COinS