Advisor

Mitch Cruzan

Date of Award

6-2-2017

Document Type

Thesis

Degree Name

Master of Science (M.S.) in Biology

Department

Biology

Physical Description

1 online resource (vi, 102 pages)

DOI

10.15760/etd.5891

Abstract

Tracking seed dispersal using traditional, direct measurement approaches is difficult and generally underestimates dispersal distances. Variation in chloroplast haplotypes (cpDNA) offers a way to trace past seed dispersal and to make inferences about factors contributing to present patterns of dispersal. Although cpDNA generally has low levels of intraspecific variation, this can be overcome by assaying the whole chloroplast genome. Whole-genome sequencing is more expensive, but resources can be conserved by pooling samples. Unfortunately, haplotype associations among SNPs are lost in pooled samples and treating SNP frequencies as independent estimates of variation provides biased estimates of genetic distance. I have developed an application, CallHap, that uses a least-squares algorithm to evaluate the fit between observed and predicted SNP frequencies from pooled samples based on network topology, thus enabling pooling for chloroplast sequencing for large-scale studies of chloroplast genomic variation. This method was tested using artificially-constructed test networks and pools, and pooled samples of Lasthenia californica (California goldfields) from Whetstone Prairie, in Southern Oregon, USA. In test networks, CallHap reliably recovered network topologies and haplotype frequencies. Overall, the CallHap pipeline allows for the efficient use of resources for estimation of genetic distance for studies using non-recombining, whole-genome haplotypes, such as intra-specific variation in chloroplast, mitochondrial, bacterial, or viral DNA.

Persistent Identifier

http://archives.pdx.edu/ds/psu/22719

Included in

Biology Commons

Share

COinS