Date of Award

2006

Degree Type

Thesis

Degree Name

Master of Science

Program

Computer Science

Supervisor

Robert Mercer

Abstract

Knowledge in the field of biomedicine is booming. In order to effectively stay on the frontiers of their fields, researchers need automated tools to manage the ever increasing quantity of scholarly literature. As an aid to developing these tools, we suggest that the rhetoric found in the biomedical genre needs to be studied. This study is enabled by the availability of full text repositories such as BioMed Central and PubMed Central. To this end, this thesis proposes a model of rhetoric that captures the descriptive purpose of each sentence within a biomedical research article. This model proposes five Rhetorical Zones: Domain Knowledge, Experimental Procedure, Experimental Results, Meta-Experimental Results, and Meta-Experimental Procedure. This model was validated on a sample corpus of biomedical research articles with two machine-learning paradigms using a feature set of rhetorical cues extracted from the corpus. In addition, a system was developed to extract the feature set automatically.

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.