Electronic Thesis and Dissertation Repository

Thesis Format

Monograph

Degree

Master of Arts

Program

Education

Supervisor

Faez, Farahnaz

2nd Supervisor

Boers, Frank

Co-Supervisor

Abstract

This research aims to explain the development of an Ontario High School Science Corpus and subsequently an Ontario High School Science Word List (OHSWL). The OHSWL is a list of the most frequent technical words in the Ontario high school science curriculum. The science corpus was compiled from Ontario science textbooks and public written lecture material. A total of 803 lemmas were identified as part of the OHSWL. The coverage of the OHSWL in the science corpus vs non-science corpus is 7.79% and 1.52% respectively. The high frequency vocabulary (top 3,000 words) of the Corpus of Contemporary American English (COCA) and OHSWL had a coverage of 85.44% and 75.67% in the science corpus compared to the non-science corpus. With an approximately 10% difference in coverage, the OHSWL proves to be a significant source of vocabulary for an Ontario science learner. While coverage of the first and second 1,000 words of the COCA were greater in the science corpus compared to the OHSWL, coverage of the third 1,000 words was only marginally greater. Therefore, past the top 3,000 words of the COCA, the greatest value for someone learning the Ontario science curriculum is achieved by knowing the OHSWL. This corpus-based study has the potential of helping students in Ontario, regardless of whether they speak English as their first language or not.

Summary for Lay Audience

This research aims to explain the development of an Ontario High School Science Corpus and subsequently an Ontario High School Science Word List (OHSWL). A Corpus is “a collection of texts that is designed to be representative of some aspect of language” (Webb & Nation, 2017). The OHSWL is a list of the most frequent technical words in the Ontario high school science curriculum. The science corpus was compiled from Ontario science textbooks and public written lecture material. A total of 803 lemmas were identified as part of the OHSWL. A lemma is made up of the headword and its inflection. For example, the headword “add” would have its inflections as “adds”, “adding” and “added”. (Webb & Nation, 2017). The coverage of the OHSWL in the science corpus vs non-science corpus is 7.79% and 1.52% respectively. The high frequency vocabulary (top 3,000 words) of the Corpus of Contemporary American English (COCA) and OHSWL had a coverage of 85.44% and 75.67% in the science corpus compared to the non-science corpus. With an approximately 10% difference in coverage, the OHSWL proves to be a significant source of vocabulary for an Ontario science learner. While coverage of the first and second 1,000 words of the COCA (1 to 1,000 and 1,001 to 2,000) were greater in the science corpus compared to the OHSWL, coverage of the third 1,000 words was only marginally greater. Therefore, past the top 3,000 words of the COCA, the greatest value for someone learning the Ontario science curriculum is achieved by knowing the OHSWL. This corpus-based study has the potential of helping students in Ontario, regardless of whether they speak English as their first language or not. By teachers implementing the use of the OHSWL in their classrooms, beginning with students in grade 7 up to grade 12, understanding the scientific jargon will no longer be as difficult. Students will be able to focus on applying their knowledge rather than memorizing terminology.

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS