Electronic Thesis and Dissertation Repository

Thesis Format

Monograph

Degree

Master of Engineering Science

Program

Electrical and Computer Engineering

Supervisor

Essex, Aleksander E.

Abstract

Additively homomorphic encryption is a public-key primitive allowing a sum to be computed on encrypted values. Although limited in functionality, additive schemes have been an essential tool in the private function evaluation toolbox for decades. They are typically faster and more straightforward to implement relative to their fully homomorphic counterparts, and more efficient than garbled circuits in certain applications. This thesis presents a novel method for extending the functionality of additively homomorphic encryption to allow the private evaluation of functions of restricted domain. Provided the encrypted sum falls within the restricted domain, the function can be homomorphically evaluated “for free” in a single public-key operation. We will describe an algorithm for encoding private functions into the public-keys of two well-known additive cryptosystems. We extend this scheme to an application in the field of pharmacogenomics called Similar Patient Query. With the advent of human genome project, there is a tremendous availability of genomic data opening the door for a possibility of many advances in the field of medicine. Precision medicine is one such application where a patient is administered drugs based on their genetic makeup. If the genomic data is not kept private, it can lead to several information frauds, so it needs to be encrypted. To tap the full potential of the encrypted genomic data, we need to perform computations on it without compromising its security. For SPQ, we pick a query genome and compare it across a hospital data base, to find patients similar to that of the query and use the information to apply precision medicine, all of this is carried out under privacy preserving settings in the presence of a semi-honest adversary in a single transaction.

Summary for Lay Audience

In an increasingly data driven world, information availability and its distribution are exploding exponentially. The access to personal information has also invited a lot of threats such as hacking, denial of service attack, injecting malicious code or gaining illegal access to confidential information. As these threats are becoming a matter-of-course with ever increasing sophistication- planning, implementing and maintaining information security systems at any organization is becoming a challenging task. There are a lot of systems in place to protect data at various levels such as physical, application, network and cloud. There are also myriad legal regulations and compliances that help the organizations to improve their security strategy by providing guidelines depending on the type of data these organizations are handling with. Violating these standards can result in severe penalties such as fines and law suits, or even worse, personal information breach. The most challenging aspect of information security thus lies in making the best possible use of available data while ensuring privacy.

One of the popular approaches of storing information securely is to encrypt it or convert it into random looking numbers so when hackers have access to it, they will not be able to figure out what the data is about. Now, it is hard to perform computations on this random looking numbers. If we need to make use of the full potential of the data, we need to decode those random looking numbers. Nonetheless, there are methods in the field of computer science and pure math that enable us to perform analysis without breaking this data. To make the best use of data in an encrypted form, we use concepts from math and design functions in such a way that, we can perform computations on the encrypted data without having to decode. Our thesis presents one of such algorithms which help us perform computations without decoding the data. Once we perform computations and decode this data, we get the results in the same way as if we would get if the computations on plain text data.

We use these algorithms for applications in the field of medicine to conduct search across genomic data bases. Genomic databases are stored in a secure way to avoid leaking any sensitive information. Using our algorithms, we can search across these databases without decoding the data. We perform some mathematical functions and retrieve similarity scores and obtain similar records to that of our search query. This is used in the field of precision medicine where, we administer medication or therapy to a patient based on their genetic make up. Since not all databases are available openly, the medical practitioners cannot tap the full potential of genomic data. Using our algorithm, we help the medical practitioners to access across all the databases even if encrypted, giving them tremendous potential to make advances in the field of precision medicine.

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS