Genetic privacy explained

Genetic privacy involves the concept of personal privacy concerning the storing, repurposing, provision to third parties, and displaying of information pertaining to one's genetic information.^[1] ^[2] This concept also encompasses privacy regarding the ability to identify specific individuals by their genetic sequence, and the potential to gain information on specific characteristics about that person via portions of their genetic information, such as their propensity for specific diseases or their immediate or distant ancestry.^[3]

With the public release of genome sequence information of participants in large-scale research studies, questions regarding participant privacy have been raised. In some cases, it has been shown that it is possible to identify previously anonymous participants from large-scale genetic studies that released gene sequence information.^[4]

Genetic privacy concerns also arise in the context of criminal law because the government can sometimes overcome criminal suspects' genetic privacy interests and obtain their DNA sample.^[5] Due to the shared nature of genetic information between family members, this raises privacy concerns of relatives as well.^[6]

As concerns and issues of genetic privacy are raised, regulations and policies have been developed in the United States both at a federal and state level.

Significance of genetic information

In the majority of cases, an individual's genetic sequence is considered unique to that individual. One notable exception to this rule in humans is the case of identical twins, who have nearly identical genome sequences at birth.^[7] In the remainder of cases, one's genetic fingerprint is considered specific to a particular person and is regularly used in the identification of individuals in the case of establishing innocence or guilt in legal proceedings via DNA profiling.^[8] Specific gene variants one's genetic code, known as alleles, have been shown to have strong predictive effects in the occurrences of diseases, such as the BRCA1 and BRCA2 mutant genes in Breast Cancer and Ovarian Cancer, or PSEN1, PSEN2, and APP genes in early-onset Alzheimer's disease.^[9] ^[10] ^[11] Additionally, gene sequences are passed down with a regular pattern of inheritance between generations, and can therefore reveal one's ancestry via genealogical DNA testing. Additionally with knowledge of the sequence of one's biological relatives, traits can be compared that allow relationships between individuals, or the lack thereof, to be determined, as is often done in DNA paternity testing. As such, one's genetic code can be used to infer many characteristics about an individual, including many potentially sensitive subjects such as:^[12]

Parentage / Non-paternity
Consanguinity
Adoptive Status
Ancestry
Propensity for Disease
Predicted Physical Characteristics

Sources of genetic information

Risks

Privacy Breaches

Studies have shown that genomic data is not immune to adversary attacks.^[16] ^[17] A study conducted in 2013 revealed vulnerabilities in the security of public databases that contain genetic data. As a result, research subjects could sometimes be identified by their DNA alone.^[18] Although reports of premeditated breaches outside of experimental research are disputed, researchers suggest the liability is still important to study.^[19]

While accessible genomic data has been pivotal in advancing biomedical research, it also escalates the possibility of exposing sensitive information.^[20] A common practice in genomic medicine to protect patient anonymity involves removing patient identifiers.^[21] However, de-identified data is not subject to the same privileges as the research subjects. Furthermore, there is an increasing ability to re-identify patients and their genetic relatives from their genetic data.

One study demonstrated re-identification by piecing together genomic data from short tandem repeats (e.g. CODIS), SNPallele frequencies (e.g. ancestry testing), and whole-genome sequencing. They also hypothesize using a patient's genetic information, ancestry testing, and social media to identify relatives. Other studies have echoed the risks associated with linking genomic information with public data like social media, including voter registries, web searches, and personal demographics, or with controlled data, like personal medical records.

There is also controversy regarding the responsibility a DNA testing company has to ensure that leaks and breaches do not happen.^[22] Determining who legally owns the genomic data, the company or the individual, is of legal concern. There have been published examples of personal genome information being exploited, as well as indirect identification of family members.^[23] Additional privacy concerns, related to, e.g., genetic discrimination, loss of anonymity, and psychological impacts, have been increasingly pointed out by the academic community^[23] ^[24] as well as government agencies.^[15]

Law Enforcement

Additionally, for criminal justice and privacy advocates, the use of genetic information in identifying suspects for criminal investigations proves worrisome under the United States Fourth Amendment—especially when an indirect genetic link connects an individual to crime scene evidence.^[25] Since 2018, law enforcement officials have been harnessing the power of genetic data to revisit cold cases with DNA evidence.^[26] Suspects discovered through this process are not directly identified by the input of their DNA into established criminal databases, like CODIS. Instead, suspects are identified as the result of familial genetic sleuthing by law enforcement, submitting crime scene DNA evidence to genetic database services that link users whose DNA similarity indicates a family connection.^[27] Officers can then track the newly identified suspect in person, waiting to collect discarded trash that might carry DNA in order to confirm the match.

Despite the privacy concerns of suspects and their relatives, this procedure is likely to survive Fourth Amendment scrutiny.^[6] Much like donors of biological samples in cases of genetic research,^[28] ^[29] criminal suspects do not retain property rights in abandoned waste; they can no longer assert an expectation of privacy in the discarded DNA used to confirm law enforcement suspicions, thereby eliminating their Fourth Amendment protection in that DNA. Additionally, the genetic privacy of relatives is likely irrelevant under current caselaw since Fourth Amendment protection is “personal” to criminal defendants.

Psychological Impact

In a systematic review of perspectives toward genetic privacy, researchers highlight some of the concerns individuals hold regarding their genetic information, such as the potential dangers and effects on themselves and family members. Academics note that participating in biomedical research or genetic testing has implications beyond the participant; it can also reveal information about genetic relatives.^[23] The study also found that people expressed concerns as to which body controls their information and if their genetic information could be used against them.

Additionally, the American Society of Human Genetics has expressed issues about genetic tests in children.^[30] They infer that testing could lead to negative consequences for the child. For example, if a child's likelihood for adoption was influenced by genetic testing, the child might suffer from self esteem issues. A child's well-being might also suffer due to paternity testing or custody battles that require this type of information.^[12]

Regulations

When the access of genetic information is regulated, it can prevent insurance companies and employers from reaching such data. This could avoid issues of discrimination, which oftentimes leaves an individual whose information has been breached without a job or without insurance.^[12]

In the United States

Federal Regulation

In the United States, biomedical research containing human subjects is governed by a baseline standard of ethics known as The Common Rule, which aims to protect a subject's privacy by requiring "identifiers" such as name or address to be removed from collected data.^[31] A 2012 report by the Presidential Commission for the Study of Bioethical Issues stated, however, that "what constitutes 'identifiable' and 'de-identified' data is fluid and that evolving technologies and the increasing accessibility of data could allow de-identified data to become re-identified". In fact, research has already shown that it is "possible to discover a study participant's identity by cross-referencing research data about him and his DNA sequence … [with] genetic genealogy and public-records databases".^[32] This has led to calls for policy-makers to establish consistent guidelines and best practices for the accessibility and usage of individual genomic data collected by researchers.^[33]

Privacy protections for genetic research participants were strengthened by provisions of the 21st Century Cures Act (H.R.34) passed on 7 December 2016 for which the American Society of Human Genetics (ASHG) commended Congress, Senator Warren and Senator Enzi.^[34] ^[35] ^[36]

The Genetic Information Nondiscrimination Act of 2008 (GINA) protects the genetic privacy of the public, including research participants. The passage of GINA makes it illegal for health insurers or employers to request or require genetic information of an individual or of family members (and further prohibits the discriminatory use of such information).^[37] This protection does not extend to other forms of insurance such as life insurance.

The Health Insurance Portability and Accountability Act of 1996 (HIPAA) also provides some genetic privacy protections. HIPAA defines health information to include genetic information,^[38] which places restrictions on who health providers can share the information with.^[39]

State Regulation

Three kinds of laws are frequently associated with genetic privacy: those relating to informed consent and property rights, those preventing insurance discrimination, and those prohibiting employment discrimination.^[40] ^[41] According to the National Human Genome Research Institute, forty-one states have enacted genetic privacy laws as of January 2020. However, those privacy laws vary in the scope of protection offered; while some laws "apply broadly to any person" others apply "narrowly to certain entities such as insurers, employers, or researchers."

Arizona, for example, falls in the former category and offers broad protection. Currently, Arizona's genetic privacy statutes focus on the need for informed consent to create, store, or release genetic testing results,^[42] ^[43] but a pending bill would amend the state genetic privacy law framework to grant exclusive property rights in genetic information derived from genetic testing to all persons tested.^[44] In expanding privacy rights by including property rights, the bill would grant persons who undergo genetic testing greater control over their genetic information. Arizona also prohibits insurance and employment discrimination on the basis of genetic testing results.^[45] ^[46]

New York State also has strong legislative measures protecting individuals from genetic discrimination. Section 79-I of the New York Civil Rights Law places strict restrictions on the usage of genetic data. The statute also outlines the proper conditions for consenting to genetic data collection or usage.^[47]

California similarly offers a broad range of protection for genetic privacy, but it stops short of granting individuals property rights in their genetic information. While currently enacted legislation focuses on prohibiting genetic discrimination in employment^[48] and insurance,^[49] a piece of pending legislation would extend genetic privacy rights to provide individuals with greater control over genetic information obtained through direct-to-consumer testing services like 23andMe.^[50]

Florida passed House Bill 1189, a DNA privacy law that prohibits insurers from using genetic data, in July 2020.^[51]

On the other hand, Mississippi offers few genetic privacy protections beyond those required by the federal government. In the Mississippi Employment Fairness Act, the legislature recognized the applicability of the Genetic Information Nondiscrimination Act,^[52] which "prohibit[s] discrimination on the basis of genetic information with respect to health insurance and employment."^[53] ^[54]

Other

To balance data sharing with the need to protect the privacy of research subjects geneticists are considering to move more data behind controlled-access barriers, authorizing trusted users to access the data from many studies, rather than "having to obtain it piecemeal from different studies".

In October 2005, IBM became the world's first major corporation to establish a genetics privacy policy. Its policy prohibits using employees' genetic information in employment decisions.^[55]

Breaching techniques

According to a 2014 study by Yaniv Erlich and Arvind Narayanan, genetic privacy breaching techniques fall into three categories:^[56]

Identity Tracing

Here the aim is to link between an unknown genome and the concealed identity of the data originator by accumulating quasi-identifiers − residual pieces of information that are embedded in the dataset − and to gradually narrow down the possible individuals that match the combination of these quasi-identifiers.

Attribute Disclosure Attacks via DNA (ADAD)

Here the adversary already has access to the identified DNA sample of the target and to a database that links DNA-derived data to sensitive attributes without explicit identifiers, for example a public database of the genetic study of drug abuse. The ADAD techniques match the DNA data and associate the identity of the target with the sensitive attribute

Completion Techniques

Here the adversary also knows the identity of a genomic dataset but has access only to a sanitized version without sensitive loci. The aim here is to expose the sensitive loci that are not part of the original data.

However, more recent studies have indicated new avenues for breaching genetic privacy:

Phenotype Inferences

Here, the goal is to use readily available phenotype information about an individual, such as physical features (or some combination thereof), to make genetic inferences. As genetic databases grow at unprecedented rates, providing larger and more comprehensive aggregates, the ability to make inferences with more probabilistic certainty greatly increases. Furthermore, the scope of potential inferences grows with expanding datasets.^[57] ^[58]

Safeguards

According to a 2022 study by Zhiyu Wan et al., safeguards for genetic privacy fall into two categories:^[59]

Legal Safeguards

Legal safeguards include the Genetic Information Nondiscrimination Act of 2008, the Health Insurance Portability and Accountability Act of 1996, the Common Rule, the US National Institutes of Health (NIH) data sharing policy, European Union’s General Data Protection Regulation (GDPR), US state privacy laws (e.g., California Consumer Privacy Act, California Privacy Rights Act, or Virginia Consumer Data Protection Act), self-regulations (e.g., data use agreements, privacy policies, or terms of service), and informed consents.

Technical Safeguards

Technical safeguards include cryptographic tools, access control, and data perturbation approaches. Specifically, cryptographic approaches include homomorphic encryption, secure multiparty computation, trusted execution environment, and Blockchain, whereas data perturbation approaches include k-anonymity, Beacon services,^[60] differential privacy, and synthetic data generation.^[61]

External links

Notes and References

Web site: Genetic Privacy (definition). reference.md. 29 December 2016.
Nature Reviews Genetics . From genetic privacy to open consent. 8 November 2019.
Shi. Xinghua. Wu. Xintao. January 2017. An overview of human genetic privacy: An overview of human genetic privacy. Annals of the New York Academy of Sciences. en. 1387. 1. 61–72. 10.1111/nyas.13211. 5697154. 27626905.
24 January 2013 . Genetic privacy . Nature . 493 . 7433 . 451 . 10.1038/493451a . 23350074 . free .
Maryland v. King, 569 U.S. 435 (2013).
Ram, Natalie (July 2018). "Incidental Informants: Police Can Use Genealogy Databases to Help Identify Criminal Relatives--But Should They?". Maryland Bar Journal.
Web site: The Importance of DNA Fingerprinting and How It Is Used. Phillips. Theresa. The Balance. en. 2019-09-28.
Murphy. Erin. 2018-01-13. Forensic DNA Typing. Annual Review of Criminology. en. 1. 1. 497–515. 10.1146/annurev-criminol-032317-092127. 2572-4568.
Web site: BRCA Mutations: Cancer Risk and Genetic Testing Fact Sheet. 2018-02-05. National Cancer Institute. en. 2019-09-28.
Campion. Dominique. Dumanchin. Cécile. Hannequin. Didier. Dubois. Bruno. Belliard. Serge. Puel. Michèle. Thomas-Anterion. Catherine. Michon. Agnès. Martin. Cosette. Charbonnier. Françoise. Raux. Grégory. September 1999. Early-Onset Autosomal Dominant Alzheimer Disease: Prevalence, Genetic Heterogeneity, and Mutation Spectrum. The American Journal of Human Genetics. en. 65. 3. 664–670. 10.1086/302553. 1377972. 10441572.
Lanoiselée. Hélène-Marie. Nicolas. Gaël. Wallon. David. Rovelet-Lecrux. Anne. Lacour. Morgane. Rousseau. Stéphane. Richard. Anne-Claire. Pasquier. Florence. Rollin-Sillaire. Adeline. Martinaud. Olivier. Quillard-Muraine. Muriel. March 2017. APP, PSEN1, and PSEN2 mutations in early-onset Alzheimer disease: A genetic screening study of familial and sporadic cases. PLOS Medicine. 14. 3. e1002270. 10.1371/journal.pmed.1002270. 1549-1676. 5370101. 28350801 . free .
Anderlik. Mary R. Rothstein. Mark A.. 2001. Privacy and Confidentiality of Genetic Information: What Rules for the New Science?. Annual Review of Genomics and Human Genetics. 2. 401–433. 10.1146/annurev.genom.2.1.401. 11701656.
Web site: How is direct-to-consumer genetic testing done?: MedlinePlus Genetics. 2020-10-30. medlineplus.gov. en.
Web site: 2017-11-21. The Best DNA Test in 2020: Ancestry, 23andMe, MyHeritage & More Reviewed. 2020-08-11. Smarter Hobby. en-US.
Research. Center for Drug Evaluation and. 2018-11-03. Direct-to-Consumer Tests. FDA. en.
Schwab. Abraham P.. Luu. Hung S.. Wang. Jason. Park. Jason Y.. December 2018. Genomic Privacy. Clinical Chemistry. 64. 12. 1696–1703. 10.1373/clinchem.2018.289512. 1530-8561. 29991478. free.
Wang. Shuang. Jiang. Xiaoqian. Singh. Siddharth. Marmor. Rebecca. Bonomi. Luca. Fox. Dov. Dow. Michelle. Ohno-Machado. Lucila. January 2017. Genome privacy: challenges, technical approaches to mitigate risk, and ethical considerations in the United States. Annals of the New York Academy of Sciences. 1387. 1. 73–83. 10.1111/nyas.13259. 0077-8923. 5266631. 27681358. 2017NYASA1387...73W.
Web site: Kolata. Gina. 16 June 2013. Poking Holes in Genetic Privacy. 29 December 2016. The New York Times.
Clayton. Ellen W.. Halverson. Colin M.. Sathe. Nila A.. Malin. Bradley A.. 2018-10-31. A systematic literature review of individuals' perspectives on privacy and genetic information in the United States. PLOS ONE. 13. 10. e0204417. 10.1371/journal.pone.0204417. 1932-6203. 6209148. 30379944. 2018PLoSO..1304417C. free.
Shringarpure. Suyash. Bustamante. Carlos. November 2015. Privacy Risks from Genomic Data-Sharing Beacons. American Journal of Human Genetics. 97. 5. 631–646. 10.1016/j.ajhg.2015.09.010. 0002-9297. 4667107. 26522470.
Web site: Rights (OCR). Office for Civil. 2012-09-07. Methods for De-identification of PHI. 2020-10-31. HHS.gov. en.
Web site: Privacy in Genomics. 2021-09-12. Genome.gov. en.
1210.4820. cs.CR. Emiliano. De Cristofaro. Whole Genome Sequencing: Innovation Dream or Privacy Nightmare?. 2012-10-17.
News: Curtis. Caitlin. Hereward. James. December 4, 2017. It's time to talk about who can access your digital genomic data. The Conversation. May 21, 2018.
Ram, Natalie (2015). "DNA by the Entirety". Columbia Law Review. 115: 923.
Murphy, Heather (2019-04-25). "Sooner or Later Your Cousin's DNA Is Going to Solve a Murder". The New York Times. ISSN 0362-4331. Retrieved 2020-05-19.
Hill, Kashmir; Murphy, Heather (2019-11-05). "Your DNA Profile is Private? A Florida Judge Just Said Otherwise". The New York Times. ISSN 0362-4331. Retrieved 2020-05-19.
Moore v. Regents of the Univ. of Cal., 793 P.2d 479 (Cal. 1990).
Greenberg v. Miami Children’s Hosp. Research Inst., Inc., 264 F.Supp.2d 1064 (S.D. Fla. 2003).
Web site: Md 20852. 2015-07-02. ASHG. 2020-11-07. ASHG. en-US.
Web site: Privacy and Progress in Whole Genome Sequencing. Presidential Commission for the Study of Bioethical Issues. dead. https://web.archive.org/web/20161122235132/http://bioethics.gov/node/764. 22 November 2016. 30 November 2016.
Check Hayden. Erika. 2013. Privacy loophole found in genetic databases. Nature. 10.1038/nature.2013.12237. 211729032. free.
Gutmann. Amy. Wagner. James W.. 2013-05-01. Found Your DNA on the Web: Reconciling Privacy and Progress. Hastings Center Report. 43. 3. 15–18. 10.1002/hast.162. 23650063.
Web site: ASHG supports Genetic Privacy Provisions in 21st Century Cures Act. EurekAlert!. 2 January 2017.
Web site: Congress passes 21st Century Cures Act with billions for new research, treatments. CBS News. 8 December 2016 . 2 January 2017.
Web site: Congress acts to protect the most personal data – genetic information. Pine Bluffs Post. 2 January 2017.
Web site: Privacy in Genomics. National Human Genome Research Institute (NHGRI). 29 December 2016.
https://www.govinfo.gov/content/pkg/CFR-2019-title45-vol2/pdf/CFR-2019-title45-vol2-sec160-103.pdf 45 CFR § 160.103
https://www.hhs.gov/sites/default/files/ocr/privacy/hipaa/understanding/consumers/consumer_rights.pdf "Your Health Information Privacy Rights"
Web site: Table of State Statutes Related to Genomics. January 23, 2020. National Human Genome Research Institute. May 18, 2020.
Web site: Privacy Protection, Personalized Medicine and Genetic Testing. Miller. Amalia. Tucker. Catherine. Federal Trade Commission. May 18, 2020.
Web site: Ariz. Rev. Stat. Ann. § 12-2802(A) (2019).. Arizona State Legislature. May 18, 2020.
Web site: Ariz. Rev. Stat. Ann. § 12-2803 (2019). Arizona State Legislature. May 18, 2020.
Web site: Arizona Senate Bill 1309. LegiScan. May 18, 2020.
Web site: Ariz. Rev. Stat. Ann. § 20-448(F) (2019). Arizona State Legislature. May 18, 2020.
Web site: Ariz. Rev. Stat. Ann. § 41-1463(B)(3) (2019). Arizona State Legislature. May 18, 2020.
Web site: 2021-01-09. Confidentiality of records of genetic tests. 2021-03-19. NY State Senate. en.
https://leginfo.legislature.ca.gov/faces/codes_displayText.xhtml?lawCode=GOV&division=3.&title=2.&part=2.8.&chapter=6.&article=1. Cal. Gov't Code § 12940(a)
https://leginfo.legislature.ca.gov/faces/codes_displaySection.xhtml?lawCode=GOV&sectionNum=11135 Cal. Gov't Code § 11135
http://leginfo.legislature.ca.gov/faces/billNavClient.xhtml?bill_id=201920200SB980 Cal. 2020 AB 2301
Web site: 2020-07-02. Florida becomes first state to enact DNA privacy law, blocking insurers from genetic data. 2020-07-06. Washington Examiner. en.
Web site: Miss. Code Ann. 71-15-3 (2018). Justia. May 18, 2020.
https://www.govinfo.gov/content/pkg/PLAW-110publ233/pdf/PLAW-110publ233.pdf Genetic Information Nondiscrimination Act of 2008
153 Cong. Rec. H4083 (Apr. 25, 2007) (statement of Rep. Miller).
Web site: IBM100 - Pioneering Genetic Privacy. 2 January 2017. 7 March 2012.
Routes for breaching and protecting genetic privacy. 1310.3197. 2013. Yaniv Erlich . Arvind Narayanan . q-bio.GN.
Bonomi. Luca. Huang. Yingxiang. Ohno-Machado. Lucila. July 2020. Privacy challenges and research opportunities for genomic data sharing. Nature Genetics. en. 52. 7. 646–654. 10.1038/s41588-020-0651-0. 1061-4036. 7761157. 32601475.
Ayday. Erman. Humbert. Mathias. 2017. Inference Attacks against Kin Genomic Privacy. IEEE Security & Privacy. 15. 5. 29–37. 10.1109/MSP.2017.3681052. 1540-7993. 11693/37113. 7357416. free.
Wan . Zhiyu . Hazel . James W. . Clayton . Ellen Wright . Vorobeychik . Yevgeniy . Kantarcioglu . Murat . Malin . Bradley A. . 2022-03-04 . Sociotechnical safeguards for genomic data privacy . Nature Reviews Genetics . 23 . 7 . en . 429–445 . 10.1038/s41576-022-00455-y . 1471-0064 . 8896074 . 35246669.
Fiume . Marc . Cupak . Miroslav . Keenan . Stephen . Rambla . Jordi . de la Torre . Sabela . Dyke . Stephanie O. M. . Brookes . Anthony J. . Carey . Knox . Lloyd . David . Goodhand . Peter . Haeussler . Maximilian . March 2019 . Federated discovery and sharing of genomic data using Beacons . Nature Biotechnology . en . 37 . 3 . 220–224 . 10.1038/s41587-019-0046-x . 1546-1696 . 6728157 . 30833764.
Yelmen . Burak . Decelle . Aurélien . Ongaro . Linda . Marnetto . Davide . Tallec . Corentin . Montinaro . Francesco . Furtlehner . Cyril . Pagani . Luca . Jay . Flora . 2021-02-04 . Creating artificial human genomes using generative neural networks . PLOS Genetics . en . 17 . 2 . e1009303 . 10.1371/journal.pgen.1009303 . 1553-7404 . 7861435 . 33539374 . free .