Viral Risk Assessment Through Deep Machine Learning of Genetic Testing

Daniel A. Brue, PhD

Abstract 

In this document, we outline the goals and purpose of General Genomics, and its methodology for providing metrics of viral disease propagation and individual susceptibility and response. We have developed a process to quantitatively define an individual’s risk during the COVID-19 pandemic, though the methods apply equally for other diseases as well. In this way, we are able to provide insights into disease propagation, population susceptibility, and personal risk to infection. With this information, we will help businesses, governments, and individuals to make better choices regarding public and personal safety. 

______________________________________
1 https://curo46.com 

Introduction 

General Genomics LLC is an endeavor to answer some of the questions regarding virus susceptibility, spread, and individual response to illnesses. The COVID-19 global pandemic is currently still in effect and likely will continue through several more months. This provides both a strong motivation and an unprecedented opportunity in studying viras dispersion and human reaction to specific virus infection. At no time in history have we had more information with which to work. Indeed, the most common comparison we use is the influenza outbreak of 1918. Today, modern technology provides a far better understanding of how COVID-19 has spread across the world and far more accurate medical tools for detecting and treating the disease. 

General Genomics has developed a process for using available data to provide risk assessments. This result is called a Risk Under Normalcy (RUN) score. 

The RUN score is a metric that gives a quantitative measure of individual risk of disease susceptibility. The RUN score is a result of multivariate factors that includes testing for genetic markers that may make one more or less inclined to infections. It has been shown that there do 2 exist genetic predispositions that affect one’s susceptibility and resilience to COVID-19 as well as other diseases. 

Data Collection 

General Genomics has developed a survey and an application interface that allows individuals or medical institutions to provide a person’s genetic information and demographic factors. Using either the app available on Google Play or the Apple Store, or through survey.curo46.com, anyone can upload their genetic information and any other factors they might supply, and they will receive a RUN score with a report of their most significant risk factors. 

Data Management 

All data collected will be stored and managed according to HIPAA compliance and user license agreements. Only data in aggregate will be shared or used for analysis, and individual identification will not be used for tracking. 

Several methodologies are available for managing COVID-19 data 3 . Initially, a simple queried database will be sufficient, but will transfer into a more reliable cloud system as need arises. 

______________________________________
2 https://www.nature.com/articles/s41467-020-16256-y
3 https://arxiv.org/abs/2005.05036 

Analytic 

Some work has been published in tracking propagation of COVID-19 based on statistical inference 45 including Johns-Hopkins6 , the CDC7 , research has shown that certain genetic markers are correlated with COVID0-19. With sufficient data, we will be able to confirm and/or refine these conclusions and will publish our findings. 

Many options already exist for machine learning and artificial intelligence (ML/AI) codes, and the methodology deployed by General Genomics will be chosen based on available data types and the specific questions to be answered. The chosen methods will be compared, weighted, and tuned based on empirical field data. 

Questions: 

Initial analysis shows which factors are most significant in answering the following questions: 
1. Which genetic markers are most significant to an individual’s risk of contracting COVID-19?
2. Which environmental factors, including workplace exposure, family interaction, and general exposure to the public, are most influential in a person’s general risk?
3. What independent factors, such as smoking, prescription medication, etc. should be considered in diagnosis and treatment plans?
4. Can we increase the accuracy of models tracking and predicting the spread of the pandemic by having a much more accurate understanding of human response and resilience? 

By identifying significant factors, the resultant RUN score is far more than just a number, but allows a person to weigh their own risks and take mitigating measures to reduce their risk. For example, we can provide a list of the most significant factors adding to someone’s risk, thereby allowing the person to make better informed choices for self protection and care. 

Expected Results: 

1. Inform an individual and their medical care provider information on the individual’s risk and primary risk factors. 

______________________________________
4 https://arxiv.org/abs/2005.05086
5 https://arxiv.org/abs/2005.04937
6 https://arxiv.org/abs/2005.05060
7 https://www.cdc.gov/coronavirus/2019-ncov/index.html 

2. Inform businesses that track RUN scores on their aggregate risk by allowing them to set policy on high or low risk customers, especially in situations of high population density and personal interaction.
3. Provide data on similar cases and which treatments have been most effective in combating the disease.
4. An assessment of how the disease spreads, including factors, but not limited to, social distancing and isolation. 

Based on results, the model will be continuously updated and refined. As new factors present themselves, we will be able to develop improved products to better inform the population, business, and the scientific medical community of the results. We will also be able to provide increasingly accurate products and services.

Paper: Do Your Genes Predispose You to COVID-19?

Individual differences in genetic makeup may explain our susceptibility to the new coronavirus and the severity of the disease it causes

By Loïc Mangin on April 30, 2020

Do Your Genes Predispose You to COVID-19?
Determining blood type in the ABO group system. Credit: Getty Images

Since the start of the COVID-19 pandemic several months ago, scientists have been puzzling over the different ways the disease manifests itself. They range from cases with no symptoms at all to severe ones that involve acute respiratory distress syndrome, which can be fatal. What accounts for this variability? Might the answer lie in our genes?

Coronaviruses have raised such questions for more than 15 years. In researching the 2003 outbreak of severe acute respiratory syndrome (SARS), Ralph Baric and his colleagues at the University of North Carolina at Chapel Hill identified a gene that, when silenced by a mutation, makes mice highly susceptible to SARS-CoV, the coronavirus that causes the disease. Called TICAM2, the gene codes for a protein that helps activate a family of receptors, called toll-like receptors (TLRs), that are involved in innate immunity, the first line of defense against pathogens.

Attention has now shifted to SARS-CoV-2, the new coronavirus that causes COVID-19. And TLRs have once again drawn researchers’ interest—this time to help explain the excess number of men who suffer from severe infections.

Men made up 73 percent of severe cases of COVID-19 in intensive care in France, according to a national survey published April 23. Behavioral and hormonal differences may be partially responsible. But genes may also factor into the mix. Unlike men, women have two X chromosomes and so carry double the copies of the gene TLR7, a key detector of viral activity that helps boost immunity.

The genetics of blood groups may offer some insight into whether you are liable to be infected with the virus. In late March Peng George Wang of the Southern University of Science and Technology in China and his colleagues released the results of a preprint study—not yet peer-reviewed—that compared the distribution of blood types among 2,173 COVID-19 patients in three hospitals in the Chinese cities of Wuhan and Shenzhen with that of uninfected people in the same areas. Blood type A appears to be associated with a higher risk of contracting the virus, whereas type O offers the most protection for reasons that have yet to be determined.

<See the link below for the rest of the article>

https://www.scientificamerican.com/article/do-your-genes-predispose-you-to-covid-19/?amp=true

Paper: Relationship between the ABO Blood Group and the COVID-19 Susceptibility

Abstract

The novel coronavirus disease-2019 (COVID-19) has been spreading around the world rapidly and declared as a pandemic by WHO. Here, we compared the ABO blood group distribution in 2,173 patients with COVID-19 confirmed by SARS-CoV-2 test from three hospitals in Wuhan and Shenzhen, China with that in normal people from the corresponding regions. The results showed that blood group A was associated with a higher risk for acquiring COVID-19 compared with non-A blood groups, whereas blood group O was associated with a lower risk for the infection compared with non-O blood groups. This is the first observation of an association between the ABO blood type and COVID-19. It should be emphasized, however, that this is an early study with limitations. It would be premature to use this study to guide clinical practice at this time, but it should encourage further investigation of the relationship between the ABO blood group and the COVID-19 susceptibility.

https://doi.org/10.1101/2020.03.11.20031096

Paper: Genome-wide association and HLA region fine-mapping studies identify susceptibility loci for multiple common infections

Abstract

Infectious diseases have a profound impact on our health and many studies suggest that host genetics play a major role in the pathogenesis of most of them. We perform 23 genome-wide association studies for common infections and infection-associated procedures, including chickenpox, shingles, cold sores, mononucleosis, mumps, hepatitis B, plantar warts, positive tuberculosis test results, strep throat, scarlet fever, pneumonia, bacterial meningitis, yeast infections, urinary tract infections, tonsillectomy, childhood ear infections, myringotomy, measles, hepatitis A, rheumatic fever, common colds, rubella and chronic sinus infection, in over 200,000 individuals of European ancestry. We detect 59 genome-wide significant (P < 5 × 10-8) associations in genes with key roles in immunity and embryonic development. We apply fine-mapping analysis to dissect associations in the human leukocyte antigen region, which suggests important roles of specific amino acid polymorphisms in the antigen-binding clefts. Our findings provide an important step toward dissecting the host genetic architecture of response to common infections. Susceptibility to infectious diseases is, among others, influenced by the genetic landscape of the host. Here, Tian and colleagues perform genome-wide association studies for 23 common infections and find 59 risk loci for 17 of these, both within the HLA region and non-HLA loci.

https://pubmed.ncbi.nlm.nih.gov/28928442/

Paper: Human genetic susceptibility to infectious disease

Abstract

Recent genome-wide studies have reported novel associations between common polymorphisms and susceptibility to many major infectious diseases in humans. In parallel, an increasing number of rare mutations underlying susceptibility to specific phenotypes of infectious disease have been described. Together, these developments have highlighted a key role for host genetic variation in determining the susceptibility to infectious disease. They have also provided insights into the genetic architecture of infectious disease susceptibility and identified immune molecules and pathways that are directly relevant to the human host defence.

https://pubmed.ncbi.nlm.nih.gov/22310894/