Viral Risk Assessment Through Deep Machine Learning of Genetic Testing

Share this post

Daniel A. Brue, PhD


In this document, we outline the goals and purpose of General Genomics, and its methodology for providing metrics of viral disease propagation and individual susceptibility and response. We have developed a process to quantitatively define an individual’s risk during the COVID-19 pandemic, though the methods apply equally for other diseases as well. In this way, we are able to provide insights into disease propagation, population susceptibility, and personal risk to infection. With this information, we will help businesses, governments, and individuals to make better choices regarding public and personal safety. 



General Genomics LLC is an endeavor to answer some of the questions regarding virus susceptibility, spread, and individual response to illnesses. The COVID-19 global pandemic is currently still in effect and likely will continue through several more months. This provides both a strong motivation and an unprecedented opportunity in studying viras dispersion and human reaction to specific virus infection. At no time in history have we had more information with which to work. Indeed, the most common comparison we use is the influenza outbreak of 1918. Today, modern technology provides a far better understanding of how COVID-19 has spread across the world and far more accurate medical tools for detecting and treating the disease. 

General Genomics has developed a process for using available data to provide risk assessments. This result is called a Risk Under Normalcy (RUN) score. 

The RUN score is a metric that gives a quantitative measure of individual risk of disease susceptibility. The RUN score is a result of multivariate factors that includes testing for genetic markers that may make one more or less inclined to infections. It has been shown that there do 2 exist genetic predispositions that affect one’s susceptibility and resilience to COVID-19 as well as other diseases. 

Data Collection 

General Genomics has developed a survey and an application interface that allows individuals or medical institutions to provide a person’s genetic information and demographic factors. Using either the app available on Google Play or the Apple Store, or through, anyone can upload their genetic information and any other factors they might supply, and they will receive a RUN score with a report of their most significant risk factors. 

Data Management 

All data collected will be stored and managed according to HIPAA compliance and user license agreements. Only data in aggregate will be shared or used for analysis, and individual identification will not be used for tracking. 

Several methodologies are available for managing COVID-19 data 3 . Initially, a simple queried database will be sufficient, but will transfer into a more reliable cloud system as need arises. 



Some work has been published in tracking propagation of COVID-19 based on statistical inference 45 including Johns-Hopkins6 , the CDC7 , research has shown that certain genetic markers are correlated with COVID0-19. With sufficient data, we will be able to confirm and/or refine these conclusions and will publish our findings. 

Many options already exist for machine learning and artificial intelligence (ML/AI) codes, and the methodology deployed by General Genomics will be chosen based on available data types and the specific questions to be answered. The chosen methods will be compared, weighted, and tuned based on empirical field data. 


Initial analysis shows which factors are most significant in answering the following questions: 
1. Which genetic markers are most significant to an individual’s risk of contracting COVID-19?
2. Which environmental factors, including workplace exposure, family interaction, and general exposure to the public, are most influential in a person’s general risk?
3. What independent factors, such as smoking, prescription medication, etc. should be considered in diagnosis and treatment plans?
4. Can we increase the accuracy of models tracking and predicting the spread of the pandemic by having a much more accurate understanding of human response and resilience? 

By identifying significant factors, the resultant RUN score is far more than just a number, but allows a person to weigh their own risks and take mitigating measures to reduce their risk. For example, we can provide a list of the most significant factors adding to someone’s risk, thereby allowing the person to make better informed choices for self protection and care. 

Expected Results: 

1. Inform an individual and their medical care provider information on the individual’s risk and primary risk factors. 


2. Inform businesses that track RUN scores on their aggregate risk by allowing them to set policy on high or low risk customers, especially in situations of high population density and personal interaction.
3. Provide data on similar cases and which treatments have been most effective in combating the disease.
4. An assessment of how the disease spreads, including factors, but not limited to, social distancing and isolation. 

Based on results, the model will be continuously updated and refined. As new factors present themselves, we will be able to develop improved products to better inform the population, business, and the scientific medical community of the results. We will also be able to provide increasingly accurate products and services.

Subscribe To Our Newsletter

Get updates and learn from the best

More to explore

Personalizing Healthcare Through Data With General Genomics Published on May 21, 2021 by Ditsa Keren – DNA Weekly ( General Genomics is a bioinformatics artificial intelligence platform that provides customers with a comprehensive

Do You Want To Boost Your Business?

drop us a line and keep in touch