Viral Risk Assessment Through Deep Machine Learning of Genetic Testing

Daniel A. Brue, PhD


In this document, we outline the goals and purpose of General Genomics, and its methodology for providing metrics of viral disease propagation and individual susceptibility and response. We have developed a process to quantitatively define an individual’s risk during the COVID-19 pandemic, though the methods apply equally for other diseases as well. In this way, we are able to provide insights into disease propagation, population susceptibility, and personal risk to infection. With this information, we will help businesses, governments, and individuals to make better choices regarding public and personal safety. 



General Genomics LLC is an endeavor to answer some of the questions regarding virus susceptibility, spread, and individual response to illnesses. The COVID-19 global pandemic is currently still in effect and likely will continue through several more months. This provides both a strong motivation and an unprecedented opportunity in studying viras dispersion and human reaction to specific virus infection. At no time in history have we had more information with which to work. Indeed, the most common comparison we use is the influenza outbreak of 1918. Today, modern technology provides a far better understanding of how COVID-19 has spread across the world and far more accurate medical tools for detecting and treating the disease. 

General Genomics has developed a process for using available data to provide risk assessments. This result is called a Risk Under Normalcy (RUN) score. 

The RUN score is a metric that gives a quantitative measure of individual risk of disease susceptibility. The RUN score is a result of multivariate factors that includes testing for genetic markers that may make one more or less inclined to infections. It has been shown that there do 2 exist genetic predispositions that affect one’s susceptibility and resilience to COVID-19 as well as other diseases. 

Data Collection 

General Genomics has developed a survey and an application interface that allows individuals or medical institutions to provide a person’s genetic information and demographic factors. Using either the app available on Google Play or the Apple Store, or through, anyone can upload their genetic information and any other factors they might supply, and they will receive a RUN score with a report of their most significant risk factors. 

Data Management 

All data collected will be stored and managed according to HIPAA compliance and user license agreements. Only data in aggregate will be shared or used for analysis, and individual identification will not be used for tracking. 

Several methodologies are available for managing COVID-19 data 3 . Initially, a simple queried database will be sufficient, but will transfer into a more reliable cloud system as need arises. 



Some work has been published in tracking propagation of COVID-19 based on statistical inference 45 including Johns-Hopkins6 , the CDC7 , research has shown that certain genetic markers are correlated with COVID0-19. With sufficient data, we will be able to confirm and/or refine these conclusions and will publish our findings. 

Many options already exist for machine learning and artificial intelligence (ML/AI) codes, and the methodology deployed by General Genomics will be chosen based on available data types and the specific questions to be answered. The chosen methods will be compared, weighted, and tuned based on empirical field data. 


Initial analysis shows which factors are most significant in answering the following questions: 
1. Which genetic markers are most significant to an individual’s risk of contracting COVID-19?
2. Which environmental factors, including workplace exposure, family interaction, and general exposure to the public, are most influential in a person’s general risk?
3. What independent factors, such as smoking, prescription medication, etc. should be considered in diagnosis and treatment plans?
4. Can we increase the accuracy of models tracking and predicting the spread of the pandemic by having a much more accurate understanding of human response and resilience? 

By identifying significant factors, the resultant RUN score is far more than just a number, but allows a person to weigh their own risks and take mitigating measures to reduce their risk. For example, we can provide a list of the most significant factors adding to someone’s risk, thereby allowing the person to make better informed choices for self protection and care. 

Expected Results: 

1. Inform an individual and their medical care provider information on the individual’s risk and primary risk factors. 


2. Inform businesses that track RUN scores on their aggregate risk by allowing them to set policy on high or low risk customers, especially in situations of high population density and personal interaction.
3. Provide data on similar cases and which treatments have been most effective in combating the disease.
4. An assessment of how the disease spreads, including factors, but not limited to, social distancing and isolation. 

Based on results, the model will be continuously updated and refined. As new factors present themselves, we will be able to develop improved products to better inform the population, business, and the scientific medical community of the results. We will also be able to provide increasingly accurate products and services.

Edmond Sun: Investigator unraveling mystery of COVID-19 genetic markers and virus susceptibility

Local genome researcher Daniel Brue investigates why some people are more susceptible to COVID-19 while others are not. As an inventor and the founder of General Genomics, he has established a group of people in an attempt to find more information and correlations between genetic markers and virus susceptibility of COVID-19.

Investigator unraveling mystery of COVID-19 genetic markers and virus susceptibility

The findings could potentially reveal effective methods of treatment against the virus.

“What we do know now is that there is a significant part of the population A-symptomatic to COVID-19,” said Brue, P.h.D. “So they are carriers, but they don’t know that they’re ill.”

Brue is part of a group whose focus is to increase the effectiveness and preventiveness of treatments and illnesses by warning people to understand what they may be susceptible to, based on their genetic information.

Brue said a large population of participants in companies such as 23andMe and have been receiving reports about their genetic information.

“What I would like to track is how a disease effects people of different genetic dispositions,” Brue said.

A clearer picture of genetic markers linked to disease is forming from incoming information and volunteer participants. Brue correlates the effectiveness of treatments participants have received based on their genetic bands.

COVID-19 is becoming one of the best documented cases of a pandemic, and it is Brue’s hope that the group’s findings will apply to a bigger picture, triggering further scientific research of other disease processes as well.

“What I would want people to know is we have greater capacity to understand what is happening than we have ever had before,” Brue said. “If we didn’t take advantage of learning as much as we possibly can, we would be horribly remiss in not using data that we have on hand to try to improve people’s health care, and understand on the onset, what is the most effective treatment for those who are ill.”

The three inventors of the new program combine expertise in several disciplines. Ultimately they want to save lives.

Brue has an extensive background in physics and artificial intelligence/machine learning, and medical image processing. He earned his doctorate at the University of Oklahoma. Brue said he understands how sensors work and how to get the best information from them.

“What I know very well is how to extract information from measuring apparatuses that we’re using,” he said. 

Warren Gieck, of Calgary, Alberta, is an entrepreneur and industrial engineer, with experience in software development, artificial intelligence, robotics, mechatronics, and product development. 

“Our motivation is the suffering of our friends and society around us. And just as importantly, we are dads whose kids just want to go back to school,” Gieck said. “With extensive scientific and engineering expertise, we have built solutions using similar technologies for industrial applications, and we saw how we could help solve the uncertainty around the Covid-19 virus.

“Ultimately our goal is to allow people who are low risk to get back to their lives.”

A.J. Rosenthal of Midland, Texas, has a background in multi-disciplinary engineering solutions, nuclear engineering technology, and finance. Kyrie Cameron, attorney at Patterson + Sheridan, has assisted these inventors in filing their patent applications.   

“I want to figure out a way that we can better identify what people should be looking for in their own health care,” Brue said.

The goal is provide people a better understanding of how to take care of their personal health. By understanding individual risks, individuals would be able to provide care providers a better understanding of how they should be treated should they be in poor health, Brue said. As a result, physicians would have more concrete information to work with in patient care.

Brue said one of the worst aspects of what anyone goes through when they become sick is their uncertainty. A lot of people are concerned and scared of COVID-19.

“I have lived through enough personal losses to see how much the damage is on not just the person who’s ill, but their entire family around them,” Brue said.

His goal is to reduce anxiety by educating people about disease processes.

“It’s personally important to me,” Brue said.

Houston Chronicle: Midlander creates algorithm to predict likelihood of infection

Determination would be made using person’s genetic make-up, medical history

By Caitlin Randle, Reporter-Telegram Published 9:11 pm CDT, Thursday, April 16, 2020

A Midland data scientist and his two partners have created an algorithm that uses a person’s genetic markers and medical history to predict someone’s likelihood of becoming infected with the coronavirus and suffering complications from it.

Midlander A.J. Rosenthal and his partners, Dan Brue of Oklahoma and Warren Gieck of Alberta, Canada, filed patents this week related to the algorithm.

Rosenthal said it could use a person’s genetic make-up in combination with various factors, such as their medical history and types of exposure they’ve had (i.e. a miner exposed to coal dust), to determine someone’s risk factor and assign them a correlating score.

“We’re describing potentially where a person would fall, give them a score, and that score allows them to either start going back to the workplace because they’re not going to succumb to the disease, or they won’t even be susceptible to it,” he said.

The algorithm would use the medical histories of those who have been hospitalized with COVID-19 to determine what markers could put a person at risk, Rosenthal said. He described inputting the data from past patients as “training the algorithm.”

The goal of this project is for the information to be widely accessible, Rosenthal said. He said the algorithm could potentially be on a website where a person could enter their medical information after signing a HIPPA privacy release.

“What we’re trying to do is if people want this – and we’re hoping they do – is to make it easier for them to feel comfortable and safe going back out,” he said. “Because they’ve now been locked in their houses for weeks … they don’t know if they’re going to get sick. They don’t know if they’re even susceptible to it.”

The algorithm could also be applied to other viruses and diseases, Rosenthal said, but the trio has chosen to focus on COVID-19 because there’s an immediate need.

The project’s success is contingent on partnerships with other entities – primarily, with medical providers who would give access to the medical histories of past COVID-19 patients. HIPPA laws prevent that data from being publicly available.

Rosenthal pointed to studies linking ACE2 receptors in the lungs to COVID-19 as evidence that a person’s DNA could be used to predict their risk of being infected. Some studies have found the coronavirus uses these receptors to infiltrate cells in the body.

“When the coronavirus attaches, it has a certain type of envelope that it attaches to,” Rosenthal said. “Your receptor on your lung, a lot of the coronavirus sticks to it … and from there, it propagates an infection.”

Some health entities worldwide have advised against using ibuprofen to treat COVID-19 because it’s thought to increase the number of ACE2 receptors in the body, but there’s no clear consensus among the scientific community about whether more of these receptors create a higher risk of contracting or having complications from the coronavirus.

Rosenthal said the algorithm could determine if certain combinations of medications and genetics were frequently present in those infected with the virus and serve as a guide to those with similar DNA who are also on those medications.

A former multi-disciplinary engineer in the U.S. Navy and at General Electric, Rosenthal currently works for an oil and gas company in Midland. He said he and his partners, who met working at GE, were inspired to take up this enterprise by their kids, who want to “go back to school and go to the mall and play baseball.”

“We’re just three dads. We just want our kids to have a normal life again,” Rosenthal said.

“Maybe these three dads can help the world,” he said. “The only thing we’ve got left to lose are our jobs or the economy.”