A year into the COVID-19 pandemic, mass vaccinations have begun to raise the tantalizing prospect of herd immunity that eventually curtails or halts the spread of SARS-CoV-2. But what if herd immunity is never fully achieved - or if the mutating virus gives rise to hyper-virulent variants that diminish the benefits of vaccination?

Those questions underscore the need for effective treatments for people who continue to fall ill with the coronavirus. While a few existing drugs show some benefit, there's a pressing need to find new therapeutics.

Led by The University of New Mexico's Tudor Oprea, MD, PhD, scientists have created a unique tool to help drug researchers quickly identify molecules capable of disarming the virus before it invades human cells or disabling it in the early stages of the infection.

In a paper published this week in Nature Machine Intelligence, the researchers introduced REDIAL-2020, an open source online suite of computational models that will help scientists rapidly screen small molecules for their potential COVID-fighting properties.

"To some extent this replaces (laboratory) experiments, says Oprea, chief of the Translational Informatics Division in the UNM School of Medicine. "It narrows the field of what people need to focus on. That's why we placed it online for everyone to use."

Oprea's team at UNM and another group at the University of Texas at El Paso led by Suman Sirimulla, PhD, started work on the REDIAL-2020 tool last spring after scientists at the National Center for Advancing Translational Sciences (NCATS) released data from their own COVID drug repurposing studies.

"Becoming aware of this, I was like, 'Wait a minute, there's enough data here for us to build solid machine learning models,'" Oprea says. The results from NCATS laboratory assays gauged each molecule's ability to inhibit viral entry, infectivity and reproduction, such as the cytopathic effect - the ability to protect a cell from being killed by the virus.

Biomedicine researchers often tend to focus on the positive findings from their studies, but in this case, the NCATS scientists also reported which molecules had no virus-fighting effects. The inclusion of negative data actually enhances the accuracy of machine learning, Oprea says.

"The idea was that we identify molecules that fit the perfect profile," he says. "You want to find molecules that do all these things and don't do the things that we don't want them to do."

The coronavirus is a wily adversary, Oprea says. "I don't think there is a drug that will fit everything to a T." Instead, researchers will likely devise a multi-drug cocktail that attacks the virus on multiple fronts. "It goes back to the one-two punch," he says.

REDIAL-2020 is based on machine learning algorithms capable of rapidly processing huge amounts of data and teasing out hidden patterns that might not be perceivable by a human researcher. Oprea's team validated the machine learning predictions based on the NCATS data by comparing them against the known effects of approved drugs in UNM's DrugCentral database.

In principle, this computational workflow is flexible and could be trained to evaluate compounds against other pathogens, as well as evaluate chemicals that have not yet been approved for human use, Oprea says.

"Our main intent remains drug repurposing, but we're actually focusing on any small molecule," he says. "It doesn't have to be an approved drug. Anyone who tests their molecule could come up with something important."

KC GB, Bocci G, Verma S. et al.
A machine learning platform to estimate anti-SARS-CoV-2 activities.
Nat Mach Intell, 2021. doi: 10.1038/s42256-021-00335-w