Molecular dynamics simulations are one of the best methods for quickly understanding the mechanisms of SARS-CoV-2. A project led by Modesto Orozco of the Spanish Institute for Research in Biomedicine is investigating the evolutionary path of the virus from bats to humans, forecasting human sensitivity to infection, and looking at the impact of viral mutations on infectivity.
Modesto Orozco is a computational chemist who uses simulations to study biological molecules. His group has done extensive work on developing methods for studying the dynamics of proteins, nucleic acids, enzyme catalysis and more. With the onset of the COVID-19 pandemic, they have now turned their considerable knowledge towards studying various aspects of the virus. PRACE is supporting this work through an allocation of six million core hours on Joliot-Curie Rome, hosted by GENCI at CEA, France.
“We wanted to see how this bat virus may have come to infect humans and then continued to evolve.”
One of the main focuses of the project is the interactions between the human angiotensin converting enzyme (ACE2) and a part of the spike protein of the virus known as the receptor-binding domain (RBD). ACE2 is an enzyme molecule that connects the inside of our cells to the outside via the cell membrane. Normally, it plays a role in stabilising blood pressure in the human body, but the SARS-CoV-2 virus is able to exploit its machinery to gain access to our cells by binding to it via RBD.
However, looking at RBD in isolation cannot give the whole picture of what is going on, as Orozco explains. “We have to look at the entire spike protein, because for RBD to interact with ACE2 it has to somehow become exposed, meaning the spike protein has to be very flexible and change its structural conformation,” he says. “Nobody really understands this mechanism at the moment. Experimentalists have huge amounts of structural data, but theory is needed to understand this transition. We are providing this by carrying out huge simulations that comprise nearly one million atoms.” With a 93% shared genetic makeup, the SARS-CoV-2 RNA sequence is very similar to a virus found in horseshoe bats known as RaTG13. Understanding how a virus that originated in bats may have come to infect humans can help to avoid the emergence of new infections and has therefore been a significant focus of Orozco’s research.
“We wanted to see how this bat virus may have come to infect humans and then continued to evolve,” he says. “In practical terms, this required doing large simulations of the experimentally-known complex between the human enzyme ACE2 and the SARS-CoV-2 RBD, and then see if it was possible for this to have mutated from the RaTG13 virus. It requires a huge number of mutations, but we have shown it to be possible and have had some experimental validation from colleagues in Northern Italy.” Following on from this study, the group is also looking at where the virus may jump to next. Even if SARS-CoV-2 is eliminated from humans, it is possible that the virus will end up in another species. “It seems possible that bats can act as a reservoir, meaning that they can be re-infected by the virus from humans,” says Orozco. “What we are trying to find out is whether or not other animals, for instance cats or birds, might also be able to catch it. These kinds of simulations are less complicated but no less important.
Overview of the computational approach to investigate the impact of mutations in SARS-CoV-2. Workflows built using the BioExcel Building Blocks library (BioBB) combining GROMACS MD package and pmx free energy tools are used to efficiently use thousands of cores from PRACE HPC resources.
The final piece of research being carried out related to ACE2 and RBD is related to human polymorphisms of ACE2. Variations in this enzyme throughout the global human population mean that some people may be more susceptible to catching the virus than others. The team has collated data from all around the world on these genetic variations and carried out simulations to see how the different variants of ACE2 interact with RBD. Their findings show that one variant of ACE2 does not bind easily to RBD, meaning people with that variant are more resistant to the virus. They have also shown that some people have a variant which makes them much more susceptible to the virus. However, both of these groups are insignificant in size.
Many genomics projects around the world have been sequencing the virus as it has spread and mutated around the globe, providing an excellent pool of data for computational studies. Orozco’s team has been using this data to analyse the impact of viral changes on the binding of ACE2 and RBD. “There is a huge amount of information in this area, so we are only looking at the mutations that affect the ACE2-RBD binding, and then analysing the impact that these viral mutations have on the interaction. Our work on this is still unpublished, but what we have found is an excellent example of Darwinian evolution in action.
“When a viral mutation negatively affects the binding, the virus does not progress. We see one sequence of it and then it disappears. However, when mutations improve the binding, it becomes imprinted in the next generation of viruses and becomes dominant very quickly. For instance, you can see that there is one mutation on the spike protein, originally observed in Spain, that somehow helps to expose the RBD and thus helps it bind to ACE2. This mutation has now colonised the whole of Europe. Of course, this particular area of research does not have a definitive end, as the virus is still spreading, and mutating and data continues to be gathered through genome sequencing.”
Every aspect of this project on COVID-19 has been helped by a new technology developed by Orozco and others at the BioExcel Centre of Excellence for Computational Biomolecular Research. The technology, known as BioExcel Building Blocks, enables HPC users to harness the full power of large parallel supercomputers by ensuring that all processors are being used as much as possible, allowing for simulations to be carried out more quickly and more efficiently.
With thousands of groups now carrying out molecular dynamics simulations of molecules involved in COVID-19, the amount of information available is overwhelming. To help researchers deal with this deluge of data, Orozco has helped to create a platform, BioExcel-CV19, which collates all of it in a useful way. “The platform is based on an infrastructure that was developed for the Human Brain Project, and allows researchers to quickly access data and tailored analysis on the protein family that they are interested in. We are hoping that others working on PRACE projects will be uploading their data on to the platform so that the whole world can benefit from their research.”