The PC gaming market inflected upwards in 1999, sparking the geometric growth of parallel processing computer chips designed to optimize gaming.
More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world.
Fast forward to October 2020 and NVIDIA is emerging as a major player in the chip space against rivals that have dominated the industry for decades. Medical research and bioinformatics could be one of the drivers in the next massive opportunity in chip design. According to Allied Market Research, the global AI chip market will reach $91 billion by 2025, with growth rates of 45% a year until then.
Some areas of interest include image analysis and diagnosis improvements in radiology, drug discovery, and deep learning in genomic datasets.
Last year, NVIDIA announced a plan to build the most powerful computer in Cambridge (UK). Named Cambridge-1, the computer will employ AI to support healthcare research, at a cost of approximately £40 million ($55 million). According to Jensen Huang, Nvidia’s CEO, “The Cambridge-1 supercomputer will serve as a hub of innovation for the UK, and further the groundbreaking work being done by the nation’s researchers in critical healthcare and drug discovery.”
More recently, NVIDIA announced a collaboration with biopharmaceutical company AstraZeneca and the University of Florida’s academic health center, UF Health, on new AI research projects using breakthrough transformer neural networks.
Transformer-based neural network architectures — which have become available only in the last several years — allow researchers to leverage massive datasets using self-supervised training methods, avoiding the need for manually labeled examples during pre-training. These models, equally adept at learning the syntactic rules to describe chemistry as they are at learning the grammar of languages, are finding applications across research domains and modalities.
Separately, UF Health is harnessing NVIDIA’s state-of-the-art Megatron framework and BioMegatron pre-trained model — available on NGC — to develop GatorTron, the largest clinical language model to date.
Megatron Model for Molecular Insights
The MegaMolBART drug discovery model being developed by NVIDIA and AstraZeneca is slated for use in reaction prediction, molecular optimization and de novo molecular generation. It’s based on AstraZeneca’s MolBART transformer model and is being trained on the ZINC chemical compound database — using NVIDIA’s Megatron framework to enable massively scaled-out training on supercomputing infrastructure.
The large ZINC database allows researchers to pretrain a model that understands chemical structure, bypassing the need for hand-labeled data. Armed with a statistical understanding of chemistry, the model will be specialized for a number of downstream tasks, including predicting how chemicals will react with each other and generating new molecular structures.
“Just as AI language models can learn the relationships between words in a sentence, our aim is that neural networks trained on molecular structure data will be able to learn the relationships between atoms in real-world molecules,” said Ola Engkvist, head of molecular AI, discovery sciences, and R&D at AstraZeneca. “Once developed, this NLP model will be open source, giving the scientific community a powerful tool for faster drug discovery.”
The model, trained using NVIDIA DGX SuperPOD, gives researchers ideas for molecules that don’t exist in databases but could be potential drug candidates. Computational methods, known as in-silico techniques, allow drug developers to search through more of the vast chemical space and optimize pharmacological properties before shifting to expensive and time-consuming lab testing.
This collaboration will use the NVIDIA DGX A100-powered Cambridge-1 and Selene supercomputers to run large workloads at scale. Cambridge-1 is the largest supercomputer in the U.K., ranking No. 3 on the Green500 and No. 29 on the TOP500 list of the world’s most powerful systems. NVIDIA’s Selene supercomputer topped the most recent Green500 and ranks fifth on the TOP500.
Language Models Speed Up Medical Innovation
UF Health’s GatorTron model — trained on records from more than 50 million interactions with 2 million patients — is a breakthrough that can help identify patients for lifesaving clinical trials, predict and alert health teams about life-threatening conditions, and provide clinical decision support to doctors.
“GatorTron leveraged over a decade of electronic medical records to develop a state-of-the-art model,” said Joseph Glover, provost at the University of Florida, which recently boosted its supercomputing facilities with NVIDIA DGX SuperPOD. “A tool of this scale will enable healthcare researchers to unlock insights and reveal previously inaccessible trends from clinical notes.”
Beyond clinical medicine, the model also accelerates drug discovery by making it easier to rapidly create patient cohorts for clinical trials and for studying the effect of a certain drug, treatment or vaccine.
It was created using BioMegatron, the largest biomedical transformer model ever trained, developed by NVIDIA’s applied deep learning research team using data from the PubMed corpus. BioMegatron is available on NGC through Clara NLP, a collection of NVIDIA Clara Discovery models pretrained on biomedical and clinical text.
“The GatorTron project is an exceptional example of the discoveries that happen when experts in academia and industry collaborate using leading-edge artificial intelligence and world-class computing resources,” said David R. Nelson, M.D., senior vice president for health affairs at UF and president of UF Health. “Our partnership with NVIDIA is crucial to UF emerging as a destination for artificial intelligence expertise and development.”
Powering Drug Discovery Platforms
NVIDIA Clara Discovery libraries and NVIDIA DGX systems have been adopted by computational drug discovery platforms, too, boosting pharmaceutical research.
- Schrödinger, a leader in chemical simulation software development, today announced a strategic partnership with NVIDIA that includes research in scientific computing and machine learning, optimizing of Schrödinger applications on NVIDIA platforms, and a joint solution around NVIDIA DGX SuperPOD to evaluate billions of potential drug compounds within minutes.
- Biotechnology company Recursion has installed BioHive-1, a supercomputer based on the NVIDIA DGX SuperPOD reference architecture that, as of January, is estimated to rank at No. 58 on the TOP500 list of the world’s most powerful computer systems. BioHive-1 will allow Recursion to run within a day deep learning projects that previously took a week to complete using its existing cluster.
- Insilico Medicine, a partner in the NVIDIA Inception accelerator program, recently announced the discovery of a novel preclinical candidate to treat idiopathic pulmonary fibrosis — the first example of an AI-designed molecule for a new disease target nominated for clinical trials. Compounds were generated on a system powered by NVIDIA Tensor Core GPUs, taking less than 18 months and under $2 million from target hypothesis to preclinical candidate selection.
- Vyasa Analytics, a member of the NVIDIA Inception accelerator program, is using Clara NLP and NVIDIA DGX systems to give its users access to pretrained models for biomedical research. The company’s GPU-accelerated Vyasa Layar Data Fabric is powering solutions for multi-institutional cancer research, clinical trial analytics and biomedical data harmonization.
The ideas presented on this site do not constitute a recommendation to buy or sell any security. Investors are advised to conduct their own independent research into individual stocks before making a purchase decision. In addition, investors are advised that past stock performance is not indicative of future price action. You should be aware of the risks involved in stock investing, and you use the material contained herein at your own risk. Neither SYNTHETIC.COM nor any of its contributors are responsible for any errors or omissions which may have occurred. The analysis, ratings, and/or recommendations made on this site do not provide, imply, or otherwise constitute a guarantee of performance. SYNTHETIC.COM posts may contain financial reports and economic analysis that embody a unique view of trends and opportunities. Accuracy and completeness cannot be guaranteed. Investors should be aware of the risks involved in stock investments and the possibility of financial loss. It should not be assumed that future results will be profitable or will equal past performance, real, indicated or implied. The material on this website is provided for information purpose only. SYNTHETIC.COM does not accept liability for your use of the website. The website is provided on an “as is” and “as available” basis, without any representations, recommendations, warranties or conditions of any kind.