High-performance computing team powers scientific discovery at St. Jude

The St. Jude Data Center

The St. Jude Data Center provides high performance computing to the institution’s scientists to accelerate their research.

Up until 1922, diabetes mellitus was a terminal diagnosis. Researchers began theorizing that a chemical produced by the pancreas must control blood sugar in 1889, but it wasn’t until 30 years later that insulin was discovered. It has been saving the lives of people living with Type I diabetes ever since. 

An urgent clock always ticks for biomedical research. In 1889, scientists had part of the puzzle: the importance of the pancreas in diabetes. Still, it took decades of refining their techniques to find insulin. 

Today, biomedical researchers have access to a wealth of genetic and chemical information, with billions of potentially lifesaving data points. Their task is to find ways to process that data into meaningful findings as quickly as possible. In this age of data-driven discovery, high-performance computing (HPC) is helping to accelerate this process, serving as a resource for every researcher at St. Jude.

“High-performance computing enables a whole different scale of discovery,” explained Ed Suh, DSc, St. Jude Research Informatics vice president, who helps run the HPC cluster. “We are here to help researchers take calculations that would take weeks or months and do them in minutes to hours.”

HPC speeds up analysis through parallel processing, splitting even the largest jobs into many small parts. The parts are assigned to individual or groups of processors (nodes) networked together. By performing all work simultaneously, the network divides a task’s time by the number of assigned nodes. 

“For example,” Suh said, “if you have a task that might take 30 days of computer power but assign it to 30 nodes, it will take just over a single day to complete.” 

While essential, computational power alone isn’t enough to guarantee success. At St. Jude, the HPC team’s computational engineering expertise is a critical partner in driving discovery.

St. Jude high-performance computing team maintains research momentum

The last thing any fast-moving researcher wants is to be stymied by a technicality in software or hardware. To minimize those challenges, the St. Jude HPC team serves as an approachable guide to the system. 

“The HPC specialists seem excited to help us,” said Xueying Liu, PhD, a research scientist in the St. Jude Department of Computational Biology. “When I have problems, they not only know why a specific job failed, but they also always know how we can fix it. They are extremely available, often answering questions the same day I submit a ticket.”

While knowledgeable, the HPC team works best when collaborating with the researchers using the system. “They know the intricacies of the HPC cluster like I know the intricacies of the biology I’m studying,” said Charlie Wright, PhD, St. Jude Department of Computational Biology, bioinformatics research scientist. “We come together to make a job work. Even if one of us doesn’t know the exact terms the other uses, we can communicate effectively to find problems and address them quickly.”

One major problem that could cost researchers time is addressing data security. Fan Wang, PhD, St. Jude Department of Epidemiology and Cancer Control, lead bioinformatics research scientist, often handles sensitive genome sequencing data. However, he isn’t worried. 

“Data privacy is a major concern, but the fantastic team behind the HPC cluster addresses its computer security aspect excellently,” he explained. “They maintain a safe computing environment for us, so we researchers can focus on performing our analysis.”

High-performance computing accelerates biomedical research at St. Jude

St. Jude scientists take advantage of access to the HPC cluster and team to get results faster and ask questions that would be impossible without them:

  • “I primarily use the HPC cluster for drug combination analysis,” Wright said. “I compare many different drug combinations that could take days or weeks on a local machine but might take just 40 minutes on the HPC.” His work focuses on finding better drug treatments for pediatric neuroblastoma, a cancer with poor prognosis. He developed a CRISPR-based approach for identifying novel drug combinations for neuroblastoma, published in Nature Communications, supported by the HPC cluster. 

  • “I use HPC for single-cell sequencing data analysis as part of my daily routine,” Liu said. “I’ve had to process single-cell count matrices of over two million cells at once, which is only possible because of the HPC cluster’s memory and parallel processing power between nodes.” Her HPC-assisted work created computational methods to analyze and compare preclinical and clinical RNA-sequencing data in neuroblastoma, published in Genome Biology and Cell Genomics.

  • “I use the HPC to analyze and process whole genome and whole exome sequencing data,” Wang said. “Our goal is to identify the genetic influence on late effects of childhood cancer and its treatment. Given that each sample includes millions of genetic variants, only a parallel computational method, like HPC, can overcome these computational challenges and enhance our ability for genetic discovery.” His HPC-based analysis has enabled him to develop a genome-wide haplotype association analysis framework of cancer risk published in Cancer Research.

  • “The St. Jude HPC cluster has made a large impact on our ability to predict and understand shapes of biological molecules and infer the effect of mutations on protein function,” said M. Madan Babu, PhD, St. Jude Data Science senior vice president, chief data scientist and Center of Excellence for Data-Driven Discovery director. “It is now democratizing protein structure prediction across the institution, pushing research forward in multiple fields, including the design of chimeric antigen receptors (CAR).” HPC-generated predictions have been used to improve CAR T–cell immunotherapy, published in Nature Biomedical Engineering and Cell Reports Medicine.

Their examples highlight how the St. Jude HPC is speeding the process from data collection to discovery, taking a vast wealth of biological data and turning it into actionable information. 

Biomedical researchers today don’t want to wait 30 years to go from idea to discovery, like their predecessors looking for insulin. The technological revolution has provided massive modern chemical and sequencing data collections brimming with potential discoveries. Today, with the St. Jude HPC team there to help, the hospital’s scientists are using that data to advance research and treatment like never before. 

“We provide powerful HPC infrastructure and computational tools so that our researchers can make the next big breakthroughs as quickly as possible,” Suh said. “That’s our mission."

About the author

Senior Scientific Writer

Alex Generous, PhD, is a Senior Scientific Writer in the Strategic Communications, Education and Outreach Department at St. Jude.

More Articles From Alex Generous

Related Posts

Pioneering 3D printing to advance pediatric cancer research

Technology tidalwave: How artificial intelligence is shaping the future of radiology

Capitalizing on the resolution revolution in structural and cell biology with cryo-electron tomography

Stay ahead of the curve