The Treehouse Childhood Cancer Initiative at the UCSC Genomics Institute, under the direction of Distinguished Professor David Haussler, Assistant Professor Olena Vaske and Research Scientist Sofie Salama, uses shared data to analyze a child’s tumor against both child and adult patient cancer tumors. We are incredibly grateful to St. Baldrick’s Foundation for funding our groundbreaking RNA work to further discovery of treatments for kids with hard-to-treat cancers. Thanks to St. Baldrick’s, Treehouse is able to realize continuous innovation in research that results in direct impact. 

What follows are updates on our research funded by St. Baldrick’s, with a brief summary of potential impact. Research presented here is from preliminary reports that have not yet been peer-reviewed. They should not be regarded as conclusive, guide health-related behavior, or be reported as established information. We are excited about the promise of our research and proud to report our ongoing work here and the team dedicated to supporting the research. 

Treehouse continues to collaborate with several hospitals, including Lucile Packard Children’s Hospital, UCSF Benioff Children’s Hospital, Alberta Children’s Hospital, Children’s Hospital of Orange County, Nationwide Children’s Hospital and UCLA Mattel Children’s Hospital. 

Yvonne Vasquez (top), Graduate Student Researcher, and Geoff Lyle (bottom), Research Data Analyst

 

Myoepithelial carcinoma case study 

For many childhood cancers there are few treatment options. The standard-of-care therapies that are used often have long-term effects on a child’s health. The goal of our genomics research is to help identify less toxic and more effective treatments for kids with a rare or difficult to treat cancer. In a recent study, we partnered with Stanford to analyze tumor samples from 33 young patients. Each of these patients had a tumor that either came back after treatment (relapsed) or was unresponsive to standard therapies, making them difficult to treat. Yvonne is writing up the case study.

One of those patients — a 4-year-old boy — was diagnosed at 1 year with myoepithelial carcinoma, a rare cancer that has no known effective therapies. He received standard chemotherapy; however, after treatment his cancer, which was originally found in the liver, returned and spread to other parts of his body. When a cancer metastasizes, or spreads to another body site, it becomes harder to treat. 

We were called in to analyze his RNA sequencing data from his tumor. Our Treehouse analysis compared this child’s tumor to data from thousands of pediatric and adult tumors and identified several genes and pathways that appeared to be unusually highly expressed in this patient’s tumor; Geoff Lyle dedicated his time to this analysis. After a comprehensive review of the data, Treehouse suggested two drugs that could possibly target those pathways. The Stanford clinicians took our Treehouse analysis into account with all the other information available and made an informed decision on how to proceed with treating this patient. 

 After receiving Stanford’s selected treatment for over a year, CT scans showed that the boy’s cancer had stopped growing and spreading. He had surgery to remove the remaining cancer and will continue to receive the treatment for another year to try to ensure the cancer does not return.

Holly Beale, Ph.D. Lead Computational Biologist

 

MEND 

We are data experts, and sometimes the information we get from the data raises more questions than answers — it is our job to solve these questions and provide the answers that kids with cancer need. 

Low-quality data can cause particular problems, and lead to inaccurate results, which is unacceptable. Five years ago, Treehouse  embarked on a journey to test data quality, and Holly dedicated her time to this effort. Treehouse applied it to the thousands of data sets that we looked at. In March 2021, we had a peer-reviewed article (10.1093/gigascience/giab011) published describing a method for assessing data quality. This study involved RNA sequencing data from more than 2000 tumors from over 40 projects that have generously shared their data with us. 

Some projects we analyzed had many low-quality samples, and many projects had a few. We want people to know the quality of the data they’re using, and we 

believe scientific findings should provide as much benefit as possible for as many people as possible. So, we didn’t just share our quality scores for the samples we analyzed; we also shared our software that does the scoring so investigators can apply it to their own data.

Some projects we analyzed had many low-quality samples, and many projects had a few. We want people to know the quality of the data they’re using, and we believe scientific findings should provide as much benefit as possible for as many people as possible. So, we didn’t just share our quality scores for the samples we analyzed; we also shared our software that does the scoring so investigators can apply it to their own data. 

Our software can be downloaded for free and run by anyone who uses RNA sequencing data (https://github.com/UCSC-Treehouse/mend_qc). If someone wants to improve it, they can suggest a change for us to incorporate, or make a copy of the software and make their own changes. We are gratified that since we’ve published and shared our data-quality software, other researchers have downloaded and are using it to assess their data quality.

Drew Thompson, Undergraduate Researcher

 

ProTECT 

The human body has many ways it reveals when something is wrong inside of it, including ways in which it indicates cancer is present. Cells communicate with the immune system by presenting small pieces of protein on their surface. These small proteins are called epitopes. When the epitopes are changed due to a mutation in a tumor, they make the cells look different from all the other cells in the body, and they are called neoepitopes. Even if the immune system is not picking up on these neoepitopes, we can use immunotherapy to encourage it to start responding to them. 

ProTECT (Prediction of T-Cell Epitopes for Cancer Therapy), originally developed by Treehouse alumnus Arjun Rao (https://doi.org/10.3389/fimmu.2020.483296), is a computer program (https://github.com/BD2KGenomics/protect) that predicts these neoepitopes and ranks them based on how likely they are to be useful in immunotherapy. Our

collaborator from Alberta Children’s Hospital, Dr. Narendren, states that the “ProTect analysis identified tumor-specific neoepitopes, which are helpful for the development of personalized immunotherapies for leukemia patients.” ProTECT requires sequence data from DNA, the genes, as well as from RNA, which tells us how much different genes are being used. Unfortunately, DNA sequencing can be unaffordable and it definitely is not done routinely for all patients; so we’re working on adapting ProTECT to run when we only have RNA sequencing data, which is much more accessible and less costly. 

One challenge is that ProTECT’s results based on both DNA and RNA analysis look different than its results based on RNA alone. There are many potential reasons for this, such as RNA sequencing has different error sources than DNA sequencing, and RNA requires ProTECT to use different methods to get the results. 

Our research is focused on getting the RNA-only pipeline to better predict neoepitopes, so that RNA alone will enable us to assess with confidence how useful neoepitopes would be in immunotherapy.

Many thanks to St. Baldrick’s from our whole team!