
Researchers at Oregon Health & Science University’s Knight Cancer Institute have unveiled a groundbreaking tool named OmicsTweezer, designed to enhance the study of cell type compositions in human tissues. This innovation is pivotal for advancing the understanding of diseases like cancer. The findings were published today in Cell Genomics.
OmicsTweezer employs advanced machine learning techniques to analyze biological data on a scale substantial enough to estimate the composition of cell types in tissue samples, such as those obtained from biopsies. This capability allows scientists to map the cellular makeup of tumors and the surrounding tumor microenvironment, a key area of interest in cancer research.
“The tumor microenvironment, made up of diverse cell types that shape tumor development and patient outcomes, has been a longstanding research priority at the Knight Cancer Institute,” stated Zheng Xia, Ph.D., senior author and associate professor of biomedical engineering at OHSU.
Addressing the Batch Effect Challenge
Traditionally, scientists have faced challenges when comparing bulk tissue data with single-cell data due to discrepancies known as the “batch effect.” OmicsTweezer overcomes these limitations by aligning known patterns from single-cell data with complex bulk data in a shared digital space, thus reducing errors and providing more reliable results.
“It’s still very expensive to profile a large clinical sample size using single-cell technology,” Xia explained. “But there is an abundance of bulk data — and by integrating single-cell and bulk data together, we can build a much clearer picture.”
Innovative Techniques in Data Alignment
OmicsTweezer employs deep learning, a branch of machine learning that identifies non-linear patterns in complex data, along with a method called optimal transport to align different types of data. This sophisticated approach surpasses traditional linear models used to estimate cell types based on gene expression.
“We use optimal transport to align two different distributions — single-cell and bulk data — in the same space,” Xia said. “In this way, we can reduce the batch effect, which has long been a challenge when working with data from different sources.”
New Horizons in Cancer Research
The tool was tested on both simulated datasets and real tissue samples from patients with prostate and colon cancer. It successfully identified subtle cell subtypes and estimated cell population changes between patient groups, potentially aiding in the identification of therapeutic targets.
“With this tool, we can now estimate the fractions of those populations defined by single-cell data in bulk data from patient groups,” Xia noted. “That could help us understand which cell populations are changing during disease progression and guide treatment decisions.”
Collaborative Efforts and Future Implications
OmicsTweezer was developed as part of a multidisciplinary collaboration at the OHSU Knight Cancer Institute, in partnership with experts such as Lisa Coussens, Ph.D., FAACR, FAIO, Gordon Mills, M.D., Ph.D., and the SMMART project. SMMART, which stands for Serial Measurements of Molecular and Architectural Responses to Therapy, is the flagship project of the Knight Cancer Institute’s precision oncology program.
“This kind of work wouldn’t be possible without collaboration,” Xia emphasized. “It really reflects the strength of the team at the Knight Cancer Institute.”
As the field of cancer research continues to evolve, tools like OmicsTweezer represent significant advancements in understanding and treating complex diseases. By providing a clearer picture of the tumor microenvironment, researchers can better identify new treatments that improve patient outcomes and quality of life.
The development of OmicsTweezer marks a significant step forward in the integration of machine learning with biomedical research, setting a precedent for future innovations in the field.