Biomarker discovery

Precision environmental health monitoring by longitudinal exposome and multi-omics profiling

Conventional environmental health studies have primarily focused on limited environmental stressors at the population level, which lacks the power to dissect the complexity and heterogeneity of individualized environmental exposures. Here, as a pilot case study, we integrated deep-profiled longitudinal personal exposome and internal multi-omics to systematically investigate how the exposome shapes a single individual's phenome. We annotated thousands of chemical and biological components in the personal exposome cloud and found they were significantly correlated with thousands of internal biomolecules, which was further cross-validated using corresponding clinical data. Our results showed that agrochemicals and fungi predominated in the highly diverse and dynamic personal exposome, and the biomolecules and pathways related to the individual's immune system, kidney, and liver were highly associated with the personal external exposome. Overall, this data-driven longitudinal monitoring study shows the potential dynamic interactions between the personal exposome and internal multi-omics, as well as the impact of the exposome on precision health by producing abundant testable hypotheses.

TidyMass an object-oriented reproducible analysis framework for LC–MS data profiling

Reproducibility, traceability, and transparency have been long-standing issues for metabolomics data analysis. Multiple tools have been developed, but limitations still exist. Here, we present the tidyMass project (, a comprehensive R-based computational framework that can achieve the traceable, shareable, and reproducible workflow needs of data processing and analysis for LC-MS-based untargeted metabolomics. TidyMass is an ecosystem of R packages that share an underlying design philosophy, grammar, and data structure, which provides a comprehensive, reproducible, and object-oriented computational framework. The modular architecture makes tidyMass a highly flexible and extensible tool, which other users can improve and integrate with other tools to customize their own pipeline.

Metabolic Dynamics and Prediction of Gestational Age and Time to Delivery in Pregnant Women

Metabolism during pregnancy is a dynamic and precisely programmed process, the failure of which can bring devastating consequences to the mother and fetus. To define a high-resolution temporal profile of metabolites during healthy pregnancy, we analyzed the untargeted metabolome of 784 weekly blood samples from 30 pregnant women. Using linear models, we built a metabolic clock with five metabolites that time gestational age in high accordance with ultrasound (R = 0.92). Furthermore, two to threemetabolites can identify when labor occurs (time to delivery within two, four, and eight weeks, AUROCR0.85). Our study represents a weekly characterization of the human pregnancy metabolome, providing ahigh-resolution landscape for understanding pregnancy with potential clinical utilities.

Development of a Correlative Strategy To Discover Colorectal Tumor Tissue Derived Metabolite Biomarkers in Plasma Using Untargeted Metabolomics

The metabolic profiling of biofluids using untargeted metabolomics provides a promising choice to discover metabolite biomarkers for clinical cancer diagnosis. However, metabolite biomarkers discovered in biofluids may not necessarily reflect the pathological status of tumor tissue, which makes these biomarkers difficult to reproduce. In this study, we developed a new analysis strategy by integrating the univariate and multivariate correlation analysis approach to discover tumor tissue derived (TTD) metabolites in plasma samples. Specifically, untargeted metabolomics was first used to profile a set of paired tissue and plasma samples from 34 colorectal cancer (CRC) patients. Next, univariate correlation analysis was used to select correlative metabolite pairs between tissue and plasma, and a random forest regression model was utilized to define 243 TTD metabolites in plasma samples. The TTD metabolites in CRC plasma were demonstrated to accurately reflect the pathological status of tumor tissue and have great potential for metabolite biomarker discovery. Accordingly, we conducted a clinical study using a set of 146 plasma samples from CRC patients and gender-matched polyp controls to discover metabolite biomarkers from TTD metabolites. As a result, eight metabolites were selected as potential biomarkers for CRC diagnosis with high sensitivity and specificity. For CRC patients after surgery, the survival risk score defined by metabolite biomarkers also performed well in predicting overall survival time (p = 0.022) and progression-free survival time (p = 0.002). In conclusion, we developed a new analysis strategy which effectively discovers tumor tissue related metabolite biomarkers in plasma for cancer diagnosis and prognosis.

Predicting the pathological response to neoadjuvant chemoradiation using untargeted metabolomics in locally advanced rectal cancer

A panel of metabolites has been identified to facilitate the prediction of tumor response to NCRT in LARC, which is promising for the generation of personalized treatment strategies for LARC patients.