Personalized medicine will improve heath outcomes and patient satisfaction. However, implementing personalized medicine based on individuals’ biological information is far from simple, requiring genetic biomarkers th...Personalized medicine will improve heath outcomes and patient satisfaction. However, implementing personalized medicine based on individuals’ biological information is far from simple, requiring genetic biomarkers that are mainly developed and used by the pharmaceutical companies for selecting those patients who benefit more, or have less risk of adverse drug reactions, from a particular drug. Genome-wide Association Studies (GWAS) aim to identify genetic variants across the human genome that might be utilized as genetic biomarkers for diagnosis and prognosis. During the last several years, high-density genotyping SNP arrays have facilitated GWAS that successfully identified common genetic variants associated with a variety of phenotypes. However, each of the identified genetic variants only explains a very small fraction of the underlying genetic contribution to the studied phenotypic trait. The replication studies demonstrated that only a small portion of associated loci in the initial GWAS can be replicated, even within the same populations. Given the complexity of GWAS, multiple sources of Type I (false positive) and Type II (false negative) errors exist. The inconsistency in genotypes that caused either by the genotypeing experiment or by genotype calling process is a major source of the false GWAS findings. Accurate and reproducible genotypes are paramount as inconsistency in genotypes can lead to an inflation of false associations. This article will review the sources of inconsistency in genotypes and discuss its effect in GWAS findings.展开更多
The treatment efficacy of anti-diabetic therapies is highly heterogeneous among patients with type 2 diabetes(T2D)(Ahmad et al.2022).Predictive biomarkers can be used to stratify patients into subgroups with varying e...The treatment efficacy of anti-diabetic therapies is highly heterogeneous among patients with type 2 diabetes(T2D)(Ahmad et al.2022).Predictive biomarkers can be used to stratify patients into subgroups with varying efficacy before receiving the treatment,and help advance the understanding of disease and treatment(Ahmad et al.2022).Thus,identifying predictive biomarkers is important for precision medicine of patients with T2D.Approved in China in October 2021 as an adjunct to diet and exercise for improving glycemic control in adult patients with T2D,chiglitazar is a non-thiazolidinedione agonist of theα,δandγsubtypes of the peroxisome proliferator-activated receptors(PPARs)(Deeks 2022).展开更多
Homologous recombination deficiency(HRD)has emerged as a critical prognostic and predictive biomarker in oncology.However,current test-ing methods,especially those reliant on targeted panels,are plagued by inconsisten...Homologous recombination deficiency(HRD)has emerged as a critical prognostic and predictive biomarker in oncology.However,current test-ing methods,especially those reliant on targeted panels,are plagued by inconsistent results from the same samples.This highlights the urgent need for standardized benchmarks to evaluate HRD assay performance.In phases lla and Ilb of the Chinese HRD Harmonization Project,we de-veloped ten pairs of well-characterized DNA reference materials derived from lung,breast,and melanoma cancer cell lines and their matched normal cell lines,keeping each paired with seven cancer-to-normal mass ratios.Reference datasets for allele-specific copy number variations(AsCNVs)and HRD scores were established and validated using three sequencing methods and nine analytical pipelines.The genomic instabil-ity scores(GISs)of the reference materials ranged from 11 to 96,enabling validation across various thresholds.The AsCNV reference datasets covered a genomic span of 2340 to 2749 Mb,equivalent to 81.2%to 95.4%of the autosomes in the 37d5 reference genome.These bench-marks were subsequently utilized to assess the accuracy and reproducibility of four HRD panel assays,revealing significant variability in both ASCNV detection and HRD scores.The concordance between panel-detected GISs and reference GISs ranged from 0.81 to 0.94,with only two assays exhibiting high overall agreement with Myriad MyChoice CDx for HRD classification.This study also identified specific challenges in ASCNV detection in HRD-related regions and the profound impact of high ploidy on consistency.The established HRD reference materials and datasets providea robust toolkit forobjective evaluation of HRD testing.展开更多
Bioinformatics methods for various RNA-seq data analyses are in fast evolution with the improvement of sequencing technologies. However, many challenges still exist in how to efficiently process the RNA-seq data to ob...Bioinformatics methods for various RNA-seq data analyses are in fast evolution with the improvement of sequencing technologies. However, many challenges still exist in how to efficiently process the RNA-seq data to obtain accurate and comprehensive results. Here we reviewed the strategies for improving diverse transcriptomic studies and the annotation of genetic variants based on RNA-seq data. Mapping RNA-seq reads to the genome and transcriptome represent two distinct methods for quantifying the expression of genes/transcripts. Besides the known genes annotated in current databases, many novel genes/transcripts(especially those long noncoding RNAs) still can be identified on the reference genome using RNA-seq. Moreover, owing to the incompleteness of current reference genomes, some novel genes are missing from them. Genome-guided and de novo transcriptome reconstruction are two effective and complementary strategies for identifying those novel genes/transcripts on or beyond the reference genome. In addition, integrating the genes of distinct databases to conduct transcriptomics and genetics studies can improve the results of corresponding analyses.展开更多
Defects in genes involved in the DNA damage response cause homologous recombination repair deficiency(HRD).HRD is found in a subgroup of cancer patients for several tumor types,and it has a clinical relevance to cance...Defects in genes involved in the DNA damage response cause homologous recombination repair deficiency(HRD).HRD is found in a subgroup of cancer patients for several tumor types,and it has a clinical relevance to cancer prevention and therapies.Accumulating evidence has identified HRD as a biomarker for assessing the therapeutic response of tumor cells to poly(ADP-ribose)polymerase inhibitors and platinum-based chemotherapies.Nevertheless,the biology of HRD is complex,and its applications and the benefits of different HRD biomarker assays are controversial.This is primarily due to inconsistencies in HRD assessments and definitions(gene-level tests,genomic scars,mutational signatures,or a combination of these methods)and difficulties in assessing the contribution of each genomic event.Therefore,we aim to review the biological rationale and clinical evidence of HRD as a biomarker.This review provides a blueprint for the standardization and harmonization of HRD assessments.展开更多
High-throughput next generation sequencing(NGS)is a shotgun approach applied in a parallel fashion by which the genome is fragmented and sequenced through small pieces and then analyzed either by aligning to a known r...High-throughput next generation sequencing(NGS)is a shotgun approach applied in a parallel fashion by which the genome is fragmented and sequenced through small pieces and then analyzed either by aligning to a known reference genome or by de novo assembly without reference genome.This technology has led researchers to conduct an explosion of sequencing related projects in multidisciplinary fields of science.However,due to the limitations of sequencing-based chemistry,length of sequencing reads and the complexity of genes,it is difficult to determine the sequences of some portions of the human genome,leaving gaps in genomic data that frustrate further analysis.Particularly,some complex genes are difficult to be accurately sequenced or mapped because they contain high GC-content and/or low complexity regions,and complicated pseudogenes,such as the genes encoding xenobiotic metabolizing enzymes and transporters(XMETs).The genetic variants in XMET genes are critical to predicate interindividual variability in drug efficacy,drug safety and susceptibility to environmental toxicity.We summarized and discussed challenges,wet-lab methods,and bioinformatics algorithms in sequencing"complex"XMET genes,which may provide insightful information in the application of NGS technology for implementation in toxicogenomics and pharmacogenomics.展开更多
Human and mouse orthologs are expected to have similar biological functions; however, many discrepancies have also been reported. We systematically compared human and mouse orthologs in terms of alternative splicing p...Human and mouse orthologs are expected to have similar biological functions; however, many discrepancies have also been reported. We systematically compared human and mouse orthologs in terms of alternative splicing patterns and expression profiles. Human-mouse orthologs are divergent in alternative splicing, as human orthologs could generally encode more isoforms than their mouse orthologs. In early embryos, exon skipping is far more common with human orthologs, whereas constitutive exons are more prevalent with mouse orthologs. This may correlate with divergence in expression of splicing regulators. Orthologous expression similarities are different in distinct embryonic stages, with the highest in morula. Expression differences for orthologous transcription factor genes could play an important role in orthologous expression discordance. We further detected largely orthologous divergence in differential expression between distinct embryonic stages. Collectively, our study uncovers significant orthologous divergence from multiple aspects, which may result in functional differences and dynamics between human-mouse orthologs during embryonic development.展开更多
Bifunctional RNAs that possess both protein-coding and noncoding functional properties were less explored and poorly understood. Here we systematically explored the characteristics and functions of such human bifuncti...Bifunctional RNAs that possess both protein-coding and noncoding functional properties were less explored and poorly understood. Here we systematically explored the characteristics and functions of such human bifunctional RNAs by integrating tandem mass spectrometry and RNA-seq data. We first constructed a pipeline to identify and annotate bifunctional RNAs,leading to the characterization of 132 high-confidence bifunctional RNAs. Our analyses indicate that bifunctional RNAs may be involved in human embryonic development and can be functional in diverse tissues. Moreover, bifunctional RNAs could interact with multiple miRNAs and RNA-binding proteins to exert their corresponding roles. Bifunctional RNAs may also function as competing endogenous RNAs to regulate the expression of many genes by competing for common targeting miRNAs. Finally,somatic mutations of diverse carcinomas may generate harmful effect on corresponding bifunctional RNAs. Collectively,our study not only provides the pipeline for identifying and annotating bifunctional RNAs but also reveals their important gene-regulatory functions.展开更多
High-throughput technologies for multiomics or molecular phenomics profiling have been extensively adopted in biomedical research and clinical applications,offering a more comprehensive understanding of biological pro...High-throughput technologies for multiomics or molecular phenomics profiling have been extensively adopted in biomedical research and clinical applications,offering a more comprehensive understanding of biological processes and diseases.Omics reference materials play a pivotal role in ensuring the accuracy,reliability,and comparability of laboratory measurements and analyses.However,the current application of omics reference materials has revealed several issues,including inappropriate selection and underutilization,leading to inconsistencies across laboratories.This review aims to address these concerns by emphasizing the importance of well-characterized reference materials at each level of omics,encompassing(epi-)genomics,transcriptomics,proteomics,and metabolomics.By summarizing their characteristics,advantages,and limitations along with appropriate performance metrics pertinent to study purposes,we provide an overview of how omics reference materials can enhance data quality and data integration,thus fostering robust scientific investigations with omics technologies.展开更多
RNA sequencing(RNAseq)technology has become increasingly important in precision medicine and clinical diagnostics,and emerged as a powerful tool for identifying protein-coding genes,performing differential gene analys...RNA sequencing(RNAseq)technology has become increasingly important in precision medicine and clinical diagnostics,and emerged as a powerful tool for identifying protein-coding genes,performing differential gene analysis,and inferring immune cell composition.Human peripheral blood samples are widely used for RNAseq,providing valuable insights into individual biomolecular information.Blood samples can be classified as whole blood(WB),plasma,serum,and remaining sediment samples,including plasma-free blood(PFB)and serum-free blood(SFB)samples that are generally considered less useful byproducts during the processes of plasma and serum separation,respectively.However,the feasibility of using PFB and SFB samples for transcriptome analysis remains unclear.In this study,we aimed to assess the suitability of employing PFB or SFB samples as an alternative RNA source in transcriptomic analysis.We performed a comparative analysis of WB,PFB,and SFB samples for different applications.Our results revealed that PFB samples exhibit greater similarity to WB samples than SFB samples in terms of protein-coding gene expression patterns,detection of differentially expressed genes,and immunological characterizations,suggesting that PFB can serve as a viable alternative to WB for transcriptomic analysis.Our study contributes to the optimization of blood sample utilization and the advancement of precision medicine research.展开更多
文摘Personalized medicine will improve heath outcomes and patient satisfaction. However, implementing personalized medicine based on individuals’ biological information is far from simple, requiring genetic biomarkers that are mainly developed and used by the pharmaceutical companies for selecting those patients who benefit more, or have less risk of adverse drug reactions, from a particular drug. Genome-wide Association Studies (GWAS) aim to identify genetic variants across the human genome that might be utilized as genetic biomarkers for diagnosis and prognosis. During the last several years, high-density genotyping SNP arrays have facilitated GWAS that successfully identified common genetic variants associated with a variety of phenotypes. However, each of the identified genetic variants only explains a very small fraction of the underlying genetic contribution to the studied phenotypic trait. The replication studies demonstrated that only a small portion of associated loci in the initial GWAS can be replicated, even within the same populations. Given the complexity of GWAS, multiple sources of Type I (false positive) and Type II (false negative) errors exist. The inconsistency in genotypes that caused either by the genotypeing experiment or by genotype calling process is a major source of the false GWAS findings. Accurate and reproducible genotypes are paramount as inconsistency in genotypes can lead to an inflation of false associations. This article will review the sources of inconsistency in genotypes and discuss its effect in GWAS findings.
基金the National Natural Science Foundation of China(T2425013,32370701,32470692,32170657)the National Key R&D Project of China(2023YFC3402501)+1 种基金Shanghai Municipal Science and Technology Major Project,the Strategic Priority Research Program of the Chinese Academy of Sciences(XDA12040104)the 111 Project(B13016).
文摘The treatment efficacy of anti-diabetic therapies is highly heterogeneous among patients with type 2 diabetes(T2D)(Ahmad et al.2022).Predictive biomarkers can be used to stratify patients into subgroups with varying efficacy before receiving the treatment,and help advance the understanding of disease and treatment(Ahmad et al.2022).Thus,identifying predictive biomarkers is important for precision medicine of patients with T2D.Approved in China in October 2021 as an adjunct to diet and exercise for improving glycemic control in adult patients with T2D,chiglitazar is a non-thiazolidinedione agonist of theα,δandγsubtypes of the peroxisome proliferator-activated receptors(PPARs)(Deeks 2022).
基金supported by the National Key R&D Program of China(Grant No.2022YFF1202203)the NIFDC Fund for Key Technology Research,China(Grant No.GJJS-2022-2-1).
文摘Homologous recombination deficiency(HRD)has emerged as a critical prognostic and predictive biomarker in oncology.However,current test-ing methods,especially those reliant on targeted panels,are plagued by inconsistent results from the same samples.This highlights the urgent need for standardized benchmarks to evaluate HRD assay performance.In phases lla and Ilb of the Chinese HRD Harmonization Project,we de-veloped ten pairs of well-characterized DNA reference materials derived from lung,breast,and melanoma cancer cell lines and their matched normal cell lines,keeping each paired with seven cancer-to-normal mass ratios.Reference datasets for allele-specific copy number variations(AsCNVs)and HRD scores were established and validated using three sequencing methods and nine analytical pipelines.The genomic instabil-ity scores(GISs)of the reference materials ranged from 11 to 96,enabling validation across various thresholds.The AsCNV reference datasets covered a genomic span of 2340 to 2749 Mb,equivalent to 81.2%to 95.4%of the autosomes in the 37d5 reference genome.These bench-marks were subsequently utilized to assess the accuracy and reproducibility of four HRD panel assays,revealing significant variability in both ASCNV detection and HRD scores.The concordance between panel-detected GISs and reference GISs ranged from 0.81 to 0.94,with only two assays exhibiting high overall agreement with Myriad MyChoice CDx for HRD classification.This study also identified specific challenges in ASCNV detection in HRD-related regions and the profound impact of high ploidy on consistency.The established HRD reference materials and datasets providea robust toolkit forobjective evaluation of HRD testing.
基金supported by the National High Technology Research and Development Program of China(2015AA020104)the China Human Proteome Project(2014DFB30010)+1 种基金the National Science Foundation of China(31471239,to Leming Shi)the 111 Project(B13016)
文摘Bioinformatics methods for various RNA-seq data analyses are in fast evolution with the improvement of sequencing technologies. However, many challenges still exist in how to efficiently process the RNA-seq data to obtain accurate and comprehensive results. Here we reviewed the strategies for improving diverse transcriptomic studies and the annotation of genetic variants based on RNA-seq data. Mapping RNA-seq reads to the genome and transcriptome represent two distinct methods for quantifying the expression of genes/transcripts. Besides the known genes annotated in current databases, many novel genes/transcripts(especially those long noncoding RNAs) still can be identified on the reference genome using RNA-seq. Moreover, owing to the incompleteness of current reference genomes, some novel genes are missing from them. Genome-guided and de novo transcriptome reconstruction are two effective and complementary strategies for identifying those novel genes/transcripts on or beyond the reference genome. In addition, integrating the genes of distinct databases to conduct transcriptomics and genetics studies can improve the results of corresponding analyses.
基金supported by the National Key R&D Program of China(Grant No.2022YFC2409902)the National Natural Science Foundation of China(Grant No.82172876)+2 种基金the Beijing Nova Program of Science and Technology(Grant No.Z191100001119095)the Chinese Academy of Medical Sciences(CAMS)Innovation Fund for Medical Sciences(Grant No.2021-I2M-1-066)the Beijing Hope Run Special Fund of Cancer Foundation of China(Grant No.LC2019L04).
文摘Defects in genes involved in the DNA damage response cause homologous recombination repair deficiency(HRD).HRD is found in a subgroup of cancer patients for several tumor types,and it has a clinical relevance to cancer prevention and therapies.Accumulating evidence has identified HRD as a biomarker for assessing the therapeutic response of tumor cells to poly(ADP-ribose)polymerase inhibitors and platinum-based chemotherapies.Nevertheless,the biology of HRD is complex,and its applications and the benefits of different HRD biomarker assays are controversial.This is primarily due to inconsistencies in HRD assessments and definitions(gene-level tests,genomic scars,mutational signatures,or a combination of these methods)and difficulties in assessing the contribution of each genomic event.Therefore,we aim to review the biological rationale and clinical evidence of HRD as a biomarker.This review provides a blueprint for the standardization and harmonization of HRD assessments.
基金supported by the FDA Project(E0765001)the National Key Research and Development Program of China(2016YFC0902100 to Geng Chen)
文摘High-throughput next generation sequencing(NGS)is a shotgun approach applied in a parallel fashion by which the genome is fragmented and sequenced through small pieces and then analyzed either by aligning to a known reference genome or by de novo assembly without reference genome.This technology has led researchers to conduct an explosion of sequencing related projects in multidisciplinary fields of science.However,due to the limitations of sequencing-based chemistry,length of sequencing reads and the complexity of genes,it is difficult to determine the sequences of some portions of the human genome,leaving gaps in genomic data that frustrate further analysis.Particularly,some complex genes are difficult to be accurately sequenced or mapped because they contain high GC-content and/or low complexity regions,and complicated pseudogenes,such as the genes encoding xenobiotic metabolizing enzymes and transporters(XMETs).The genetic variants in XMET genes are critical to predicate interindividual variability in drug efficacy,drug safety and susceptibility to environmental toxicity.We summarized and discussed challenges,wet-lab methods,and bioinformatics algorithms in sequencing"complex"XMET genes,which may provide insightful information in the application of NGS technology for implementation in toxicogenomics and pharmacogenomics.
基金supported by the China Human Proteomics Project (2014DFB30010)the National High Technology Research and Development Program of China (2015AA020104)+1 种基金the National Natural Science Foundation of China (31071162)the Graduate School of East China Normal University
文摘Human and mouse orthologs are expected to have similar biological functions; however, many discrepancies have also been reported. We systematically compared human and mouse orthologs in terms of alternative splicing patterns and expression profiles. Human-mouse orthologs are divergent in alternative splicing, as human orthologs could generally encode more isoforms than their mouse orthologs. In early embryos, exon skipping is far more common with human orthologs, whereas constitutive exons are more prevalent with mouse orthologs. This may correlate with divergence in expression of splicing regulators. Orthologous expression similarities are different in distinct embryonic stages, with the highest in morula. Expression differences for orthologous transcription factor genes could play an important role in orthologous expression discordance. We further detected largely orthologous divergence in differential expression between distinct embryonic stages. Collectively, our study uncovers significant orthologous divergence from multiple aspects, which may result in functional differences and dynamics between human-mouse orthologs during embryonic development.
基金supported in part by the National High Technology Research and Development Program of China(2015AA020104,2015AA020108)the China Human Proteomics Project(2014DF30030)the National Science Foundation of China(31471239)
文摘Bifunctional RNAs that possess both protein-coding and noncoding functional properties were less explored and poorly understood. Here we systematically explored the characteristics and functions of such human bifunctional RNAs by integrating tandem mass spectrometry and RNA-seq data. We first constructed a pipeline to identify and annotate bifunctional RNAs,leading to the characterization of 132 high-confidence bifunctional RNAs. Our analyses indicate that bifunctional RNAs may be involved in human embryonic development and can be functional in diverse tissues. Moreover, bifunctional RNAs could interact with multiple miRNAs and RNA-binding proteins to exert their corresponding roles. Bifunctional RNAs may also function as competing endogenous RNAs to regulate the expression of many genes by competing for common targeting miRNAs. Finally,somatic mutations of diverse carcinomas may generate harmful effect on corresponding bifunctional RNAs. Collectively,our study not only provides the pipeline for identifying and annotating bifunctional RNAs but also reveals their important gene-regulatory functions.
基金supported in part by Shanghai Sailing Program(22YF1403500)the National Natural Science Foundation of China(32300536,31720103909 and 32170657)+2 种基金the National Key R&D Project of China(2018YFE0201603 and 2018YFE0201600)State Key Laboratory of Genetic Engineering(SKLGE-2117)the 111 Project(B13016).
文摘High-throughput technologies for multiomics or molecular phenomics profiling have been extensively adopted in biomedical research and clinical applications,offering a more comprehensive understanding of biological processes and diseases.Omics reference materials play a pivotal role in ensuring the accuracy,reliability,and comparability of laboratory measurements and analyses.However,the current application of omics reference materials has revealed several issues,including inappropriate selection and underutilization,leading to inconsistencies across laboratories.This review aims to address these concerns by emphasizing the importance of well-characterized reference materials at each level of omics,encompassing(epi-)genomics,transcriptomics,proteomics,and metabolomics.By summarizing their characteristics,advantages,and limitations along with appropriate performance metrics pertinent to study purposes,we provide an overview of how omics reference materials can enhance data quality and data integration,thus fostering robust scientific investigations with omics technologies.
基金supported in part by the National Natural Science Foundation of China(31720103909 and 32170657)the National Key R&D Project of China(2018YFE0201603,2018YFE0201600,and 2021YFF1201305)+2 种基金Shanghai Municipal Science and Technology Major Project(2017SHZDZX01)State Key Laboratory of Genetic Engineering(SKLGE-2117)the 111 Project(B13016).
文摘RNA sequencing(RNAseq)technology has become increasingly important in precision medicine and clinical diagnostics,and emerged as a powerful tool for identifying protein-coding genes,performing differential gene analysis,and inferring immune cell composition.Human peripheral blood samples are widely used for RNAseq,providing valuable insights into individual biomolecular information.Blood samples can be classified as whole blood(WB),plasma,serum,and remaining sediment samples,including plasma-free blood(PFB)and serum-free blood(SFB)samples that are generally considered less useful byproducts during the processes of plasma and serum separation,respectively.However,the feasibility of using PFB and SFB samples for transcriptome analysis remains unclear.In this study,we aimed to assess the suitability of employing PFB or SFB samples as an alternative RNA source in transcriptomic analysis.We performed a comparative analysis of WB,PFB,and SFB samples for different applications.Our results revealed that PFB samples exhibit greater similarity to WB samples than SFB samples in terms of protein-coding gene expression patterns,detection of differentially expressed genes,and immunological characterizations,suggesting that PFB can serve as a viable alternative to WB for transcriptomic analysis.Our study contributes to the optimization of blood sample utilization and the advancement of precision medicine research.