Objective: Challenges remain in current practices of colorectal cancer(CRC) screening, such as low compliance,low specificities and expensive cost. This study aimed to identify high-risk groups for CRC from the genera...Objective: Challenges remain in current practices of colorectal cancer(CRC) screening, such as low compliance,low specificities and expensive cost. This study aimed to identify high-risk groups for CRC from the general population using regular health examination data.Methods: The study population consist of more than 7,000 CRC cases and more than 140,000 controls. Using regular health examination data, a model detecting CRC cases was derived by the classification and regression trees(CART) algorithm. Receiver operating characteristic(ROC) curve was applied to evaluate the performance of models. The robustness and generalization of the CART model were validated by independent datasets. In addition, the effectiveness of CART-based screening was compared with stool-based screening.Results: After data quality control, 4,647 CRC cases and 133,898 controls free of colorectal neoplasms were used for downstream analysis. The final CART model based on four biomarkers(age, albumin, hematocrit and percent lymphocytes) was constructed. In the test set, the area under ROC curve(AUC) of the CART model was 0.88 [95%confidence interval(95% CI), 0.87-0.90] for detecting CRC. At the cutoff yielding 99.0% specificity, this model’s sensitivity was 62.2%(95% CI, 58.1%-66.2%), thereby achieving a 63-fold enrichment of CRC cases. We validated the robustness of the method across subsets of test set with diverse CRC incidences, aging rates, genders ratio, distributions of tumor stages and locations, and data sources. Importantly, CART-based screening had the higher positive predictive value(1.6%) than fecal immunochemical test(0.3%).Conclusions: As an alternative approach for the early detection of CRC, this study provides a low-cost method using regular health examination data to identify high-risk individuals for CRC for further examinations. The approach can promote early detection of CRC especially in developing countries such as China, where annual health examination is popular but regular CRC-specific screening is rare.展开更多
The value of a statistical life(VSL)is a crucial tool for monetizing health impacts.To explore the VSL in China,this study examines people’s willingness to pay(WTP)to reduce death risk from air pollution in six repre...The value of a statistical life(VSL)is a crucial tool for monetizing health impacts.To explore the VSL in China,this study examines people’s willingness to pay(WTP)to reduce death risk from air pollution in six representative cities in China based on face-to-face contingent valuation interviews(n=3936)from March 7,2019 to September 30,2019.The results reveal that the WTP varied from CNY 455 to 763 in 2019(USD 66-111),corresponding to a VSL range of CNY 3.79-6.36 million(USD 549395-921940).The VSL in China in 2019 is estimated to be CNY 4.76 million(USD 689659).The statistics indicate that monthly expenditure levels,environmental concerns,risk attitudes,and assumed market acceptance,which have seldom been dis‐cussed in previous studies,significantly impact WTP and VSL.These findings will serve as a reference for ana‐lyzing mortality risk reduction benefits in future research and for policymaking.展开更多
Tumor research is a fundamental focus of medical science,yet the intrinsic heterogeneity and complexity of tumors present challenges in understanding their biological mechanisms of initiation,progression,and metastasi...Tumor research is a fundamental focus of medical science,yet the intrinsic heterogeneity and complexity of tumors present challenges in understanding their biological mechanisms of initiation,progression,and metastasis.Recent advancements in single-cell transcriptomic sequencing have revolutionized the way researchers explore tumor biology by providing unprecedented resolution.However,a key limitation of single-cell sequencing is the loss of spatial information during single-cell preparation.Spatial transcriptomics(ST)emerges as a cutting-edge technology in tumor research that preserves the spatial information of RNA transcripts,thereby facilitating a deeper understanding of the tumor heterogeneity,the intricate interplay between tumor cells and the tumor microenvironment.This review systematically introduces ST technologies and summarizes their latest applications in tumor research.Furthermore,we provide a thorough overview of the bioinformatics analysis workflow for ST data and offer an online tutorial(https://github.com/Siyua nHuan g1/ST_Analy sis_Handb ook).Lastly,we discuss the potential future directions of ST.We believe that ST will become a powerful tool in unraveling tumor biology and offer new insights for effective treatment and precision medicine in oncology.展开更多
This paper gives a review of concentration inequalities which are widely employed in non-asymptotical analyses of mathematical statistics in awide range of settings,fromdistribution-free to distribution-dependent,from...This paper gives a review of concentration inequalities which are widely employed in non-asymptotical analyses of mathematical statistics in awide range of settings,fromdistribution-free to distribution-dependent,from sub-Gaussian to sub-exponential,sub-Gamma,and sub-Weibull random variables,and from the mean to the maximum concentration.This review provides results in these settings with some fresh new results.Given the increasing popularity of high-dimensional data and inference,results in the context of high-dimensional linear and Poisson regressions are also provided.We aim to illustrate the concentration inequalities with known constants and to improve existing bounds with sharper constants.展开更多
Background:To systematically summarize and categorize the Chinese herbal medicine in the domestic traditional Chinese medicine(TCM)literature on type 2 diabetes mellitus(T2DM),in this paper,we mine traditional Chinese...Background:To systematically summarize and categorize the Chinese herbal medicine in the domestic traditional Chinese medicine(TCM)literature on type 2 diabetes mellitus(T2DM),in this paper,we mine traditional Chinese medicine data for relationships and provide for future practitioners and researchers.Methods:Taking randomized controlled trials on the treatment of T2DM in TCM as the research theme,we searched for full-text literature in three major clinical databases,including CNKI,Wan Fang,and VIP,published between 1990 and 2020.We then conducted frequency statistics,cluster analysis,association rules extraction,and principal component analysis based on a corpus of medical academic words extracted from 1116 research articles.Results:The most frequently used is Astragali Radix,and the most commonly used two-herb combination in T2DM treatment consisted of Coptidis Rhizoma and Moutan Cortex.Moutan Cortex,Alismatis Rhizoma,and Dioscoreae Rhizoma were the most frequently used three-herb combination.We found a“lung”and“liver”and“kidney”model and confirmed the value of classical meridian tropism theory and pattern identification.The treatment is mainly to fill deficiency and clear heat and consider water infiltration,dampness,blood circulation,and silt.Conclusion:This study provides an in-depth perspective on the TCM medication rules for T2DM and offers practitioners and researchers valuable information about the current status and frontier trends of TCM research on T2DM in terms of diagnosis and treatment.展开更多
Human primed pluripotent stem cells are capable of generating all the embryonic lineages.However,their extraembryonic trophectoderm potentials are limited.It remains unclear how to expand their developmental potential...Human primed pluripotent stem cells are capable of generating all the embryonic lineages.However,their extraembryonic trophectoderm potentials are limited.It remains unclear how to expand their developmental potential to trophectoderm lineages.Here we show that transient treatment with a cocktail of small molecule epigenetic modulators imparts trophectoderm lineage potentials to human primed pluripotent stem cells while preserving their embryonic potential.These chemically treated cells can generate trophectoderm-like cells and downstream trophoblast stem cells,diverging into syncytiotrophoblast and extravillous trophoblast lineages.Transcriptomic and CUT&Tag analyses reveal that these induced cells share transcriptional profiles with in vivo trophectoderm and cytotrophoblast,and exhibit reduced H3K27me3 modification at gene loci specific to trophoblast lineages compared with primed pluripotent cells.Mechanistic exploration highlighted the critical roles of epigenetic modulators HDAC2,EZH1/2,and KDM5s in the activation of trophoblast lineage potential.Our findings demonstrate that transient epigenetic resetting activates unrestricted lineage potential in human primed pluripotent stem cells,and offer new mechanistic insights into human trophoblast lineage specification and in vitro models for studying placental development and related disorders.展开更多
In this paper, we provide a pathwise spine decomposition for multitype superdiffusions with nonlocal branching mechanisms under a martingale change of measure. As an application of this decomposition,we obtain a neces...In this paper, we provide a pathwise spine decomposition for multitype superdiffusions with nonlocal branching mechanisms under a martingale change of measure. As an application of this decomposition,we obtain a necessary and sufficient condition(called the L log L criterion) for the limit of the fundamental martingale to be non-degenerate. This result complements the related results obtained in Kyprianou et al.(2012),Kyprianou and Murillo-Salas(2013) and Liu et al.(2009) for superprocesses with purely local branching mechanisms and in Kyprianou and Palau(2018) for super Markov chains.展开更多
Eukaryotic genomes are densely packaged into hierarchical three-dimensional(3D) structures that contain information about gene regulation and many other biological processes. With the development of imaging and sequen...Eukaryotic genomes are densely packaged into hierarchical three-dimensional(3D) structures that contain information about gene regulation and many other biological processes. With the development of imaging and sequencing-based technologies, 3D genome studies have revealed that the high-order chromatin structure is composed of hierarchical levels, including chromosome territories, A/B compartments, topologically associated domains, and chromatin loops. However, how this chromatin architecture is formed and maintained is not completely clear. In this review, we introduce experimental methods to investigate the 3D genome, review major architectural proteins that regulate 3D chromatin organization in mammalian cells, such as CTCF(CCCTC-binding factor), cohesin, lamins, and transcription factors, and discuss relevant mechanisms such as phase separation.展开更多
The eukaryotic genome is folded into higher-order conformation accompanied with constrained dynamics for coordinated genome functions.However,the molecular machinery underlying these hierarchically organized three-dim...The eukaryotic genome is folded into higher-order conformation accompanied with constrained dynamics for coordinated genome functions.However,the molecular machinery underlying these hierarchically organized three-dimensional(3D)chromatin architecture and dynamics remains poorly understood.Here by combining imaging and sequencing,we studied the role of lamin B1 in chromatin architecture and dynamics.We found that lamin B1 depletion leads to detachment of lamina-associated domains(LADs)from the nuclear periphery accompanied with global chromatin redistribution and decompaction.Consequently,the interchromosomal as well as inter-compartment interactions are increased,but the structure of topologically associating domains(TADs)is not affected.Using live-cell genomic loci tracking,we further proved that depletion of lamin B1 leads to increased chromatin dynamics,owing to chromatin decompaction and redistribution toward nucleoplasm.Taken together,our data suggest that lamin B1 and chromatin interactions at the nuclear periphery promote LAD maintenance,chromatin compaction,genomic compartmentalization into chromosome territories and A/B compartments and confine chromatin dynamics,supporting their crucial roles in chromatin higher-order structure and chromatin dynamics.展开更多
We study the law of the iterated logarithm (LIL) for the maximum likelihood estimation of the parameters (as a convex optimization problem) in the generalized linear models with independent or weakly dependent (ρ-mix...We study the law of the iterated logarithm (LIL) for the maximum likelihood estimation of the parameters (as a convex optimization problem) in the generalized linear models with independent or weakly dependent (ρ-mixing) responses under mild conditions. The LIL is useful to derive the asymptotic bounds for the discrepancy between the empirical process of the log-likelihood function and the true log-likelihood. The strong consistency of some penalized likelihood-based model selection criteria can be shown as an application of the LIL. Under some regularity conditions, the model selection criterion will be helpful to select the simplest correct model almost surely when the penalty term increases with the model dimension, and the penalty term has an order higher than O(log log n) but lower than O(n). Simulation studies are implemented to verify the selection consistency of Bayesian information criterion.展开更多
Demographic estimation becomes a problem of small area estimation when detaileddisaggregation leads to small cell counts.The usual difficulties of small area estimation are compounded when the available data sources c...Demographic estimation becomes a problem of small area estimation when detaileddisaggregation leads to small cell counts.The usual difficulties of small area estimation are compounded when the available data sources contain measurement errors.We present a Bayesianapproach to the problem of small area estimation with imperfect data sources.The overall modelcontains separate submodels for underlying demographic processes and for measurement processes.All unknown quantities in the model,including coverage ratios and demographic rates,are estimated jointly via Markov chain Monte Carlo methods.The approach is illustrated usingthe example of provincial fertility rates in Cambodia.展开更多
Consider a supercritical superprocess X = {Xt, t 〉~ O} on a locally compact separable metric space (E, m). Suppose that the spatial motion of X is a Hunt process satisfying certain conditions and that the branching...Consider a supercritical superprocess X = {Xt, t 〉~ O} on a locally compact separable metric space (E, m). Suppose that the spatial motion of X is a Hunt process satisfying certain conditions and that the branching mechanism is of the form展开更多
We consider the small value probability of supercritical continuous state branching processes with immigration. From Pinsky (1972) it is known that under regularity condition on the branching mechanism and immigrati...We consider the small value probability of supercritical continuous state branching processes with immigration. From Pinsky (1972) it is known that under regularity condition on the branching mechanism and immigration mechanism, the normalized population size converges to a non-degenerate finite and positive limit PV as t tends to infinity. We provide sharp estimate on asymptotic behavior of P(W≤ε〈) as ε→ 0+ by studying the Laplace transform of W. Without immigration, we also give a simpler proof for the small value probability in the non-subordinator case via the prolific backbone decomposition.展开更多
A new dimension-reduction graphical method for testing high- dimensional normality is developed by using the theory of spherical distributions and the idea of principal component analysis. The dimension reduction is r...A new dimension-reduction graphical method for testing high- dimensional normality is developed by using the theory of spherical distributions and the idea of principal component analysis. The dimension reduction is realized by projecting high-dimensional data onto some selected eigenvector directions. The asymptotic statistical independence of the plotting functions on the selected eigenvector directions provides the principle for the new plot. A departure from multivariate normality of the raw data could be captured by at least one plot on the selected eigenvector direction. Acceptance regions associated with the plots are provided to enhance interpretability of the plots. Monte Carlo studies and an illustrative example show that the proposed graphical method has competitive power performance and improves the existing graphical method significantly in testing high-dimensional normality.展开更多
Although significant achievements have shown that the coronavirus disease 2019(COVID‐19)resurgence in Beijing,China,was initiated by contaminated frozen products and transported via cold chain transportation,internat...Although significant achievements have shown that the coronavirus disease 2019(COVID‐19)resurgence in Beijing,China,was initiated by contaminated frozen products and transported via cold chain transportation,international travelers with asymptomatic symptoms or false‐negative nucleic acid may have another possible transmission mode that spread the virus to Beijing.One of the key differences between these two assumptions was whether the virus actively replicated since,so far,no reports showed viruses could stop evolution in alive hosts.We studied severe acute respiratory syndrome coronavirus 2(SARS‐CoV‐2)sequences in this outbreak by a modified leaf‐dating method with the Bayes factor.The numbers of single nucleotide variants(SNVs)found in SARS‐CoV‐2 sequences were significantly lower than those called from B.1.1 records collected at the matching time worldwide(P=0.047).In addition,results of the leaf‐dating method showed ages of viruses sampled from this outbreak were earlier than their recorded dates of collection(Bayes factors>10),while control sequences(selected randomly with ten replicates)showed no differences in their collection dates(Bayes factors<10).Our results which indicated that the re‐emergence of SARS‐CoV‐2 in Beijing in June 2020 was caused by a virus that exhibited a lack of evolutionary changes compared to viruses collected at the corresponding time,provided evolutionary evidence to the contaminated imported frozen food should be responsible for the reappearance of COVID‐19 cases in Beijing.The method developed here might also be helpful to provide the very first clues for potential sources of COVID‐19 cases in the future.展开更多
Expected shortfall(ES)is a popular risk measure and plays an important role in risk and portfolio management.Recently,change-point detection of risk measures has been attracting much attention in finance.Based on the ...Expected shortfall(ES)is a popular risk measure and plays an important role in risk and portfolio management.Recently,change-point detection of risk measures has been attracting much attention in finance.Based on the self-normalized CUSUM statistic in Fan,Glynn and Pelger(2018)and the Wild Binary Segmentation(WBS)algorithm in Fryzlewicz(2014),this paper proposes a variant WBS procedure to detect and estimate change points of ES in time series.The strengthened Schwarz information criterion is also introduced to determine the number of change points.Monte Carlo simulation studies are conducted to assess the finite-sample performance of our variant WBS procedure about ES in time series.An empirical application is given to illustrate the usefulness of our procedure.展开更多
This paper examines the theoretical and empirical properties of a supervised factor model based on combining forecasts using principal components(CFPC),in comparison with two other supervised factor models(partial lea...This paper examines the theoretical and empirical properties of a supervised factor model based on combining forecasts using principal components(CFPC),in comparison with two other supervised factor models(partial least squares regression,PLS,and principal covariate regression,PCovR)and with the unsupervised principal component regression,PCR.The supervision refers to training the predictors for a variable to forecast.We compare the performance of the three supervised factor models and the unsupervised factor model in forecasting of U.S.CPI inflation.The main finding is that the predictive ability of the supervised factor models is much better than the unsupervised factor model.The computation of the factors can be doubly supervised together with variable selection,which can further improve the forecasting performance of the supervised factor models.Among the three supervised factor models,the CFPC best performs and is also most stable.While PCovR also performs well and is stable,the performance of PLS is less stable over different out-of-sample forecasting periods.The effect of supervision gets even larger as forecast horizon increases.Supervision helps to reduce the number of factors and lags needed in modelling economic structure,achieving more parsimony.展开更多
The past two decades have witnessed the burgeoning of enormous digital technologies and data collected via countless channels.They are combined in numerous ways in different fields,including epidemiology,mHealth and m...The past two decades have witnessed the burgeoning of enormous digital technologies and data collected via countless channels.They are combined in numerous ways in different fields,including epidemiology,mHealth and modeling of health systems,with the intention to improve human health(e.g.,clinical decision support,electronic medical record management)[1-6].However,this is a new interdisciplinary area where no single scientific discipline knows how to take full advantage of these data and technologies to solve health problems[1].展开更多
We formulate a Lagrange method for continuous-time stochastic optimization in an appropriate normed space by using a proper stochastic process as the Lagrange multiplier.The obtained optimality conditions are applied ...We formulate a Lagrange method for continuous-time stochastic optimization in an appropriate normed space by using a proper stochastic process as the Lagrange multiplier.The obtained optimality conditions are applied to different types of problems.Some examples selected from control theory and economic theory are studied to test and illustrate the potential applications of the method.展开更多
Relative-risk models are often used to characterize the relationship between survival time and time-dependent covariates. When the covariates are observed, the estimation and asymptotic theory for parameters of intere...Relative-risk models are often used to characterize the relationship between survival time and time-dependent covariates. When the covariates are observed, the estimation and asymptotic theory for parameters of interest are available; challenges remain when missingness occurs. A popular approach at hand is to jointly model survival data and longitudinal data. This seems efficient, in making use of more information, but the rigorous theoretical studies have long been ignored. For both additive risk models and relative-risk models, we consider the missing data nonignorable. Under general regularity conditions, we prove asymptotic normality for the nonparametric maximum likelihood estimators.展开更多
基金supported by funding from Beijing Municipal Science & Technology Commission, Clinical Application and Development of Capital Characteristic (No. Z161100000516003)National Natural Science Foundation of China (No. 31871266)
文摘Objective: Challenges remain in current practices of colorectal cancer(CRC) screening, such as low compliance,low specificities and expensive cost. This study aimed to identify high-risk groups for CRC from the general population using regular health examination data.Methods: The study population consist of more than 7,000 CRC cases and more than 140,000 controls. Using regular health examination data, a model detecting CRC cases was derived by the classification and regression trees(CART) algorithm. Receiver operating characteristic(ROC) curve was applied to evaluate the performance of models. The robustness and generalization of the CART model were validated by independent datasets. In addition, the effectiveness of CART-based screening was compared with stool-based screening.Results: After data quality control, 4,647 CRC cases and 133,898 controls free of colorectal neoplasms were used for downstream analysis. The final CART model based on four biomarkers(age, albumin, hematocrit and percent lymphocytes) was constructed. In the test set, the area under ROC curve(AUC) of the CART model was 0.88 [95%confidence interval(95% CI), 0.87-0.90] for detecting CRC. At the cutoff yielding 99.0% specificity, this model’s sensitivity was 62.2%(95% CI, 58.1%-66.2%), thereby achieving a 63-fold enrichment of CRC cases. We validated the robustness of the method across subsets of test set with diverse CRC incidences, aging rates, genders ratio, distributions of tumor stages and locations, and data sources. Importantly, CART-based screening had the higher positive predictive value(1.6%) than fecal immunochemical test(0.3%).Conclusions: As an alternative approach for the early detection of CRC, this study provides a low-cost method using regular health examination data to identify high-risk individuals for CRC for further examinations. The approach can promote early detection of CRC especially in developing countries such as China, where annual health examination is popular but regular CRC-specific screening is rare.
基金supported by the National Natural Science Foun‐dation of China[Grant No.71773061].
文摘The value of a statistical life(VSL)is a crucial tool for monetizing health impacts.To explore the VSL in China,this study examines people’s willingness to pay(WTP)to reduce death risk from air pollution in six representative cities in China based on face-to-face contingent valuation interviews(n=3936)from March 7,2019 to September 30,2019.The results reveal that the WTP varied from CNY 455 to 763 in 2019(USD 66-111),corresponding to a VSL range of CNY 3.79-6.36 million(USD 549395-921940).The VSL in China in 2019 is estimated to be CNY 4.76 million(USD 689659).The statistics indicate that monthly expenditure levels,environmental concerns,risk attitudes,and assumed market acceptance,which have seldom been dis‐cussed in previous studies,significantly impact WTP and VSL.These findings will serve as a reference for ana‐lyzing mortality risk reduction benefits in future research and for policymaking.
基金supported by the National Key R&D Program of China[2020YFE0204200 to R.X.]the National Natural Science Foundation of China[12371286,11971039 to R.X.,12201219 to J.M.]+1 种基金Sino-Russian Mathematics Center,Foundation of Qinglonghu laboratory,Shanghai Sailing Program(No.21YF1410600 to J.M.)Shanghai Key Program of Computational Biology(No.23JS1400500,23JS1400800 to J.M.).
文摘Tumor research is a fundamental focus of medical science,yet the intrinsic heterogeneity and complexity of tumors present challenges in understanding their biological mechanisms of initiation,progression,and metastasis.Recent advancements in single-cell transcriptomic sequencing have revolutionized the way researchers explore tumor biology by providing unprecedented resolution.However,a key limitation of single-cell sequencing is the loss of spatial information during single-cell preparation.Spatial transcriptomics(ST)emerges as a cutting-edge technology in tumor research that preserves the spatial information of RNA transcripts,thereby facilitating a deeper understanding of the tumor heterogeneity,the intricate interplay between tumor cells and the tumor microenvironment.This review systematically introduces ST technologies and summarizes their latest applications in tumor research.Furthermore,we provide a thorough overview of the bioinformatics analysis workflow for ST data and offer an online tutorial(https://github.com/Siyua nHuan g1/ST_Analy sis_Handb ook).Lastly,we discuss the potential future directions of ST.We believe that ST will become a powerful tool in unraveling tumor biology and offer new insights for effective treatment and precision medicine in oncology.
基金funded by National Natural Science Foundation of China(Grants 92046021,12071013,12026607,71973005)LMEQF at Peking University.
文摘This paper gives a review of concentration inequalities which are widely employed in non-asymptotical analyses of mathematical statistics in awide range of settings,fromdistribution-free to distribution-dependent,from sub-Gaussian to sub-exponential,sub-Gamma,and sub-Weibull random variables,and from the mean to the maximum concentration.This review provides results in these settings with some fresh new results.Given the increasing popularity of high-dimensional data and inference,results in the context of high-dimensional linear and Poisson regressions are also provided.We aim to illustrate the concentration inequalities with known constants and to improve existing bounds with sharper constants.
基金supported by China’s National Key R&D Program,NO.2019YFC1709801.
文摘Background:To systematically summarize and categorize the Chinese herbal medicine in the domestic traditional Chinese medicine(TCM)literature on type 2 diabetes mellitus(T2DM),in this paper,we mine traditional Chinese medicine data for relationships and provide for future practitioners and researchers.Methods:Taking randomized controlled trials on the treatment of T2DM in TCM as the research theme,we searched for full-text literature in three major clinical databases,including CNKI,Wan Fang,and VIP,published between 1990 and 2020.We then conducted frequency statistics,cluster analysis,association rules extraction,and principal component analysis based on a corpus of medical academic words extracted from 1116 research articles.Results:The most frequently used is Astragali Radix,and the most commonly used two-herb combination in T2DM treatment consisted of Coptidis Rhizoma and Moutan Cortex.Moutan Cortex,Alismatis Rhizoma,and Dioscoreae Rhizoma were the most frequently used three-herb combination.We found a“lung”and“liver”and“kidney”model and confirmed the value of classical meridian tropism theory and pattern identification.The treatment is mainly to fill deficiency and clear heat and consider water infiltration,dampness,blood circulation,and silt.Conclusion:This study provides an in-depth perspective on the TCM medication rules for T2DM and offers practitioners and researchers valuable information about the current status and frontier trends of TCM research on T2DM in terms of diagnosis and treatment.
基金supported by the National Key Research and Development Program of China(2021YFA1100300)the National Natural Science Foundation of China(32288102,32370843 and 32025006)Part of the data analysis was performed on the High Performance Computing Platform of the Center for Life Sciences,Peking University。
文摘Human primed pluripotent stem cells are capable of generating all the embryonic lineages.However,their extraembryonic trophectoderm potentials are limited.It remains unclear how to expand their developmental potential to trophectoderm lineages.Here we show that transient treatment with a cocktail of small molecule epigenetic modulators imparts trophectoderm lineage potentials to human primed pluripotent stem cells while preserving their embryonic potential.These chemically treated cells can generate trophectoderm-like cells and downstream trophoblast stem cells,diverging into syncytiotrophoblast and extravillous trophoblast lineages.Transcriptomic and CUT&Tag analyses reveal that these induced cells share transcriptional profiles with in vivo trophectoderm and cytotrophoblast,and exhibit reduced H3K27me3 modification at gene loci specific to trophoblast lineages compared with primed pluripotent cells.Mechanistic exploration highlighted the critical roles of epigenetic modulators HDAC2,EZH1/2,and KDM5s in the activation of trophoblast lineage potential.Our findings demonstrate that transient epigenetic resetting activates unrestricted lineage potential in human primed pluripotent stem cells,and offer new mechanistic insights into human trophoblast lineage specification and in vitro models for studying placental development and related disorders.
基金supported by Simons Foundation (Grant No. 520542)a Victor Klee Faculty Fellowship and National Natural Science Foundation of China (Grant No. 11731009)+2 种基金supported by National Natural Science Foundation of China (Grant Nos. 11671017 and 11731009)Key Laboratory of Mathematical Economics and Quantitative Finance (LMEQF) (Peking University),Ministry of Educationsupported by the Simons Foundation (Grant No. #429343)
文摘In this paper, we provide a pathwise spine decomposition for multitype superdiffusions with nonlocal branching mechanisms under a martingale change of measure. As an application of this decomposition,we obtain a necessary and sufficient condition(called the L log L criterion) for the limit of the fundamental martingale to be non-degenerate. This result complements the related results obtained in Kyprianou et al.(2012),Kyprianou and Murillo-Salas(2013) and Liu et al.(2009) for superprocesses with purely local branching mechanisms and in Kyprianou and Palau(2018) for super Markov chains.
基金the National Natural Science Foundation of China (NSFC) (31871266 for C.L., 21573013 and 21825401 for Y.S.)National Key Research and Development Program of China (2016YFA0100103 for C.L., 2017YFA0505302 for Y.S.)NSFC Key Research Grant 71532001 for C.L.
文摘Eukaryotic genomes are densely packaged into hierarchical three-dimensional(3D) structures that contain information about gene regulation and many other biological processes. With the development of imaging and sequencing-based technologies, 3D genome studies have revealed that the high-order chromatin structure is composed of hierarchical levels, including chromosome territories, A/B compartments, topologically associated domains, and chromatin loops. However, how this chromatin architecture is formed and maintained is not completely clear. In this review, we introduce experimental methods to investigate the 3D genome, review major architectural proteins that regulate 3D chromatin organization in mammalian cells, such as CTCF(CCCTC-binding factor), cohesin, lamins, and transcription factors, and discuss relevant mechanisms such as phase separation.
基金This work is supported by grants from National Key R&D Program of China,No.2017YFA0505302the National Science Foundation of China 21573013,21825401 for Y.S.+1 种基金Chinese National Key Projects of Research and Development,No.2016YFA0100103,Peking-Tsinghua Center for Life SciencesNational Natural Science Foundation of China Key Research Grant 31871266 for C.L。
文摘The eukaryotic genome is folded into higher-order conformation accompanied with constrained dynamics for coordinated genome functions.However,the molecular machinery underlying these hierarchically organized three-dimensional(3D)chromatin architecture and dynamics remains poorly understood.Here by combining imaging and sequencing,we studied the role of lamin B1 in chromatin architecture and dynamics.We found that lamin B1 depletion leads to detachment of lamina-associated domains(LADs)from the nuclear periphery accompanied with global chromatin redistribution and decompaction.Consequently,the interchromosomal as well as inter-compartment interactions are increased,but the structure of topologically associating domains(TADs)is not affected.Using live-cell genomic loci tracking,we further proved that depletion of lamin B1 leads to increased chromatin dynamics,owing to chromatin decompaction and redistribution toward nucleoplasm.Taken together,our data suggest that lamin B1 and chromatin interactions at the nuclear periphery promote LAD maintenance,chromatin compaction,genomic compartmentalization into chromosome territories and A/B compartments and confine chromatin dynamics,supporting their crucial roles in chromatin higher-order structure and chromatin dynamics.
文摘We study the law of the iterated logarithm (LIL) for the maximum likelihood estimation of the parameters (as a convex optimization problem) in the generalized linear models with independent or weakly dependent (ρ-mixing) responses under mild conditions. The LIL is useful to derive the asymptotic bounds for the discrepancy between the empirical process of the log-likelihood function and the true log-likelihood. The strong consistency of some penalized likelihood-based model selection criteria can be shown as an application of the LIL. Under some regularity conditions, the model selection criterion will be helpful to select the simplest correct model almost surely when the penalty term increases with the model dimension, and the penalty term has an order higher than O(log log n) but lower than O(n). Simulation studies are implemented to verify the selection consistency of Bayesian information criterion.
文摘Demographic estimation becomes a problem of small area estimation when detaileddisaggregation leads to small cell counts.The usual difficulties of small area estimation are compounded when the available data sources contain measurement errors.We present a Bayesianapproach to the problem of small area estimation with imperfect data sources.The overall modelcontains separate submodels for underlying demographic processes and for measurement processes.All unknown quantities in the model,including coverage ratios and demographic rates,are estimated jointly via Markov chain Monte Carlo methods.The approach is illustrated usingthe example of provincial fertility rates in Cambodia.
文摘Consider a supercritical superprocess X = {Xt, t 〉~ O} on a locally compact separable metric space (E, m). Suppose that the spatial motion of X is a Hunt process satisfying certain conditions and that the branching mechanism is of the form
基金supported by National Science Foundation of US (Grant Nos. DMS-0805929 and DMS-1106938)National Natural Science Foundation of China (Grant Nos. 10928103,10971003 and 11128101)+1 种基金Specialized Research Fund for the Doctoral Program of Higher Education of Chinathe Fundamental Research Funds for the Central Universities
文摘We consider the small value probability of supercritical continuous state branching processes with immigration. From Pinsky (1972) it is known that under regularity condition on the branching mechanism and immigration mechanism, the normalized population size converges to a non-degenerate finite and positive limit PV as t tends to infinity. We provide sharp estimate on asymptotic behavior of P(W≤ε〈) as ε→ 0+ by studying the Laplace transform of W. Without immigration, we also give a simpler proof for the small value probability in the non-subordinator case via the prolific backbone decomposition.
文摘A new dimension-reduction graphical method for testing high- dimensional normality is developed by using the theory of spherical distributions and the idea of principal component analysis. The dimension reduction is realized by projecting high-dimensional data onto some selected eigenvector directions. The asymptotic statistical independence of the plotting functions on the selected eigenvector directions provides the principle for the new plot. A departure from multivariate normality of the raw data could be captured by at least one plot on the selected eigenvector direction. Acceptance regions associated with the plots are provided to enhance interpretability of the plots. Monte Carlo studies and an illustrative example show that the proposed graphical method has competitive power performance and improves the existing graphical method significantly in testing high-dimensional normality.
基金This work was supported by the National Natural Science Foundation of China(Grant number:82041023)the Bill&Melinda Gates Foundation(Grant number:INV‐016826)China Mega‐Projects for Infectious Disease(2018ZX10711001,2017ZX10104001).
文摘Although significant achievements have shown that the coronavirus disease 2019(COVID‐19)resurgence in Beijing,China,was initiated by contaminated frozen products and transported via cold chain transportation,international travelers with asymptomatic symptoms or false‐negative nucleic acid may have another possible transmission mode that spread the virus to Beijing.One of the key differences between these two assumptions was whether the virus actively replicated since,so far,no reports showed viruses could stop evolution in alive hosts.We studied severe acute respiratory syndrome coronavirus 2(SARS‐CoV‐2)sequences in this outbreak by a modified leaf‐dating method with the Bayes factor.The numbers of single nucleotide variants(SNVs)found in SARS‐CoV‐2 sequences were significantly lower than those called from B.1.1 records collected at the matching time worldwide(P=0.047).In addition,results of the leaf‐dating method showed ages of viruses sampled from this outbreak were earlier than their recorded dates of collection(Bayes factors>10),while control sequences(selected randomly with ten replicates)showed no differences in their collection dates(Bayes factors<10).Our results which indicated that the re‐emergence of SARS‐CoV‐2 in Beijing in June 2020 was caused by a virus that exhibited a lack of evolutionary changes compared to viruses collected at the corresponding time,provided evolutionary evidence to the contaminated imported frozen food should be responsible for the reappearance of COVID‐19 cases in Beijing.The method developed here might also be helpful to provide the very first clues for potential sources of COVID‐19 cases in the future.
基金supported in part by the NSFC(Nos.71973077 and 11771239)the Tsinghua University Initiative Scientific Research Program(No.2019Z07L01009).
文摘Expected shortfall(ES)is a popular risk measure and plays an important role in risk and portfolio management.Recently,change-point detection of risk measures has been attracting much attention in finance.Based on the self-normalized CUSUM statistic in Fan,Glynn and Pelger(2018)and the Wild Binary Segmentation(WBS)algorithm in Fryzlewicz(2014),this paper proposes a variant WBS procedure to detect and estimate change points of ES in time series.The strengthened Schwarz information criterion is also introduced to determine the number of change points.Monte Carlo simulation studies are conducted to assess the finite-sample performance of our variant WBS procedure about ES in time series.An empirical application is given to illustrate the usefulness of our procedure.
基金National Natural Science Foundation of China(Grant 71301004,71472007,71532001,71671002)China's National Key Research Special Program(2016YFC0207705)+1 种基金the Center for Statistical Science at Peking University,and Key Laboratory of Mathematical Economics and Quantitative FinanceMinistry of Education.
文摘This paper examines the theoretical and empirical properties of a supervised factor model based on combining forecasts using principal components(CFPC),in comparison with two other supervised factor models(partial least squares regression,PLS,and principal covariate regression,PCovR)and with the unsupervised principal component regression,PCR.The supervision refers to training the predictors for a variable to forecast.We compare the performance of the three supervised factor models and the unsupervised factor model in forecasting of U.S.CPI inflation.The main finding is that the predictive ability of the supervised factor models is much better than the unsupervised factor model.The computation of the factors can be doubly supervised together with variable selection,which can further improve the forecasting performance of the supervised factor models.Among the three supervised factor models,the CFPC best performs and is also most stable.While PCovR also performs well and is stable,the performance of PLS is less stable over different out-of-sample forecasting periods.The effect of supervision gets even larger as forecast horizon increases.Supervision helps to reduce the number of factors and lags needed in modelling economic structure,achieving more parsimony.
基金funded in part by research grants from the USbased China Medical Board (16-262)the United Nations International Children’s Emergency Fund (2018-Nutrition-2.1.2.3)+2 种基金the National Natural Science Foundation of China (11771240, 91746205, 71673199)funding from Xi’an Jiaotong UniversityUniversity of Twente
文摘The past two decades have witnessed the burgeoning of enormous digital technologies and data collected via countless channels.They are combined in numerous ways in different fields,including epidemiology,mHealth and modeling of health systems,with the intention to improve human health(e.g.,clinical decision support,electronic medical record management)[1-6].However,this is a new interdisciplinary area where no single scientific discipline knows how to take full advantage of these data and technologies to solve health problems[1].
基金supported by National Natural Science Foundation of China (Grant No.11001029)the National Basic Research Program of China (973 Program) (Grant No. 2007CB814902)+1 种基金the Science Fund for Creative Research Groups (Grant No. 11021161)Key Laboratory of Random Complex Structures and Data Science (Grant No. 2008DP173182)
文摘We formulate a Lagrange method for continuous-time stochastic optimization in an appropriate normed space by using a proper stochastic process as the Lagrange multiplier.The obtained optimality conditions are applied to different types of problems.Some examples selected from control theory and economic theory are studied to test and illustrate the potential applications of the method.
基金funded by National Natural Science Foundation of China(NSFC No.11771241)Natural Science Foundation of Anhui Province(No.1708085QA14)
文摘Relative-risk models are often used to characterize the relationship between survival time and time-dependent covariates. When the covariates are observed, the estimation and asymptotic theory for parameters of interest are available; challenges remain when missingness occurs. A popular approach at hand is to jointly model survival data and longitudinal data. This seems efficient, in making use of more information, but the rigorous theoretical studies have long been ignored. For both additive risk models and relative-risk models, we consider the missing data nonignorable. Under general regularity conditions, we prove asymptotic normality for the nonparametric maximum likelihood estimators.