Attributed graph clustering plays a vital role in uncovering hidden network structures,but it presents significant challenges.In recent years,various models have been proposed to identify meaningful clusters by integr...Attributed graph clustering plays a vital role in uncovering hidden network structures,but it presents significant challenges.In recent years,various models have been proposed to identify meaningful clusters by integrating both structural and attribute-based information.However,these models often emphasize node proximities without adequately balancing the efficiency of clustering based on both structural and attribute data.Furthermore,they tend to neglect the critical fuzzy information inherent in attributed graph clusters.To address these issues,we introduce a new framework,Markov lumpability optimization,for efficient clustering of large-scale attributed graphs.Specifically,we define a lumped Markov chain on an attribute-augmented graph and introduce a new metric,Markov lumpability,to quantify the differences between the original and lumped Markov transition probability matrices.To minimize this measure,we propose a conjugate gradient projectionbased approach that ensures the partitioning closely aligns with the intrinsic structure of fuzzy clusters through conditional optimization.Extensive experiments on both synthetic and real-world datasets demonstrate the superior performance of the proposed framework compared to existing clustering algorithms.This framework has many potential applications,including dynamic community analysis of social networks,user profiling in recommendation systems,functional module identification in biological molecular networks,and financial risk control,offering a new paradigm for mining complex patterns in high-dimensional attributed graph data.展开更多
In this paper,we propose a new method,called the level-collapsing method,to construct branching Latin hypercube designs(BLHDs).The obtained design has a sliced structure in the third part,that is,the part for the shar...In this paper,we propose a new method,called the level-collapsing method,to construct branching Latin hypercube designs(BLHDs).The obtained design has a sliced structure in the third part,that is,the part for the shared factors,which is desirable for the qualitative branching factors.The construction method is easy to implement,and(near)orthogonality can be achieved in the obtained BLHDs.A simulation example is provided to illustrate the effectiveness of the new designs.展开更多
The development of novel adjuvants constitutes a new strategy for the research of tumor vaccines.Immunomodulatory molecule adjuvants are one of the novel adjuvants that can effectively stimulate the pattern recognitio...The development of novel adjuvants constitutes a new strategy for the research of tumor vaccines.Immunomodulatory molecule adjuvants are one of the novel adjuvants that can effectively stimulate the pattern recognition receptors to activate the downstream pathways of immune cells.However,there are few studies on immunomodulatory molecular adjuvants associated with C-type lectin.It has been reported that GlcC_(14)C_(18)is a Mincle ligand with a relatively simple structure and strong adjuvant activity in vivo.Herein,we coupled GlcC_(14)C_(18)with MUC1 glycopeptide and evaluated its immune effect.In addition,we also synthesized α-GlcC_(14)C_(18)-MUC1 and β-GlcC_(14)C_(18)-MUC1 based on the two configurations of GlcC_14C_(18)and compared their immune effects.The results show that both of the two configurations of the vaccine have a good immune effect,but to a certain extent,the immune effect of β-GlcC_(14)C_(18)-MUC1 is better than that of α-GlcC_(14)C_(18)-MUC1.展开更多
We have developed a MUC1 antigen-based antitumor vaccine loaded on alum colloid encapsulated insideβ-glucan particles(GP-Al).The constructed vaccine induced strong MUC1 antigen specific Ig G antibody titers and enhan...We have developed a MUC1 antigen-based antitumor vaccine loaded on alum colloid encapsulated insideβ-glucan particles(GP-Al).The constructed vaccine induced strong MUC1 antigen specific Ig G antibody titers and enhanced CD^(8+)T cells cytotoxic effect to kill tumor cells.These results indicated that GP-Al can be served as an efficient delivery system and adjuvant for the development of cancer vaccines especially small molecule antigens based cancer vaccines.展开更多
Atmospheric particulate matter pollution has attracted much wider attention globally.In recent years,the development of atmospheric particle collection techniques has put forwards new demands on the real-time source a...Atmospheric particulate matter pollution has attracted much wider attention globally.In recent years,the development of atmospheric particle collection techniques has put forwards new demands on the real-time source apportionments techniques.Such demands are summarized,in this paper,as how to set up new restraints in apportionment and how to develop a non-linear regression model to process complicated circumstances,such as the existence of secondary source and similar source.In this study,we firstly analyze the possible and potential restraints in single particle source apportionment,then propose a novel three-step self-feedback long short-term memory(SF-LSTM)network for approximating the source contribution.The proposed deep learning neural network includes three modules,as generation,scoring and refining,and regeneration modules.Benefited from the scoring modules,SF-LSTM implants four loss functions representing four restraints to be followed in the apportionment,meanwhile,the regeneration module calculates the source contribution in a non-linear way.The results show that the model outperforms the conventional regression methods in the overall performance of the four evaluation indicators(residual sum of squares,stability,sparsity,negativity)for the restraints.Additionally,in short time-resolution analyzing,SF-LSTM provides better results under the restraint of stability.展开更多
Vaccine adjuvants have been widely used to enhance the immunogenicity of the antigens and elicit long-lasting immune response.However,only few vaccine adjuvants have been approved by the FDA for human use so far.There...Vaccine adjuvants have been widely used to enhance the immunogenicity of the antigens and elicit long-lasting immune response.However,only few vaccine adjuvants have been approved by the FDA for human use so far.Therefore,there is still an urgent need to develop novel adjuvants for the potential applications in clinical trials.Herein,non-nucleotide small molecule STING agonist di ABZI was employed to construct glycopeptide antigen based vaccines for the first time.Immunological evaluation indicated di ABZI not only enhanced the production of antibodies and T cell immune responses,but also inhibited tumor growth in tumor-bearing mice in glycopeptide-based subunit vaccines.These results indicated that di-ABZI demonstrates a high potential as adjuvant for the development of cancer vaccines.展开更多
Detecting overlapping communities in attributed networks remains a significant challenge due to the complexity of jointly modeling topological structure and node attributes,the unknown number of communities,and the ne...Detecting overlapping communities in attributed networks remains a significant challenge due to the complexity of jointly modeling topological structure and node attributes,the unknown number of communities,and the need to capture nodes with multiple memberships.To address these issues,we propose a novel framework named density peaks clustering with neutrosophic C-means.First,we construct a consensus embedding by aligning structure-based and attribute-based representations using spectral decomposition and canonical correlation analysis.Then,an improved density peaks algorithm automatically estimates the number of communities and selects initial cluster centers based on a newly designed cluster strength metric.Finally,a neutrosophic C-means algorithm refines the community assignments,modeling uncertainty and overlap explicitly.Experimental results on synthetic and real-world networks demonstrate that the proposed method achieves superior performance in terms of detection accuracy,stability,and its ability to identify overlapping structures.展开更多
With the advancement of modern scientific research,multimodal data is increasingly being collected from multiple sources or types.For outcomes derived from generalized linear models with high-dimensional and multimoda...With the advancement of modern scientific research,multimodal data is increasingly being collected from multiple sources or types.For outcomes derived from generalized linear models with high-dimensional and multimodal covariates,we develop two distinct factor-adjusted tests to assess the significance of high-dimensional modality data and specific low-dimensional linear combinations of predictors from one or more modalities,respectively.First,we propose a factor-adjusted decorrelated score test to evaluate the significance of a single modality.This approach simultaneously transforms a high-dimensional test into a fixed low-dimensional one while addressing the impact of high-dimensional nuisance parameters.Second,we construct a factor-adjusted Wald test based on partial penalized estimation to assess the significance of certain low-dimensional combinations of variables from one or more modalities.The limiting distributions of these two proposed tests are analyzed under both the null hypothesis and local alternatives to characterize the asymptotic type-I errors and powers.The finite sample performance of our proposed tests is evaluated through simulations and further demonstrated with a breast cancer dataset.展开更多
In this paper,we study the inverse local times at 0 of one-dimensional reflected diffusions on[0,∞)and establish a comparison principle for these inverse local times.We also provide applications to Green function est...In this paper,we study the inverse local times at 0 of one-dimensional reflected diffusions on[0,∞)and establish a comparison principle for these inverse local times.We also provide applications to Green function estimates for non-local operators.展开更多
We consider the problem of multi-task regression with time-varying low-rank patterns,where the collected data may be contaminated by heavy-tailed distributions and/or outliers.Our approach is based on a piecewise robu...We consider the problem of multi-task regression with time-varying low-rank patterns,where the collected data may be contaminated by heavy-tailed distributions and/or outliers.Our approach is based on a piecewise robust multi-task learning formulation,in which a robust loss function—not necessarily to be convex,but with a bounded derivative—is used,and each piecewise low-rank pattern is induced by a nuclear norm regularization term.We propose using the composite gradient descent algorithm to obtain stationary points within a data segment and employing the dynamic programming algorithm to determine the optimal segmentation.The theoretical properties of the detected number and time points of pattern shifts are studied under mild conditions.Numerical results confirm the effectiveness of our method.展开更多
Simultaneously finding active predictors and controlling the false discovery rate(FDR)for high-dimensional survival data is an important but challenging statistical problem.In this paper,the authors propose a novel va...Simultaneously finding active predictors and controlling the false discovery rate(FDR)for high-dimensional survival data is an important but challenging statistical problem.In this paper,the authors propose a novel variable selection procedure with error rate control for the high-dimensional Cox model.By adopting a data-splitting strategy,the authors construct a series of symmetric statistics and then utilize the symmetry property to derive a data-driven threshold to achieve error rate control.The authors establish finite-sample and asymptotic FDR control results under some mild conditions.Simulation results as well as a real data application show that the proposed approach successfully controls FDR and is often more powerful than the competing approaches.展开更多
In this paper,the authors propose a two-stage online debiased lasso estimation and statistical inference method for high-dimensional quantile regression(QR)models in the presence of streaming data.In the first stage,t...In this paper,the authors propose a two-stage online debiased lasso estimation and statistical inference method for high-dimensional quantile regression(QR)models in the presence of streaming data.In the first stage,the authors modify the QR score function based on kernel smoothing and obtain the online lasso smoothed QR estimator through iterative algorithms.The estimation process only involves the current data batch and specific historical summary statistics,which perfectly accommodates to the special structure of streaming data.In the second stage,an online debiasing procedure is carried out to eliminate biases caused by the lasso penalty as well as the accumulative approximation error so that the asymptotic normality of the resulting estimator can be established.The authors conduct extensive numerical experiments to evaluate the performance of the proposed method.These experiments demonstrate the effectiveness of the proposed method and support the theoretical results.An application to the Beijing PM2.5 Dataset is also presented.展开更多
Numerous intriguing optimization problems arise as a result of the advancement of machine learning.The stochastic first-ordermethod is the predominant choicefor those problems due to its high efficiency.However,the ne...Numerous intriguing optimization problems arise as a result of the advancement of machine learning.The stochastic first-ordermethod is the predominant choicefor those problems due to its high efficiency.However,the negative effects of noisy gradient estimates and high nonlinearity of the loss function result in a slow convergence rate.Second-order algorithms have their typical advantages in dealing with highly nonlinear and ill-conditioning problems.This paper provides a review on recent developments in stochastic variants of quasi-Newton methods,which construct the Hessian approximations using only gradient information.We concentrate on BFGS-based methods in stochastic settings and highlight the algorithmic improvements that enable the algorithm to work in various scenarios.Future research on stochastic quasi-Newton methods should focus on enhancing its applicability,lowering the computational and storage costs,and improving the convergence rate.展开更多
基金supported by the National Natural Science Foundation of China(Grant No.72571150)Beijing Natural Science Foundation(Grant No.9182015)。
文摘Attributed graph clustering plays a vital role in uncovering hidden network structures,but it presents significant challenges.In recent years,various models have been proposed to identify meaningful clusters by integrating both structural and attribute-based information.However,these models often emphasize node proximities without adequately balancing the efficiency of clustering based on both structural and attribute data.Furthermore,they tend to neglect the critical fuzzy information inherent in attributed graph clusters.To address these issues,we introduce a new framework,Markov lumpability optimization,for efficient clustering of large-scale attributed graphs.Specifically,we define a lumped Markov chain on an attribute-augmented graph and introduce a new metric,Markov lumpability,to quantify the differences between the original and lumped Markov transition probability matrices.To minimize this measure,we propose a conjugate gradient projectionbased approach that ensures the partitioning closely aligns with the intrinsic structure of fuzzy clusters through conditional optimization.Extensive experiments on both synthetic and real-world datasets demonstrate the superior performance of the proposed framework compared to existing clustering algorithms.This framework has many potential applications,including dynamic community analysis of social networks,user profiling in recommendation systems,functional module identification in biological molecular networks,and financial risk control,offering a new paradigm for mining complex patterns in high-dimensional attributed graph data.
基金partially supported by NSFC (No.12001296)Fundamental Research Funds for the Central Universities+3 种基金Nankai University (No.63201163)Shi is partially supported by NSFC (No.11922112)Natural Science Foundation of TianjinNankai Universitv (No.63206034)。
基金supported by the National Natural Science Foundation of China (11601367,11771219 and 11771220)National Ten Thousand Talents Program+1 种基金Tianjin Development Program for Innovation and EntrepreneurshipTianjin "131" Talents Program
文摘In this paper,we propose a new method,called the level-collapsing method,to construct branching Latin hypercube designs(BLHDs).The obtained design has a sliced structure in the third part,that is,the part for the shared factors,which is desirable for the qualitative branching factors.The construction method is easy to implement,and(near)orthogonality can be achieved in the obtained BLHDs.A simulation example is provided to illustrate the effectiveness of the new designs.
基金supported by the National Natural Science Foundation of China (Nos. 22077068, 82103984)the Fundamental Research Funds for the Central Universities。
文摘The development of novel adjuvants constitutes a new strategy for the research of tumor vaccines.Immunomodulatory molecule adjuvants are one of the novel adjuvants that can effectively stimulate the pattern recognition receptors to activate the downstream pathways of immune cells.However,there are few studies on immunomodulatory molecular adjuvants associated with C-type lectin.It has been reported that GlcC_(14)C_(18)is a Mincle ligand with a relatively simple structure and strong adjuvant activity in vivo.Herein,we coupled GlcC_(14)C_(18)with MUC1 glycopeptide and evaluated its immune effect.In addition,we also synthesized α-GlcC_(14)C_(18)-MUC1 and β-GlcC_(14)C_(18)-MUC1 based on the two configurations of GlcC_14C_(18)and compared their immune effects.The results show that both of the two configurations of the vaccine have a good immune effect,but to a certain extent,the immune effect of β-GlcC_(14)C_(18)-MUC1 is better than that of α-GlcC_(14)C_(18)-MUC1.
基金supported by the National Natural Science Foundation of China(No.22077068)the National Key R&D Program of China(No.2018YFA0507204)+2 种基金the NCC Fund(No.NCC2020FH12)the Natural Science Foundation of Tianjin(No.19JCQNJC05300)the Fundamental Research Funds for the Central Universities。
文摘We have developed a MUC1 antigen-based antitumor vaccine loaded on alum colloid encapsulated insideβ-glucan particles(GP-Al).The constructed vaccine induced strong MUC1 antigen specific Ig G antibody titers and enhanced CD^(8+)T cells cytotoxic effect to kill tumor cells.These results indicated that GP-Al can be served as an efficient delivery system and adjuvant for the development of cancer vaccines especially small molecule antigens based cancer vaccines.
基金supported by Key Laboratory For Environmental Factors Control of Agro-product Quality Safety,Ministry of Agriculture and Rural Affairs(No.2018hjyzkfkt-002)Qian Xuesen Laboratory of Space Technology,CAST(No.GZZKFJJ2020002)National Research Program for Key Issues in Air Pollution Control(No.DQGG-05-30)
文摘Atmospheric particulate matter pollution has attracted much wider attention globally.In recent years,the development of atmospheric particle collection techniques has put forwards new demands on the real-time source apportionments techniques.Such demands are summarized,in this paper,as how to set up new restraints in apportionment and how to develop a non-linear regression model to process complicated circumstances,such as the existence of secondary source and similar source.In this study,we firstly analyze the possible and potential restraints in single particle source apportionment,then propose a novel three-step self-feedback long short-term memory(SF-LSTM)network for approximating the source contribution.The proposed deep learning neural network includes three modules,as generation,scoring and refining,and regeneration modules.Benefited from the scoring modules,SF-LSTM implants four loss functions representing four restraints to be followed in the apportionment,meanwhile,the regeneration module calculates the source contribution in a non-linear way.The results show that the model outperforms the conventional regression methods in the overall performance of the four evaluation indicators(residual sum of squares,stability,sparsity,negativity)for the restraints.Additionally,in short time-resolution analyzing,SF-LSTM provides better results under the restraint of stability.
基金supported by the National Natural Science Foundation of China(No.22077068)the National Key R&D Program of China(No.2018YFA0507204)+2 种基金the NCC Fund(No.NCC2020FH12)the Natural Science Foundation of Tianjin(No.19JCQNJC05300)the Fundamental Research Funds for the Central Universities。
文摘Vaccine adjuvants have been widely used to enhance the immunogenicity of the antigens and elicit long-lasting immune response.However,only few vaccine adjuvants have been approved by the FDA for human use so far.Therefore,there is still an urgent need to develop novel adjuvants for the potential applications in clinical trials.Herein,non-nucleotide small molecule STING agonist di ABZI was employed to construct glycopeptide antigen based vaccines for the first time.Immunological evaluation indicated di ABZI not only enhanced the production of antibodies and T cell immune responses,but also inhibited tumor growth in tumor-bearing mice in glycopeptide-based subunit vaccines.These results indicated that di-ABZI demonstrates a high potential as adjuvant for the development of cancer vaccines.
基金supported by the Natural Science Foundation of China(Grant No.72571150)。
文摘Detecting overlapping communities in attributed networks remains a significant challenge due to the complexity of jointly modeling topological structure and node attributes,the unknown number of communities,and the need to capture nodes with multiple memberships.To address these issues,we propose a novel framework named density peaks clustering with neutrosophic C-means.First,we construct a consensus embedding by aligning structure-based and attribute-based representations using spectral decomposition and canonical correlation analysis.Then,an improved density peaks algorithm automatically estimates the number of communities and selects initial cluster centers based on a newly designed cluster strength metric.Finally,a neutrosophic C-means algorithm refines the community assignments,modeling uncertainty and overlap explicitly.Experimental results on synthetic and real-world networks demonstrate that the proposed method achieves superior performance in terms of detection accuracy,stability,and its ability to identify overlapping structures.
基金supported by the Fundamental Research Funds for the Central UniversitiesNational Natural Science Foundation of China(Grant No.12271272)。
文摘With the advancement of modern scientific research,multimodal data is increasingly being collected from multiple sources or types.For outcomes derived from generalized linear models with high-dimensional and multimodal covariates,we develop two distinct factor-adjusted tests to assess the significance of high-dimensional modality data and specific low-dimensional linear combinations of predictors from one or more modalities,respectively.First,we propose a factor-adjusted decorrelated score test to evaluate the significance of a single modality.This approach simultaneously transforms a high-dimensional test into a fixed low-dimensional one while addressing the impact of high-dimensional nuisance parameters.Second,we construct a factor-adjusted Wald test based on partial penalized estimation to assess the significance of certain low-dimensional combinations of variables from one or more modalities.The limiting distributions of these two proposed tests are analyzed under both the null hypothesis and local alternatives to characterize the asymptotic type-I errors and powers.The finite sample performance of our proposed tests is evaluated through simulations and further demonstrated with a breast cancer dataset.
基金supported by Simons Foundation(Grant No.520542)supported by National Natural Science Foundation of China(Grant Nos.11801283 and 12171252)。
文摘In this paper,we study the inverse local times at 0 of one-dimensional reflected diffusions on[0,∞)and establish a comparison principle for these inverse local times.We also provide applications to Green function estimates for non-local operators.
基金supported by the National Key R&D Program of China(Grant Nos.2022YFA1003703,2022YFA 1003800)the National Natural Science Foundation of China(Grant Nos.11925106,12231011,11931001,12226007,12326325)+2 种基金supported by the National Natural Science Foundation of China(Grant No.12301380)supported by the National Key R&D Program of China(Grant Nos.2021YFA1000100,2021YFA1000101,2022YFA1003800)the Natural Science Foundation of Shanghai(Grant No.23ZR1419400)。
文摘We consider the problem of multi-task regression with time-varying low-rank patterns,where the collected data may be contaminated by heavy-tailed distributions and/or outliers.Our approach is based on a piecewise robust multi-task learning formulation,in which a robust loss function—not necessarily to be convex,but with a bounded derivative—is used,and each piecewise low-rank pattern is induced by a nuclear norm regularization term.We propose using the composite gradient descent algorithm to obtain stationary points within a data segment and employing the dynamic programming algorithm to determine the optimal segmentation.The theoretical properties of the detected number and time points of pattern shifts are studied under mild conditions.Numerical results confirm the effectiveness of our method.
基金supported by the National Natural Science Foundation of China under Grant Nos.12301364,12322112,12071038,11925106,12231011,11931001,12226007,12326325,and 12131006the National Key R&D Program of China under Grant Nos.2022YFA1003703 and 2022YFA1003800+3 种基金the Natural Science Foundation of Anhui Province under Grant No.2308085QA09the Fundamental Research Funds for the Central Universities under Grant No.2243200006the Scientific and Technological Innovation Project of China Academy of Chinese Medical Sciences under Grant No.CI2023C063YLLthe University Grant Council of Hong Kong。
文摘Simultaneously finding active predictors and controlling the false discovery rate(FDR)for high-dimensional survival data is an important but challenging statistical problem.In this paper,the authors propose a novel variable selection procedure with error rate control for the high-dimensional Cox model.By adopting a data-splitting strategy,the authors construct a series of symmetric statistics and then utilize the symmetry property to derive a data-driven threshold to achieve error rate control.The authors establish finite-sample and asymptotic FDR control results under some mild conditions.Simulation results as well as a real data application show that the proposed approach successfully controls FDR and is often more powerful than the competing approaches.
基金supported by the Fundamental Research Funds for the Central Universitiesthe National Natural Science Foundation of China under Grant No.12271272。
文摘In this paper,the authors propose a two-stage online debiased lasso estimation and statistical inference method for high-dimensional quantile regression(QR)models in the presence of streaming data.In the first stage,the authors modify the QR score function based on kernel smoothing and obtain the online lasso smoothed QR estimator through iterative algorithms.The estimation process only involves the current data batch and specific historical summary statistics,which perfectly accommodates to the special structure of streaming data.In the second stage,an online debiasing procedure is carried out to eliminate biases caused by the lasso penalty as well as the accumulative approximation error so that the asymptotic normality of the resulting estimator can be established.The authors conduct extensive numerical experiments to evaluate the performance of the proposed method.These experiments demonstrate the effectiveness of the proposed method and support the theoretical results.An application to the Beijing PM2.5 Dataset is also presented.
基金the National Key R&D Program of China(No.2021YFA1000403)the National Natural Science Foundation of China(Nos.11731013,12101334 and U19B2040)+1 种基金the Natural Science Foundation of Tianjin(No.21JCQNJC00030)the Fundamental Research Funds for the Central Universities。
文摘Numerous intriguing optimization problems arise as a result of the advancement of machine learning.The stochastic first-ordermethod is the predominant choicefor those problems due to its high efficiency.However,the negative effects of noisy gradient estimates and high nonlinearity of the loss function result in a slow convergence rate.Second-order algorithms have their typical advantages in dealing with highly nonlinear and ill-conditioning problems.This paper provides a review on recent developments in stochastic variants of quasi-Newton methods,which construct the Hessian approximations using only gradient information.We concentrate on BFGS-based methods in stochastic settings and highlight the algorithmic improvements that enable the algorithm to work in various scenarios.Future research on stochastic quasi-Newton methods should focus on enhancing its applicability,lowering the computational and storage costs,and improving the convergence rate.