Purpose:This paper examines African Journals Online(AJOL)as a bibliometric resource,providing a structured dataset of journal and publication metadata.In addition,it integrates AJOL data with OpenAlex to enhance metad...Purpose:This paper examines African Journals Online(AJOL)as a bibliometric resource,providing a structured dataset of journal and publication metadata.In addition,it integrates AJOL data with OpenAlex to enhance metadata coverage and improve interoperability with other bibliometric sources.Design/methodology/approach:The journal list and publications indexed in AJOL were retrieved using web scraping techniques.This paper details the database construction process,highlighting its strengths and limitations,and presents a descriptive analysis of AJOL’s indexed journals and publications.Findings:The publication analysis demonstrates a steady growth in the number of publications over time but reveals significant disparities in their distribution across African countries.This paper presents an example of the possibility of integrating both sources using author country data from OpenAlex.The analysis of author contributions reveals that African journals serve as both regional and international venues,confirming that African journals play a dual role in fostering both regional and global research engagement.Research limitations:While AJOL contains relevant information for identifying and providing insights about African publications and journals,its metadata are limited.Therefore,the kind of analysis that can be performed with the database presented here is also limited.The integration with OpenAlex aims to overcome some of the limitations.Finally,although some automatic citation procedures have been performed,the metadata has not been manually curated.Therefore,if errors or inaccuracies are present in the AJOL,they may be reproduced in this database.Practical implications:The database introduced in this article contributes to the accessibility of African scholarly publications by providing structured,accessible metadata derived from the AJOL.It facilitates bibliometric analyses that are more representative of African research activities.This contribution complements ongoing efforts to develop alternative data sources and infrastructure that better reflect the diversity of global knowledge production.Originality/value:This paper presents a novel database for bibliometric analysis and offers a detailed report of the retrieval and construction procedures.The inclusion of matched data with OpenAlex further enhances the database’s utility.By showcasing AJOL’s potential,this study contributes to the broader goal of fostering inclusivity and improving the representation of African research in global bibliometric analyses.展开更多
In this paper,we use the Riemann-Hilbert(RH)method to investigate the Cauchy problem of the reverse space-time nonlocal Hirota equation with step-like initial data:q(z,0)=o(1)as z→-∞and q(z,0)=δ+o(1)as z→∞,where...In this paper,we use the Riemann-Hilbert(RH)method to investigate the Cauchy problem of the reverse space-time nonlocal Hirota equation with step-like initial data:q(z,0)=o(1)as z→-∞and q(z,0)=δ+o(1)as z→∞,whereδis an arbitrary positive constant.We show that the solution of the Cauchy problem can be determined by the solution of the corresponding matrix RH problem established on the plane of complex spectral parameterλ.As an example,we construct an exact solution of the reverse space-time nonlocal Hirota equation in a special case via this RH problem.展开更多
The rapid growth of biomedical data,particularly multi-omics data including genomes,transcriptomics,proteomics,metabolomics,and epigenomics,medical research and clinical decision-making confront both new opportunities...The rapid growth of biomedical data,particularly multi-omics data including genomes,transcriptomics,proteomics,metabolomics,and epigenomics,medical research and clinical decision-making confront both new opportunities and obstacles.The huge and diversified nature of these datasets cannot always be managed using traditional data analysis methods.As a consequence,deep learning has emerged as a strong tool for analysing numerous omics data due to its ability to handle complex and non-linear relationships.This paper explores the fundamental concepts of deep learning and how they are used in multi-omics medical data mining.We demonstrate how autoencoders,variational autoencoders,multimodal models,attention mechanisms,transformers,and graph neural networks enable pattern analysis and recognition across all omics data.Deep learning has been found to be effective in illness classification,biomarker identification,gene network learning,and therapeutic efficacy prediction.We also consider critical problems like as data quality,model explainability,whether findings can be repeated,and computational power requirements.We now consider future elements of combining omics with clinical and imaging data,explainable AI,federated learning,and real-time diagnostics.Overall,this study emphasises the need of collaborating across disciplines to advance deep learning-based multi-omics research for precision medicine and comprehending complicated disorders.展开更多
High-throughput transcriptomics has evolved from bulk RNA-seq to single-cell and spatial profiling,yet its clinical translation still depends on effective integration across diverse omics and data modalities.Emerging ...High-throughput transcriptomics has evolved from bulk RNA-seq to single-cell and spatial profiling,yet its clinical translation still depends on effective integration across diverse omics and data modalities.Emerging foundation models and multimodal learning frameworks are enabling scalable and transferable representations of cellular states,while advances in interpretability and real-world data integration are bridging the gap between discovery and clinical application.This paper outlines a concise roadmap for AI-driven,transcriptome-centered multi-omics integration in precision medicine(Figure 1).展开更多
Gastrointestinal tumors require personalized treatment strategies due to their heterogeneity and complexity.Multimodal artificial intelligence(AI)addresses this challenge by integrating diverse data sources-including ...Gastrointestinal tumors require personalized treatment strategies due to their heterogeneity and complexity.Multimodal artificial intelligence(AI)addresses this challenge by integrating diverse data sources-including computed tomography(CT),magnetic resonance imaging(MRI),endoscopic imaging,and genomic profiles-to enable intelligent decision-making for individualized therapy.This approach leverages AI algorithms to fuse imaging,endoscopic,and omics data,facilitating comprehensive characterization of tumor biology,prediction of treatment response,and optimization of therapeutic strategies.By combining CT and MRI for structural assessment,endoscopic data for real-time visual inspection,and genomic information for molecular profiling,multimodal AI enhances the accuracy of patient stratification and treatment personalization.The clinical implementation of this technology demonstrates potential for improving patient outcomes,advancing precision oncology,and supporting individualized care in gastrointestinal cancers.Ultimately,multimodal AI serves as a transformative tool in oncology,bridging data integration with clinical application to effectively tailor therapies.展开更多
Many high quality studies have emerged from public databases,such as Surveillance,Epidemiology,and End Results(SEER),National Health and Nutrition Examination Survey(NHANES),The Cancer Genome Atlas(TCGA),and Medical I...Many high quality studies have emerged from public databases,such as Surveillance,Epidemiology,and End Results(SEER),National Health and Nutrition Examination Survey(NHANES),The Cancer Genome Atlas(TCGA),and Medical Information Mart for Intensive Care(MIMIC);however,these data are often characterized by a high degree of dimensional heterogeneity,timeliness,scarcity,irregularity,and other characteristics,resulting in the value of these data not being fully utilized.Data-mining technology has been a frontier field in medical research,as it demonstrates excellent performance in evaluating patient risks and assisting clinical decision-making in building disease-prediction models.Therefore,data mining has unique advantages in clinical big-data research,especially in large-scale medical public databases.This article introduced the main medical public database and described the steps,tasks,and models of data mining in simple language.Additionally,we described data-mining methods along with their practical applications.The goal of this work was to aid clinical researchers in gaining a clear and intuitive understanding of the application of data-mining technology on clinical big-data in order to promote the production of research results that are beneficial to doctors and patients.展开更多
Vector quantization (VQ) is an important data compression method. The key of the encoding of VQ is to find the closest vector among N vectors for a feature vector. Many classical linear search algorithms take O(N)...Vector quantization (VQ) is an important data compression method. The key of the encoding of VQ is to find the closest vector among N vectors for a feature vector. Many classical linear search algorithms take O(N) steps of distance computing between two vectors. The quantum VQ iteration and corresponding quantum VQ encoding algorithm that takes O(√N) steps are presented in this paper. The unitary operation of distance computing can be performed on a number of vectors simultaneously because the quantum state exists in a superposition of states. The quantum VQ iteration comprises three oracles, by contrast many quantum algorithms have only one oracle, such as Shor's factorization algorithm and Grover's algorithm. Entanglement state is generated and used, by contrast the state in Grover's algorithm is not an entanglement state. The quantum VQ iteration is a rotation over subspace, by contrast the Grover iteration is a rotation over global space. The quantum VQ iteration extends the Grover iteration to the more complex search that requires more oracles. The method of the quantum VQ iteration is universal.展开更多
Under Type-Ⅱ progressively hybrid censoring, this paper discusses statistical inference and optimal design on stepstress partially accelerated life test for hybrid system in presence of masked data. It is assumed tha...Under Type-Ⅱ progressively hybrid censoring, this paper discusses statistical inference and optimal design on stepstress partially accelerated life test for hybrid system in presence of masked data. It is assumed that the lifetime of the component in hybrid systems follows independent and identical modified Weibull distributions. The maximum likelihood estimations(MLEs)of the unknown parameters, acceleration factor and reliability indexes are derived by using the Newton-Raphson algorithm. The asymptotic variance-covariance matrix and the approximate confidence intervals are obtained based on normal approximation to the asymptotic distribution of MLEs of model parameters. Moreover,two bootstrap confidence intervals are constructed by using the parametric bootstrap method. The optimal time of changing stress levels is determined under D-optimality and A-optimality criteria.Finally, the Monte Carlo simulation study is carried out to illustrate the proposed procedures.展开更多
Single-step genomic best linear unbiased prediction(ss GBLUP) is now intensively investigated and widely used in livestock breeding due to its beneficial feature of combining information from both genotyped and ungeno...Single-step genomic best linear unbiased prediction(ss GBLUP) is now intensively investigated and widely used in livestock breeding due to its beneficial feature of combining information from both genotyped and ungenotyped individuals in the single model. With the increasing accessibility of whole-genome sequence(WGS) data at the population level, more attention is being paid to the usage of WGS data in ss GBLUP. The predictive ability of ss GBLUP using WGS data might be improved by incorporating biological knowledge from public databases. Thus, we extended ss GBLUP, incorporated genomic annotation information into the model, and evaluated them using a yellow-feathered chicken population as the examples. The chicken population consisted of 1 338 birds with 23 traits, where imputed WGS data including 5 127 612 single nucleotide polymorphisms(SNPs) are available for 895 birds. Considering different combinations of annotation information and models, original ss GBLUP, haplotype-based ss GHBLUP, and four extended ss GBLUP incorporating genomic annotation models were evaluated. Based on the genomic annotation(GRCg6a) of chickens, 3 155 524 and 94 837 SNPs were mapped to genic and exonic regions, respectively. Extended ss GBLUP using genic/exonic SNPs outperformed other models with respect to predictive ability in 15 out of 23 traits, and their advantages ranged from 2.5 to 6.1% compared with original ss GBLUP. In addition, to further enhance the performance of genomic prediction with imputed WGS data, we investigated the genotyping strategies of reference population on ss GBLUP in the chicken population. Comparing two strategies of individual selection for genotyping in the reference population, the strategy of evenly selection by family(SBF) performed slightly better than random selection in most situations. Overall, we extended genomic prediction models that can comprehensively utilize WGS data and genomic annotation information in the framework of ss GBLUP, and validated the idea that properly handling the genomic annotation information and WGS data increased the predictive ability of ss GBLUP. Moreover, while using WGS data, the genotyping strategy of maximizing the expected genetic relationship between the reference and candidate population could further improve the predictive ability of ss GBLUP. The results from this study shed light on the comprehensive usage of genomic annotation information in WGS-based single-step genomic prediction.展开更多
In this paper, in order to implement the share and exchange of the ship product data, a new kind of global function model is established. By researching on the development and trend of the application of ship STEP (st...In this paper, in order to implement the share and exchange of the ship product data, a new kind of global function model is established. By researching on the development and trend of the application of ship STEP (standard for the exchange of product model data) standards, the AIM (application interpreted model) of AP216 is developed and improved as an example, aiming at the characteristics and practical engineering of ship industry in our country. The data exchange interfaces are formed based on STEP in the CAD/CAM for the ship by all function modules and shared databases under the global function model. The share and exchange of all information and data are solved in the design, manufacture and all life-cycle of ship products among different computer application systems. The research work makes foundation for the ship industry informatization.展开更多
Direct soil temperature(ST)measurement is time-consuming and costly;thus,the use of simple and cost-effective machine learning(ML)tools is helpful.In this study,ML approaches,including KStar,instance-based K-nearest l...Direct soil temperature(ST)measurement is time-consuming and costly;thus,the use of simple and cost-effective machine learning(ML)tools is helpful.In this study,ML approaches,including KStar,instance-based K-nearest learning(IBK),and locally weighted learning(LWL),coupled with resampling algorithms of bagging(BA)and dagging(DA)(BA-IBK,BA-KStar,BA-LWL,DA-IBK,DA-KStar,and DA-LWL)were developed and tested for multi-step ahead(3,6,and 9 d ahead)ST forecasting.In addition,a linear regression(LR)model was used as a benchmark to evaluate the results.A dataset was established,with daily ST time-series at 5 and 50 cm soil depths in a farmland as models’output and meteorological data as models’input,including mean(T_(mean)),minimum(Tmin),and maximum(T_(max))air temperatures,evaporation(Eva),sunshine hours(SSH),and solar radiation(SR),which were collected at Isfahan Synoptic Station(Iran)for 13 years(1992–2005).Six different input combination scenarios were selected based on Pearson’s correlation coefficients between inputs and outputs and fed into the models.We used 70%of the data to train the models,with the remaining 30%used for model evaluation via multiple visual and quantitative metrics.Our?ndings showed that T_(mean)was the most effective input variable for ST forecasting in most of the developed models,while in some cases the combinations of variables,including T_(mean)and T_(max)and T_(mean),T_(max),Tmin,Eva,and SSH proved to be the best input combinations.Among the evaluated models,BA-KStar showed greater compatibility,while in most cases,BA-IBK and-LWL provided more accurate results,depending on soil depth.For the 5 cm soil depth,BA-KStar had superior performance(i.e.,Nash-Sutcliffe efficiency(NSE)=0.90,0.87,and 0.85 for 3,6,and 9 d ahead forecasting,respectively);for the 50 cm soil depth,DA-KStar outperformed the other models(i.e.,NSE=0.88,0.89,and 0.89 for 3,6,and 9 d ahead forecasting,respectively).The results con?rmed that all hybrid models had higher prediction capabilities than the LR model.展开更多
We present a new least-mean-square algorithm of adaptive filtering to improve the signal to noise ratio for magneto-cardiography data collected with high-temperature SQUID-based magnetometers. By frequently adjusting ...We present a new least-mean-square algorithm of adaptive filtering to improve the signal to noise ratio for magneto-cardiography data collected with high-temperature SQUID-based magnetometers. By frequently adjusting the adaptive parameter a go systematic optimum values in the course of the programmed procedure, the convergence is accelerated with a highest speed and the minimum steady-state error is obtained simultaneously. This algorithm may be applied to eliminate other non-steady relevant noises as well.展开更多
This study focuses on resource block allocation issue in the downlink transmission systems of the Long Term Evolution (LTE). In existing LTE standards, all Allocation Units (AUs) allocated to any user must adopt the s...This study focuses on resource block allocation issue in the downlink transmission systems of the Long Term Evolution (LTE). In existing LTE standards, all Allocation Units (AUs) allocated to any user must adopt the same Modulation and Coding Scheme (MCS), which is determined by the AU with the worst channel condition. Despite its simplicity, this strategy incurs significant performance degradation since the achievable system throughput is limited by the AUs having the worst channel quality. To address this issue, a two-step resource block allocation algorithm is proposed in this paper. The algorithm first allocates AUs to each user according to the users' priorities and the number of their required AUs. Then, a re-allocation mechanism is introduced. Specifically, for any given user, the AUs with the worst channel condition are removed. In this manner, the users may adopt a higher MCS level, and the achievable data rate can be increased. Finally, all the unallocated AUs are assigned among users without changing the chosen MCSs, and the total throughput of the system is further enhanced. Simulation results show that thanks to the proposed algorithm, the system gains higher throughput without adding too many?complexities.展开更多
Iced transmission line galloping poses a significant threat to the safety and reliability of power systems,leading directly to line tripping,disconnections,and power outages.Existing early warning methods of iced tran...Iced transmission line galloping poses a significant threat to the safety and reliability of power systems,leading directly to line tripping,disconnections,and power outages.Existing early warning methods of iced transmission line galloping suffer from issues such as reliance on a single data source,neglect of irregular time series,and lack of attention-based closed-loop feedback,resulting in high rates of missed and false alarms.To address these challenges,we propose an Internet of Things(IoT)empowered early warning method of transmission line galloping that integrates time series data from optical fiber sensing and weather forecast.Initially,the method applies a primary adaptive weighted fusion to the IoT empowered optical fiber real-time sensing data and weather forecast data,followed by a secondary fusion based on a Back Propagation(BP)neural network,and uses the K-medoids algorithm for clustering the fused data.Furthermore,an adaptive irregular time series perception adjustment module is introduced into the traditional Gated Recurrent Unit(GRU)network,and closed-loop feedback based on attentionmechanism is employed to update network parameters through gradient feedback of the loss function,enabling closed-loop training and time series data prediction of the GRU network model.Subsequently,considering various types of prediction data and the duration of icing,an iced transmission line galloping risk coefficient is established,and warnings are categorized based on this coefficient.Finally,using an IoT-driven realistic dataset of iced transmission line galloping,the effectiveness of the proposed method is validated through multi-dimensional simulation scenarios.展开更多
The performance degradation rates of the missile tank are generally time-varying functions uneasily evaluated by general classical evaluation methods. This paper develops a segmented nonlinear accelerated degradation ...The performance degradation rates of the missile tank are generally time-varying functions uneasily evaluated by general classical evaluation methods. This paper develops a segmented nonlinear accelerated degradation model (SNADM) based on the equivalent method of accumulative damage theory, which tackles the problem that product life is difficult to be determined with degradation rate being a function of the variable of time. A segmented expression of the function of population accumulative degradation is derived. And combined with nonlinear function, an accelerated degradation function, i.e., SNADM is obtained. The parameters of the SNADM are identified by numerical iteration, and the statistical function of degradation track is extrapolated. The reliability function is determined through the type of random process of the degradation distribution. Then an evaluation of product storage life is undertaken by combining the statistical function of degradation track, reliability function and threshold. An example of a missile tank undergoes a step-down stress accelerated degradation test (SDSADT), in which the results with the SNADM and the classical method are evaluated and compared. The technology introduced is validated with the resultant coincidence of both evaluated and field storage lives.展开更多
The Intelligent Internet of Things(IIoT)involves real-world things that communicate or interact with each other through networking technologies by collecting data from these“things”and using intelligent approaches,s...The Intelligent Internet of Things(IIoT)involves real-world things that communicate or interact with each other through networking technologies by collecting data from these“things”and using intelligent approaches,such as Artificial Intelligence(AI)and machine learning,to make accurate decisions.Data science is the science of dealing with data and its relationships through intelligent approaches.Most state-of-the-art research focuses independently on either data science or IIoT,rather than exploring their integration.Therefore,to address the gap,this article provides a comprehensive survey on the advances and integration of data science with the Intelligent IoT(IIoT)system by classifying the existing IoT-based data science techniques and presenting a summary of various characteristics.The paper analyzes the data science or big data security and privacy features,including network architecture,data protection,and continuous monitoring of data,which face challenges in various IoT-based systems.Extensive insights into IoT data security,privacy,and challenges are visualized in the context of data science for IoT.In addition,this study reveals the current opportunities to enhance data science and IoT market development.The current gap and challenges faced in the integration of data science and IoT are comprehensively presented,followed by the future outlook and possible solutions.展开更多
Viral infectious diseases,characterized by their intricate nature and wide-ranging diversity,pose substantial challenges in the domain of data management.The vast volume of data generated by these diseases,spanning fr...Viral infectious diseases,characterized by their intricate nature and wide-ranging diversity,pose substantial challenges in the domain of data management.The vast volume of data generated by these diseases,spanning from the molecular mechanisms within cells to large-scale epidemiological patterns,has surpassed the capabilities of traditional analytical methods.In the era of artificial intelligence(AI)and big data,there is an urgent necessity for the optimization of these analytical methods to more effectively handle and utilize the information.Despite the rapid accumulation of data associated with viral infections,the lack of a comprehensive framework for integrating,selecting,and analyzing these datasets has left numerous researchers uncertain about which data to select,how to access it,and how to utilize it most effectively in their research.This review endeavors to fill these gaps by exploring the multifaceted nature of viral infectious diseases and summarizing relevant data across multiple levels,from the molecular details of pathogens to broad epidemiological trends.The scope extends from the micro-scale to the macro-scale,encompassing pathogens,hosts,and vectors.In addition to data summarization,this review thoroughly investigates various dataset sources.It also traces the historical evolution of data collection in the field of viral infectious diseases,highlighting the progress achieved over time.Simultaneously,it evaluates the current limitations that impede data utilization.Furthermore,we propose strategies to surmount these challenges,focusing on the development and application of advanced computational techniques,AI-driven models,and enhanced data integration practices.By providing a comprehensive synthesis of existing knowledge,this review is designed to guide future research and contribute to more informed approaches in the surveillance,prevention,and control of viral infectious diseases,particularly within the context of the expanding big-data landscape.展开更多
Survival data with amulti-state structure are frequently observed in follow-up studies.An analytic approach based on a multi-state model(MSM)should be used in longitudinal health studies in which a patient experiences...Survival data with amulti-state structure are frequently observed in follow-up studies.An analytic approach based on a multi-state model(MSM)should be used in longitudinal health studies in which a patient experiences a sequence of clinical progression events.One main objective in the MSM framework is variable selection,where attempts are made to identify the risk factors associated with the transition hazard rates or probabilities of disease progression.The usual variable selection methods,including stepwise and penalized methods,do not provide information about the importance of variables.In this context,we present a two-step algorithm to evaluate the importance of variables formulti-state data.Three differentmachine learning approaches(randomforest,gradient boosting,and neural network)as themost widely usedmethods are considered to estimate the variable importance in order to identify the factors affecting disease progression and rank these factors according to their importance.The performance of our proposed methods is validated by simulation and applied to the COVID-19 data set.The results revealed that the proposed two-stage method has promising performance for estimating variable importance.展开更多
基金supported by a PIPF contract of the Madrid Education,Science and Universities Office(grant number:PIPF-2022/PH-HUM-25403).
文摘Purpose:This paper examines African Journals Online(AJOL)as a bibliometric resource,providing a structured dataset of journal and publication metadata.In addition,it integrates AJOL data with OpenAlex to enhance metadata coverage and improve interoperability with other bibliometric sources.Design/methodology/approach:The journal list and publications indexed in AJOL were retrieved using web scraping techniques.This paper details the database construction process,highlighting its strengths and limitations,and presents a descriptive analysis of AJOL’s indexed journals and publications.Findings:The publication analysis demonstrates a steady growth in the number of publications over time but reveals significant disparities in their distribution across African countries.This paper presents an example of the possibility of integrating both sources using author country data from OpenAlex.The analysis of author contributions reveals that African journals serve as both regional and international venues,confirming that African journals play a dual role in fostering both regional and global research engagement.Research limitations:While AJOL contains relevant information for identifying and providing insights about African publications and journals,its metadata are limited.Therefore,the kind of analysis that can be performed with the database presented here is also limited.The integration with OpenAlex aims to overcome some of the limitations.Finally,although some automatic citation procedures have been performed,the metadata has not been manually curated.Therefore,if errors or inaccuracies are present in the AJOL,they may be reproduced in this database.Practical implications:The database introduced in this article contributes to the accessibility of African scholarly publications by providing structured,accessible metadata derived from the AJOL.It facilitates bibliometric analyses that are more representative of African research activities.This contribution complements ongoing efforts to develop alternative data sources and infrastructure that better reflect the diversity of global knowledge production.Originality/value:This paper presents a novel database for bibliometric analysis and offers a detailed report of the retrieval and construction procedures.The inclusion of matched data with OpenAlex further enhances the database’s utility.By showcasing AJOL’s potential,this study contributes to the broader goal of fostering inclusivity and improving the representation of African research in global bibliometric analyses.
基金supported by the National Natural Science Foundation of China under Grant No.12147115the Discipline(Subject)Leader Cultivation Project of Universities in Anhui Province under Grant Nos.DTR2023052 and DTR2024046+2 种基金the Natural Science Research Project of Universities in Anhui Province under Grant No.2024AH040202the Young Top Notch Talents and Young Scholars of High End Talent Introduction and Cultivation Action Project in Anhui Provincethe Scientific Research Foundation Funded Project of Chuzhou University under Grant Nos.2022qd022 and 2022qd038。
文摘In this paper,we use the Riemann-Hilbert(RH)method to investigate the Cauchy problem of the reverse space-time nonlocal Hirota equation with step-like initial data:q(z,0)=o(1)as z→-∞and q(z,0)=δ+o(1)as z→∞,whereδis an arbitrary positive constant.We show that the solution of the Cauchy problem can be determined by the solution of the corresponding matrix RH problem established on the plane of complex spectral parameterλ.As an example,we construct an exact solution of the reverse space-time nonlocal Hirota equation in a special case via this RH problem.
文摘The rapid growth of biomedical data,particularly multi-omics data including genomes,transcriptomics,proteomics,metabolomics,and epigenomics,medical research and clinical decision-making confront both new opportunities and obstacles.The huge and diversified nature of these datasets cannot always be managed using traditional data analysis methods.As a consequence,deep learning has emerged as a strong tool for analysing numerous omics data due to its ability to handle complex and non-linear relationships.This paper explores the fundamental concepts of deep learning and how they are used in multi-omics medical data mining.We demonstrate how autoencoders,variational autoencoders,multimodal models,attention mechanisms,transformers,and graph neural networks enable pattern analysis and recognition across all omics data.Deep learning has been found to be effective in illness classification,biomarker identification,gene network learning,and therapeutic efficacy prediction.We also consider critical problems like as data quality,model explainability,whether findings can be repeated,and computational power requirements.We now consider future elements of combining omics with clinical and imaging data,explainable AI,federated learning,and real-time diagnostics.Overall,this study emphasises the need of collaborating across disciplines to advance deep learning-based multi-omics research for precision medicine and comprehending complicated disorders.
文摘High-throughput transcriptomics has evolved from bulk RNA-seq to single-cell and spatial profiling,yet its clinical translation still depends on effective integration across diverse omics and data modalities.Emerging foundation models and multimodal learning frameworks are enabling scalable and transferable representations of cellular states,while advances in interpretability and real-world data integration are bridging the gap between discovery and clinical application.This paper outlines a concise roadmap for AI-driven,transcriptome-centered multi-omics integration in precision medicine(Figure 1).
基金Supported by Xuhui District Health Commission,No.SHXH202214.
文摘Gastrointestinal tumors require personalized treatment strategies due to their heterogeneity and complexity.Multimodal artificial intelligence(AI)addresses this challenge by integrating diverse data sources-including computed tomography(CT),magnetic resonance imaging(MRI),endoscopic imaging,and genomic profiles-to enable intelligent decision-making for individualized therapy.This approach leverages AI algorithms to fuse imaging,endoscopic,and omics data,facilitating comprehensive characterization of tumor biology,prediction of treatment response,and optimization of therapeutic strategies.By combining CT and MRI for structural assessment,endoscopic data for real-time visual inspection,and genomic information for molecular profiling,multimodal AI enhances the accuracy of patient stratification and treatment personalization.The clinical implementation of this technology demonstrates potential for improving patient outcomes,advancing precision oncology,and supporting individualized care in gastrointestinal cancers.Ultimately,multimodal AI serves as a transformative tool in oncology,bridging data integration with clinical application to effectively tailor therapies.
基金the National Social Science Foundation of China(No.16BGL183).
文摘Many high quality studies have emerged from public databases,such as Surveillance,Epidemiology,and End Results(SEER),National Health and Nutrition Examination Survey(NHANES),The Cancer Genome Atlas(TCGA),and Medical Information Mart for Intensive Care(MIMIC);however,these data are often characterized by a high degree of dimensional heterogeneity,timeliness,scarcity,irregularity,and other characteristics,resulting in the value of these data not being fully utilized.Data-mining technology has been a frontier field in medical research,as it demonstrates excellent performance in evaluating patient risks and assisting clinical decision-making in building disease-prediction models.Therefore,data mining has unique advantages in clinical big-data research,especially in large-scale medical public databases.This article introduced the main medical public database and described the steps,tasks,and models of data mining in simple language.Additionally,we described data-mining methods along with their practical applications.The goal of this work was to aid clinical researchers in gaining a clear and intuitive understanding of the application of data-mining technology on clinical big-data in order to promote the production of research results that are beneficial to doctors and patients.
文摘Vector quantization (VQ) is an important data compression method. The key of the encoding of VQ is to find the closest vector among N vectors for a feature vector. Many classical linear search algorithms take O(N) steps of distance computing between two vectors. The quantum VQ iteration and corresponding quantum VQ encoding algorithm that takes O(√N) steps are presented in this paper. The unitary operation of distance computing can be performed on a number of vectors simultaneously because the quantum state exists in a superposition of states. The quantum VQ iteration comprises three oracles, by contrast many quantum algorithms have only one oracle, such as Shor's factorization algorithm and Grover's algorithm. Entanglement state is generated and used, by contrast the state in Grover's algorithm is not an entanglement state. The quantum VQ iteration is a rotation over subspace, by contrast the Grover iteration is a rotation over global space. The quantum VQ iteration extends the Grover iteration to the more complex search that requires more oracles. The method of the quantum VQ iteration is universal.
基金supported by the National Natural Science Foundation of China(71401134 71571144+1 种基金 71171164)the Program of International Cooperation and Exchanges in Science and Technology Funded by Shaanxi Province(2016KW-033)
文摘Under Type-Ⅱ progressively hybrid censoring, this paper discusses statistical inference and optimal design on stepstress partially accelerated life test for hybrid system in presence of masked data. It is assumed that the lifetime of the component in hybrid systems follows independent and identical modified Weibull distributions. The maximum likelihood estimations(MLEs)of the unknown parameters, acceleration factor and reliability indexes are derived by using the Newton-Raphson algorithm. The asymptotic variance-covariance matrix and the approximate confidence intervals are obtained based on normal approximation to the asymptotic distribution of MLEs of model parameters. Moreover,two bootstrap confidence intervals are constructed by using the parametric bootstrap method. The optimal time of changing stress levels is determined under D-optimality and A-optimality criteria.Finally, the Monte Carlo simulation study is carried out to illustrate the proposed procedures.
基金supported by the National Natural Science Foundation of China(32022078)the Local Innovative and Research Teams Project of Guangdong Province,China(2019BT02N630)the support from the National Supercomputer Center in Guangzhou,China。
文摘Single-step genomic best linear unbiased prediction(ss GBLUP) is now intensively investigated and widely used in livestock breeding due to its beneficial feature of combining information from both genotyped and ungenotyped individuals in the single model. With the increasing accessibility of whole-genome sequence(WGS) data at the population level, more attention is being paid to the usage of WGS data in ss GBLUP. The predictive ability of ss GBLUP using WGS data might be improved by incorporating biological knowledge from public databases. Thus, we extended ss GBLUP, incorporated genomic annotation information into the model, and evaluated them using a yellow-feathered chicken population as the examples. The chicken population consisted of 1 338 birds with 23 traits, where imputed WGS data including 5 127 612 single nucleotide polymorphisms(SNPs) are available for 895 birds. Considering different combinations of annotation information and models, original ss GBLUP, haplotype-based ss GHBLUP, and four extended ss GBLUP incorporating genomic annotation models were evaluated. Based on the genomic annotation(GRCg6a) of chickens, 3 155 524 and 94 837 SNPs were mapped to genic and exonic regions, respectively. Extended ss GBLUP using genic/exonic SNPs outperformed other models with respect to predictive ability in 15 out of 23 traits, and their advantages ranged from 2.5 to 6.1% compared with original ss GBLUP. In addition, to further enhance the performance of genomic prediction with imputed WGS data, we investigated the genotyping strategies of reference population on ss GBLUP in the chicken population. Comparing two strategies of individual selection for genotyping in the reference population, the strategy of evenly selection by family(SBF) performed slightly better than random selection in most situations. Overall, we extended genomic prediction models that can comprehensively utilize WGS data and genomic annotation information in the framework of ss GBLUP, and validated the idea that properly handling the genomic annotation information and WGS data increased the predictive ability of ss GBLUP. Moreover, while using WGS data, the genotyping strategy of maximizing the expected genetic relationship between the reference and candidate population could further improve the predictive ability of ss GBLUP. The results from this study shed light on the comprehensive usage of genomic annotation information in WGS-based single-step genomic prediction.
基金Supported by Commission of the Basic ResearchScience and Technology for National Defence ( No.B192001C001).
文摘In this paper, in order to implement the share and exchange of the ship product data, a new kind of global function model is established. By researching on the development and trend of the application of ship STEP (standard for the exchange of product model data) standards, the AIM (application interpreted model) of AP216 is developed and improved as an example, aiming at the characteristics and practical engineering of ship industry in our country. The data exchange interfaces are formed based on STEP in the CAD/CAM for the ship by all function modules and shared databases under the global function model. The share and exchange of all information and data are solved in the design, manufacture and all life-cycle of ship products among different computer application systems. The research work makes foundation for the ship industry informatization.
文摘Direct soil temperature(ST)measurement is time-consuming and costly;thus,the use of simple and cost-effective machine learning(ML)tools is helpful.In this study,ML approaches,including KStar,instance-based K-nearest learning(IBK),and locally weighted learning(LWL),coupled with resampling algorithms of bagging(BA)and dagging(DA)(BA-IBK,BA-KStar,BA-LWL,DA-IBK,DA-KStar,and DA-LWL)were developed and tested for multi-step ahead(3,6,and 9 d ahead)ST forecasting.In addition,a linear regression(LR)model was used as a benchmark to evaluate the results.A dataset was established,with daily ST time-series at 5 and 50 cm soil depths in a farmland as models’output and meteorological data as models’input,including mean(T_(mean)),minimum(Tmin),and maximum(T_(max))air temperatures,evaporation(Eva),sunshine hours(SSH),and solar radiation(SR),which were collected at Isfahan Synoptic Station(Iran)for 13 years(1992–2005).Six different input combination scenarios were selected based on Pearson’s correlation coefficients between inputs and outputs and fed into the models.We used 70%of the data to train the models,with the remaining 30%used for model evaluation via multiple visual and quantitative metrics.Our?ndings showed that T_(mean)was the most effective input variable for ST forecasting in most of the developed models,while in some cases the combinations of variables,including T_(mean)and T_(max)and T_(mean),T_(max),Tmin,Eva,and SSH proved to be the best input combinations.Among the evaluated models,BA-KStar showed greater compatibility,while in most cases,BA-IBK and-LWL provided more accurate results,depending on soil depth.For the 5 cm soil depth,BA-KStar had superior performance(i.e.,Nash-Sutcliffe efficiency(NSE)=0.90,0.87,and 0.85 for 3,6,and 9 d ahead forecasting,respectively);for the 50 cm soil depth,DA-KStar outperformed the other models(i.e.,NSE=0.88,0.89,and 0.89 for 3,6,and 9 d ahead forecasting,respectively).The results con?rmed that all hybrid models had higher prediction capabilities than the LR model.
文摘We present a new least-mean-square algorithm of adaptive filtering to improve the signal to noise ratio for magneto-cardiography data collected with high-temperature SQUID-based magnetometers. By frequently adjusting the adaptive parameter a go systematic optimum values in the course of the programmed procedure, the convergence is accelerated with a highest speed and the minimum steady-state error is obtained simultaneously. This algorithm may be applied to eliminate other non-steady relevant noises as well.
文摘This study focuses on resource block allocation issue in the downlink transmission systems of the Long Term Evolution (LTE). In existing LTE standards, all Allocation Units (AUs) allocated to any user must adopt the same Modulation and Coding Scheme (MCS), which is determined by the AU with the worst channel condition. Despite its simplicity, this strategy incurs significant performance degradation since the achievable system throughput is limited by the AUs having the worst channel quality. To address this issue, a two-step resource block allocation algorithm is proposed in this paper. The algorithm first allocates AUs to each user according to the users' priorities and the number of their required AUs. Then, a re-allocation mechanism is introduced. Specifically, for any given user, the AUs with the worst channel condition are removed. In this manner, the users may adopt a higher MCS level, and the achievable data rate can be increased. Finally, all the unallocated AUs are assigned among users without changing the chosen MCSs, and the total throughput of the system is further enhanced. Simulation results show that thanks to the proposed algorithm, the system gains higher throughput without adding too many?complexities.
基金research was funded by Science and Technology Project of State Grid Corporation of China under grant number 5200-202319382A-2-3-XG.
文摘Iced transmission line galloping poses a significant threat to the safety and reliability of power systems,leading directly to line tripping,disconnections,and power outages.Existing early warning methods of iced transmission line galloping suffer from issues such as reliance on a single data source,neglect of irregular time series,and lack of attention-based closed-loop feedback,resulting in high rates of missed and false alarms.To address these challenges,we propose an Internet of Things(IoT)empowered early warning method of transmission line galloping that integrates time series data from optical fiber sensing and weather forecast.Initially,the method applies a primary adaptive weighted fusion to the IoT empowered optical fiber real-time sensing data and weather forecast data,followed by a secondary fusion based on a Back Propagation(BP)neural network,and uses the K-medoids algorithm for clustering the fused data.Furthermore,an adaptive irregular time series perception adjustment module is introduced into the traditional Gated Recurrent Unit(GRU)network,and closed-loop feedback based on attentionmechanism is employed to update network parameters through gradient feedback of the loss function,enabling closed-loop training and time series data prediction of the GRU network model.Subsequently,considering various types of prediction data and the duration of icing,an iced transmission line galloping risk coefficient is established,and warnings are categorized based on this coefficient.Finally,using an IoT-driven realistic dataset of iced transmission line galloping,the effectiveness of the proposed method is validated through multi-dimensional simulation scenarios.
文摘The performance degradation rates of the missile tank are generally time-varying functions uneasily evaluated by general classical evaluation methods. This paper develops a segmented nonlinear accelerated degradation model (SNADM) based on the equivalent method of accumulative damage theory, which tackles the problem that product life is difficult to be determined with degradation rate being a function of the variable of time. A segmented expression of the function of population accumulative degradation is derived. And combined with nonlinear function, an accelerated degradation function, i.e., SNADM is obtained. The parameters of the SNADM are identified by numerical iteration, and the statistical function of degradation track is extrapolated. The reliability function is determined through the type of random process of the degradation distribution. Then an evaluation of product storage life is undertaken by combining the statistical function of degradation track, reliability function and threshold. An example of a missile tank undergoes a step-down stress accelerated degradation test (SDSADT), in which the results with the SNADM and the classical method are evaluated and compared. The technology introduced is validated with the resultant coincidence of both evaluated and field storage lives.
基金supported in part by the National Natural Science Foundation of China under Grant 62371181in part by the Changzhou Science and Technology International Cooperation Program under Grant CZ20230029+1 种基金supported by a National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(2021R1A2B5B02087169)supported under the framework of international cooperation program managed by the National Research Foundation of Korea(2022K2A9A1A01098051)。
文摘The Intelligent Internet of Things(IIoT)involves real-world things that communicate or interact with each other through networking technologies by collecting data from these“things”and using intelligent approaches,such as Artificial Intelligence(AI)and machine learning,to make accurate decisions.Data science is the science of dealing with data and its relationships through intelligent approaches.Most state-of-the-art research focuses independently on either data science or IIoT,rather than exploring their integration.Therefore,to address the gap,this article provides a comprehensive survey on the advances and integration of data science with the Intelligent IoT(IIoT)system by classifying the existing IoT-based data science techniques and presenting a summary of various characteristics.The paper analyzes the data science or big data security and privacy features,including network architecture,data protection,and continuous monitoring of data,which face challenges in various IoT-based systems.Extensive insights into IoT data security,privacy,and challenges are visualized in the context of data science for IoT.In addition,this study reveals the current opportunities to enhance data science and IoT market development.The current gap and challenges faced in the integration of data science and IoT are comprehensively presented,followed by the future outlook and possible solutions.
基金supported by the National Natural Science Foundation of China(32370703)the CAMS Innovation Fund for Medical Sciences(CIFMS)(2022-I2M-1-021,2021-I2M-1-061)the Major Project of Guangzhou National Labora-tory(GZNL2024A01015).
文摘Viral infectious diseases,characterized by their intricate nature and wide-ranging diversity,pose substantial challenges in the domain of data management.The vast volume of data generated by these diseases,spanning from the molecular mechanisms within cells to large-scale epidemiological patterns,has surpassed the capabilities of traditional analytical methods.In the era of artificial intelligence(AI)and big data,there is an urgent necessity for the optimization of these analytical methods to more effectively handle and utilize the information.Despite the rapid accumulation of data associated with viral infections,the lack of a comprehensive framework for integrating,selecting,and analyzing these datasets has left numerous researchers uncertain about which data to select,how to access it,and how to utilize it most effectively in their research.This review endeavors to fill these gaps by exploring the multifaceted nature of viral infectious diseases and summarizing relevant data across multiple levels,from the molecular details of pathogens to broad epidemiological trends.The scope extends from the micro-scale to the macro-scale,encompassing pathogens,hosts,and vectors.In addition to data summarization,this review thoroughly investigates various dataset sources.It also traces the historical evolution of data collection in the field of viral infectious diseases,highlighting the progress achieved over time.Simultaneously,it evaluates the current limitations that impede data utilization.Furthermore,we propose strategies to surmount these challenges,focusing on the development and application of advanced computational techniques,AI-driven models,and enhanced data integration practices.By providing a comprehensive synthesis of existing knowledge,this review is designed to guide future research and contribute to more informed approaches in the surveillance,prevention,and control of viral infectious diseases,particularly within the context of the expanding big-data landscape.
文摘Survival data with amulti-state structure are frequently observed in follow-up studies.An analytic approach based on a multi-state model(MSM)should be used in longitudinal health studies in which a patient experiences a sequence of clinical progression events.One main objective in the MSM framework is variable selection,where attempts are made to identify the risk factors associated with the transition hazard rates or probabilities of disease progression.The usual variable selection methods,including stepwise and penalized methods,do not provide information about the importance of variables.In this context,we present a two-step algorithm to evaluate the importance of variables formulti-state data.Three differentmachine learning approaches(randomforest,gradient boosting,and neural network)as themost widely usedmethods are considered to estimate the variable importance in order to identify the factors affecting disease progression and rank these factors according to their importance.The performance of our proposed methods is validated by simulation and applied to the COVID-19 data set.The results revealed that the proposed two-stage method has promising performance for estimating variable importance.