Clustering is used to gain an intuition of the struc tures in the data.Most of the current clustering algorithms pro duce a clustering structure even on data that do not possess such structure.In these cases,the algor...Clustering is used to gain an intuition of the struc tures in the data.Most of the current clustering algorithms pro duce a clustering structure even on data that do not possess such structure.In these cases,the algorithms force a structure in the data instead of discovering one.To avoid false structures in the relations of data,a novel clusterability assessment method called density-based clusterability measure is proposed in this paper.I measures the prominence of clustering structure in the data to evaluate whether a cluster analysis could produce a meaningfu insight to the relationships in the data.This is especially useful in time-series data since visualizing the structure in time-series data is hard.The performance of the clusterability measure is evalu ated against several synthetic data sets and time-series data sets which illustrate that the density-based clusterability measure can successfully indicate clustering structure of time-series data.展开更多
In this study,we developed software for vehicle big data analysis to analyze the time-series data of connected vehicles.We designed two software modules:The rst to derive the Pearson correlation coefcients to analyze ...In this study,we developed software for vehicle big data analysis to analyze the time-series data of connected vehicles.We designed two software modules:The rst to derive the Pearson correlation coefcients to analyze the collected data and the second to conduct exploratory data analysis of the collected vehicle data.In particular,we analyzed the dangerous driving patterns of motorists based on the safety standards of the Korea Transportation Safety Authority.We also analyzed seasonal fuel efciency(four seasons)and mileage of vehicles,and identied rapid acceleration,rapid deceleration,sudden stopping(harsh braking),quick starting,sudden left turn,sudden right turn and sudden U-turn driving patterns of vehicles.We implemented the density-based spatial clustering of applications with a noise algorithm for trajectory analysis based on GPS(Global Positioning System)data and designed a long shortterm memory algorithm and an auto-regressive integrated moving average model for time-series data analysis.In this paper,we mainly describe the development environment of the analysis software,the structure and data ow of the overall analysis platform,the conguration of the collected vehicle data,and the various algorithms used in the analysis.Finally,we present illustrative results of our analysis,such as dangerous driving patterns that were detected.展开更多
The detection and characterization of non-metallic inclusions are essential for clean steel production.Recently,imaging analysis combined with high-dimensional data processing of metallic materials using artificial in...The detection and characterization of non-metallic inclusions are essential for clean steel production.Recently,imaging analysis combined with high-dimensional data processing of metallic materials using artificial intelligence(AI)-based machine learning(ML)has developed rapidly.This technique has achieved impressive results in the field of inclusion classification in process metallurgy.The present study surveys the ML modeling of inclusion prediction in advanced steels,including the detection,classification,and feature prediction of inclusions in different steel grades.Studies on clean steel with different features based on data and image analysis via ML are summarized.Regarding the data analysis,the inclusion prediction methodology based on ML establishes a connection between the experimental parameters and inclusion characteristics and analyzes the importance of the experimental parameters.Regarding the image analysis,the focus is placed on the classification of different types of inclusions via deep learning,in comparison with data analysis.Finally,further development of inclusion analyses using ML-based methods is recommended.This work paves the way for the application of AIbased methodologies for ultraclean-steel studies from a sustainable metallurgy perspective.展开更多
Rowlands et al.1present an analysis of accelerometer data from the UK Biobank cohort,examining variations in the duration,intensity,and accumulation of moderate-intensity physical activity(MPA)and vigorous-intensity p...Rowlands et al.1present an analysis of accelerometer data from the UK Biobank cohort,examining variations in the duration,intensity,and accumulation of moderate-intensity physical activity(MPA)and vigorous-intensity physical activity(VPA)sufficient to reduce the risk of all-cause mortality.In this study,the authors questioned if shorter durations(i.e.,1,2,3,4,5,10,15,and 20 min/day)of MPA and VPA performed continuously or accumulated throughout the day would equally reduce the risks of all-cause mortality as longer duration MPA and VPA recommended in the physical activity(PA)guidelines.展开更多
DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expres...DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expression data generated. To address this, this paper employs a mixed-effects model to analyze gene expression data. In terms of data selection, 1176 genes from the white mouse gene expression dataset under two experimental conditions were chosen, setting up two conditions: pneumococcal infection and no infection, and constructing a mixed-effects model. After preprocessing the gene chip information, the data were imported into the model, preliminary results were calculated, and permutation tests were performed to biologically validate the preliminary results using GSEA. The final dataset consists of 20 groups of gene expression data from pneumococcal infection, which categorizes functionally related genes based on the similarity of their expression profiles, facilitating the study of genes with unknown functions.展开更多
With the advent of the big data era,real-time data analysis and decision-support systems have been recognized as essential tools for enhancing enterprise competitiveness and optimizing the decision-making process.This...With the advent of the big data era,real-time data analysis and decision-support systems have been recognized as essential tools for enhancing enterprise competitiveness and optimizing the decision-making process.This study aims to explore the development strategies of real-time data analysis and decision-support systems,and analyze their application status and future development trends in various industries.The article first reviews the basic concepts and importance of real-time data analysis and decision-support systems,and then discusses in detail the key technical aspects such as system architecture,data collection and processing,analysis methods,and visualization techniques.展开更多
This paper analyzes the advantages of legal digital currencies and explores their impact on bank big data practices.By combining bank big data collection and processing,it clarifies that legal digital currencies can e...This paper analyzes the advantages of legal digital currencies and explores their impact on bank big data practices.By combining bank big data collection and processing,it clarifies that legal digital currencies can enhance the efficiency of bank data processing,enrich data types,and strengthen data analysis and application capabilities.In response to future development needs,it is necessary to strengthen data collection management,enhance data processing capabilities,innovate big data application models,and provide references for bank big data practices,promoting the transformation and upgrading of the banking industry in the context of legal digital currencies.展开更多
With the acceleration of intelligent transformation of energy system,the monitoring of equipment operation status and optimization of production process in thermal power plants face the challenge of multi-source heter...With the acceleration of intelligent transformation of energy system,the monitoring of equipment operation status and optimization of production process in thermal power plants face the challenge of multi-source heterogeneous data integration.In view of the heterogeneous characteristics of physical sensor data,including temperature,vibration and pressure that generated by boilers,steam turbines and other key equipment and real-time working condition data of SCADA system,this paper proposes a multi-source heterogeneous data fusion and analysis platform for thermal power plants based on edge computing and deep learning.By constructing a multi-level fusion architecture,the platform adopts dynamic weight allocation strategy and 5D digital twin model to realize the collaborative analysis of physical sensor data,simulation calculation results and expert knowledge.The data fusion module combines Kalman filter,wavelet transform and Bayesian estimation method to solve the problem of data time series alignment and dimension difference.Simulation results show that the data fusion accuracy can be improved to more than 98%,and the calculation delay can be controlled within 500 ms.The data analysis module integrates Dymola simulation model and AERMOD pollutant diffusion model,supports the cascade analysis of boiler combustion efficiency prediction and flue gas emission monitoring,system response time is less than 2 seconds,and data consistency verification accuracy reaches 99.5%.展开更多
The fracture volume is gradually changed with the depletion of fracture pressure during the production process.However,there are few flowback models available so far that can estimate the fracture volume loss using pr...The fracture volume is gradually changed with the depletion of fracture pressure during the production process.However,there are few flowback models available so far that can estimate the fracture volume loss using pressure transient and rate transient data.The initial flowback involves producing back the fracturing fuid after hydraulic fracturing,while the second flowback involves producing back the preloading fluid injected into the parent wells before fracturing of child wells.The main objective of this research is to compare the initial and second flowback data to capture the changes in fracture volume after production and preload processes.Such a comparison is useful for evaluating well performance and optimizing frac-turing operations.We construct rate-normalized pressure(RNP)versus material balance time(MBT)diagnostic plots using both initial and second flowback data(FB;and FBs,respectively)of six multi-fractured horizontal wells completed in Niobrara and Codell formations in DJ Basin.In general,the slope of RNP plot during the FB,period is higher than that during the FB;period,indicating a potential loss of fracture volume from the FB;to the FB,period.We estimate the changes in effective fracture volume(Ver)by analyzing the changes in the RNP slope and total compressibility between these two flowback periods.Ver during FB,is in general 3%-45%lower than that during FB:.We also compare the drive mechanisms for the two flowback periods by calculating the compaction-drive index(CDI),hydrocarbon-drive index(HDI),and water-drive index(WDI).The dominant drive mechanism during both flowback periods is CDI,but its contribution is reduced by 16%in the FB,period.This drop is generally compensated by a relatively higher HDI during this period.The loss of effective fracture volume might be attributed to the pressure depletion in fractures,which occurs during the production period and can extend 800 days.展开更多
With the rapid development of the Internet and e-commerce,e-commerce platforms have accumulated huge amounts of user behavior data.The emergence of big data technology provides a powerful means for in-depth analysis o...With the rapid development of the Internet and e-commerce,e-commerce platforms have accumulated huge amounts of user behavior data.The emergence of big data technology provides a powerful means for in-depth analysis of these data and insight into user behavior patterns and preferences.This paper elaborates on the application of big data technology in the analysis of user behavior on e-commerce platforms,including the technical methods of data collection,storage,processing and analysis,as well as the specific applications in the construction of user profiles,precision marketing,personalized recommendation,user retention and churn analysis,etc.,and discusses the challenges and countermeasures faced in the application.Through the study of actual cases,it demonstrates the remarkable effectiveness of big data technology in enhancing the competitiveness of e-commerce platforms and user experience.展开更多
Objective To identify core acupoint patterns and elucidate the molecular mechanisms of acupuncture for primary depressive disorder(PDD)through data mining and network analysis.Methods A comprehensive literature search...Objective To identify core acupoint patterns and elucidate the molecular mechanisms of acupuncture for primary depressive disorder(PDD)through data mining and network analysis.Methods A comprehensive literature search was conducted across PubMed,Embase,Ovid Technologies(OVID),Web of Science,Cochrane Library,China National Knowledge Infrastructure(CNKI),China National Knowledge Infrastructure Database(VIP),Wanfang Data,and SinoMed Database from database foundation to January 31,2025,for clinical studies on acupuncture treatment of PDD.Descriptive statistics,high-frequency acupoint analysis,degree and betweenness centrality evaluation,and core acupoint prescription mining identified predominant therapeutic combinations for PDD.Network acupuncture was used to predict therapeutic target for the core acupoint prescription.Subsequent protein-protein interaction(PPI)network and molecular complex detection(MCODE)analyses were conducted to identify the key targets and functional modules.Gene Ontology(GO)and Kyoto Encyclopedia of Genes and Genomes(KEGG)analyses explored the underlying biological mechanisms of the core acupoint prescription in treating PDD.Results A total of 57 acupoint prescriptions underwent systematic analysis.The core therapeutic combinations comprised Baihui(GV20),Yintang(GV29),Neiguan(PC6),Hegu(LI4),and Shenmen(HT7).Network acupuncture analysis identified 88 potential therapeutic targets(79 overlapping with PDD),while PPI network analysis revealed central regulatory nodes,including interleukin(IL)-6,IL-1β,tumor necrosis factor(TNF)-α,toll-like receptor 4(TLR4),IL-10,brain-derived neurotrophic factor(BDNF),transforming growth factor(TGF)-β1,C-XC motif chemokine ligand 10(CXCL10),mitogen-activated protein kinase 3(MAPK3),and nitric oxide synthase 1(NOS1).MCODE-based modular analysis further elucidated three functionally coherent clusters:inflammation-homeostasis(score=6.571),plasticity-neurotransmission(score=3.143),and oxidative stress(score=3.000).GO and KEGG analyses demonstrated significant enrichment of the MAPK,phosphoinositide 3-kinase/protein kinase B(PI3K/Akt),and hypoxia-inducible factor(HIF)-1 signaling pathways.These mechanistic insights suggested that the antidepressant effects mediated through mechanisms of neuroinflammatory regulation,neuroplasticity restoration,and immune-oxidative stress homeostasis.Conclusion This study reveals that acupuncture alleviates depression through a multi-level mechanism,primarily involving the neuroinflammation suppression,neuroplasticity enhancement,and oxidative stress regulation.These findings systematically clarify the underlying mechanisms of acupuncture’s antidepressant effects and identify novel therapeutic targets for further mechanistic research.展开更多
Semantic communication(SemCom)aims to achieve high-fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy.Nevertheless,semantic communication still suffers from unexpe...Semantic communication(SemCom)aims to achieve high-fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy.Nevertheless,semantic communication still suffers from unexpected channel volatility and thus developing a re-transmission mechanism(e.g.,hybrid automatic repeat request[HARQ])becomes indispensable.In that regard,instead of discarding previously transmitted information,the incremental knowledge-based HARQ(IK-HARQ)is deemed as a more effective mechanism that could sufficiently utilize the information semantics.However,considering the possible existence of semantic ambiguity in image transmission,a simple bit-level cyclic redundancy check(CRC)might compromise the performance of IK-HARQ.Therefore,there emerges a strong incentive to revolutionize the CRC mechanism,thus more effectively reaping the benefits of both SemCom and HARQ.In this paper,built on top of swin transformer-based joint source-channel coding(JSCC)and IK-HARQ,we propose a semantic image transmission framework SC-TDA-HARQ.In particular,different from the conventional CRC,we introduce a topological data analysis(TDA)-based error detection method,which capably digs out the inner topological and geometric information of images,to capture semantic information and determine the necessity for re-transmission.Extensive numerical results validate the effectiveness and efficiency of the proposed SC-TDA-HARQ framework,especially under the limited bandwidth condition,and manifest the superiority of TDA-based error detection method in image transmission.展开更多
In section‘Track decoding’of this article,one of the paragraphs was inadvertently missed out after the text'…shows the flow diagram of the Tr2-1121 track mode.'The missed paragraph is provided below.
Dissolved oxygen(DO)is an important indicator of aquaculture,and its accurate forecasting can effectively improve the quality of aquatic products.In this paper,a new DO hybrid forecasting model is proposed that includ...Dissolved oxygen(DO)is an important indicator of aquaculture,and its accurate forecasting can effectively improve the quality of aquatic products.In this paper,a new DO hybrid forecasting model is proposed that includes three stages:multi-factor analysis,adaptive decomposition,and an optimizationbased ensemble.First,considering the complex factors affecting DO,the grey relational(GR)degree method is used to screen out the environmental factors most closely related to DO.The consideration of multiple factors makes model fusion more effective.Second,the series of DO,water temperature,salinity,and oxygen saturation are decomposed adaptively into sub-series by means of the empirical wavelet transform(EWT)method.Then,five benchmark models are utilized to forecast the sub-series of EWT decomposition.The ensemble weights of these five sub-forecasting models are calculated by particle swarm optimization and gravitational search algorithm(PSOGSA).Finally,a multi-factor ensemble model for DO is obtained by weighted allocation.The performance of the proposed model is verified by timeseries data collected by the pacific islands ocean observing system(PacIOOS)from the WQB04 station at Hilo.The evaluation indicators involved in the experiment include the Nash–Sutcliffe efficiency(NSE),Kling–Gupta efficiency(KGE),mean absolute percent error(MAPE),standard deviation of error(SDE),and coefficient of determination(R^(2)).Example analysis demonstrates that:①The proposed model can obtain excellent DO forecasting results;②the proposed model is superior to other comparison models;and③the forecasting model can be used to analyze the trend of DO and enable managers to make better management decisions.展开更多
Cervical cancer,a leading malignancy globally,poses a significant threat to women's health,with an estimated 604,000 new cases and 342,000 deaths reported in 2020^([1]).As cervical cancer is closely linked to huma...Cervical cancer,a leading malignancy globally,poses a significant threat to women's health,with an estimated 604,000 new cases and 342,000 deaths reported in 2020^([1]).As cervical cancer is closely linked to human papilloma virus(HPV)infection,early detection relies on HPV screening;however,late-stage prognosis remains poor,underscoring the need for novel diagnostic and therapeutic targets^([2]).展开更多
The analysis of ancient genomics provides opportunities to explore human population history across both temporal and geographic dimensions(Haak et al.,2015;Wang et al.,2021,2024)to enhance the accessibility and utilit...The analysis of ancient genomics provides opportunities to explore human population history across both temporal and geographic dimensions(Haak et al.,2015;Wang et al.,2021,2024)to enhance the accessibility and utility of these ancient genomic datasets,a range of databases and advanced statistical models have been developed,including the Allen Ancient DNA Resource(AADR)(Mallick et al.,2024)and AdmixTools(Patterson et al.,2012).While upstream processes such as sequencing and raw data processing have been streamlined by resources like the AADR,the downstream analysis of these datasets-encompassing population genetics inference and spatiotemporal interpretation-remains a significant challenge.The AADR provides a unified collection of published ancient DNA(aDNA)data,yet its file-based format and reliance on command-line tools,such as those in Admix-Tools(Patterson et al.,2012),require advanced computational expertise for effective exploration and analysis.These requirements can present significant challenges forresearchers lackingadvanced computational expertise,limiting the accessibility and broader application of these valuable genomic resources.展开更多
The application of ti me-series modeling and forecasting method to the spectral analysis for lubricat ing oil of mechanical equipment is discussed. The AR model is used to perform a time-series modeling and forecasti...The application of ti me-series modeling and forecasting method to the spectral analysis for lubricat ing oil of mechanical equipment is discussed. The AR model is used to perform a time-series modeling and forecasting analysis for the spectral analysis data co llected from aero-engines. In the oil condition monitoring field of mechanical equipment, the use of the method of time-series analysis has rarely been report ed. As indicated in the satisfactory example, a practical method for condition m onitoring and fault forecasting of mechanical equipment has been achieved.展开更多
The negative pressure conical fluidized bed is widely used in the pharmaceutical industry.In this study,experiments based on the negative pressure conical fluidized bed are carried out by changing the material mass an...The negative pressure conical fluidized bed is widely used in the pharmaceutical industry.In this study,experiments based on the negative pressure conical fluidized bed are carried out by changing the material mass and particle size.The pressure fluctuation signals are analyzed by the time and the frequency domain methods.A method for absolutely characterizing the degree of the energy concentration at the main frequency is proposed,where the calculation is to divide the original power spectrum by the average signal power.A phenomenon where the gas velocity curve temporarily stops growing is observed when the material mass is light,and the particle size is small.The standard deviation and kurtosis both rapidly change at the minimum fluidization velocity and thus can be used to determine the flow regime,and the variation rule of the kurtosis is independent of both the material mass and particle size.In the initial fluidization stage,the dominant pressure signal comes from the material movement;with the increase in the gas velocity,the power of a 2.5 Hz signal continues to increase.A method of dividing the main frequency by the average cycle frequency can conveniently determine the fluidized state,and a novel concept called stable fluidized zone proposed in this paper can be obtained.Controlling the gas velocity within the stable fluidized zone ensures that the fluidized bed consistently remains in a stable fluidized state.展开更多
There are some limitations when we apply conventional methods to analyze the massive amounts of seismic data acquired with high-density spatial sampling since processors usually obtain the properties of raw data from ...There are some limitations when we apply conventional methods to analyze the massive amounts of seismic data acquired with high-density spatial sampling since processors usually obtain the properties of raw data from common shot gathers or other datasets located at certain points or along lines. We propose a novel method in this paper to observe seismic data on time slices from spatial subsets. The composition of a spatial subset and the unique character of orthogonal or oblique subsets are described and pre-stack subsets are shown by 3D visualization. In seismic data processing, spatial subsets can be used for the following aspects: (1) to check the trace distribution uniformity and regularity; (2) to observe the main features of ground-roll and linear noise; (3) to find abnormal traces from slices of datasets; and (4) to QC the results of pre-stack noise attenuation. The field data application shows that seismic data analysis in spatial subsets is an effective method that may lead to a better discrimination among various wavefields and help us obtain more information.展开更多
Objective To evaluate the environmental and technical efficiencies of China's industrial sectors and provide appropriate advice for policy makers in the context of rapid economic growth and concurrent serious environ...Objective To evaluate the environmental and technical efficiencies of China's industrial sectors and provide appropriate advice for policy makers in the context of rapid economic growth and concurrent serious environmental damages caused by industrial pollutants. Methods A data of envelopment analysis (DEA) framework crediting both reduction of pollution outputs and expansion of good outputs was designed as a model to compute environmental efficiency of China's regional industrial systems. Results As shown by the geometric mean of environmental efficiency, if other inputs were made constant and good outputs were not to be improved, the air pollution outputs would have the potential to be decreased by about 60% in the whole China. Conclusion Both environmental and technical efficiencies have the potential to be greatly improved in China, which may provide some advice for policy-makers.展开更多
文摘Clustering is used to gain an intuition of the struc tures in the data.Most of the current clustering algorithms pro duce a clustering structure even on data that do not possess such structure.In these cases,the algorithms force a structure in the data instead of discovering one.To avoid false structures in the relations of data,a novel clusterability assessment method called density-based clusterability measure is proposed in this paper.I measures the prominence of clustering structure in the data to evaluate whether a cluster analysis could produce a meaningfu insight to the relationships in the data.This is especially useful in time-series data since visualizing the structure in time-series data is hard.The performance of the clusterability measure is evalu ated against several synthetic data sets and time-series data sets which illustrate that the density-based clusterability measure can successfully indicate clustering structure of time-series data.
基金supported by the Technology Innovation Program(10083633,Development on Big Data Analysis Technology and Business Service for Connected Vehicles)funded by the Ministry of Trade,Industry&Energy(MOTIE,Korea)。
文摘In this study,we developed software for vehicle big data analysis to analyze the time-series data of connected vehicles.We designed two software modules:The rst to derive the Pearson correlation coefcients to analyze the collected data and the second to conduct exploratory data analysis of the collected vehicle data.In particular,we analyzed the dangerous driving patterns of motorists based on the safety standards of the Korea Transportation Safety Authority.We also analyzed seasonal fuel efciency(four seasons)and mileage of vehicles,and identied rapid acceleration,rapid deceleration,sudden stopping(harsh braking),quick starting,sudden left turn,sudden right turn and sudden U-turn driving patterns of vehicles.We implemented the density-based spatial clustering of applications with a noise algorithm for trajectory analysis based on GPS(Global Positioning System)data and designed a long shortterm memory algorithm and an auto-regressive integrated moving average model for time-series data analysis.In this paper,we mainly describe the development environment of the analysis software,the structure and data ow of the overall analysis platform,the conguration of the collected vehicle data,and the various algorithms used in the analysis.Finally,we present illustrative results of our analysis,such as dangerous driving patterns that were detected.
基金support from the National Key Research and Development Program of China(No.2024YFB3713705)is acknowledgedWangzhong Mu would like to acknowledge the Strategic Mobility,Sweden(SSF,No.SM22-0039)+1 种基金the Swedish Foundation for International Cooperation in Research and Higher Education(STINT,No.IB2022-9228)the Jernkontoret(Sweden)for supporting this clean steel research.Gonghao Lian would like to acknowledge China Scholarship Council(CSC,No.202306080032).
文摘The detection and characterization of non-metallic inclusions are essential for clean steel production.Recently,imaging analysis combined with high-dimensional data processing of metallic materials using artificial intelligence(AI)-based machine learning(ML)has developed rapidly.This technique has achieved impressive results in the field of inclusion classification in process metallurgy.The present study surveys the ML modeling of inclusion prediction in advanced steels,including the detection,classification,and feature prediction of inclusions in different steel grades.Studies on clean steel with different features based on data and image analysis via ML are summarized.Regarding the data analysis,the inclusion prediction methodology based on ML establishes a connection between the experimental parameters and inclusion characteristics and analyzes the importance of the experimental parameters.Regarding the image analysis,the focus is placed on the classification of different types of inclusions via deep learning,in comparison with data analysis.Finally,further development of inclusion analyses using ML-based methods is recommended.This work paves the way for the application of AIbased methodologies for ultraclean-steel studies from a sustainable metallurgy perspective.
文摘Rowlands et al.1present an analysis of accelerometer data from the UK Biobank cohort,examining variations in the duration,intensity,and accumulation of moderate-intensity physical activity(MPA)and vigorous-intensity physical activity(VPA)sufficient to reduce the risk of all-cause mortality.In this study,the authors questioned if shorter durations(i.e.,1,2,3,4,5,10,15,and 20 min/day)of MPA and VPA performed continuously or accumulated throughout the day would equally reduce the risks of all-cause mortality as longer duration MPA and VPA recommended in the physical activity(PA)guidelines.
文摘DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expression data generated. To address this, this paper employs a mixed-effects model to analyze gene expression data. In terms of data selection, 1176 genes from the white mouse gene expression dataset under two experimental conditions were chosen, setting up two conditions: pneumococcal infection and no infection, and constructing a mixed-effects model. After preprocessing the gene chip information, the data were imported into the model, preliminary results were calculated, and permutation tests were performed to biologically validate the preliminary results using GSEA. The final dataset consists of 20 groups of gene expression data from pneumococcal infection, which categorizes functionally related genes based on the similarity of their expression profiles, facilitating the study of genes with unknown functions.
文摘With the advent of the big data era,real-time data analysis and decision-support systems have been recognized as essential tools for enhancing enterprise competitiveness and optimizing the decision-making process.This study aims to explore the development strategies of real-time data analysis and decision-support systems,and analyze their application status and future development trends in various industries.The article first reviews the basic concepts and importance of real-time data analysis and decision-support systems,and then discusses in detail the key technical aspects such as system architecture,data collection and processing,analysis methods,and visualization techniques.
文摘This paper analyzes the advantages of legal digital currencies and explores their impact on bank big data practices.By combining bank big data collection and processing,it clarifies that legal digital currencies can enhance the efficiency of bank data processing,enrich data types,and strengthen data analysis and application capabilities.In response to future development needs,it is necessary to strengthen data collection management,enhance data processing capabilities,innovate big data application models,and provide references for bank big data practices,promoting the transformation and upgrading of the banking industry in the context of legal digital currencies.
文摘With the acceleration of intelligent transformation of energy system,the monitoring of equipment operation status and optimization of production process in thermal power plants face the challenge of multi-source heterogeneous data integration.In view of the heterogeneous characteristics of physical sensor data,including temperature,vibration and pressure that generated by boilers,steam turbines and other key equipment and real-time working condition data of SCADA system,this paper proposes a multi-source heterogeneous data fusion and analysis platform for thermal power plants based on edge computing and deep learning.By constructing a multi-level fusion architecture,the platform adopts dynamic weight allocation strategy and 5D digital twin model to realize the collaborative analysis of physical sensor data,simulation calculation results and expert knowledge.The data fusion module combines Kalman filter,wavelet transform and Bayesian estimation method to solve the problem of data time series alignment and dimension difference.Simulation results show that the data fusion accuracy can be improved to more than 98%,and the calculation delay can be controlled within 500 ms.The data analysis module integrates Dymola simulation model and AERMOD pollutant diffusion model,supports the cascade analysis of boiler combustion efficiency prediction and flue gas emission monitoring,system response time is less than 2 seconds,and data consistency verification accuracy reaches 99.5%.
文摘The fracture volume is gradually changed with the depletion of fracture pressure during the production process.However,there are few flowback models available so far that can estimate the fracture volume loss using pressure transient and rate transient data.The initial flowback involves producing back the fracturing fuid after hydraulic fracturing,while the second flowback involves producing back the preloading fluid injected into the parent wells before fracturing of child wells.The main objective of this research is to compare the initial and second flowback data to capture the changes in fracture volume after production and preload processes.Such a comparison is useful for evaluating well performance and optimizing frac-turing operations.We construct rate-normalized pressure(RNP)versus material balance time(MBT)diagnostic plots using both initial and second flowback data(FB;and FBs,respectively)of six multi-fractured horizontal wells completed in Niobrara and Codell formations in DJ Basin.In general,the slope of RNP plot during the FB,period is higher than that during the FB;period,indicating a potential loss of fracture volume from the FB;to the FB,period.We estimate the changes in effective fracture volume(Ver)by analyzing the changes in the RNP slope and total compressibility between these two flowback periods.Ver during FB,is in general 3%-45%lower than that during FB:.We also compare the drive mechanisms for the two flowback periods by calculating the compaction-drive index(CDI),hydrocarbon-drive index(HDI),and water-drive index(WDI).The dominant drive mechanism during both flowback periods is CDI,but its contribution is reduced by 16%in the FB,period.This drop is generally compensated by a relatively higher HDI during this period.The loss of effective fracture volume might be attributed to the pressure depletion in fractures,which occurs during the production period and can extend 800 days.
文摘With the rapid development of the Internet and e-commerce,e-commerce platforms have accumulated huge amounts of user behavior data.The emergence of big data technology provides a powerful means for in-depth analysis of these data and insight into user behavior patterns and preferences.This paper elaborates on the application of big data technology in the analysis of user behavior on e-commerce platforms,including the technical methods of data collection,storage,processing and analysis,as well as the specific applications in the construction of user profiles,precision marketing,personalized recommendation,user retention and churn analysis,etc.,and discusses the challenges and countermeasures faced in the application.Through the study of actual cases,it demonstrates the remarkable effectiveness of big data technology in enhancing the competitiveness of e-commerce platforms and user experience.
文摘Objective To identify core acupoint patterns and elucidate the molecular mechanisms of acupuncture for primary depressive disorder(PDD)through data mining and network analysis.Methods A comprehensive literature search was conducted across PubMed,Embase,Ovid Technologies(OVID),Web of Science,Cochrane Library,China National Knowledge Infrastructure(CNKI),China National Knowledge Infrastructure Database(VIP),Wanfang Data,and SinoMed Database from database foundation to January 31,2025,for clinical studies on acupuncture treatment of PDD.Descriptive statistics,high-frequency acupoint analysis,degree and betweenness centrality evaluation,and core acupoint prescription mining identified predominant therapeutic combinations for PDD.Network acupuncture was used to predict therapeutic target for the core acupoint prescription.Subsequent protein-protein interaction(PPI)network and molecular complex detection(MCODE)analyses were conducted to identify the key targets and functional modules.Gene Ontology(GO)and Kyoto Encyclopedia of Genes and Genomes(KEGG)analyses explored the underlying biological mechanisms of the core acupoint prescription in treating PDD.Results A total of 57 acupoint prescriptions underwent systematic analysis.The core therapeutic combinations comprised Baihui(GV20),Yintang(GV29),Neiguan(PC6),Hegu(LI4),and Shenmen(HT7).Network acupuncture analysis identified 88 potential therapeutic targets(79 overlapping with PDD),while PPI network analysis revealed central regulatory nodes,including interleukin(IL)-6,IL-1β,tumor necrosis factor(TNF)-α,toll-like receptor 4(TLR4),IL-10,brain-derived neurotrophic factor(BDNF),transforming growth factor(TGF)-β1,C-XC motif chemokine ligand 10(CXCL10),mitogen-activated protein kinase 3(MAPK3),and nitric oxide synthase 1(NOS1).MCODE-based modular analysis further elucidated three functionally coherent clusters:inflammation-homeostasis(score=6.571),plasticity-neurotransmission(score=3.143),and oxidative stress(score=3.000).GO and KEGG analyses demonstrated significant enrichment of the MAPK,phosphoinositide 3-kinase/protein kinase B(PI3K/Akt),and hypoxia-inducible factor(HIF)-1 signaling pathways.These mechanistic insights suggested that the antidepressant effects mediated through mechanisms of neuroinflammatory regulation,neuroplasticity restoration,and immune-oxidative stress homeostasis.Conclusion This study reveals that acupuncture alleviates depression through a multi-level mechanism,primarily involving the neuroinflammation suppression,neuroplasticity enhancement,and oxidative stress regulation.These findings systematically clarify the underlying mechanisms of acupuncture’s antidepressant effects and identify novel therapeutic targets for further mechanistic research.
基金supported in part by the National Key Research and Development Program of China under Grant 2024YFE0200600in part by the National Natural Science Foundation of China under Grant 62071425+3 种基金in part by the Zhejiang Key Research and Development Plan under Grant 2022C01093in part by the Zhejiang Provincial Natural Science Foundation of China under Grant LR23F010005in part by the National Key Laboratory of Wireless Communications Foundation under Grant 2023KP01601in part by the Big Data and Intelligent Computing Key Lab of CQUPT under Grant BDIC-2023-B-001.
文摘Semantic communication(SemCom)aims to achieve high-fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy.Nevertheless,semantic communication still suffers from unexpected channel volatility and thus developing a re-transmission mechanism(e.g.,hybrid automatic repeat request[HARQ])becomes indispensable.In that regard,instead of discarding previously transmitted information,the incremental knowledge-based HARQ(IK-HARQ)is deemed as a more effective mechanism that could sufficiently utilize the information semantics.However,considering the possible existence of semantic ambiguity in image transmission,a simple bit-level cyclic redundancy check(CRC)might compromise the performance of IK-HARQ.Therefore,there emerges a strong incentive to revolutionize the CRC mechanism,thus more effectively reaping the benefits of both SemCom and HARQ.In this paper,built on top of swin transformer-based joint source-channel coding(JSCC)and IK-HARQ,we propose a semantic image transmission framework SC-TDA-HARQ.In particular,different from the conventional CRC,we introduce a topological data analysis(TDA)-based error detection method,which capably digs out the inner topological and geometric information of images,to capture semantic information and determine the necessity for re-transmission.Extensive numerical results validate the effectiveness and efficiency of the proposed SC-TDA-HARQ framework,especially under the limited bandwidth condition,and manifest the superiority of TDA-based error detection method in image transmission.
文摘In section‘Track decoding’of this article,one of the paragraphs was inadvertently missed out after the text'…shows the flow diagram of the Tr2-1121 track mode.'The missed paragraph is provided below.
基金the National Natural Science Foundation of China(61873283)the Changsha Science&Technology Project(KQ1707017)the innovation-driven project of the Central South University(2019CX005).
文摘Dissolved oxygen(DO)is an important indicator of aquaculture,and its accurate forecasting can effectively improve the quality of aquatic products.In this paper,a new DO hybrid forecasting model is proposed that includes three stages:multi-factor analysis,adaptive decomposition,and an optimizationbased ensemble.First,considering the complex factors affecting DO,the grey relational(GR)degree method is used to screen out the environmental factors most closely related to DO.The consideration of multiple factors makes model fusion more effective.Second,the series of DO,water temperature,salinity,and oxygen saturation are decomposed adaptively into sub-series by means of the empirical wavelet transform(EWT)method.Then,five benchmark models are utilized to forecast the sub-series of EWT decomposition.The ensemble weights of these five sub-forecasting models are calculated by particle swarm optimization and gravitational search algorithm(PSOGSA).Finally,a multi-factor ensemble model for DO is obtained by weighted allocation.The performance of the proposed model is verified by timeseries data collected by the pacific islands ocean observing system(PacIOOS)from the WQB04 station at Hilo.The evaluation indicators involved in the experiment include the Nash–Sutcliffe efficiency(NSE),Kling–Gupta efficiency(KGE),mean absolute percent error(MAPE),standard deviation of error(SDE),and coefficient of determination(R^(2)).Example analysis demonstrates that:①The proposed model can obtain excellent DO forecasting results;②the proposed model is superior to other comparison models;and③the forecasting model can be used to analyze the trend of DO and enable managers to make better management decisions.
基金supported by a project funded by the Hebei Provincial Central Guidance Local Science and Technology Development Fund(236Z7714G)。
文摘Cervical cancer,a leading malignancy globally,poses a significant threat to women's health,with an estimated 604,000 new cases and 342,000 deaths reported in 2020^([1]).As cervical cancer is closely linked to human papilloma virus(HPV)infection,early detection relies on HPV screening;however,late-stage prognosis remains poor,underscoring the need for novel diagnostic and therapeutic targets^([2]).
基金by the National Key Research and Development Program of China(2023YFC3303701-02 and 2024YFC3306701)the National Natural Science Foundation of China(T2425014 and 32270667)+3 种基金the Natural Science Foundation of Fujian Province of China(2023J06013)the Major Project of the National Social Science Foundation of China granted to Chuan-Chao Wang(21&ZD285)Open Research Fund of State Key Laboratory of Genetic Engineering at Fudan University(SKLGE-2310)Open Research Fund of Forensic Genetics Key Laboratory of the Ministry of Public Security(2023FGKFKT07).
文摘The analysis of ancient genomics provides opportunities to explore human population history across both temporal and geographic dimensions(Haak et al.,2015;Wang et al.,2021,2024)to enhance the accessibility and utility of these ancient genomic datasets,a range of databases and advanced statistical models have been developed,including the Allen Ancient DNA Resource(AADR)(Mallick et al.,2024)and AdmixTools(Patterson et al.,2012).While upstream processes such as sequencing and raw data processing have been streamlined by resources like the AADR,the downstream analysis of these datasets-encompassing population genetics inference and spatiotemporal interpretation-remains a significant challenge.The AADR provides a unified collection of published ancient DNA(aDNA)data,yet its file-based format and reliance on command-line tools,such as those in Admix-Tools(Patterson et al.,2012),require advanced computational expertise for effective exploration and analysis.These requirements can present significant challenges forresearchers lackingadvanced computational expertise,limiting the accessibility and broader application of these valuable genomic resources.
文摘The application of ti me-series modeling and forecasting method to the spectral analysis for lubricat ing oil of mechanical equipment is discussed. The AR model is used to perform a time-series modeling and forecasting analysis for the spectral analysis data co llected from aero-engines. In the oil condition monitoring field of mechanical equipment, the use of the method of time-series analysis has rarely been report ed. As indicated in the satisfactory example, a practical method for condition m onitoring and fault forecasting of mechanical equipment has been achieved.
基金the National Standardization Project of TCM(ZYBZH-C-TJ-55)and National Science and Technology Major Project(2018ZX09201011-002).
文摘The negative pressure conical fluidized bed is widely used in the pharmaceutical industry.In this study,experiments based on the negative pressure conical fluidized bed are carried out by changing the material mass and particle size.The pressure fluctuation signals are analyzed by the time and the frequency domain methods.A method for absolutely characterizing the degree of the energy concentration at the main frequency is proposed,where the calculation is to divide the original power spectrum by the average signal power.A phenomenon where the gas velocity curve temporarily stops growing is observed when the material mass is light,and the particle size is small.The standard deviation and kurtosis both rapidly change at the minimum fluidization velocity and thus can be used to determine the flow regime,and the variation rule of the kurtosis is independent of both the material mass and particle size.In the initial fluidization stage,the dominant pressure signal comes from the material movement;with the increase in the gas velocity,the power of a 2.5 Hz signal continues to increase.A method of dividing the main frequency by the average cycle frequency can conveniently determine the fluidized state,and a novel concept called stable fluidized zone proposed in this paper can be obtained.Controlling the gas velocity within the stable fluidized zone ensures that the fluidized bed consistently remains in a stable fluidized state.
文摘There are some limitations when we apply conventional methods to analyze the massive amounts of seismic data acquired with high-density spatial sampling since processors usually obtain the properties of raw data from common shot gathers or other datasets located at certain points or along lines. We propose a novel method in this paper to observe seismic data on time slices from spatial subsets. The composition of a spatial subset and the unique character of orthogonal or oblique subsets are described and pre-stack subsets are shown by 3D visualization. In seismic data processing, spatial subsets can be used for the following aspects: (1) to check the trace distribution uniformity and regularity; (2) to observe the main features of ground-roll and linear noise; (3) to find abnormal traces from slices of datasets; and (4) to QC the results of pre-stack noise attenuation. The field data application shows that seismic data analysis in spatial subsets is an effective method that may lead to a better discrimination among various wavefields and help us obtain more information.
文摘Objective To evaluate the environmental and technical efficiencies of China's industrial sectors and provide appropriate advice for policy makers in the context of rapid economic growth and concurrent serious environmental damages caused by industrial pollutants. Methods A data of envelopment analysis (DEA) framework crediting both reduction of pollution outputs and expansion of good outputs was designed as a model to compute environmental efficiency of China's regional industrial systems. Results As shown by the geometric mean of environmental efficiency, if other inputs were made constant and good outputs were not to be improved, the air pollution outputs would have the potential to be decreased by about 60% in the whole China. Conclusion Both environmental and technical efficiencies have the potential to be greatly improved in China, which may provide some advice for policy-makers.