As global climate change intensifies,the power industry-a major source of carbon emissions-plays a pivotal role in achieving carbon peaking and neutrality goals through its low-carbon transition.Traditional power pla...As global climate change intensifies,the power industry-a major source of carbon emissions-plays a pivotal role in achieving carbon peaking and neutrality goals through its low-carbon transition.Traditional power plants’carbon management systems can no longer meet the demands of high-precision,real-time monitoring.Smart power plants now offer innovative solutions for carbon emission tracking and intelligent analysis by integrating IoT,big data,and AI technologies.Current research predominantly focuses on optimizing individual processes,lacking systematic exploration of comprehensive dynamic monitoring and intelligent decision-making across the entire workflow.To address this gap,we propose a smart carbon emission monitoring and analysis platform for power plants that integrates IoT sensing,multimodal data analytics,and AI-driven decision-making.The platform establishes a multi-source sensor network to collect emissions data throughout the fuel combustion,auxiliary equipment operation,and waste treatment processes.Combining carbon emission factor analysis with machine learning models enables real-time emission calculations and utilizes long short-term memory networks to predict future emission trends.展开更多
The application and development of a wide-area measurement system(WAMS)has enabled many applications and led to several requirements based on dynamic measurement data.Such data are transmitted as big data information ...The application and development of a wide-area measurement system(WAMS)has enabled many applications and led to several requirements based on dynamic measurement data.Such data are transmitted as big data information flow.To ensure effective transmission of wide-frequency electrical information by the communication protocol of a WAMS,this study performs real-time traffic monitoring and analysis of the data network of a power information system,and establishes corresponding network optimization strategies to solve existing transmission problems.This study utilizes the traffic analysis results obtained using the current real-time dynamic monitoring system to design an optimization strategy,covering the optimization in three progressive levels:the underlying communication protocol,source data,and transmission process.Optimization of the system structure and scheduling optimization of data information are validated to be feasible and practical via tests.展开更多
Opinion (sentiment) analysis on big data streams from the constantly generated text streams on social media networks to hundreds of millions of online consumer reviews provides many organizations in every field with o...Opinion (sentiment) analysis on big data streams from the constantly generated text streams on social media networks to hundreds of millions of online consumer reviews provides many organizations in every field with opportunities to discover valuable intelligence from the massive user generated text streams. However, the traditional content analysis frameworks are inefficient to handle the unprecedentedly big volume of unstructured text streams and the complexity of text analysis tasks for the real time opinion analysis on the big data streams. In this paper, we propose a parallel real time sentiment analysis system: Social Media Data Stream Sentiment Analysis Service (SMDSSAS) that performs multiple phases of sentiment analysis of social media text streams effectively in real time with two fully analytic opinion mining models to combat the scale of text data streams and the complexity of sentiment analysis processing on unstructured text streams. We propose two aspect based opinion mining models: Deterministic and Probabilistic sentiment models for a real time sentiment analysis on the user given topic related data streams. Experiments on the social media Twitter stream traffic captured during the pre-election weeks of the 2016 Presidential election for real-time analysis of public opinions toward two presidential candidates showed that the proposed system was able to predict correctly Donald Trump as the winner of the 2016 Presidential election. The cross validation results showed that the proposed sentiment models with the real-time streaming components in our proposed framework delivered effectively the analysis of the opinions on two presidential candidates with average 81% accuracy for the Deterministic model and 80% for the Probabilistic model, which are 1% - 22% improvements from the results of the existing literature.展开更多
In the era of big data, huge volumes of data are generated from online social networks, sensor networks, mobile devices, and organizations’ enterprise systems. This phenomenon provides organizations with unprecedente...In the era of big data, huge volumes of data are generated from online social networks, sensor networks, mobile devices, and organizations’ enterprise systems. This phenomenon provides organizations with unprecedented opportunities to tap into big data to mine valuable business intelligence. However, traditional business analytics methods may not be able to cope with the flood of big data. The main contribution of this paper is the illustration of the development of a novel big data stream analytics framework named BDSASA that leverages a probabilistic language model to analyze the consumer sentiments embedded in hundreds of millions of online consumer reviews. In particular, an inference model is embedded into the classical language modeling framework to enhance the prediction of consumer sentiments. The practical implication of our research work is that organizations can apply our big data stream analytics framework to analyze consumers’ product preferences, and hence develop more effective marketing and production strategies.展开更多
With the advent of the big data era,real-time data analysis and decision-support systems have been recognized as essential tools for enhancing enterprise competitiveness and optimizing the decision-making process.This...With the advent of the big data era,real-time data analysis and decision-support systems have been recognized as essential tools for enhancing enterprise competitiveness and optimizing the decision-making process.This study aims to explore the development strategies of real-time data analysis and decision-support systems,and analyze their application status and future development trends in various industries.The article first reviews the basic concepts and importance of real-time data analysis and decision-support systems,and then discusses in detail the key technical aspects such as system architecture,data collection and processing,analysis methods,and visualization techniques.展开更多
DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expres...DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expression data generated. To address this, this paper employs a mixed-effects model to analyze gene expression data. In terms of data selection, 1176 genes from the white mouse gene expression dataset under two experimental conditions were chosen, setting up two conditions: pneumococcal infection and no infection, and constructing a mixed-effects model. After preprocessing the gene chip information, the data were imported into the model, preliminary results were calculated, and permutation tests were performed to biologically validate the preliminary results using GSEA. The final dataset consists of 20 groups of gene expression data from pneumococcal infection, which categorizes functionally related genes based on the similarity of their expression profiles, facilitating the study of genes with unknown functions.展开更多
The exponential expansion of the Internet of Things(IoT),Industrial Internet of Things(IIoT),and Transportation Management of Things(TMoT)produces vast amounts of real-time streaming data.Ensuring system dependability...The exponential expansion of the Internet of Things(IoT),Industrial Internet of Things(IIoT),and Transportation Management of Things(TMoT)produces vast amounts of real-time streaming data.Ensuring system dependability,operational efficiency,and security depends on the identification of anomalies in these dynamic and resource-constrained systems.Due to their high computational requirements and inability to efficiently process continuous data streams,traditional anomaly detection techniques often fail in IoT systems.This work presents a resource-efficient adaptive anomaly detection model for real-time streaming data in IoT systems.Extensive experiments were carried out on multiple real-world datasets,achieving an average accuracy score of 96.06%with an execution time close to 7.5 milliseconds for each individual streaming data point,demonstrating its potential for real-time,resourceconstrained applications.The model uses Principal Component Analysis(PCA)for dimensionality reduction and a Z-score technique for anomaly detection.It maintains a low computational footprint with a sliding window mechanism,enabling incremental data processing and identification of both transient and sustained anomalies without storing historical data.The system uses a Multivariate Linear Regression(MLR)based imputation technique that estimates missing or corrupted sensor values,preserving data integrity prior to anomaly detection.The suggested solution is appropriate for many uses in smart cities,industrial automation,environmental monitoring,IoT security,and intelligent transportation systems,and is particularly well-suited for resource-constrained edge devices.展开更多
This paper analyzes the advantages of legal digital currencies and explores their impact on bank big data practices.By combining bank big data collection and processing,it clarifies that legal digital currencies can e...This paper analyzes the advantages of legal digital currencies and explores their impact on bank big data practices.By combining bank big data collection and processing,it clarifies that legal digital currencies can enhance the efficiency of bank data processing,enrich data types,and strengthen data analysis and application capabilities.In response to future development needs,it is necessary to strengthen data collection management,enhance data processing capabilities,innovate big data application models,and provide references for bank big data practices,promoting the transformation and upgrading of the banking industry in the context of legal digital currencies.展开更多
With the acceleration of intelligent transformation of energy system,the monitoring of equipment operation status and optimization of production process in thermal power plants face the challenge of multi-source heter...With the acceleration of intelligent transformation of energy system,the monitoring of equipment operation status and optimization of production process in thermal power plants face the challenge of multi-source heterogeneous data integration.In view of the heterogeneous characteristics of physical sensor data,including temperature,vibration and pressure that generated by boilers,steam turbines and other key equipment and real-time working condition data of SCADA system,this paper proposes a multi-source heterogeneous data fusion and analysis platform for thermal power plants based on edge computing and deep learning.By constructing a multi-level fusion architecture,the platform adopts dynamic weight allocation strategy and 5D digital twin model to realize the collaborative analysis of physical sensor data,simulation calculation results and expert knowledge.The data fusion module combines Kalman filter,wavelet transform and Bayesian estimation method to solve the problem of data time series alignment and dimension difference.Simulation results show that the data fusion accuracy can be improved to more than 98%,and the calculation delay can be controlled within 500 ms.The data analysis module integrates Dymola simulation model and AERMOD pollutant diffusion model,supports the cascade analysis of boiler combustion efficiency prediction and flue gas emission monitoring,system response time is less than 2 seconds,and data consistency verification accuracy reaches 99.5%.展开更多
The issue of strong noise has increasingly become a bottleneck restricting the precision and application space of electromagnetic exploration methods.Noise suppression and extraction of effective electromagnetic respo...The issue of strong noise has increasingly become a bottleneck restricting the precision and application space of electromagnetic exploration methods.Noise suppression and extraction of effective electromagnetic response information under a strong noise background is a crucial scientific task to be addressed.To solve the noise suppression problem of the controlled-source electromagnetic method in strong interference areas,we propose an approach based on complex-plane 2D k-means clustering for data processing.Based on the stability of the controlled-source signal response,clustering analysis is applied to classify the spectra of different sources and noises in multiple time segments.By identifying the power spectra with controlled-source characteristics,it helps to improve the quality of the controlled-source response extraction.This paper presents the principle and workflow of the proposed algorithm,and demonstrates feasibility and effectiveness of the new algorithm through synthetic and real data examples.The results show that,compared with the conventional Robust denoising method,the clustering algorithm has a stronger suppression effect on common noise,can identify high-quality signals,and improve the preprocessing data quality of the controlledsource electromagnetic method.展开更多
The fracture volume is gradually changed with the depletion of fracture pressure during the production process.However,there are few flowback models available so far that can estimate the fracture volume loss using pr...The fracture volume is gradually changed with the depletion of fracture pressure during the production process.However,there are few flowback models available so far that can estimate the fracture volume loss using pressure transient and rate transient data.The initial flowback involves producing back the fracturing fuid after hydraulic fracturing,while the second flowback involves producing back the preloading fluid injected into the parent wells before fracturing of child wells.The main objective of this research is to compare the initial and second flowback data to capture the changes in fracture volume after production and preload processes.Such a comparison is useful for evaluating well performance and optimizing frac-turing operations.We construct rate-normalized pressure(RNP)versus material balance time(MBT)diagnostic plots using both initial and second flowback data(FB;and FBs,respectively)of six multi-fractured horizontal wells completed in Niobrara and Codell formations in DJ Basin.In general,the slope of RNP plot during the FB,period is higher than that during the FB;period,indicating a potential loss of fracture volume from the FB;to the FB,period.We estimate the changes in effective fracture volume(Ver)by analyzing the changes in the RNP slope and total compressibility between these two flowback periods.Ver during FB,is in general 3%-45%lower than that during FB:.We also compare the drive mechanisms for the two flowback periods by calculating the compaction-drive index(CDI),hydrocarbon-drive index(HDI),and water-drive index(WDI).The dominant drive mechanism during both flowback periods is CDI,but its contribution is reduced by 16%in the FB,period.This drop is generally compensated by a relatively higher HDI during this period.The loss of effective fracture volume might be attributed to the pressure depletion in fractures,which occurs during the production period and can extend 800 days.展开更多
With the rapid development of the Internet and e-commerce,e-commerce platforms have accumulated huge amounts of user behavior data.The emergence of big data technology provides a powerful means for in-depth analysis o...With the rapid development of the Internet and e-commerce,e-commerce platforms have accumulated huge amounts of user behavior data.The emergence of big data technology provides a powerful means for in-depth analysis of these data and insight into user behavior patterns and preferences.This paper elaborates on the application of big data technology in the analysis of user behavior on e-commerce platforms,including the technical methods of data collection,storage,processing and analysis,as well as the specific applications in the construction of user profiles,precision marketing,personalized recommendation,user retention and churn analysis,etc.,and discusses the challenges and countermeasures faced in the application.Through the study of actual cases,it demonstrates the remarkable effectiveness of big data technology in enhancing the competitiveness of e-commerce platforms and user experience.展开更多
Objective To identify core acupoint patterns and elucidate the molecular mechanisms of acupuncture for primary depressive disorder(PDD)through data mining and network analysis.Methods A comprehensive literature search...Objective To identify core acupoint patterns and elucidate the molecular mechanisms of acupuncture for primary depressive disorder(PDD)through data mining and network analysis.Methods A comprehensive literature search was conducted across PubMed,Embase,Ovid Technologies(OVID),Web of Science,Cochrane Library,China National Knowledge Infrastructure(CNKI),China National Knowledge Infrastructure Database(VIP),Wanfang Data,and SinoMed Database from database foundation to January 31,2025,for clinical studies on acupuncture treatment of PDD.Descriptive statistics,high-frequency acupoint analysis,degree and betweenness centrality evaluation,and core acupoint prescription mining identified predominant therapeutic combinations for PDD.Network acupuncture was used to predict therapeutic target for the core acupoint prescription.Subsequent protein-protein interaction(PPI)network and molecular complex detection(MCODE)analyses were conducted to identify the key targets and functional modules.Gene Ontology(GO)and Kyoto Encyclopedia of Genes and Genomes(KEGG)analyses explored the underlying biological mechanisms of the core acupoint prescription in treating PDD.Results A total of 57 acupoint prescriptions underwent systematic analysis.The core therapeutic combinations comprised Baihui(GV20),Yintang(GV29),Neiguan(PC6),Hegu(LI4),and Shenmen(HT7).Network acupuncture analysis identified 88 potential therapeutic targets(79 overlapping with PDD),while PPI network analysis revealed central regulatory nodes,including interleukin(IL)-6,IL-1β,tumor necrosis factor(TNF)-α,toll-like receptor 4(TLR4),IL-10,brain-derived neurotrophic factor(BDNF),transforming growth factor(TGF)-β1,C-XC motif chemokine ligand 10(CXCL10),mitogen-activated protein kinase 3(MAPK3),and nitric oxide synthase 1(NOS1).MCODE-based modular analysis further elucidated three functionally coherent clusters:inflammation-homeostasis(score=6.571),plasticity-neurotransmission(score=3.143),and oxidative stress(score=3.000).GO and KEGG analyses demonstrated significant enrichment of the MAPK,phosphoinositide 3-kinase/protein kinase B(PI3K/Akt),and hypoxia-inducible factor(HIF)-1 signaling pathways.These mechanistic insights suggested that the antidepressant effects mediated through mechanisms of neuroinflammatory regulation,neuroplasticity restoration,and immune-oxidative stress homeostasis.Conclusion This study reveals that acupuncture alleviates depression through a multi-level mechanism,primarily involving the neuroinflammation suppression,neuroplasticity enhancement,and oxidative stress regulation.These findings systematically clarify the underlying mechanisms of acupuncture’s antidepressant effects and identify novel therapeutic targets for further mechanistic research.展开更多
Semantic communication(SemCom)aims to achieve high-fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy.Nevertheless,semantic communication still suffers from unexpe...Semantic communication(SemCom)aims to achieve high-fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy.Nevertheless,semantic communication still suffers from unexpected channel volatility and thus developing a re-transmission mechanism(e.g.,hybrid automatic repeat request[HARQ])becomes indispensable.In that regard,instead of discarding previously transmitted information,the incremental knowledge-based HARQ(IK-HARQ)is deemed as a more effective mechanism that could sufficiently utilize the information semantics.However,considering the possible existence of semantic ambiguity in image transmission,a simple bit-level cyclic redundancy check(CRC)might compromise the performance of IK-HARQ.Therefore,there emerges a strong incentive to revolutionize the CRC mechanism,thus more effectively reaping the benefits of both SemCom and HARQ.In this paper,built on top of swin transformer-based joint source-channel coding(JSCC)and IK-HARQ,we propose a semantic image transmission framework SC-TDA-HARQ.In particular,different from the conventional CRC,we introduce a topological data analysis(TDA)-based error detection method,which capably digs out the inner topological and geometric information of images,to capture semantic information and determine the necessity for re-transmission.Extensive numerical results validate the effectiveness and efficiency of the proposed SC-TDA-HARQ framework,especially under the limited bandwidth condition,and manifest the superiority of TDA-based error detection method in image transmission.展开更多
In section‘Track decoding’of this article,one of the paragraphs was inadvertently missed out after the text'…shows the flow diagram of the Tr2-1121 track mode.'The missed paragraph is provided below.
Cervical cancer,a leading malignancy globally,poses a significant threat to women's health,with an estimated 604,000 new cases and 342,000 deaths reported in 2020^([1]).As cervical cancer is closely linked to huma...Cervical cancer,a leading malignancy globally,poses a significant threat to women's health,with an estimated 604,000 new cases and 342,000 deaths reported in 2020^([1]).As cervical cancer is closely linked to human papilloma virus(HPV)infection,early detection relies on HPV screening;however,late-stage prognosis remains poor,underscoring the need for novel diagnostic and therapeutic targets^([2]).展开更多
The analysis of ancient genomics provides opportunities to explore human population history across both temporal and geographic dimensions(Haak et al.,2015;Wang et al.,2021,2024)to enhance the accessibility and utilit...The analysis of ancient genomics provides opportunities to explore human population history across both temporal and geographic dimensions(Haak et al.,2015;Wang et al.,2021,2024)to enhance the accessibility and utility of these ancient genomic datasets,a range of databases and advanced statistical models have been developed,including the Allen Ancient DNA Resource(AADR)(Mallick et al.,2024)and AdmixTools(Patterson et al.,2012).While upstream processes such as sequencing and raw data processing have been streamlined by resources like the AADR,the downstream analysis of these datasets-encompassing population genetics inference and spatiotemporal interpretation-remains a significant challenge.The AADR provides a unified collection of published ancient DNA(aDNA)data,yet its file-based format and reliance on command-line tools,such as those in Admix-Tools(Patterson et al.,2012),require advanced computational expertise for effective exploration and analysis.These requirements can present significant challenges forresearchers lackingadvanced computational expertise,limiting the accessibility and broader application of these valuable genomic resources.展开更多
As the complexity of flight missions continues to increase,sending a timely warning or providing assistance to pilots helps to reduce the probability of operational errors and flight accidents.Monitoring pilots’physi...As the complexity of flight missions continues to increase,sending a timely warning or providing assistance to pilots helps to reduce the probability of operational errors and flight accidents.Monitoring pilots’physiological data,real-time evaluation of mission load is a feasible technical way to achieve this.In this paper,a set of flight tasks including aircraft control,humancomputer interaction and mental arithmetic tests are designed to simulate five mission loads at different flight difficulty levels.A sensitivity analysis method based on a comprehensive test is proposed to select a set of sensitive physiological factors.Then,based on the SVM hierarchical combination classification method,the pilot mission load real-time evaluation model is established.The test results show significant differences in EMG,respiration rate(abdomen),heart rate,blood oxygen saturation,pupil area,fixation duration,number of fixations,and saccades.The high accuracy obtained from experiments proved that the proposed real-time evaluation model is applicable to meet the requirements of real working environments.The findings can provide methodological references for mission load evaluation research in other fields.展开更多
Chlorine, chlorine dioxide, and ozone are widely used as disinfectants in drinking water treatments. However, the combined use of different disinfectants can result in the formation of various organic and inorganic di...Chlorine, chlorine dioxide, and ozone are widely used as disinfectants in drinking water treatments. However, the combined use of different disinfectants can result in the formation of various organic and inorganic disinfection byproducts(DBPs). The toxic interactions, including synergism, addition, and antagonism, among the complex DBPs are still unclear. In this study, we established and verified a real-time cell analysis(RTCA) method for cytotoxicity measurement on Chinese hamster ovary(CHO) cell. Using this convenient and accurate method, we assessed the cytotoxicity of a series of binary combinations consisting of one of the 3 inorganic DBPs(chlorite, chlorate, and bromate) and one of the 32 regulated and emerging organic DBPs. The combination index(CI) of each combination was calculated and evaluated by isobolographic analysis to reflect the toxic interactions. The results confirmed the synergistic effect on cytotoxicity in the binary combinations consisting of chlorite and one of the 5 organic DBPs(2 iodinated DBPs(I-DBPs) and 3 brominated DBPs(Br-DBPs)), chlorate and one of the 4 organic DBPs(3 aromatic DBPs and dibromoacetonitrile), and bromate and one of the 3 organic DBPs(2 I-DBPs and dibromoacetic acid). The possible synergism mechanism of organic DBPs on the inorganic ones may be attributed to the influence of organic DBPs on cell membrane and cell antioxidant system. This study revealed the toxic interactions among organic and inorganic DBPs, and emphasized the latent adverse outcomes in the combined use of different disinfectants.展开更多
In order to optimize the embedded system implementation for Ethernet-based computer numerical control (CNC) system, it is very necessary to establish the performance analysis model and further adopt the codesign met...In order to optimize the embedded system implementation for Ethernet-based computer numerical control (CNC) system, it is very necessary to establish the performance analysis model and further adopt the codesign method from the control, communication and computing perspectives. On the basis of analyzing real-time Ethemet, system architecture, time characteristic parameters of control-loop ere, a performance analysis model for real-time Ethemet-based CNC system was proposed, which is able to include the timing effects caused by the implementation platform in the simulation. The key for establishing the model is accomplished by designing the error analysis module and the controller nodes. Under the restraint of CPU resource and communication bandwidth, the experiment with a case study was conducted, and the results show that if the deadline miss ratio of data packets is 0.2%, then the percentage error is 1.105%. The proposed model can be used at several stages of CNC system development.展开更多
文摘As global climate change intensifies,the power industry-a major source of carbon emissions-plays a pivotal role in achieving carbon peaking and neutrality goals through its low-carbon transition.Traditional power plants’carbon management systems can no longer meet the demands of high-precision,real-time monitoring.Smart power plants now offer innovative solutions for carbon emission tracking and intelligent analysis by integrating IoT,big data,and AI technologies.Current research predominantly focuses on optimizing individual processes,lacking systematic exploration of comprehensive dynamic monitoring and intelligent decision-making across the entire workflow.To address this gap,we propose a smart carbon emission monitoring and analysis platform for power plants that integrates IoT sensing,multimodal data analytics,and AI-driven decision-making.The platform establishes a multi-source sensor network to collect emissions data throughout the fuel combustion,auxiliary equipment operation,and waste treatment processes.Combining carbon emission factor analysis with machine learning models enables real-time emission calculations and utilizes long short-term memory networks to predict future emission trends.
文摘The application and development of a wide-area measurement system(WAMS)has enabled many applications and led to several requirements based on dynamic measurement data.Such data are transmitted as big data information flow.To ensure effective transmission of wide-frequency electrical information by the communication protocol of a WAMS,this study performs real-time traffic monitoring and analysis of the data network of a power information system,and establishes corresponding network optimization strategies to solve existing transmission problems.This study utilizes the traffic analysis results obtained using the current real-time dynamic monitoring system to design an optimization strategy,covering the optimization in three progressive levels:the underlying communication protocol,source data,and transmission process.Optimization of the system structure and scheduling optimization of data information are validated to be feasible and practical via tests.
文摘Opinion (sentiment) analysis on big data streams from the constantly generated text streams on social media networks to hundreds of millions of online consumer reviews provides many organizations in every field with opportunities to discover valuable intelligence from the massive user generated text streams. However, the traditional content analysis frameworks are inefficient to handle the unprecedentedly big volume of unstructured text streams and the complexity of text analysis tasks for the real time opinion analysis on the big data streams. In this paper, we propose a parallel real time sentiment analysis system: Social Media Data Stream Sentiment Analysis Service (SMDSSAS) that performs multiple phases of sentiment analysis of social media text streams effectively in real time with two fully analytic opinion mining models to combat the scale of text data streams and the complexity of sentiment analysis processing on unstructured text streams. We propose two aspect based opinion mining models: Deterministic and Probabilistic sentiment models for a real time sentiment analysis on the user given topic related data streams. Experiments on the social media Twitter stream traffic captured during the pre-election weeks of the 2016 Presidential election for real-time analysis of public opinions toward two presidential candidates showed that the proposed system was able to predict correctly Donald Trump as the winner of the 2016 Presidential election. The cross validation results showed that the proposed sentiment models with the real-time streaming components in our proposed framework delivered effectively the analysis of the opinions on two presidential candidates with average 81% accuracy for the Deterministic model and 80% for the Probabilistic model, which are 1% - 22% improvements from the results of the existing literature.
文摘In the era of big data, huge volumes of data are generated from online social networks, sensor networks, mobile devices, and organizations’ enterprise systems. This phenomenon provides organizations with unprecedented opportunities to tap into big data to mine valuable business intelligence. However, traditional business analytics methods may not be able to cope with the flood of big data. The main contribution of this paper is the illustration of the development of a novel big data stream analytics framework named BDSASA that leverages a probabilistic language model to analyze the consumer sentiments embedded in hundreds of millions of online consumer reviews. In particular, an inference model is embedded into the classical language modeling framework to enhance the prediction of consumer sentiments. The practical implication of our research work is that organizations can apply our big data stream analytics framework to analyze consumers’ product preferences, and hence develop more effective marketing and production strategies.
文摘With the advent of the big data era,real-time data analysis and decision-support systems have been recognized as essential tools for enhancing enterprise competitiveness and optimizing the decision-making process.This study aims to explore the development strategies of real-time data analysis and decision-support systems,and analyze their application status and future development trends in various industries.The article first reviews the basic concepts and importance of real-time data analysis and decision-support systems,and then discusses in detail the key technical aspects such as system architecture,data collection and processing,analysis methods,and visualization techniques.
文摘DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expression data generated. To address this, this paper employs a mixed-effects model to analyze gene expression data. In terms of data selection, 1176 genes from the white mouse gene expression dataset under two experimental conditions were chosen, setting up two conditions: pneumococcal infection and no infection, and constructing a mixed-effects model. After preprocessing the gene chip information, the data were imported into the model, preliminary results were calculated, and permutation tests were performed to biologically validate the preliminary results using GSEA. The final dataset consists of 20 groups of gene expression data from pneumococcal infection, which categorizes functionally related genes based on the similarity of their expression profiles, facilitating the study of genes with unknown functions.
基金funded by the Ongoing Research Funding Program(ORF-2025-890)King Saud University,Riyadh,Saudi Arabia and was supported by the Competitive Research Fund of theUniversity of Aizu,Japan.
文摘The exponential expansion of the Internet of Things(IoT),Industrial Internet of Things(IIoT),and Transportation Management of Things(TMoT)produces vast amounts of real-time streaming data.Ensuring system dependability,operational efficiency,and security depends on the identification of anomalies in these dynamic and resource-constrained systems.Due to their high computational requirements and inability to efficiently process continuous data streams,traditional anomaly detection techniques often fail in IoT systems.This work presents a resource-efficient adaptive anomaly detection model for real-time streaming data in IoT systems.Extensive experiments were carried out on multiple real-world datasets,achieving an average accuracy score of 96.06%with an execution time close to 7.5 milliseconds for each individual streaming data point,demonstrating its potential for real-time,resourceconstrained applications.The model uses Principal Component Analysis(PCA)for dimensionality reduction and a Z-score technique for anomaly detection.It maintains a low computational footprint with a sliding window mechanism,enabling incremental data processing and identification of both transient and sustained anomalies without storing historical data.The system uses a Multivariate Linear Regression(MLR)based imputation technique that estimates missing or corrupted sensor values,preserving data integrity prior to anomaly detection.The suggested solution is appropriate for many uses in smart cities,industrial automation,environmental monitoring,IoT security,and intelligent transportation systems,and is particularly well-suited for resource-constrained edge devices.
文摘This paper analyzes the advantages of legal digital currencies and explores their impact on bank big data practices.By combining bank big data collection and processing,it clarifies that legal digital currencies can enhance the efficiency of bank data processing,enrich data types,and strengthen data analysis and application capabilities.In response to future development needs,it is necessary to strengthen data collection management,enhance data processing capabilities,innovate big data application models,and provide references for bank big data practices,promoting the transformation and upgrading of the banking industry in the context of legal digital currencies.
文摘With the acceleration of intelligent transformation of energy system,the monitoring of equipment operation status and optimization of production process in thermal power plants face the challenge of multi-source heterogeneous data integration.In view of the heterogeneous characteristics of physical sensor data,including temperature,vibration and pressure that generated by boilers,steam turbines and other key equipment and real-time working condition data of SCADA system,this paper proposes a multi-source heterogeneous data fusion and analysis platform for thermal power plants based on edge computing and deep learning.By constructing a multi-level fusion architecture,the platform adopts dynamic weight allocation strategy and 5D digital twin model to realize the collaborative analysis of physical sensor data,simulation calculation results and expert knowledge.The data fusion module combines Kalman filter,wavelet transform and Bayesian estimation method to solve the problem of data time series alignment and dimension difference.Simulation results show that the data fusion accuracy can be improved to more than 98%,and the calculation delay can be controlled within 500 ms.The data analysis module integrates Dymola simulation model and AERMOD pollutant diffusion model,supports the cascade analysis of boiler combustion efficiency prediction and flue gas emission monitoring,system response time is less than 2 seconds,and data consistency verification accuracy reaches 99.5%.
基金supported by the National Key Research and Development Program Project of China(Grant No.2023YFF0718003)the key research and development plan project of Yunnan Province(Grant No.202303AA080006).
文摘The issue of strong noise has increasingly become a bottleneck restricting the precision and application space of electromagnetic exploration methods.Noise suppression and extraction of effective electromagnetic response information under a strong noise background is a crucial scientific task to be addressed.To solve the noise suppression problem of the controlled-source electromagnetic method in strong interference areas,we propose an approach based on complex-plane 2D k-means clustering for data processing.Based on the stability of the controlled-source signal response,clustering analysis is applied to classify the spectra of different sources and noises in multiple time segments.By identifying the power spectra with controlled-source characteristics,it helps to improve the quality of the controlled-source response extraction.This paper presents the principle and workflow of the proposed algorithm,and demonstrates feasibility and effectiveness of the new algorithm through synthetic and real data examples.The results show that,compared with the conventional Robust denoising method,the clustering algorithm has a stronger suppression effect on common noise,can identify high-quality signals,and improve the preprocessing data quality of the controlledsource electromagnetic method.
文摘The fracture volume is gradually changed with the depletion of fracture pressure during the production process.However,there are few flowback models available so far that can estimate the fracture volume loss using pressure transient and rate transient data.The initial flowback involves producing back the fracturing fuid after hydraulic fracturing,while the second flowback involves producing back the preloading fluid injected into the parent wells before fracturing of child wells.The main objective of this research is to compare the initial and second flowback data to capture the changes in fracture volume after production and preload processes.Such a comparison is useful for evaluating well performance and optimizing frac-turing operations.We construct rate-normalized pressure(RNP)versus material balance time(MBT)diagnostic plots using both initial and second flowback data(FB;and FBs,respectively)of six multi-fractured horizontal wells completed in Niobrara and Codell formations in DJ Basin.In general,the slope of RNP plot during the FB,period is higher than that during the FB;period,indicating a potential loss of fracture volume from the FB;to the FB,period.We estimate the changes in effective fracture volume(Ver)by analyzing the changes in the RNP slope and total compressibility between these two flowback periods.Ver during FB,is in general 3%-45%lower than that during FB:.We also compare the drive mechanisms for the two flowback periods by calculating the compaction-drive index(CDI),hydrocarbon-drive index(HDI),and water-drive index(WDI).The dominant drive mechanism during both flowback periods is CDI,but its contribution is reduced by 16%in the FB,period.This drop is generally compensated by a relatively higher HDI during this period.The loss of effective fracture volume might be attributed to the pressure depletion in fractures,which occurs during the production period and can extend 800 days.
文摘With the rapid development of the Internet and e-commerce,e-commerce platforms have accumulated huge amounts of user behavior data.The emergence of big data technology provides a powerful means for in-depth analysis of these data and insight into user behavior patterns and preferences.This paper elaborates on the application of big data technology in the analysis of user behavior on e-commerce platforms,including the technical methods of data collection,storage,processing and analysis,as well as the specific applications in the construction of user profiles,precision marketing,personalized recommendation,user retention and churn analysis,etc.,and discusses the challenges and countermeasures faced in the application.Through the study of actual cases,it demonstrates the remarkable effectiveness of big data technology in enhancing the competitiveness of e-commerce platforms and user experience.
文摘Objective To identify core acupoint patterns and elucidate the molecular mechanisms of acupuncture for primary depressive disorder(PDD)through data mining and network analysis.Methods A comprehensive literature search was conducted across PubMed,Embase,Ovid Technologies(OVID),Web of Science,Cochrane Library,China National Knowledge Infrastructure(CNKI),China National Knowledge Infrastructure Database(VIP),Wanfang Data,and SinoMed Database from database foundation to January 31,2025,for clinical studies on acupuncture treatment of PDD.Descriptive statistics,high-frequency acupoint analysis,degree and betweenness centrality evaluation,and core acupoint prescription mining identified predominant therapeutic combinations for PDD.Network acupuncture was used to predict therapeutic target for the core acupoint prescription.Subsequent protein-protein interaction(PPI)network and molecular complex detection(MCODE)analyses were conducted to identify the key targets and functional modules.Gene Ontology(GO)and Kyoto Encyclopedia of Genes and Genomes(KEGG)analyses explored the underlying biological mechanisms of the core acupoint prescription in treating PDD.Results A total of 57 acupoint prescriptions underwent systematic analysis.The core therapeutic combinations comprised Baihui(GV20),Yintang(GV29),Neiguan(PC6),Hegu(LI4),and Shenmen(HT7).Network acupuncture analysis identified 88 potential therapeutic targets(79 overlapping with PDD),while PPI network analysis revealed central regulatory nodes,including interleukin(IL)-6,IL-1β,tumor necrosis factor(TNF)-α,toll-like receptor 4(TLR4),IL-10,brain-derived neurotrophic factor(BDNF),transforming growth factor(TGF)-β1,C-XC motif chemokine ligand 10(CXCL10),mitogen-activated protein kinase 3(MAPK3),and nitric oxide synthase 1(NOS1).MCODE-based modular analysis further elucidated three functionally coherent clusters:inflammation-homeostasis(score=6.571),plasticity-neurotransmission(score=3.143),and oxidative stress(score=3.000).GO and KEGG analyses demonstrated significant enrichment of the MAPK,phosphoinositide 3-kinase/protein kinase B(PI3K/Akt),and hypoxia-inducible factor(HIF)-1 signaling pathways.These mechanistic insights suggested that the antidepressant effects mediated through mechanisms of neuroinflammatory regulation,neuroplasticity restoration,and immune-oxidative stress homeostasis.Conclusion This study reveals that acupuncture alleviates depression through a multi-level mechanism,primarily involving the neuroinflammation suppression,neuroplasticity enhancement,and oxidative stress regulation.These findings systematically clarify the underlying mechanisms of acupuncture’s antidepressant effects and identify novel therapeutic targets for further mechanistic research.
基金supported in part by the National Key Research and Development Program of China under Grant 2024YFE0200600in part by the National Natural Science Foundation of China under Grant 62071425+3 种基金in part by the Zhejiang Key Research and Development Plan under Grant 2022C01093in part by the Zhejiang Provincial Natural Science Foundation of China under Grant LR23F010005in part by the National Key Laboratory of Wireless Communications Foundation under Grant 2023KP01601in part by the Big Data and Intelligent Computing Key Lab of CQUPT under Grant BDIC-2023-B-001.
文摘Semantic communication(SemCom)aims to achieve high-fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy.Nevertheless,semantic communication still suffers from unexpected channel volatility and thus developing a re-transmission mechanism(e.g.,hybrid automatic repeat request[HARQ])becomes indispensable.In that regard,instead of discarding previously transmitted information,the incremental knowledge-based HARQ(IK-HARQ)is deemed as a more effective mechanism that could sufficiently utilize the information semantics.However,considering the possible existence of semantic ambiguity in image transmission,a simple bit-level cyclic redundancy check(CRC)might compromise the performance of IK-HARQ.Therefore,there emerges a strong incentive to revolutionize the CRC mechanism,thus more effectively reaping the benefits of both SemCom and HARQ.In this paper,built on top of swin transformer-based joint source-channel coding(JSCC)and IK-HARQ,we propose a semantic image transmission framework SC-TDA-HARQ.In particular,different from the conventional CRC,we introduce a topological data analysis(TDA)-based error detection method,which capably digs out the inner topological and geometric information of images,to capture semantic information and determine the necessity for re-transmission.Extensive numerical results validate the effectiveness and efficiency of the proposed SC-TDA-HARQ framework,especially under the limited bandwidth condition,and manifest the superiority of TDA-based error detection method in image transmission.
文摘In section‘Track decoding’of this article,one of the paragraphs was inadvertently missed out after the text'…shows the flow diagram of the Tr2-1121 track mode.'The missed paragraph is provided below.
基金supported by a project funded by the Hebei Provincial Central Guidance Local Science and Technology Development Fund(236Z7714G)。
文摘Cervical cancer,a leading malignancy globally,poses a significant threat to women's health,with an estimated 604,000 new cases and 342,000 deaths reported in 2020^([1]).As cervical cancer is closely linked to human papilloma virus(HPV)infection,early detection relies on HPV screening;however,late-stage prognosis remains poor,underscoring the need for novel diagnostic and therapeutic targets^([2]).
基金by the National Key Research and Development Program of China(2023YFC3303701-02 and 2024YFC3306701)the National Natural Science Foundation of China(T2425014 and 32270667)+3 种基金the Natural Science Foundation of Fujian Province of China(2023J06013)the Major Project of the National Social Science Foundation of China granted to Chuan-Chao Wang(21&ZD285)Open Research Fund of State Key Laboratory of Genetic Engineering at Fudan University(SKLGE-2310)Open Research Fund of Forensic Genetics Key Laboratory of the Ministry of Public Security(2023FGKFKT07).
文摘The analysis of ancient genomics provides opportunities to explore human population history across both temporal and geographic dimensions(Haak et al.,2015;Wang et al.,2021,2024)to enhance the accessibility and utility of these ancient genomic datasets,a range of databases and advanced statistical models have been developed,including the Allen Ancient DNA Resource(AADR)(Mallick et al.,2024)and AdmixTools(Patterson et al.,2012).While upstream processes such as sequencing and raw data processing have been streamlined by resources like the AADR,the downstream analysis of these datasets-encompassing population genetics inference and spatiotemporal interpretation-remains a significant challenge.The AADR provides a unified collection of published ancient DNA(aDNA)data,yet its file-based format and reliance on command-line tools,such as those in Admix-Tools(Patterson et al.,2012),require advanced computational expertise for effective exploration and analysis.These requirements can present significant challenges forresearchers lackingadvanced computational expertise,limiting the accessibility and broader application of these valuable genomic resources.
基金co-supported by the Aeronautical Science Foundation of China(No.2020Z023053002)the National Natural Science Foundation of China(No.61305133。
文摘As the complexity of flight missions continues to increase,sending a timely warning or providing assistance to pilots helps to reduce the probability of operational errors and flight accidents.Monitoring pilots’physiological data,real-time evaluation of mission load is a feasible technical way to achieve this.In this paper,a set of flight tasks including aircraft control,humancomputer interaction and mental arithmetic tests are designed to simulate five mission loads at different flight difficulty levels.A sensitivity analysis method based on a comprehensive test is proposed to select a set of sensitive physiological factors.Then,based on the SVM hierarchical combination classification method,the pilot mission load real-time evaluation model is established.The test results show significant differences in EMG,respiration rate(abdomen),heart rate,blood oxygen saturation,pupil area,fixation duration,number of fixations,and saccades.The high accuracy obtained from experiments proved that the proposed real-time evaluation model is applicable to meet the requirements of real working environments.The findings can provide methodological references for mission load evaluation research in other fields.
基金supported by the National Natural Science Foundation of China (No. 21876210)。
文摘Chlorine, chlorine dioxide, and ozone are widely used as disinfectants in drinking water treatments. However, the combined use of different disinfectants can result in the formation of various organic and inorganic disinfection byproducts(DBPs). The toxic interactions, including synergism, addition, and antagonism, among the complex DBPs are still unclear. In this study, we established and verified a real-time cell analysis(RTCA) method for cytotoxicity measurement on Chinese hamster ovary(CHO) cell. Using this convenient and accurate method, we assessed the cytotoxicity of a series of binary combinations consisting of one of the 3 inorganic DBPs(chlorite, chlorate, and bromate) and one of the 32 regulated and emerging organic DBPs. The combination index(CI) of each combination was calculated and evaluated by isobolographic analysis to reflect the toxic interactions. The results confirmed the synergistic effect on cytotoxicity in the binary combinations consisting of chlorite and one of the 5 organic DBPs(2 iodinated DBPs(I-DBPs) and 3 brominated DBPs(Br-DBPs)), chlorate and one of the 4 organic DBPs(3 aromatic DBPs and dibromoacetonitrile), and bromate and one of the 3 organic DBPs(2 I-DBPs and dibromoacetic acid). The possible synergism mechanism of organic DBPs on the inorganic ones may be attributed to the influence of organic DBPs on cell membrane and cell antioxidant system. This study revealed the toxic interactions among organic and inorganic DBPs, and emphasized the latent adverse outcomes in the combined use of different disinfectants.
基金Projects(50875090,50905063) supported by the National Natural Science Foundation of ChinaProject(2009AA04Z111) supported by the National High Technology Research and Development Program of China+2 种基金Project(20090460769) supported by China Postdoctoral Science FoundationProject(2011ZM0070) supported by the Fundamental Research Funds for the Central Universities in ChinaProject(S2011010001155) supported by the Natural Science Foundation of Guangdong Province,China
文摘In order to optimize the embedded system implementation for Ethernet-based computer numerical control (CNC) system, it is very necessary to establish the performance analysis model and further adopt the codesign method from the control, communication and computing perspectives. On the basis of analyzing real-time Ethemet, system architecture, time characteristic parameters of control-loop ere, a performance analysis model for real-time Ethemet-based CNC system was proposed, which is able to include the timing effects caused by the implementation platform in the simulation. The key for establishing the model is accomplished by designing the error analysis module and the controller nodes. Under the restraint of CPU resource and communication bandwidth, the experiment with a case study was conducted, and the results show that if the deadline miss ratio of data packets is 0.2%, then the percentage error is 1.105%. The proposed model can be used at several stages of CNC system development.