Earthquakes are highly destructive spatio-temporal phenomena whose analysis is essential for disaster preparedness and risk mitigation.Modern seismological research produces vast volumes of heterogeneous data from sei...Earthquakes are highly destructive spatio-temporal phenomena whose analysis is essential for disaster preparedness and risk mitigation.Modern seismological research produces vast volumes of heterogeneous data from seismic networks,satellite observations,and geospatial repositories,creating the need for scalable infrastructures capable of integrating and analyzing such data to support intelligent decision-making.Data warehousing technologies provide a robust foundation for this purpose;however,existing earthquake-oriented data warehouses remain limited,often relying on simplified schemas,domain-specific analytics,or cataloguing efforts.This paper presents the design and implementation of a spatio-temporal data warehouse for seismic activity.The framework integrates spatial and temporal dimensions in a unified schema and introduces a novel array-based approach for managing many-to-many relationships between facts and dimensions without intermediate bridge tables.A comparative evaluation against a conventional bridge-table schema demonstrates that the array-based design improves fact-centric query performance,while the bridge-table schema remains advantageous for dimension-centric queries.To reconcile these trade-offs,a hybrid schema is proposed that retains both representations,ensuring balanced efficiency across heterogeneous workloads.The proposed framework demonstrates how spatio-temporal data warehousing can address schema complexity,improve query performance,and support multidimensional visualization.In doing so,it provides a foundation for integrating seismic analysis into broader big data-driven intelligent decision systems for disaster resilience,risk mitigation,and emergency management.展开更多
Small angle x-ray scattering(SAXS)is an advanced technique for characterizing the particle size distribution(PSD)of nanoparticles.However,the ill-posed nature of inverse problems in SAXS data analysis often reduces th...Small angle x-ray scattering(SAXS)is an advanced technique for characterizing the particle size distribution(PSD)of nanoparticles.However,the ill-posed nature of inverse problems in SAXS data analysis often reduces the accuracy of conventional methods.This article proposes a user-friendly software for PSD analysis,GranuSAS,which employs an algorithm that integrates truncated singular value decomposition(TSVD)with the Chahine method.This approach employs TSVD for data preprocessing,generating a set of initial solutions with noise suppression.A high-quality initial solution is subsequently selected via the L-curve method.This selected candidate solution is then iteratively refined by the Chahine algorithm,enforcing constraints such as non-negativity and improving physical interpretability.Most importantly,GranuSAS employs a parallel architecture that simultaneously yields inversion results from multiple shape models and,by evaluating the accuracy of each model's reconstructed scattering curve,offers a suggestion for model selection in material systems.To systematically validate the accuracy and efficiency of the software,verification was performed using both simulated and experimental datasets.The results demonstrate that the proposed software delivers both satisfactory accuracy and reliable computational efficiency.It provides an easy-to-use and reliable tool for researchers in materials science,helping them fully exploit the potential of SAXS in nanoparticle characterization.展开更多
As global climate change intensifies,the power industry-a major source of carbon emissions-plays a pivotal role in achieving carbon peaking and neutrality goals through its low-carbon transition.Traditional power pla...As global climate change intensifies,the power industry-a major source of carbon emissions-plays a pivotal role in achieving carbon peaking and neutrality goals through its low-carbon transition.Traditional power plants’carbon management systems can no longer meet the demands of high-precision,real-time monitoring.Smart power plants now offer innovative solutions for carbon emission tracking and intelligent analysis by integrating IoT,big data,and AI technologies.Current research predominantly focuses on optimizing individual processes,lacking systematic exploration of comprehensive dynamic monitoring and intelligent decision-making across the entire workflow.To address this gap,we propose a smart carbon emission monitoring and analysis platform for power plants that integrates IoT sensing,multimodal data analytics,and AI-driven decision-making.The platform establishes a multi-source sensor network to collect emissions data throughout the fuel combustion,auxiliary equipment operation,and waste treatment processes.Combining carbon emission factor analysis with machine learning models enables real-time emission calculations and utilizes long short-term memory networks to predict future emission trends.展开更多
The application and development of a wide-area measurement system(WAMS)has enabled many applications and led to several requirements based on dynamic measurement data.Such data are transmitted as big data information ...The application and development of a wide-area measurement system(WAMS)has enabled many applications and led to several requirements based on dynamic measurement data.Such data are transmitted as big data information flow.To ensure effective transmission of wide-frequency electrical information by the communication protocol of a WAMS,this study performs real-time traffic monitoring and analysis of the data network of a power information system,and establishes corresponding network optimization strategies to solve existing transmission problems.This study utilizes the traffic analysis results obtained using the current real-time dynamic monitoring system to design an optimization strategy,covering the optimization in three progressive levels:the underlying communication protocol,source data,and transmission process.Optimization of the system structure and scheduling optimization of data information are validated to be feasible and practical via tests.展开更多
Opinion (sentiment) analysis on big data streams from the constantly generated text streams on social media networks to hundreds of millions of online consumer reviews provides many organizations in every field with o...Opinion (sentiment) analysis on big data streams from the constantly generated text streams on social media networks to hundreds of millions of online consumer reviews provides many organizations in every field with opportunities to discover valuable intelligence from the massive user generated text streams. However, the traditional content analysis frameworks are inefficient to handle the unprecedentedly big volume of unstructured text streams and the complexity of text analysis tasks for the real time opinion analysis on the big data streams. In this paper, we propose a parallel real time sentiment analysis system: Social Media Data Stream Sentiment Analysis Service (SMDSSAS) that performs multiple phases of sentiment analysis of social media text streams effectively in real time with two fully analytic opinion mining models to combat the scale of text data streams and the complexity of sentiment analysis processing on unstructured text streams. We propose two aspect based opinion mining models: Deterministic and Probabilistic sentiment models for a real time sentiment analysis on the user given topic related data streams. Experiments on the social media Twitter stream traffic captured during the pre-election weeks of the 2016 Presidential election for real-time analysis of public opinions toward two presidential candidates showed that the proposed system was able to predict correctly Donald Trump as the winner of the 2016 Presidential election. The cross validation results showed that the proposed sentiment models with the real-time streaming components in our proposed framework delivered effectively the analysis of the opinions on two presidential candidates with average 81% accuracy for the Deterministic model and 80% for the Probabilistic model, which are 1% - 22% improvements from the results of the existing literature.展开更多
In the era of big data, huge volumes of data are generated from online social networks, sensor networks, mobile devices, and organizations’ enterprise systems. This phenomenon provides organizations with unprecedente...In the era of big data, huge volumes of data are generated from online social networks, sensor networks, mobile devices, and organizations’ enterprise systems. This phenomenon provides organizations with unprecedented opportunities to tap into big data to mine valuable business intelligence. However, traditional business analytics methods may not be able to cope with the flood of big data. The main contribution of this paper is the illustration of the development of a novel big data stream analytics framework named BDSASA that leverages a probabilistic language model to analyze the consumer sentiments embedded in hundreds of millions of online consumer reviews. In particular, an inference model is embedded into the classical language modeling framework to enhance the prediction of consumer sentiments. The practical implication of our research work is that organizations can apply our big data stream analytics framework to analyze consumers’ product preferences, and hence develop more effective marketing and production strategies.展开更多
With the advent of the big data era,real-time data analysis and decision-support systems have been recognized as essential tools for enhancing enterprise competitiveness and optimizing the decision-making process.This...With the advent of the big data era,real-time data analysis and decision-support systems have been recognized as essential tools for enhancing enterprise competitiveness and optimizing the decision-making process.This study aims to explore the development strategies of real-time data analysis and decision-support systems,and analyze their application status and future development trends in various industries.The article first reviews the basic concepts and importance of real-time data analysis and decision-support systems,and then discusses in detail the key technical aspects such as system architecture,data collection and processing,analysis methods,and visualization techniques.展开更多
The detection and characterization of non-metallic inclusions are essential for clean steel production.Recently,imaging analysis combined with high-dimensional data processing of metallic materials using artificial in...The detection and characterization of non-metallic inclusions are essential for clean steel production.Recently,imaging analysis combined with high-dimensional data processing of metallic materials using artificial intelligence(AI)-based machine learning(ML)has developed rapidly.This technique has achieved impressive results in the field of inclusion classification in process metallurgy.The present study surveys the ML modeling of inclusion prediction in advanced steels,including the detection,classification,and feature prediction of inclusions in different steel grades.Studies on clean steel with different features based on data and image analysis via ML are summarized.Regarding the data analysis,the inclusion prediction methodology based on ML establishes a connection between the experimental parameters and inclusion characteristics and analyzes the importance of the experimental parameters.Regarding the image analysis,the focus is placed on the classification of different types of inclusions via deep learning,in comparison with data analysis.Finally,further development of inclusion analyses using ML-based methods is recommended.This work paves the way for the application of AIbased methodologies for ultraclean-steel studies from a sustainable metallurgy perspective.展开更多
AIM:To perform a bibliometric analysis of publications focusing on inflammatory mechanisms in glaucoma,thereby comprehensively understanding the current research status and identifying potential frontier directions fo...AIM:To perform a bibliometric analysis of publications focusing on inflammatory mechanisms in glaucoma,thereby comprehensively understanding the current research status and identifying potential frontier directions for future studies.METHODS:A systematic search was conducted in the Web of Science Core Collection(WoSCC)database to retrieve relevant literature published from January 1,2000,to August 31,2025(data accessed on September 12,2025).Multiple data visualization tools were employed to conduct in-depth analyses of the included publications,covering aspects such as publication quantity and quality,evolutionary trends of research hotspots,keyword cooccurrence networks,and collaborative patterns among countries/regions,institutions,and authors.RESULTS:A total of 3381 articles related to glaucoma inflammation were extracted from WoSCC.The analysis showed that the USA had the highest research output in this field(29.04%,n=982),followed by China(18.40%,n=622)and UK(6.01%,n=203).Based on citation frequency and burst intensity,the USA also ranked as the most influential country.Baudouin C and Sun X were identified as the most productive authors,while Journal of Glaucoma and Investigative Ophthalmology&Visual Science were the journals with the highest number of published relevant articles.Additionally,keyword analysis revealed that“neuroinflammation”,“retinal ganglion cells(RGCs)”,“pathophysiology”,and“traditional Chinese medicine”are emerging research hotspots in the field of immuneinflammatory responses in glaucoma.CONCLUSION:This study presents a comprehensive bibliometric overview of research on glaucoma-related inflammation,indicating that this field has received extensive scientific attention with a steady upward trend in research activity.Furthermore,it establishes a theoretical basis for the development of neuroinflammation-targeted therapeutic strategies for glaucoma and emphasizes the necessity of strengthening interdisciplinary collaboration to promote the clinical translation of research findings.展开更多
Rowlands et al.1present an analysis of accelerometer data from the UK Biobank cohort,examining variations in the duration,intensity,and accumulation of moderate-intensity physical activity(MPA)and vigorous-intensity p...Rowlands et al.1present an analysis of accelerometer data from the UK Biobank cohort,examining variations in the duration,intensity,and accumulation of moderate-intensity physical activity(MPA)and vigorous-intensity physical activity(VPA)sufficient to reduce the risk of all-cause mortality.In this study,the authors questioned if shorter durations(i.e.,1,2,3,4,5,10,15,and 20 min/day)of MPA and VPA performed continuously or accumulated throughout the day would equally reduce the risks of all-cause mortality as longer duration MPA and VPA recommended in the physical activity(PA)guidelines.展开更多
To address the challenge of low survival rates and limited data collection efficiency in current virtual probe deployments,which results from anomaly detection mechanisms in location-based service(LBS)applications,thi...To address the challenge of low survival rates and limited data collection efficiency in current virtual probe deployments,which results from anomaly detection mechanisms in location-based service(LBS)applications,this paper proposes a novel virtual probe deployment method based on user behavioral feature analysis.The core idea is to circumvent LBS anomaly detection by mimicking real-user behavior patterns.First,we design an automated data extraction algorithm that recognizes graphical user interface(GUI)elements to collect spatio-temporal behavior data.Then,by analyzing the automatically collected user data,we identify normal users’spatio-temporal patterns and extract their features such as high-activity time windows and spatial clustering characteristics.Subsequently,an antidetection scheduling strategy is developed,integrating spatial clustering optimization,load-balanced allocation,and time window control to generate probe scheduling schemes.Additionally,a self-correction mechanism based on an exponential backoff strategy is implemented to rectify anomalous behaviors andmaintain system stability.Experiments in real-world environments demonstrate that the proposed method significantly outperforms baseline methods in terms of both probe ban rate and task completion rate,while maintaining high time efficiency.This study provides a more reliable and clandestine solution for geosocial data collection and lays the foundation for building more robust virtual probe systems.展开更多
DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expres...DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expression data generated. To address this, this paper employs a mixed-effects model to analyze gene expression data. In terms of data selection, 1176 genes from the white mouse gene expression dataset under two experimental conditions were chosen, setting up two conditions: pneumococcal infection and no infection, and constructing a mixed-effects model. After preprocessing the gene chip information, the data were imported into the model, preliminary results were calculated, and permutation tests were performed to biologically validate the preliminary results using GSEA. The final dataset consists of 20 groups of gene expression data from pneumococcal infection, which categorizes functionally related genes based on the similarity of their expression profiles, facilitating the study of genes with unknown functions.展开更多
The exponential expansion of the Internet of Things(IoT),Industrial Internet of Things(IIoT),and Transportation Management of Things(TMoT)produces vast amounts of real-time streaming data.Ensuring system dependability...The exponential expansion of the Internet of Things(IoT),Industrial Internet of Things(IIoT),and Transportation Management of Things(TMoT)produces vast amounts of real-time streaming data.Ensuring system dependability,operational efficiency,and security depends on the identification of anomalies in these dynamic and resource-constrained systems.Due to their high computational requirements and inability to efficiently process continuous data streams,traditional anomaly detection techniques often fail in IoT systems.This work presents a resource-efficient adaptive anomaly detection model for real-time streaming data in IoT systems.Extensive experiments were carried out on multiple real-world datasets,achieving an average accuracy score of 96.06%with an execution time close to 7.5 milliseconds for each individual streaming data point,demonstrating its potential for real-time,resourceconstrained applications.The model uses Principal Component Analysis(PCA)for dimensionality reduction and a Z-score technique for anomaly detection.It maintains a low computational footprint with a sliding window mechanism,enabling incremental data processing and identification of both transient and sustained anomalies without storing historical data.The system uses a Multivariate Linear Regression(MLR)based imputation technique that estimates missing or corrupted sensor values,preserving data integrity prior to anomaly detection.The suggested solution is appropriate for many uses in smart cities,industrial automation,environmental monitoring,IoT security,and intelligent transportation systems,and is particularly well-suited for resource-constrained edge devices.展开更多
This paper analyzes the advantages of legal digital currencies and explores their impact on bank big data practices.By combining bank big data collection and processing,it clarifies that legal digital currencies can e...This paper analyzes the advantages of legal digital currencies and explores their impact on bank big data practices.By combining bank big data collection and processing,it clarifies that legal digital currencies can enhance the efficiency of bank data processing,enrich data types,and strengthen data analysis and application capabilities.In response to future development needs,it is necessary to strengthen data collection management,enhance data processing capabilities,innovate big data application models,and provide references for bank big data practices,promoting the transformation and upgrading of the banking industry in the context of legal digital currencies.展开更多
With the acceleration of intelligent transformation of energy system,the monitoring of equipment operation status and optimization of production process in thermal power plants face the challenge of multi-source heter...With the acceleration of intelligent transformation of energy system,the monitoring of equipment operation status and optimization of production process in thermal power plants face the challenge of multi-source heterogeneous data integration.In view of the heterogeneous characteristics of physical sensor data,including temperature,vibration and pressure that generated by boilers,steam turbines and other key equipment and real-time working condition data of SCADA system,this paper proposes a multi-source heterogeneous data fusion and analysis platform for thermal power plants based on edge computing and deep learning.By constructing a multi-level fusion architecture,the platform adopts dynamic weight allocation strategy and 5D digital twin model to realize the collaborative analysis of physical sensor data,simulation calculation results and expert knowledge.The data fusion module combines Kalman filter,wavelet transform and Bayesian estimation method to solve the problem of data time series alignment and dimension difference.Simulation results show that the data fusion accuracy can be improved to more than 98%,and the calculation delay can be controlled within 500 ms.The data analysis module integrates Dymola simulation model and AERMOD pollutant diffusion model,supports the cascade analysis of boiler combustion efficiency prediction and flue gas emission monitoring,system response time is less than 2 seconds,and data consistency verification accuracy reaches 99.5%.展开更多
The issue of strong noise has increasingly become a bottleneck restricting the precision and application space of electromagnetic exploration methods.Noise suppression and extraction of effective electromagnetic respo...The issue of strong noise has increasingly become a bottleneck restricting the precision and application space of electromagnetic exploration methods.Noise suppression and extraction of effective electromagnetic response information under a strong noise background is a crucial scientific task to be addressed.To solve the noise suppression problem of the controlled-source electromagnetic method in strong interference areas,we propose an approach based on complex-plane 2D k-means clustering for data processing.Based on the stability of the controlled-source signal response,clustering analysis is applied to classify the spectra of different sources and noises in multiple time segments.By identifying the power spectra with controlled-source characteristics,it helps to improve the quality of the controlled-source response extraction.This paper presents the principle and workflow of the proposed algorithm,and demonstrates feasibility and effectiveness of the new algorithm through synthetic and real data examples.The results show that,compared with the conventional Robust denoising method,the clustering algorithm has a stronger suppression effect on common noise,can identify high-quality signals,and improve the preprocessing data quality of the controlledsource electromagnetic method.展开更多
The fracture volume is gradually changed with the depletion of fracture pressure during the production process.However,there are few flowback models available so far that can estimate the fracture volume loss using pr...The fracture volume is gradually changed with the depletion of fracture pressure during the production process.However,there are few flowback models available so far that can estimate the fracture volume loss using pressure transient and rate transient data.The initial flowback involves producing back the fracturing fuid after hydraulic fracturing,while the second flowback involves producing back the preloading fluid injected into the parent wells before fracturing of child wells.The main objective of this research is to compare the initial and second flowback data to capture the changes in fracture volume after production and preload processes.Such a comparison is useful for evaluating well performance and optimizing frac-turing operations.We construct rate-normalized pressure(RNP)versus material balance time(MBT)diagnostic plots using both initial and second flowback data(FB;and FBs,respectively)of six multi-fractured horizontal wells completed in Niobrara and Codell formations in DJ Basin.In general,the slope of RNP plot during the FB,period is higher than that during the FB;period,indicating a potential loss of fracture volume from the FB;to the FB,period.We estimate the changes in effective fracture volume(Ver)by analyzing the changes in the RNP slope and total compressibility between these two flowback periods.Ver during FB,is in general 3%-45%lower than that during FB:.We also compare the drive mechanisms for the two flowback periods by calculating the compaction-drive index(CDI),hydrocarbon-drive index(HDI),and water-drive index(WDI).The dominant drive mechanism during both flowback periods is CDI,but its contribution is reduced by 16%in the FB,period.This drop is generally compensated by a relatively higher HDI during this period.The loss of effective fracture volume might be attributed to the pressure depletion in fractures,which occurs during the production period and can extend 800 days.展开更多
With the rapid development of the Internet and e-commerce,e-commerce platforms have accumulated huge amounts of user behavior data.The emergence of big data technology provides a powerful means for in-depth analysis o...With the rapid development of the Internet and e-commerce,e-commerce platforms have accumulated huge amounts of user behavior data.The emergence of big data technology provides a powerful means for in-depth analysis of these data and insight into user behavior patterns and preferences.This paper elaborates on the application of big data technology in the analysis of user behavior on e-commerce platforms,including the technical methods of data collection,storage,processing and analysis,as well as the specific applications in the construction of user profiles,precision marketing,personalized recommendation,user retention and churn analysis,etc.,and discusses the challenges and countermeasures faced in the application.Through the study of actual cases,it demonstrates the remarkable effectiveness of big data technology in enhancing the competitiveness of e-commerce platforms and user experience.展开更多
Objective To identify core acupoint patterns and elucidate the molecular mechanisms of acupuncture for primary depressive disorder(PDD)through data mining and network analysis.Methods A comprehensive literature search...Objective To identify core acupoint patterns and elucidate the molecular mechanisms of acupuncture for primary depressive disorder(PDD)through data mining and network analysis.Methods A comprehensive literature search was conducted across PubMed,Embase,Ovid Technologies(OVID),Web of Science,Cochrane Library,China National Knowledge Infrastructure(CNKI),China National Knowledge Infrastructure Database(VIP),Wanfang Data,and SinoMed Database from database foundation to January 31,2025,for clinical studies on acupuncture treatment of PDD.Descriptive statistics,high-frequency acupoint analysis,degree and betweenness centrality evaluation,and core acupoint prescription mining identified predominant therapeutic combinations for PDD.Network acupuncture was used to predict therapeutic target for the core acupoint prescription.Subsequent protein-protein interaction(PPI)network and molecular complex detection(MCODE)analyses were conducted to identify the key targets and functional modules.Gene Ontology(GO)and Kyoto Encyclopedia of Genes and Genomes(KEGG)analyses explored the underlying biological mechanisms of the core acupoint prescription in treating PDD.Results A total of 57 acupoint prescriptions underwent systematic analysis.The core therapeutic combinations comprised Baihui(GV20),Yintang(GV29),Neiguan(PC6),Hegu(LI4),and Shenmen(HT7).Network acupuncture analysis identified 88 potential therapeutic targets(79 overlapping with PDD),while PPI network analysis revealed central regulatory nodes,including interleukin(IL)-6,IL-1β,tumor necrosis factor(TNF)-α,toll-like receptor 4(TLR4),IL-10,brain-derived neurotrophic factor(BDNF),transforming growth factor(TGF)-β1,C-XC motif chemokine ligand 10(CXCL10),mitogen-activated protein kinase 3(MAPK3),and nitric oxide synthase 1(NOS1).MCODE-based modular analysis further elucidated three functionally coherent clusters:inflammation-homeostasis(score=6.571),plasticity-neurotransmission(score=3.143),and oxidative stress(score=3.000).GO and KEGG analyses demonstrated significant enrichment of the MAPK,phosphoinositide 3-kinase/protein kinase B(PI3K/Akt),and hypoxia-inducible factor(HIF)-1 signaling pathways.These mechanistic insights suggested that the antidepressant effects mediated through mechanisms of neuroinflammatory regulation,neuroplasticity restoration,and immune-oxidative stress homeostasis.Conclusion This study reveals that acupuncture alleviates depression through a multi-level mechanism,primarily involving the neuroinflammation suppression,neuroplasticity enhancement,and oxidative stress regulation.These findings systematically clarify the underlying mechanisms of acupuncture’s antidepressant effects and identify novel therapeutic targets for further mechanistic research.展开更多
Semantic communication(SemCom)aims to achieve high-fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy.Nevertheless,semantic communication still suffers from unexpe...Semantic communication(SemCom)aims to achieve high-fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy.Nevertheless,semantic communication still suffers from unexpected channel volatility and thus developing a re-transmission mechanism(e.g.,hybrid automatic repeat request[HARQ])becomes indispensable.In that regard,instead of discarding previously transmitted information,the incremental knowledge-based HARQ(IK-HARQ)is deemed as a more effective mechanism that could sufficiently utilize the information semantics.However,considering the possible existence of semantic ambiguity in image transmission,a simple bit-level cyclic redundancy check(CRC)might compromise the performance of IK-HARQ.Therefore,there emerges a strong incentive to revolutionize the CRC mechanism,thus more effectively reaping the benefits of both SemCom and HARQ.In this paper,built on top of swin transformer-based joint source-channel coding(JSCC)and IK-HARQ,we propose a semantic image transmission framework SC-TDA-HARQ.In particular,different from the conventional CRC,we introduce a topological data analysis(TDA)-based error detection method,which capably digs out the inner topological and geometric information of images,to capture semantic information and determine the necessity for re-transmission.Extensive numerical results validate the effectiveness and efficiency of the proposed SC-TDA-HARQ framework,especially under the limited bandwidth condition,and manifest the superiority of TDA-based error detection method in image transmission.展开更多
文摘Earthquakes are highly destructive spatio-temporal phenomena whose analysis is essential for disaster preparedness and risk mitigation.Modern seismological research produces vast volumes of heterogeneous data from seismic networks,satellite observations,and geospatial repositories,creating the need for scalable infrastructures capable of integrating and analyzing such data to support intelligent decision-making.Data warehousing technologies provide a robust foundation for this purpose;however,existing earthquake-oriented data warehouses remain limited,often relying on simplified schemas,domain-specific analytics,or cataloguing efforts.This paper presents the design and implementation of a spatio-temporal data warehouse for seismic activity.The framework integrates spatial and temporal dimensions in a unified schema and introduces a novel array-based approach for managing many-to-many relationships between facts and dimensions without intermediate bridge tables.A comparative evaluation against a conventional bridge-table schema demonstrates that the array-based design improves fact-centric query performance,while the bridge-table schema remains advantageous for dimension-centric queries.To reconcile these trade-offs,a hybrid schema is proposed that retains both representations,ensuring balanced efficiency across heterogeneous workloads.The proposed framework demonstrates how spatio-temporal data warehousing can address schema complexity,improve query performance,and support multidimensional visualization.In doing so,it provides a foundation for integrating seismic analysis into broader big data-driven intelligent decision systems for disaster resilience,risk mitigation,and emergency management.
基金Project supported by the Project of the Anhui Provincial Natural Science Foundation(Grant No.2308085MA19)Strategic Priority Research Program of the Chinese Academy of Sciences(Grant No.XDA0410401)+2 种基金the National Natural Science Foundation of China(Grant No.52202120)the National Key Research and Development Program of China(Grant No.2023YFA1609800)USTC Research Funds of the Double First-Class Initiative(Grant No.YD2310002013)。
文摘Small angle x-ray scattering(SAXS)is an advanced technique for characterizing the particle size distribution(PSD)of nanoparticles.However,the ill-posed nature of inverse problems in SAXS data analysis often reduces the accuracy of conventional methods.This article proposes a user-friendly software for PSD analysis,GranuSAS,which employs an algorithm that integrates truncated singular value decomposition(TSVD)with the Chahine method.This approach employs TSVD for data preprocessing,generating a set of initial solutions with noise suppression.A high-quality initial solution is subsequently selected via the L-curve method.This selected candidate solution is then iteratively refined by the Chahine algorithm,enforcing constraints such as non-negativity and improving physical interpretability.Most importantly,GranuSAS employs a parallel architecture that simultaneously yields inversion results from multiple shape models and,by evaluating the accuracy of each model's reconstructed scattering curve,offers a suggestion for model selection in material systems.To systematically validate the accuracy and efficiency of the software,verification was performed using both simulated and experimental datasets.The results demonstrate that the proposed software delivers both satisfactory accuracy and reliable computational efficiency.It provides an easy-to-use and reliable tool for researchers in materials science,helping them fully exploit the potential of SAXS in nanoparticle characterization.
文摘As global climate change intensifies,the power industry-a major source of carbon emissions-plays a pivotal role in achieving carbon peaking and neutrality goals through its low-carbon transition.Traditional power plants’carbon management systems can no longer meet the demands of high-precision,real-time monitoring.Smart power plants now offer innovative solutions for carbon emission tracking and intelligent analysis by integrating IoT,big data,and AI technologies.Current research predominantly focuses on optimizing individual processes,lacking systematic exploration of comprehensive dynamic monitoring and intelligent decision-making across the entire workflow.To address this gap,we propose a smart carbon emission monitoring and analysis platform for power plants that integrates IoT sensing,multimodal data analytics,and AI-driven decision-making.The platform establishes a multi-source sensor network to collect emissions data throughout the fuel combustion,auxiliary equipment operation,and waste treatment processes.Combining carbon emission factor analysis with machine learning models enables real-time emission calculations and utilizes long short-term memory networks to predict future emission trends.
文摘The application and development of a wide-area measurement system(WAMS)has enabled many applications and led to several requirements based on dynamic measurement data.Such data are transmitted as big data information flow.To ensure effective transmission of wide-frequency electrical information by the communication protocol of a WAMS,this study performs real-time traffic monitoring and analysis of the data network of a power information system,and establishes corresponding network optimization strategies to solve existing transmission problems.This study utilizes the traffic analysis results obtained using the current real-time dynamic monitoring system to design an optimization strategy,covering the optimization in three progressive levels:the underlying communication protocol,source data,and transmission process.Optimization of the system structure and scheduling optimization of data information are validated to be feasible and practical via tests.
文摘Opinion (sentiment) analysis on big data streams from the constantly generated text streams on social media networks to hundreds of millions of online consumer reviews provides many organizations in every field with opportunities to discover valuable intelligence from the massive user generated text streams. However, the traditional content analysis frameworks are inefficient to handle the unprecedentedly big volume of unstructured text streams and the complexity of text analysis tasks for the real time opinion analysis on the big data streams. In this paper, we propose a parallel real time sentiment analysis system: Social Media Data Stream Sentiment Analysis Service (SMDSSAS) that performs multiple phases of sentiment analysis of social media text streams effectively in real time with two fully analytic opinion mining models to combat the scale of text data streams and the complexity of sentiment analysis processing on unstructured text streams. We propose two aspect based opinion mining models: Deterministic and Probabilistic sentiment models for a real time sentiment analysis on the user given topic related data streams. Experiments on the social media Twitter stream traffic captured during the pre-election weeks of the 2016 Presidential election for real-time analysis of public opinions toward two presidential candidates showed that the proposed system was able to predict correctly Donald Trump as the winner of the 2016 Presidential election. The cross validation results showed that the proposed sentiment models with the real-time streaming components in our proposed framework delivered effectively the analysis of the opinions on two presidential candidates with average 81% accuracy for the Deterministic model and 80% for the Probabilistic model, which are 1% - 22% improvements from the results of the existing literature.
文摘In the era of big data, huge volumes of data are generated from online social networks, sensor networks, mobile devices, and organizations’ enterprise systems. This phenomenon provides organizations with unprecedented opportunities to tap into big data to mine valuable business intelligence. However, traditional business analytics methods may not be able to cope with the flood of big data. The main contribution of this paper is the illustration of the development of a novel big data stream analytics framework named BDSASA that leverages a probabilistic language model to analyze the consumer sentiments embedded in hundreds of millions of online consumer reviews. In particular, an inference model is embedded into the classical language modeling framework to enhance the prediction of consumer sentiments. The practical implication of our research work is that organizations can apply our big data stream analytics framework to analyze consumers’ product preferences, and hence develop more effective marketing and production strategies.
文摘With the advent of the big data era,real-time data analysis and decision-support systems have been recognized as essential tools for enhancing enterprise competitiveness and optimizing the decision-making process.This study aims to explore the development strategies of real-time data analysis and decision-support systems,and analyze their application status and future development trends in various industries.The article first reviews the basic concepts and importance of real-time data analysis and decision-support systems,and then discusses in detail the key technical aspects such as system architecture,data collection and processing,analysis methods,and visualization techniques.
基金support from the National Key Research and Development Program of China(No.2024YFB3713705)is acknowledgedWangzhong Mu would like to acknowledge the Strategic Mobility,Sweden(SSF,No.SM22-0039)+1 种基金the Swedish Foundation for International Cooperation in Research and Higher Education(STINT,No.IB2022-9228)the Jernkontoret(Sweden)for supporting this clean steel research.Gonghao Lian would like to acknowledge China Scholarship Council(CSC,No.202306080032).
文摘The detection and characterization of non-metallic inclusions are essential for clean steel production.Recently,imaging analysis combined with high-dimensional data processing of metallic materials using artificial intelligence(AI)-based machine learning(ML)has developed rapidly.This technique has achieved impressive results in the field of inclusion classification in process metallurgy.The present study surveys the ML modeling of inclusion prediction in advanced steels,including the detection,classification,and feature prediction of inclusions in different steel grades.Studies on clean steel with different features based on data and image analysis via ML are summarized.Regarding the data analysis,the inclusion prediction methodology based on ML establishes a connection between the experimental parameters and inclusion characteristics and analyzes the importance of the experimental parameters.Regarding the image analysis,the focus is placed on the classification of different types of inclusions via deep learning,in comparison with data analysis.Finally,further development of inclusion analyses using ML-based methods is recommended.This work paves the way for the application of AIbased methodologies for ultraclean-steel studies from a sustainable metallurgy perspective.
基金Supported by the National Natural Science Foundation of China(No.82074500)Beijing Natural Science Foundation(No.7252273)+2 种基金CACMS Innovation Fund(No.CI2021A02605)Administration of Traditional Chinese Medicine of Zhejiang Province(No.2024ZR029)Science and Technology Program of Wenzhou City(No.Y2023210).
文摘AIM:To perform a bibliometric analysis of publications focusing on inflammatory mechanisms in glaucoma,thereby comprehensively understanding the current research status and identifying potential frontier directions for future studies.METHODS:A systematic search was conducted in the Web of Science Core Collection(WoSCC)database to retrieve relevant literature published from January 1,2000,to August 31,2025(data accessed on September 12,2025).Multiple data visualization tools were employed to conduct in-depth analyses of the included publications,covering aspects such as publication quantity and quality,evolutionary trends of research hotspots,keyword cooccurrence networks,and collaborative patterns among countries/regions,institutions,and authors.RESULTS:A total of 3381 articles related to glaucoma inflammation were extracted from WoSCC.The analysis showed that the USA had the highest research output in this field(29.04%,n=982),followed by China(18.40%,n=622)and UK(6.01%,n=203).Based on citation frequency and burst intensity,the USA also ranked as the most influential country.Baudouin C and Sun X were identified as the most productive authors,while Journal of Glaucoma and Investigative Ophthalmology&Visual Science were the journals with the highest number of published relevant articles.Additionally,keyword analysis revealed that“neuroinflammation”,“retinal ganglion cells(RGCs)”,“pathophysiology”,and“traditional Chinese medicine”are emerging research hotspots in the field of immuneinflammatory responses in glaucoma.CONCLUSION:This study presents a comprehensive bibliometric overview of research on glaucoma-related inflammation,indicating that this field has received extensive scientific attention with a steady upward trend in research activity.Furthermore,it establishes a theoretical basis for the development of neuroinflammation-targeted therapeutic strategies for glaucoma and emphasizes the necessity of strengthening interdisciplinary collaboration to promote the clinical translation of research findings.
文摘Rowlands et al.1present an analysis of accelerometer data from the UK Biobank cohort,examining variations in the duration,intensity,and accumulation of moderate-intensity physical activity(MPA)and vigorous-intensity physical activity(VPA)sufficient to reduce the risk of all-cause mortality.In this study,the authors questioned if shorter durations(i.e.,1,2,3,4,5,10,15,and 20 min/day)of MPA and VPA performed continuously or accumulated throughout the day would equally reduce the risks of all-cause mortality as longer duration MPA and VPA recommended in the physical activity(PA)guidelines.
基金supported by theNationalNatural Science Foundation of China(No.U23A20305)National Key Research and Development Program of China(No.2022YFB3102900)+1 种基金Innovation Scientists and Technicians Troop Construction Projects of Henan Province,China(No.254000510007)Key Research and Development Project of Henan Province(No.221111321200).
文摘To address the challenge of low survival rates and limited data collection efficiency in current virtual probe deployments,which results from anomaly detection mechanisms in location-based service(LBS)applications,this paper proposes a novel virtual probe deployment method based on user behavioral feature analysis.The core idea is to circumvent LBS anomaly detection by mimicking real-user behavior patterns.First,we design an automated data extraction algorithm that recognizes graphical user interface(GUI)elements to collect spatio-temporal behavior data.Then,by analyzing the automatically collected user data,we identify normal users’spatio-temporal patterns and extract their features such as high-activity time windows and spatial clustering characteristics.Subsequently,an antidetection scheduling strategy is developed,integrating spatial clustering optimization,load-balanced allocation,and time window control to generate probe scheduling schemes.Additionally,a self-correction mechanism based on an exponential backoff strategy is implemented to rectify anomalous behaviors andmaintain system stability.Experiments in real-world environments demonstrate that the proposed method significantly outperforms baseline methods in terms of both probe ban rate and task completion rate,while maintaining high time efficiency.This study provides a more reliable and clandestine solution for geosocial data collection and lays the foundation for building more robust virtual probe systems.
文摘DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expression data generated. To address this, this paper employs a mixed-effects model to analyze gene expression data. In terms of data selection, 1176 genes from the white mouse gene expression dataset under two experimental conditions were chosen, setting up two conditions: pneumococcal infection and no infection, and constructing a mixed-effects model. After preprocessing the gene chip information, the data were imported into the model, preliminary results were calculated, and permutation tests were performed to biologically validate the preliminary results using GSEA. The final dataset consists of 20 groups of gene expression data from pneumococcal infection, which categorizes functionally related genes based on the similarity of their expression profiles, facilitating the study of genes with unknown functions.
基金funded by the Ongoing Research Funding Program(ORF-2025-890)King Saud University,Riyadh,Saudi Arabia and was supported by the Competitive Research Fund of theUniversity of Aizu,Japan.
文摘The exponential expansion of the Internet of Things(IoT),Industrial Internet of Things(IIoT),and Transportation Management of Things(TMoT)produces vast amounts of real-time streaming data.Ensuring system dependability,operational efficiency,and security depends on the identification of anomalies in these dynamic and resource-constrained systems.Due to their high computational requirements and inability to efficiently process continuous data streams,traditional anomaly detection techniques often fail in IoT systems.This work presents a resource-efficient adaptive anomaly detection model for real-time streaming data in IoT systems.Extensive experiments were carried out on multiple real-world datasets,achieving an average accuracy score of 96.06%with an execution time close to 7.5 milliseconds for each individual streaming data point,demonstrating its potential for real-time,resourceconstrained applications.The model uses Principal Component Analysis(PCA)for dimensionality reduction and a Z-score technique for anomaly detection.It maintains a low computational footprint with a sliding window mechanism,enabling incremental data processing and identification of both transient and sustained anomalies without storing historical data.The system uses a Multivariate Linear Regression(MLR)based imputation technique that estimates missing or corrupted sensor values,preserving data integrity prior to anomaly detection.The suggested solution is appropriate for many uses in smart cities,industrial automation,environmental monitoring,IoT security,and intelligent transportation systems,and is particularly well-suited for resource-constrained edge devices.
文摘This paper analyzes the advantages of legal digital currencies and explores their impact on bank big data practices.By combining bank big data collection and processing,it clarifies that legal digital currencies can enhance the efficiency of bank data processing,enrich data types,and strengthen data analysis and application capabilities.In response to future development needs,it is necessary to strengthen data collection management,enhance data processing capabilities,innovate big data application models,and provide references for bank big data practices,promoting the transformation and upgrading of the banking industry in the context of legal digital currencies.
文摘With the acceleration of intelligent transformation of energy system,the monitoring of equipment operation status and optimization of production process in thermal power plants face the challenge of multi-source heterogeneous data integration.In view of the heterogeneous characteristics of physical sensor data,including temperature,vibration and pressure that generated by boilers,steam turbines and other key equipment and real-time working condition data of SCADA system,this paper proposes a multi-source heterogeneous data fusion and analysis platform for thermal power plants based on edge computing and deep learning.By constructing a multi-level fusion architecture,the platform adopts dynamic weight allocation strategy and 5D digital twin model to realize the collaborative analysis of physical sensor data,simulation calculation results and expert knowledge.The data fusion module combines Kalman filter,wavelet transform and Bayesian estimation method to solve the problem of data time series alignment and dimension difference.Simulation results show that the data fusion accuracy can be improved to more than 98%,and the calculation delay can be controlled within 500 ms.The data analysis module integrates Dymola simulation model and AERMOD pollutant diffusion model,supports the cascade analysis of boiler combustion efficiency prediction and flue gas emission monitoring,system response time is less than 2 seconds,and data consistency verification accuracy reaches 99.5%.
基金supported by the National Key Research and Development Program Project of China(Grant No.2023YFF0718003)the key research and development plan project of Yunnan Province(Grant No.202303AA080006).
文摘The issue of strong noise has increasingly become a bottleneck restricting the precision and application space of electromagnetic exploration methods.Noise suppression and extraction of effective electromagnetic response information under a strong noise background is a crucial scientific task to be addressed.To solve the noise suppression problem of the controlled-source electromagnetic method in strong interference areas,we propose an approach based on complex-plane 2D k-means clustering for data processing.Based on the stability of the controlled-source signal response,clustering analysis is applied to classify the spectra of different sources and noises in multiple time segments.By identifying the power spectra with controlled-source characteristics,it helps to improve the quality of the controlled-source response extraction.This paper presents the principle and workflow of the proposed algorithm,and demonstrates feasibility and effectiveness of the new algorithm through synthetic and real data examples.The results show that,compared with the conventional Robust denoising method,the clustering algorithm has a stronger suppression effect on common noise,can identify high-quality signals,and improve the preprocessing data quality of the controlledsource electromagnetic method.
文摘The fracture volume is gradually changed with the depletion of fracture pressure during the production process.However,there are few flowback models available so far that can estimate the fracture volume loss using pressure transient and rate transient data.The initial flowback involves producing back the fracturing fuid after hydraulic fracturing,while the second flowback involves producing back the preloading fluid injected into the parent wells before fracturing of child wells.The main objective of this research is to compare the initial and second flowback data to capture the changes in fracture volume after production and preload processes.Such a comparison is useful for evaluating well performance and optimizing frac-turing operations.We construct rate-normalized pressure(RNP)versus material balance time(MBT)diagnostic plots using both initial and second flowback data(FB;and FBs,respectively)of six multi-fractured horizontal wells completed in Niobrara and Codell formations in DJ Basin.In general,the slope of RNP plot during the FB,period is higher than that during the FB;period,indicating a potential loss of fracture volume from the FB;to the FB,period.We estimate the changes in effective fracture volume(Ver)by analyzing the changes in the RNP slope and total compressibility between these two flowback periods.Ver during FB,is in general 3%-45%lower than that during FB:.We also compare the drive mechanisms for the two flowback periods by calculating the compaction-drive index(CDI),hydrocarbon-drive index(HDI),and water-drive index(WDI).The dominant drive mechanism during both flowback periods is CDI,but its contribution is reduced by 16%in the FB,period.This drop is generally compensated by a relatively higher HDI during this period.The loss of effective fracture volume might be attributed to the pressure depletion in fractures,which occurs during the production period and can extend 800 days.
文摘With the rapid development of the Internet and e-commerce,e-commerce platforms have accumulated huge amounts of user behavior data.The emergence of big data technology provides a powerful means for in-depth analysis of these data and insight into user behavior patterns and preferences.This paper elaborates on the application of big data technology in the analysis of user behavior on e-commerce platforms,including the technical methods of data collection,storage,processing and analysis,as well as the specific applications in the construction of user profiles,precision marketing,personalized recommendation,user retention and churn analysis,etc.,and discusses the challenges and countermeasures faced in the application.Through the study of actual cases,it demonstrates the remarkable effectiveness of big data technology in enhancing the competitiveness of e-commerce platforms and user experience.
文摘Objective To identify core acupoint patterns and elucidate the molecular mechanisms of acupuncture for primary depressive disorder(PDD)through data mining and network analysis.Methods A comprehensive literature search was conducted across PubMed,Embase,Ovid Technologies(OVID),Web of Science,Cochrane Library,China National Knowledge Infrastructure(CNKI),China National Knowledge Infrastructure Database(VIP),Wanfang Data,and SinoMed Database from database foundation to January 31,2025,for clinical studies on acupuncture treatment of PDD.Descriptive statistics,high-frequency acupoint analysis,degree and betweenness centrality evaluation,and core acupoint prescription mining identified predominant therapeutic combinations for PDD.Network acupuncture was used to predict therapeutic target for the core acupoint prescription.Subsequent protein-protein interaction(PPI)network and molecular complex detection(MCODE)analyses were conducted to identify the key targets and functional modules.Gene Ontology(GO)and Kyoto Encyclopedia of Genes and Genomes(KEGG)analyses explored the underlying biological mechanisms of the core acupoint prescription in treating PDD.Results A total of 57 acupoint prescriptions underwent systematic analysis.The core therapeutic combinations comprised Baihui(GV20),Yintang(GV29),Neiguan(PC6),Hegu(LI4),and Shenmen(HT7).Network acupuncture analysis identified 88 potential therapeutic targets(79 overlapping with PDD),while PPI network analysis revealed central regulatory nodes,including interleukin(IL)-6,IL-1β,tumor necrosis factor(TNF)-α,toll-like receptor 4(TLR4),IL-10,brain-derived neurotrophic factor(BDNF),transforming growth factor(TGF)-β1,C-XC motif chemokine ligand 10(CXCL10),mitogen-activated protein kinase 3(MAPK3),and nitric oxide synthase 1(NOS1).MCODE-based modular analysis further elucidated three functionally coherent clusters:inflammation-homeostasis(score=6.571),plasticity-neurotransmission(score=3.143),and oxidative stress(score=3.000).GO and KEGG analyses demonstrated significant enrichment of the MAPK,phosphoinositide 3-kinase/protein kinase B(PI3K/Akt),and hypoxia-inducible factor(HIF)-1 signaling pathways.These mechanistic insights suggested that the antidepressant effects mediated through mechanisms of neuroinflammatory regulation,neuroplasticity restoration,and immune-oxidative stress homeostasis.Conclusion This study reveals that acupuncture alleviates depression through a multi-level mechanism,primarily involving the neuroinflammation suppression,neuroplasticity enhancement,and oxidative stress regulation.These findings systematically clarify the underlying mechanisms of acupuncture’s antidepressant effects and identify novel therapeutic targets for further mechanistic research.
基金supported in part by the National Key Research and Development Program of China under Grant 2024YFE0200600in part by the National Natural Science Foundation of China under Grant 62071425+3 种基金in part by the Zhejiang Key Research and Development Plan under Grant 2022C01093in part by the Zhejiang Provincial Natural Science Foundation of China under Grant LR23F010005in part by the National Key Laboratory of Wireless Communications Foundation under Grant 2023KP01601in part by the Big Data and Intelligent Computing Key Lab of CQUPT under Grant BDIC-2023-B-001.
文摘Semantic communication(SemCom)aims to achieve high-fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy.Nevertheless,semantic communication still suffers from unexpected channel volatility and thus developing a re-transmission mechanism(e.g.,hybrid automatic repeat request[HARQ])becomes indispensable.In that regard,instead of discarding previously transmitted information,the incremental knowledge-based HARQ(IK-HARQ)is deemed as a more effective mechanism that could sufficiently utilize the information semantics.However,considering the possible existence of semantic ambiguity in image transmission,a simple bit-level cyclic redundancy check(CRC)might compromise the performance of IK-HARQ.Therefore,there emerges a strong incentive to revolutionize the CRC mechanism,thus more effectively reaping the benefits of both SemCom and HARQ.In this paper,built on top of swin transformer-based joint source-channel coding(JSCC)and IK-HARQ,we propose a semantic image transmission framework SC-TDA-HARQ.In particular,different from the conventional CRC,we introduce a topological data analysis(TDA)-based error detection method,which capably digs out the inner topological and geometric information of images,to capture semantic information and determine the necessity for re-transmission.Extensive numerical results validate the effectiveness and efficiency of the proposed SC-TDA-HARQ framework,especially under the limited bandwidth condition,and manifest the superiority of TDA-based error detection method in image transmission.