Viral infectious diseases,characterized by their intricate nature and wide-ranging diversity,pose substantial challenges in the domain of data management.The vast volume of data generated by these diseases,spanning fr...Viral infectious diseases,characterized by their intricate nature and wide-ranging diversity,pose substantial challenges in the domain of data management.The vast volume of data generated by these diseases,spanning from the molecular mechanisms within cells to large-scale epidemiological patterns,has surpassed the capabilities of traditional analytical methods.In the era of artificial intelligence(AI)and big data,there is an urgent necessity for the optimization of these analytical methods to more effectively handle and utilize the information.Despite the rapid accumulation of data associated with viral infections,the lack of a comprehensive framework for integrating,selecting,and analyzing these datasets has left numerous researchers uncertain about which data to select,how to access it,and how to utilize it most effectively in their research.This review endeavors to fill these gaps by exploring the multifaceted nature of viral infectious diseases and summarizing relevant data across multiple levels,from the molecular details of pathogens to broad epidemiological trends.The scope extends from the micro-scale to the macro-scale,encompassing pathogens,hosts,and vectors.In addition to data summarization,this review thoroughly investigates various dataset sources.It also traces the historical evolution of data collection in the field of viral infectious diseases,highlighting the progress achieved over time.Simultaneously,it evaluates the current limitations that impede data utilization.Furthermore,we propose strategies to surmount these challenges,focusing on the development and application of advanced computational techniques,AI-driven models,and enhanced data integration practices.By providing a comprehensive synthesis of existing knowledge,this review is designed to guide future research and contribute to more informed approaches in the surveillance,prevention,and control of viral infectious diseases,particularly within the context of the expanding big-data landscape.展开更多
Astronomical spectra are vital for deriving stellar properties,yet low signal-to-noise ratio(SNR)spectra often obscure key features,complicating accurate analysis.This study presents spec-Diffusion Probabilistic Model...Astronomical spectra are vital for deriving stellar properties,yet low signal-to-noise ratio(SNR)spectra often obscure key features,complicating accurate analysis.This study presents spec-Diffusion Probabilistic Models(DDPM),a novel deep learning approach based on DDPM,aimed at denoising low SNR spectra to improve stellar parameter estimation.Leveraging the LAMOST DR10 data set,we developed spec-DDPM using a tailored U-Net architecture(spec-Unet)to iteratively predict and remove noise.The model was trained on 28,500 low and high SNR spectral pairs and benchmarked against conventional methods,including Principal Component Analysis,wavelet techniques,and a modified DnCNN model.The spec-DDPM demonstrated superior performance,with reduced Mean Absolute Error,elevated Structural Similarity Index Measure,and enhanced spectral loss metrics.It effectively preserved critical spectral features and corrected continuum distortions.Validation experiments further confirmed its ability to improve stellar parameter estimation with reduced errors.These results underscore spec-DDPM’s potential to elevate spectral data quality,offering applications in restoring defective spectra and refining large-scale astronomical surveys.This work highlights the transformative role of deep learning in astronomical data processing.展开更多
With the advent of the big data era,real-time data analysis and decision-support systems have been recognized as essential tools for enhancing enterprise competitiveness and optimizing the decision-making process.This...With the advent of the big data era,real-time data analysis and decision-support systems have been recognized as essential tools for enhancing enterprise competitiveness and optimizing the decision-making process.This study aims to explore the development strategies of real-time data analysis and decision-support systems,and analyze their application status and future development trends in various industries.The article first reviews the basic concepts and importance of real-time data analysis and decision-support systems,and then discusses in detail the key technical aspects such as system architecture,data collection and processing,analysis methods,and visualization techniques.展开更多
As industrial production progresses toward digitalization,massive amounts of data have been collected,transmitted,and stored,with characteristics of large-scale,high-dimensional,heterogeneous,and spatiotemporal dynami...As industrial production progresses toward digitalization,massive amounts of data have been collected,transmitted,and stored,with characteristics of large-scale,high-dimensional,heterogeneous,and spatiotemporal dynamics.The high complexity of industrial big data poses challenges for the practical decision-making of domain experts,leading to ever-increasing needs for integrating computational intelligence with human perception into traditional data analysis.Industrial big data visualization integrates theoretical methods and practical technologies from multiple disciplines,including data mining,information visualization,computer graphics,and human-computer interaction,providing a highly effective manner for understanding and exploring the complex industrial processes.This review summarizes the state-of-the-art approaches,characterizes them with six visualization methods,and categorizes them based on analytical tasks and applications.Furthermore,key research challenges and potential future directions are identified.展开更多
DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expres...DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expression data generated. To address this, this paper employs a mixed-effects model to analyze gene expression data. In terms of data selection, 1176 genes from the white mouse gene expression dataset under two experimental conditions were chosen, setting up two conditions: pneumococcal infection and no infection, and constructing a mixed-effects model. After preprocessing the gene chip information, the data were imported into the model, preliminary results were calculated, and permutation tests were performed to biologically validate the preliminary results using GSEA. The final dataset consists of 20 groups of gene expression data from pneumococcal infection, which categorizes functionally related genes based on the similarity of their expression profiles, facilitating the study of genes with unknown functions.展开更多
Parametric survival models are essential for analyzing time-to-event data in fields such as engineering and biomedicine.While the log-logistic distribution is popular for its simplicity and closed-form expressions,it ...Parametric survival models are essential for analyzing time-to-event data in fields such as engineering and biomedicine.While the log-logistic distribution is popular for its simplicity and closed-form expressions,it often lacks the flexibility needed to capture complex hazard patterns.In this article,we propose a novel extension of the classical log-logistic distribution,termed the new exponential log-logistic(NExLL)distribution,designed to provide enhanced flexibility in modeling time-to-event data with complex failure behaviors.The NExLL model incorporates a new exponential generator to expand the shape adaptability of the baseline log-logistic distribution,allowing it to capture a wide range of hazard rate shapes,including increasing,decreasing,J-shaped,reversed J-shaped,modified bathtub,and unimodal forms.A key feature of the NExLL distribution is its formulation as a mixture of log-logistic densities,offering both symmetric and asymmetric patterns suitable for diverse real-world reliability scenarios.We establish several theoretical properties of the model,including closed-form expressions for its probability density function,cumulative distribution function,moments,hazard rate function,and quantiles.Parameter estimation is performed using seven classical estimation techniques,with extensive Monte Carlo simulations used to evaluate and compare their performance under various conditions.The practical utility and flexibility of the proposed model are illustrated using two real-world datasets from reliability and engineering applications,where the NExLL model demonstrates superior fit and predictive performance compared to existing log-logistic-basedmodels.This contribution advances the toolbox of parametric survivalmodels,offering a robust alternative formodeling complex aging and failure patterns in reliability,engineering,and other applied domains.展开更多
Semantic communication(SemCom)aims to achieve high-fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy.Nevertheless,semantic communication still suffers from unexpe...Semantic communication(SemCom)aims to achieve high-fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy.Nevertheless,semantic communication still suffers from unexpected channel volatility and thus developing a re-transmission mechanism(e.g.,hybrid automatic repeat request[HARQ])becomes indispensable.In that regard,instead of discarding previously transmitted information,the incremental knowledge-based HARQ(IK-HARQ)is deemed as a more effective mechanism that could sufficiently utilize the information semantics.However,considering the possible existence of semantic ambiguity in image transmission,a simple bit-level cyclic redundancy check(CRC)might compromise the performance of IK-HARQ.Therefore,there emerges a strong incentive to revolutionize the CRC mechanism,thus more effectively reaping the benefits of both SemCom and HARQ.In this paper,built on top of swin transformer-based joint source-channel coding(JSCC)and IK-HARQ,we propose a semantic image transmission framework SC-TDA-HARQ.In particular,different from the conventional CRC,we introduce a topological data analysis(TDA)-based error detection method,which capably digs out the inner topological and geometric information of images,to capture semantic information and determine the necessity for re-transmission.Extensive numerical results validate the effectiveness and efficiency of the proposed SC-TDA-HARQ framework,especially under the limited bandwidth condition,and manifest the superiority of TDA-based error detection method in image transmission.展开更多
In section‘Track decoding’of this article,one of the paragraphs was inadvertently missed out after the text'…shows the flow diagram of the Tr2-1121 track mode.'The missed paragraph is provided below.
The fracture volume is gradually changed with the depletion of fracture pressure during the production process.However,there are few flowback models available so far that can estimate the fracture volume loss using pr...The fracture volume is gradually changed with the depletion of fracture pressure during the production process.However,there are few flowback models available so far that can estimate the fracture volume loss using pressure transient and rate transient data.The initial flowback involves producing back the fracturing fuid after hydraulic fracturing,while the second flowback involves producing back the preloading fluid injected into the parent wells before fracturing of child wells.The main objective of this research is to compare the initial and second flowback data to capture the changes in fracture volume after production and preload processes.Such a comparison is useful for evaluating well performance and optimizing frac-turing operations.We construct rate-normalized pressure(RNP)versus material balance time(MBT)diagnostic plots using both initial and second flowback data(FB;and FBs,respectively)of six multi-fractured horizontal wells completed in Niobrara and Codell formations in DJ Basin.In general,the slope of RNP plot during the FB,period is higher than that during the FB;period,indicating a potential loss of fracture volume from the FB;to the FB,period.We estimate the changes in effective fracture volume(Ver)by analyzing the changes in the RNP slope and total compressibility between these two flowback periods.Ver during FB,is in general 3%-45%lower than that during FB:.We also compare the drive mechanisms for the two flowback periods by calculating the compaction-drive index(CDI),hydrocarbon-drive index(HDI),and water-drive index(WDI).The dominant drive mechanism during both flowback periods is CDI,but its contribution is reduced by 16%in the FB,period.This drop is generally compensated by a relatively higher HDI during this period.The loss of effective fracture volume might be attributed to the pressure depletion in fractures,which occurs during the production period and can extend 800 days.展开更多
This paper analyzes the advantages of legal digital currencies and explores their impact on bank big data practices.By combining bank big data collection and processing,it clarifies that legal digital currencies can e...This paper analyzes the advantages of legal digital currencies and explores their impact on bank big data practices.By combining bank big data collection and processing,it clarifies that legal digital currencies can enhance the efficiency of bank data processing,enrich data types,and strengthen data analysis and application capabilities.In response to future development needs,it is necessary to strengthen data collection management,enhance data processing capabilities,innovate big data application models,and provide references for bank big data practices,promoting the transformation and upgrading of the banking industry in the context of legal digital currencies.展开更多
The increasing demand for high-resolution solar observations has driven the development of advanced data processing and enhancement techniques for ground-based solar telescopes.This study focuses on developing a pytho...The increasing demand for high-resolution solar observations has driven the development of advanced data processing and enhancement techniques for ground-based solar telescopes.This study focuses on developing a python-based package(GT-scopy)for data processing and enhancing for giant solar telescopes,with application to the 1.6 m Goode Solar Telescope(GST)at Big Bear Solar Observatory.The objective is to develop a modern data processing software for refining existing data acquisition,processing,and enhancement methodologies to achieve atmospheric effect removal and accurate alignment at the sub-pixel level,particularly within the processing levels 1.0-1.5.In this research,we implemented an integrated and comprehensive data processing procedure that includes image de-rotation,zone-of-interest selection,coarse alignment,correction for atmospheric distortions,and fine alignment at the sub-pixel level with an advanced algorithm.The results demonstrate a significant improvement in image quality,with enhanced visibility of fine solar structures both in sunspots and quiet-Sun regions.The enhanced data processing package developed in this study significantly improves the utility of data obtained from the GST,paving the way for more precise solar research and contributing to a better understanding of solar dynamics.This package can be adapted for other ground-based solar telescopes,such as the Daniel K.Inouye Solar Telescope(DKIST),the European Solar Telescope(EST),and the 8 m Chinese Giant Solar Telescope,potentially benefiting the broader solar physics community.展开更多
Cervical cancer,a leading malignancy globally,poses a significant threat to women's health,with an estimated 604,000 new cases and 342,000 deaths reported in 2020^([1]).As cervical cancer is closely linked to huma...Cervical cancer,a leading malignancy globally,poses a significant threat to women's health,with an estimated 604,000 new cases and 342,000 deaths reported in 2020^([1]).As cervical cancer is closely linked to human papilloma virus(HPV)infection,early detection relies on HPV screening;however,late-stage prognosis remains poor,underscoring the need for novel diagnostic and therapeutic targets^([2]).展开更多
Lunar wrinkle ridges are an important stress geological structure on the Moon, which reflect the stress state and geological activity on the Moon. They provide important insights into the evolution of the Moon and are...Lunar wrinkle ridges are an important stress geological structure on the Moon, which reflect the stress state and geological activity on the Moon. They provide important insights into the evolution of the Moon and are key factors influencing future lunar activity, such as the choice of landing sites. However, automatic extraction of lunar wrinkle ridges is a challenging task due to their complex morphology and ambiguous features. Traditional manual extraction methods are time-consuming and labor-intensive. To achieve automated and detailed detection of lunar wrinkle ridges, we have constructed a lunar wrinkle ridge data set, incorporating previously unused aspect data to provide edge information, and proposed a Dual-Branch Ridge Detection Network(DBR-Net) based on deep learning technology. This method employs a dual-branch architecture and an Attention Complementary Feature Fusion module to address the issue of insufficient lunar wrinkle ridge features. Through comparisons with the results of various deep learning approaches, it is demonstrated that the proposed method exhibits superior detection performance. Furthermore, the trained model was applied to lunar mare regions, generating a distribution map of lunar mare wrinkle ridges;a significant linear relationship between the length and area of the lunar wrinkle ridges was obtained through statistical analysis, and six previously unrecorded potential lunar wrinkle ridges were detected. The proposed method upgrades the automated extraction of lunar wrinkle ridges to a pixel-level precision and verifies the effectiveness of DBR-Net in lunar wrinkle ridge detection.展开更多
BFOSC and YFOSC are the most frequently used instruments in the Xinglong 2.16 m telescope and Lijiang 2.4 m telescope,respectively.We developed a software package named“BYSpec”(BFOSC and YFOSC Spectra Reduction Pack...BFOSC and YFOSC are the most frequently used instruments in the Xinglong 2.16 m telescope and Lijiang 2.4 m telescope,respectively.We developed a software package named“BYSpec”(BFOSC and YFOSC Spectra Reduction Package)dedicated to automatically reducing the long-slit and echelle spectra obtained by these two instruments.The package supports bias and flat-fielding correction,order location,background subtraction,automatic wavelength calibration,and absolute flux calibration.The optimal extraction method maximizes the signal-to-noise ratio and removes most of the cosmic rays imprinted in the spectra.A comparison with the 1D spectra reduced with IRAF verifies the reliability of the results.This open-source software is publicly available to the community.展开更多
The integration of the Internet of Things(IoT)into healthcare systems improves patient care,boosts operational efficiency,and contributes to cost-effective healthcare delivery.However,overcoming several associated cha...The integration of the Internet of Things(IoT)into healthcare systems improves patient care,boosts operational efficiency,and contributes to cost-effective healthcare delivery.However,overcoming several associated challenges,such as data security,interoperability,and ethical concerns,is crucial to realizing the full potential of IoT in healthcare.Real-time anomaly detection plays a key role in protecting patient data and maintaining device integrity amidst the additional security risks posed by interconnected systems.In this context,this paper presents a novelmethod for healthcare data privacy analysis.The technique is based on the identification of anomalies in cloud-based Internet of Things(IoT)networks,and it is optimized using explainable artificial intelligence.For anomaly detection,the Radial Boltzmann Gaussian Temporal Fuzzy Network(RBGTFN)is used in the process of doing information privacy analysis for healthcare data.Remora Colony SwarmOptimization is then used to carry out the optimization of the network.The performance of the model in identifying anomalies across a variety of healthcare data is evaluated by an experimental study.This evaluation suggested that themodel measures the accuracy,precision,latency,Quality of Service(QoS),and scalability of themodel.A remarkable 95%precision,93%latency,89%quality of service,98%detection accuracy,and 96%scalability were obtained by the suggested model,as shown by the subsequent findings.展开更多
With the acceleration of intelligent transformation of energy system,the monitoring of equipment operation status and optimization of production process in thermal power plants face the challenge of multi-source heter...With the acceleration of intelligent transformation of energy system,the monitoring of equipment operation status and optimization of production process in thermal power plants face the challenge of multi-source heterogeneous data integration.In view of the heterogeneous characteristics of physical sensor data,including temperature,vibration and pressure that generated by boilers,steam turbines and other key equipment and real-time working condition data of SCADA system,this paper proposes a multi-source heterogeneous data fusion and analysis platform for thermal power plants based on edge computing and deep learning.By constructing a multi-level fusion architecture,the platform adopts dynamic weight allocation strategy and 5D digital twin model to realize the collaborative analysis of physical sensor data,simulation calculation results and expert knowledge.The data fusion module combines Kalman filter,wavelet transform and Bayesian estimation method to solve the problem of data time series alignment and dimension difference.Simulation results show that the data fusion accuracy can be improved to more than 98%,and the calculation delay can be controlled within 500 ms.The data analysis module integrates Dymola simulation model and AERMOD pollutant diffusion model,supports the cascade analysis of boiler combustion efficiency prediction and flue gas emission monitoring,system response time is less than 2 seconds,and data consistency verification accuracy reaches 99.5%.展开更多
There are some limitations when we apply conventional methods to analyze the massive amounts of seismic data acquired with high-density spatial sampling since processors usually obtain the properties of raw data from ...There are some limitations when we apply conventional methods to analyze the massive amounts of seismic data acquired with high-density spatial sampling since processors usually obtain the properties of raw data from common shot gathers or other datasets located at certain points or along lines. We propose a novel method in this paper to observe seismic data on time slices from spatial subsets. The composition of a spatial subset and the unique character of orthogonal or oblique subsets are described and pre-stack subsets are shown by 3D visualization. In seismic data processing, spatial subsets can be used for the following aspects: (1) to check the trace distribution uniformity and regularity; (2) to observe the main features of ground-roll and linear noise; (3) to find abnormal traces from slices of datasets; and (4) to QC the results of pre-stack noise attenuation. The field data application shows that seismic data analysis in spatial subsets is an effective method that may lead to a better discrimination among various wavefields and help us obtain more information.展开更多
The rapidly increasing demand and complexity of manufacturing process potentiates the usage of manufacturing data with the highest priority to achieve precise analyze and control,rather than using simplified physical ...The rapidly increasing demand and complexity of manufacturing process potentiates the usage of manufacturing data with the highest priority to achieve precise analyze and control,rather than using simplified physical models and human expertise.In the era of data-driven manufacturing,the explosion of data amount revolutionized how data is collected and analyzed.This paper overviews the advance of technologies developed for in-process manufacturing data collection and analysis.It can be concluded that groundbreaking sensoring technology to facilitate direct measurement is one important leading trend for advanced data collection,due to the complexity and uncertainty during indirect measurement.On the other hand,physical model-based data analysis contains inevitable simplifications and sometimes ill-posed solutions due to the limited capacity of describing complex manufacturing process.Machine learning,especially deep learning approach has great potential for making better decisions to automate the process when fed with abundant data,while trending data-driven manufacturing approaches succeeded by using limited data to achieve similar or even better decisions.And these trends can demonstrated be by analyzing some typical applications of manufacturing process.展开更多
In 2007,China surpassed the USA to become the largest carbon emitter in the world.China has promised a 60%–65%reduction in carbon emissions per unit GDP by 2030,compared to the baseline of 2005.Therefore,it is import...In 2007,China surpassed the USA to become the largest carbon emitter in the world.China has promised a 60%–65%reduction in carbon emissions per unit GDP by 2030,compared to the baseline of 2005.Therefore,it is important to obtain accurate dynamic information on the spatial and temporal patterns of carbon emissions and carbon footprints to support formulating effective national carbon emission reduction policies.This study attempts to build a carbon emission panel data model that simulates carbon emissions in China from 2000–2013 using nighttime lighting data and carbon emission statistics data.By applying the Exploratory Spatial-Temporal Data Analysis(ESTDA)framework,this study conducted an analysis on the spatial patterns and dynamic spatial-temporal interactions of carbon footprints from 2001–2013.The improved Tapio decoupling model was adopted to investigate the levels of coupling or decoupling between the carbon emission load and economic growth in 336 prefecture-level units.The results show that,firstly,high accuracy was achieved by the model in simulating carbon emissions.Secondly,the total carbon footprints and carbon deficits across China increased with average annual growth rates of 4.82%and 5.72%,respectively.The overall carbon footprints and carbon deficits were larger in the North than that in the South.There were extremely significant spatial autocorrelation features in the carbon footprints of prefecture-level units.Thirdly,the relative lengths of the Local Indicators of Spatial Association(LISA)time paths were longer in the North than that in the South,and they increased from the coastal to the central and western regions.Lastly,the overall decoupling index was mainly a weak decoupling type,but the number of cities with this weak decoupling continued to decrease.The unsustainable development trend of China’s economic growth and carbon emission load will continue for some time.展开更多
Objective:Saccades accompanied by normal gain in video head impulse tests(vHIT)are often observed in patients with vestibular migraine(VM).However,they are not considered as an independent indicator,reducing their uti...Objective:Saccades accompanied by normal gain in video head impulse tests(vHIT)are often observed in patients with vestibular migraine(VM).However,they are not considered as an independent indicator,reducing their utility in diagnosing VM.To better understand clinical features of VM,it is necessary to understand raw saccades data.Methods:Fourteen patients with confirmed VM,45 patients with probable VM(p-VM)and 14 agematched healthy volunteers were included in this study.Clinical findings related to spontaneous nystagmus(SN),positional nystagmus(PN),head-shaking nystagmus(HSN),caloric test and vHIT were recorded.Raw saccades data were exported and numbered by their sequences,and their features analyzed.Results:VM patients showed no SN,PN or HSN,and less than half of them showed unilateral weakness(UW)on caloric test.The first saccades from lateral semicircular canal stimulation were the most predominant for both left and right sides.Neither velocity nor time parameters were significantly different when compared between the two sides.Most VM patients(86%)exhibited small saccades,around 35%of the head peak velocity,with a latency of 200e400 ms.Characteristics of saccades were similar in patients with p-VM.Only four normal subjects showed saccades,all unilateral and seemingly random.Conclusions:Small saccades involving bilateral semicircular canals with a scattered distribution pattern are common in patients with VM and p-VM.展开更多
基金supported by the National Natural Science Foundation of China(32370703)the CAMS Innovation Fund for Medical Sciences(CIFMS)(2022-I2M-1-021,2021-I2M-1-061)the Major Project of Guangzhou National Labora-tory(GZNL2024A01015).
文摘Viral infectious diseases,characterized by their intricate nature and wide-ranging diversity,pose substantial challenges in the domain of data management.The vast volume of data generated by these diseases,spanning from the molecular mechanisms within cells to large-scale epidemiological patterns,has surpassed the capabilities of traditional analytical methods.In the era of artificial intelligence(AI)and big data,there is an urgent necessity for the optimization of these analytical methods to more effectively handle and utilize the information.Despite the rapid accumulation of data associated with viral infections,the lack of a comprehensive framework for integrating,selecting,and analyzing these datasets has left numerous researchers uncertain about which data to select,how to access it,and how to utilize it most effectively in their research.This review endeavors to fill these gaps by exploring the multifaceted nature of viral infectious diseases and summarizing relevant data across multiple levels,from the molecular details of pathogens to broad epidemiological trends.The scope extends from the micro-scale to the macro-scale,encompassing pathogens,hosts,and vectors.In addition to data summarization,this review thoroughly investigates various dataset sources.It also traces the historical evolution of data collection in the field of viral infectious diseases,highlighting the progress achieved over time.Simultaneously,it evaluates the current limitations that impede data utilization.Furthermore,we propose strategies to surmount these challenges,focusing on the development and application of advanced computational techniques,AI-driven models,and enhanced data integration practices.By providing a comprehensive synthesis of existing knowledge,this review is designed to guide future research and contribute to more informed approaches in the surveillance,prevention,and control of viral infectious diseases,particularly within the context of the expanding big-data landscape.
基金study was Foundation of China(NSFC)under grant Nos.11873037 and 11803016the science research grants from the China Manned Space Project with Nos.CMS-CSST-2021-B05 and CMSCSST-2021-A08+1 种基金the Natural Science Foundation of Shandong Province under grant Nos.ZR2022MA076,ZR2022MA089 and ZR2024MA063the Young Scholars Program of Shandong University,Weihai,under grant No.2016WHWLJH09 and GHfund A(202202018107).
文摘Astronomical spectra are vital for deriving stellar properties,yet low signal-to-noise ratio(SNR)spectra often obscure key features,complicating accurate analysis.This study presents spec-Diffusion Probabilistic Models(DDPM),a novel deep learning approach based on DDPM,aimed at denoising low SNR spectra to improve stellar parameter estimation.Leveraging the LAMOST DR10 data set,we developed spec-DDPM using a tailored U-Net architecture(spec-Unet)to iteratively predict and remove noise.The model was trained on 28,500 low and high SNR spectral pairs and benchmarked against conventional methods,including Principal Component Analysis,wavelet techniques,and a modified DnCNN model.The spec-DDPM demonstrated superior performance,with reduced Mean Absolute Error,elevated Structural Similarity Index Measure,and enhanced spectral loss metrics.It effectively preserved critical spectral features and corrected continuum distortions.Validation experiments further confirmed its ability to improve stellar parameter estimation with reduced errors.These results underscore spec-DDPM’s potential to elevate spectral data quality,offering applications in restoring defective spectra and refining large-scale astronomical surveys.This work highlights the transformative role of deep learning in astronomical data processing.
文摘With the advent of the big data era,real-time data analysis and decision-support systems have been recognized as essential tools for enhancing enterprise competitiveness and optimizing the decision-making process.This study aims to explore the development strategies of real-time data analysis and decision-support systems,and analyze their application status and future development trends in various industries.The article first reviews the basic concepts and importance of real-time data analysis and decision-support systems,and then discusses in detail the key technical aspects such as system architecture,data collection and processing,analysis methods,and visualization techniques.
基金supported in part by the National Key Research and Development Plan Project(2022YFB3304700)in part by the Xinliao Talent Program of Liaoning Province(XLYC2202002).
文摘As industrial production progresses toward digitalization,massive amounts of data have been collected,transmitted,and stored,with characteristics of large-scale,high-dimensional,heterogeneous,and spatiotemporal dynamics.The high complexity of industrial big data poses challenges for the practical decision-making of domain experts,leading to ever-increasing needs for integrating computational intelligence with human perception into traditional data analysis.Industrial big data visualization integrates theoretical methods and practical technologies from multiple disciplines,including data mining,information visualization,computer graphics,and human-computer interaction,providing a highly effective manner for understanding and exploring the complex industrial processes.This review summarizes the state-of-the-art approaches,characterizes them with six visualization methods,and categorizes them based on analytical tasks and applications.Furthermore,key research challenges and potential future directions are identified.
文摘DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expression data generated. To address this, this paper employs a mixed-effects model to analyze gene expression data. In terms of data selection, 1176 genes from the white mouse gene expression dataset under two experimental conditions were chosen, setting up two conditions: pneumococcal infection and no infection, and constructing a mixed-effects model. After preprocessing the gene chip information, the data were imported into the model, preliminary results were calculated, and permutation tests were performed to biologically validate the preliminary results using GSEA. The final dataset consists of 20 groups of gene expression data from pneumococcal infection, which categorizes functionally related genes based on the similarity of their expression profiles, facilitating the study of genes with unknown functions.
文摘Parametric survival models are essential for analyzing time-to-event data in fields such as engineering and biomedicine.While the log-logistic distribution is popular for its simplicity and closed-form expressions,it often lacks the flexibility needed to capture complex hazard patterns.In this article,we propose a novel extension of the classical log-logistic distribution,termed the new exponential log-logistic(NExLL)distribution,designed to provide enhanced flexibility in modeling time-to-event data with complex failure behaviors.The NExLL model incorporates a new exponential generator to expand the shape adaptability of the baseline log-logistic distribution,allowing it to capture a wide range of hazard rate shapes,including increasing,decreasing,J-shaped,reversed J-shaped,modified bathtub,and unimodal forms.A key feature of the NExLL distribution is its formulation as a mixture of log-logistic densities,offering both symmetric and asymmetric patterns suitable for diverse real-world reliability scenarios.We establish several theoretical properties of the model,including closed-form expressions for its probability density function,cumulative distribution function,moments,hazard rate function,and quantiles.Parameter estimation is performed using seven classical estimation techniques,with extensive Monte Carlo simulations used to evaluate and compare their performance under various conditions.The practical utility and flexibility of the proposed model are illustrated using two real-world datasets from reliability and engineering applications,where the NExLL model demonstrates superior fit and predictive performance compared to existing log-logistic-basedmodels.This contribution advances the toolbox of parametric survivalmodels,offering a robust alternative formodeling complex aging and failure patterns in reliability,engineering,and other applied domains.
基金supported in part by the National Key Research and Development Program of China under Grant 2024YFE0200600in part by the National Natural Science Foundation of China under Grant 62071425+3 种基金in part by the Zhejiang Key Research and Development Plan under Grant 2022C01093in part by the Zhejiang Provincial Natural Science Foundation of China under Grant LR23F010005in part by the National Key Laboratory of Wireless Communications Foundation under Grant 2023KP01601in part by the Big Data and Intelligent Computing Key Lab of CQUPT under Grant BDIC-2023-B-001.
文摘Semantic communication(SemCom)aims to achieve high-fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy.Nevertheless,semantic communication still suffers from unexpected channel volatility and thus developing a re-transmission mechanism(e.g.,hybrid automatic repeat request[HARQ])becomes indispensable.In that regard,instead of discarding previously transmitted information,the incremental knowledge-based HARQ(IK-HARQ)is deemed as a more effective mechanism that could sufficiently utilize the information semantics.However,considering the possible existence of semantic ambiguity in image transmission,a simple bit-level cyclic redundancy check(CRC)might compromise the performance of IK-HARQ.Therefore,there emerges a strong incentive to revolutionize the CRC mechanism,thus more effectively reaping the benefits of both SemCom and HARQ.In this paper,built on top of swin transformer-based joint source-channel coding(JSCC)and IK-HARQ,we propose a semantic image transmission framework SC-TDA-HARQ.In particular,different from the conventional CRC,we introduce a topological data analysis(TDA)-based error detection method,which capably digs out the inner topological and geometric information of images,to capture semantic information and determine the necessity for re-transmission.Extensive numerical results validate the effectiveness and efficiency of the proposed SC-TDA-HARQ framework,especially under the limited bandwidth condition,and manifest the superiority of TDA-based error detection method in image transmission.
文摘In section‘Track decoding’of this article,one of the paragraphs was inadvertently missed out after the text'…shows the flow diagram of the Tr2-1121 track mode.'The missed paragraph is provided below.
文摘The fracture volume is gradually changed with the depletion of fracture pressure during the production process.However,there are few flowback models available so far that can estimate the fracture volume loss using pressure transient and rate transient data.The initial flowback involves producing back the fracturing fuid after hydraulic fracturing,while the second flowback involves producing back the preloading fluid injected into the parent wells before fracturing of child wells.The main objective of this research is to compare the initial and second flowback data to capture the changes in fracture volume after production and preload processes.Such a comparison is useful for evaluating well performance and optimizing frac-turing operations.We construct rate-normalized pressure(RNP)versus material balance time(MBT)diagnostic plots using both initial and second flowback data(FB;and FBs,respectively)of six multi-fractured horizontal wells completed in Niobrara and Codell formations in DJ Basin.In general,the slope of RNP plot during the FB,period is higher than that during the FB;period,indicating a potential loss of fracture volume from the FB;to the FB,period.We estimate the changes in effective fracture volume(Ver)by analyzing the changes in the RNP slope and total compressibility between these two flowback periods.Ver during FB,is in general 3%-45%lower than that during FB:.We also compare the drive mechanisms for the two flowback periods by calculating the compaction-drive index(CDI),hydrocarbon-drive index(HDI),and water-drive index(WDI).The dominant drive mechanism during both flowback periods is CDI,but its contribution is reduced by 16%in the FB,period.This drop is generally compensated by a relatively higher HDI during this period.The loss of effective fracture volume might be attributed to the pressure depletion in fractures,which occurs during the production period and can extend 800 days.
文摘This paper analyzes the advantages of legal digital currencies and explores their impact on bank big data practices.By combining bank big data collection and processing,it clarifies that legal digital currencies can enhance the efficiency of bank data processing,enrich data types,and strengthen data analysis and application capabilities.In response to future development needs,it is necessary to strengthen data collection management,enhance data processing capabilities,innovate big data application models,and provide references for bank big data practices,promoting the transformation and upgrading of the banking industry in the context of legal digital currencies.
基金supported by the National Natural Science Foundation of China(NSFC,12173012 and 12473050)the Guangdong Natural Science Funds for Distinguished Young Scholars(2023B1515020049)+2 种基金the Shenzhen Science and Technology Project(JCYJ20240813104805008)the Shenzhen Key Laboratory Launching Project(No.ZDSYS20210702140800001)the Specialized Research Fund for State Key Laboratory of Solar Activity and Space Weather。
文摘The increasing demand for high-resolution solar observations has driven the development of advanced data processing and enhancement techniques for ground-based solar telescopes.This study focuses on developing a python-based package(GT-scopy)for data processing and enhancing for giant solar telescopes,with application to the 1.6 m Goode Solar Telescope(GST)at Big Bear Solar Observatory.The objective is to develop a modern data processing software for refining existing data acquisition,processing,and enhancement methodologies to achieve atmospheric effect removal and accurate alignment at the sub-pixel level,particularly within the processing levels 1.0-1.5.In this research,we implemented an integrated and comprehensive data processing procedure that includes image de-rotation,zone-of-interest selection,coarse alignment,correction for atmospheric distortions,and fine alignment at the sub-pixel level with an advanced algorithm.The results demonstrate a significant improvement in image quality,with enhanced visibility of fine solar structures both in sunspots and quiet-Sun regions.The enhanced data processing package developed in this study significantly improves the utility of data obtained from the GST,paving the way for more precise solar research and contributing to a better understanding of solar dynamics.This package can be adapted for other ground-based solar telescopes,such as the Daniel K.Inouye Solar Telescope(DKIST),the European Solar Telescope(EST),and the 8 m Chinese Giant Solar Telescope,potentially benefiting the broader solar physics community.
基金supported by a project funded by the Hebei Provincial Central Guidance Local Science and Technology Development Fund(236Z7714G)。
文摘Cervical cancer,a leading malignancy globally,poses a significant threat to women's health,with an estimated 604,000 new cases and 342,000 deaths reported in 2020^([1]).As cervical cancer is closely linked to human papilloma virus(HPV)infection,early detection relies on HPV screening;however,late-stage prognosis remains poor,underscoring the need for novel diagnostic and therapeutic targets^([2]).
文摘Lunar wrinkle ridges are an important stress geological structure on the Moon, which reflect the stress state and geological activity on the Moon. They provide important insights into the evolution of the Moon and are key factors influencing future lunar activity, such as the choice of landing sites. However, automatic extraction of lunar wrinkle ridges is a challenging task due to their complex morphology and ambiguous features. Traditional manual extraction methods are time-consuming and labor-intensive. To achieve automated and detailed detection of lunar wrinkle ridges, we have constructed a lunar wrinkle ridge data set, incorporating previously unused aspect data to provide edge information, and proposed a Dual-Branch Ridge Detection Network(DBR-Net) based on deep learning technology. This method employs a dual-branch architecture and an Attention Complementary Feature Fusion module to address the issue of insufficient lunar wrinkle ridge features. Through comparisons with the results of various deep learning approaches, it is demonstrated that the proposed method exhibits superior detection performance. Furthermore, the trained model was applied to lunar mare regions, generating a distribution map of lunar mare wrinkle ridges;a significant linear relationship between the length and area of the lunar wrinkle ridges was obtained through statistical analysis, and six previously unrecorded potential lunar wrinkle ridges were detected. The proposed method upgrades the automated extraction of lunar wrinkle ridges to a pixel-level precision and verifies the effectiveness of DBR-Net in lunar wrinkle ridge detection.
基金supported by the National Natural Science Foundation of China under grant No.U2031144partially supported by the Open Project Program of the Key Laboratory of Optical Astronomy,National Astronomical Observatories,Chinese Academy of Sciences+5 种基金supported by the National Key R&D Program of China with No.2021YFA1600404the National Natural Science Foundation of China(12173082)the Yunnan Fundamental Research Projects(grant 202201AT070069)the Top-notch Young Talents Program of Yunnan Provincethe Light of West China Program provided by the Chinese Academy of Sciencesthe International Centre of Supernovae,Yunnan Key Laboratory(No.202302AN360001)。
文摘BFOSC and YFOSC are the most frequently used instruments in the Xinglong 2.16 m telescope and Lijiang 2.4 m telescope,respectively.We developed a software package named“BYSpec”(BFOSC and YFOSC Spectra Reduction Package)dedicated to automatically reducing the long-slit and echelle spectra obtained by these two instruments.The package supports bias and flat-fielding correction,order location,background subtraction,automatic wavelength calibration,and absolute flux calibration.The optimal extraction method maximizes the signal-to-noise ratio and removes most of the cosmic rays imprinted in the spectra.A comparison with the 1D spectra reduced with IRAF verifies the reliability of the results.This open-source software is publicly available to the community.
基金funded by Deanship of Scientific Research(DSR)at King Abdulaziz University,Jeddah under grant No.(RG-6-611-43)the authors,therefore,acknowledge with thanks DSR technical and financial support.
文摘The integration of the Internet of Things(IoT)into healthcare systems improves patient care,boosts operational efficiency,and contributes to cost-effective healthcare delivery.However,overcoming several associated challenges,such as data security,interoperability,and ethical concerns,is crucial to realizing the full potential of IoT in healthcare.Real-time anomaly detection plays a key role in protecting patient data and maintaining device integrity amidst the additional security risks posed by interconnected systems.In this context,this paper presents a novelmethod for healthcare data privacy analysis.The technique is based on the identification of anomalies in cloud-based Internet of Things(IoT)networks,and it is optimized using explainable artificial intelligence.For anomaly detection,the Radial Boltzmann Gaussian Temporal Fuzzy Network(RBGTFN)is used in the process of doing information privacy analysis for healthcare data.Remora Colony SwarmOptimization is then used to carry out the optimization of the network.The performance of the model in identifying anomalies across a variety of healthcare data is evaluated by an experimental study.This evaluation suggested that themodel measures the accuracy,precision,latency,Quality of Service(QoS),and scalability of themodel.A remarkable 95%precision,93%latency,89%quality of service,98%detection accuracy,and 96%scalability were obtained by the suggested model,as shown by the subsequent findings.
文摘With the acceleration of intelligent transformation of energy system,the monitoring of equipment operation status and optimization of production process in thermal power plants face the challenge of multi-source heterogeneous data integration.In view of the heterogeneous characteristics of physical sensor data,including temperature,vibration and pressure that generated by boilers,steam turbines and other key equipment and real-time working condition data of SCADA system,this paper proposes a multi-source heterogeneous data fusion and analysis platform for thermal power plants based on edge computing and deep learning.By constructing a multi-level fusion architecture,the platform adopts dynamic weight allocation strategy and 5D digital twin model to realize the collaborative analysis of physical sensor data,simulation calculation results and expert knowledge.The data fusion module combines Kalman filter,wavelet transform and Bayesian estimation method to solve the problem of data time series alignment and dimension difference.Simulation results show that the data fusion accuracy can be improved to more than 98%,and the calculation delay can be controlled within 500 ms.The data analysis module integrates Dymola simulation model and AERMOD pollutant diffusion model,supports the cascade analysis of boiler combustion efficiency prediction and flue gas emission monitoring,system response time is less than 2 seconds,and data consistency verification accuracy reaches 99.5%.
文摘There are some limitations when we apply conventional methods to analyze the massive amounts of seismic data acquired with high-density spatial sampling since processors usually obtain the properties of raw data from common shot gathers or other datasets located at certain points or along lines. We propose a novel method in this paper to observe seismic data on time slices from spatial subsets. The composition of a spatial subset and the unique character of orthogonal or oblique subsets are described and pre-stack subsets are shown by 3D visualization. In seismic data processing, spatial subsets can be used for the following aspects: (1) to check the trace distribution uniformity and regularity; (2) to observe the main features of ground-roll and linear noise; (3) to find abnormal traces from slices of datasets; and (4) to QC the results of pre-stack noise attenuation. The field data application shows that seismic data analysis in spatial subsets is an effective method that may lead to a better discrimination among various wavefields and help us obtain more information.
基金Supported by National Natural Science Foundation of China(Grant No.51805260)National Natural Science Foundation for Distinguished Young Scholars of China(Grant No.51925505)National Natural Science Foundation of China(Grant No.51775278).
文摘The rapidly increasing demand and complexity of manufacturing process potentiates the usage of manufacturing data with the highest priority to achieve precise analyze and control,rather than using simplified physical models and human expertise.In the era of data-driven manufacturing,the explosion of data amount revolutionized how data is collected and analyzed.This paper overviews the advance of technologies developed for in-process manufacturing data collection and analysis.It can be concluded that groundbreaking sensoring technology to facilitate direct measurement is one important leading trend for advanced data collection,due to the complexity and uncertainty during indirect measurement.On the other hand,physical model-based data analysis contains inevitable simplifications and sometimes ill-posed solutions due to the limited capacity of describing complex manufacturing process.Machine learning,especially deep learning approach has great potential for making better decisions to automate the process when fed with abundant data,while trending data-driven manufacturing approaches succeeded by using limited data to achieve similar or even better decisions.And these trends can demonstrated be by analyzing some typical applications of manufacturing process.
基金National Natural Science Foundation of China Youth Science Foundation ProjectNo.41701170+1 种基金National Natural Science Foundation of China,No.41661025,No.42071216Fundamental Research Funds for the Central Universities,No.18LZUJBWZY068。
文摘In 2007,China surpassed the USA to become the largest carbon emitter in the world.China has promised a 60%–65%reduction in carbon emissions per unit GDP by 2030,compared to the baseline of 2005.Therefore,it is important to obtain accurate dynamic information on the spatial and temporal patterns of carbon emissions and carbon footprints to support formulating effective national carbon emission reduction policies.This study attempts to build a carbon emission panel data model that simulates carbon emissions in China from 2000–2013 using nighttime lighting data and carbon emission statistics data.By applying the Exploratory Spatial-Temporal Data Analysis(ESTDA)framework,this study conducted an analysis on the spatial patterns and dynamic spatial-temporal interactions of carbon footprints from 2001–2013.The improved Tapio decoupling model was adopted to investigate the levels of coupling or decoupling between the carbon emission load and economic growth in 336 prefecture-level units.The results show that,firstly,high accuracy was achieved by the model in simulating carbon emissions.Secondly,the total carbon footprints and carbon deficits across China increased with average annual growth rates of 4.82%and 5.72%,respectively.The overall carbon footprints and carbon deficits were larger in the North than that in the South.There were extremely significant spatial autocorrelation features in the carbon footprints of prefecture-level units.Thirdly,the relative lengths of the Local Indicators of Spatial Association(LISA)time paths were longer in the North than that in the South,and they increased from the coastal to the central and western regions.Lastly,the overall decoupling index was mainly a weak decoupling type,but the number of cities with this weak decoupling continued to decrease.The unsustainable development trend of China’s economic growth and carbon emission load will continue for some time.
文摘Objective:Saccades accompanied by normal gain in video head impulse tests(vHIT)are often observed in patients with vestibular migraine(VM).However,they are not considered as an independent indicator,reducing their utility in diagnosing VM.To better understand clinical features of VM,it is necessary to understand raw saccades data.Methods:Fourteen patients with confirmed VM,45 patients with probable VM(p-VM)and 14 agematched healthy volunteers were included in this study.Clinical findings related to spontaneous nystagmus(SN),positional nystagmus(PN),head-shaking nystagmus(HSN),caloric test and vHIT were recorded.Raw saccades data were exported and numbered by their sequences,and their features analyzed.Results:VM patients showed no SN,PN or HSN,and less than half of them showed unilateral weakness(UW)on caloric test.The first saccades from lateral semicircular canal stimulation were the most predominant for both left and right sides.Neither velocity nor time parameters were significantly different when compared between the two sides.Most VM patients(86%)exhibited small saccades,around 35%of the head peak velocity,with a latency of 200e400 ms.Characteristics of saccades were similar in patients with p-VM.Only four normal subjects showed saccades,all unilateral and seemingly random.Conclusions:Small saccades involving bilateral semicircular canals with a scattered distribution pattern are common in patients with VM and p-VM.