Geochemical survey data are essential across Earth Science disciplines but are often affected by noise,which can obscure important geological signals and compromise subsequent prediction and interpretation.Quantifying...Geochemical survey data are essential across Earth Science disciplines but are often affected by noise,which can obscure important geological signals and compromise subsequent prediction and interpretation.Quantifying prediction uncertainty is hence crucial for robust geoscientific decision-making.This study proposes a novel deep learning framework,the Spatially Constrained Variational Autoencoder(SC-VAE),for denoising geochemical survey data with integrated uncertainty quantification.The SC-VAE incorporates spatial regularization,which enforces spatial coherence by modeling inter-sample relationships directly within the latent space.The performance of the SC-VAE was systematically evaluated against a standard Variational Autoencoder(VAE)using geochemical data from the gold polymetallic district in the northwestern part of Sichuan Province,China.Both models were optimized using Bayesian optimization,with objective functions specifically designed to maintain essential geostatistical characteristics.Evaluation metrics include variogram analysis,quantitative measures of spatial interpolation accuracy,visual assessment of denoised maps,and statistical analysis of data distributions,as well as decomposition of uncertainties.Results show that the SC-VAE achieves superior noise suppression and better preservation of spatial structure compared to the standard VAE,as demonstrated by a significant reduction in the variogram nugget effect and an increased partial sill.The SC-VAE produces denoised maps with clearer anomaly delineation and more regularized data distributions,effectively mitigating outliers and reducing kurtosis.Additionally,it delivers improved interpolation accuracy and spatially explicit uncertainty estimates,facilitating more reliable and interpretable assessments of prediction confidence.The SC-VAE framework thus provides a robust,geostatistically informed solution for enhancing the quality and interpretability of geochemical data,with broad applicability in mineral exploration,environmental geochemistry,and other Earth Science domains.展开更多
Aviation data analysis can help airlines to understand passenger needs,so as to provide passengers with more sophisticated and better services.How to explore the implicit message and analyze contained features from la...Aviation data analysis can help airlines to understand passenger needs,so as to provide passengers with more sophisticated and better services.How to explore the implicit message and analyze contained features from large amounts of data has become an important issue in the civil aviation passenger data analysis process.The uncertainty analysis and visualization methods of data record and property measurement are offered in this paper,based on the visual analysis and uncertainty measure theory combined with parallel coordinates,radar chart,histogram,pixel chart and good interaction.At the same time,the data source expression clearly shows the uncertainty and hidden information as an information base for passengers’service展开更多
Background A task assigned to space exploration satellites involves detecting the physical environment within a certain space.However,space detection data are complex and abstract.These data are not conducive for rese...Background A task assigned to space exploration satellites involves detecting the physical environment within a certain space.However,space detection data are complex and abstract.These data are not conducive for researchers'visual perceptions of the evolution and interaction of events in the space environment.Methods A time-series dynamic data sampling method for large-scale space was proposed for sample detection data in space and time,and the corresponding relationships between data location features and other attribute features were established.A tone-mapping method based on statistical histogram equalization was proposed and applied to the final attribute feature data.The visualization process is optimized for rendering by merging materials,reducing the number of patches,and performing other operations.Results The results of sampling,feature extraction,and uniform visualization of the detection data of complex types,long duration spans,and uneven spatial distributions were obtained.The real-time visualization of large-scale spatial structures using augmented reality devices,particularly low-performance devices,was also investigated.Conclusions The proposed visualization system can reconstruct the three-dimensional structure of a large-scale space,express the structure and changes in the spatial environment using augmented reality,and assist in intuitively discovering spatial environmental events and evolutionary rules.展开更多
The advent of the big data era has made data visualization a crucial tool for enhancing the efficiency and insights of data analysis. This theoretical research delves into the current applications and potential future...The advent of the big data era has made data visualization a crucial tool for enhancing the efficiency and insights of data analysis. This theoretical research delves into the current applications and potential future trends of data visualization in big data analysis. The article first systematically reviews the theoretical foundations and technological evolution of data visualization, and thoroughly analyzes the challenges faced by visualization in the big data environment, such as massive data processing, real-time visualization requirements, and multi-dimensional data display. Through extensive literature research, it explores innovative application cases and theoretical models of data visualization in multiple fields including business intelligence, scientific research, and public decision-making. The study reveals that interactive visualization, real-time visualization, and immersive visualization technologies may become the main directions for future development and analyzes the potential of these technologies in enhancing user experience and data comprehension. The paper also delves into the theoretical potential of artificial intelligence technology in enhancing data visualization capabilities, such as automated chart generation, intelligent recommendation of visualization schemes, and adaptive visualization interfaces. The research also focuses on the role of data visualization in promoting interdisciplinary collaboration and data democratization. Finally, the paper proposes theoretical suggestions for promoting data visualization technology innovation and application popularization, including strengthening visualization literacy education, developing standardized visualization frameworks, and promoting open-source sharing of visualization tools. This study provides a comprehensive theoretical perspective for understanding the importance of data visualization in the big data era and its future development directions.展开更多
This study aims to explore the application of Bayesian analysis based on neural networks and deep learning in data visualization.The research background is that with the increasing amount and complexity of data,tradit...This study aims to explore the application of Bayesian analysis based on neural networks and deep learning in data visualization.The research background is that with the increasing amount and complexity of data,traditional data analysis methods have been unable to meet the needs.Research methods include building neural networks and deep learning models,optimizing and improving them through Bayesian analysis,and applying them to the visualization of large-scale data sets.The results show that the neural network combined with Bayesian analysis and deep learning method can effectively improve the accuracy and efficiency of data visualization,and enhance the intuitiveness and depth of data interpretation.The significance of the research is that it provides a new solution for data visualization in the big data environment and helps to further promote the development and application of data science.展开更多
Graphical abstracts(GAs)are emerging as a pivotal tool in medical literature,enhancing the dissemination and comprehension of complex clinical data through visual summaries.This editorial highlights the significant ad...Graphical abstracts(GAs)are emerging as a pivotal tool in medical literature,enhancing the dissemination and comprehension of complex clinical data through visual summaries.This editorial highlights the significant advantages of GAs,including improved clarity,increased reader engagement,and enhanced visibility of research findings.By transforming intricate scientific data into accessible visual formats,these abstracts facilitate quick and effective knowledge transfer,crucial in clinical decision-making and patient care.However,challenges such as potential data misrepresentation due to oversimplification,the skill gap in graphic design among researchers,and the lack of standardized creation guidelines pose barriers to their widespread adoption.Additionally,while software such as Adobe Illustrator,BioRender,and Canva are commonly employed to create these visuals,not all researchers may be proficient in their use.To address these issues,we recommend that academic journals establish clear guidelines and provide necessary design training to researchers.This proactive approach will ensure the creation of high-quality GAs,promote their standardization,and expand their use in clinical reporting,ultimately benefiting the medical community and improving healthcare outcomes.展开更多
With the advent of the big data era,real-time data analysis and decision-support systems have been recognized as essential tools for enhancing enterprise competitiveness and optimizing the decision-making process.This...With the advent of the big data era,real-time data analysis and decision-support systems have been recognized as essential tools for enhancing enterprise competitiveness and optimizing the decision-making process.This study aims to explore the development strategies of real-time data analysis and decision-support systems,and analyze their application status and future development trends in various industries.The article first reviews the basic concepts and importance of real-time data analysis and decision-support systems,and then discusses in detail the key technical aspects such as system architecture,data collection and processing,analysis methods,and visualization techniques.展开更多
This study focuses on the risks associated with the on-balance sheet recognition of data resources.At the legal level,disputes over ownership often arise due to unclear data property rights,while privacy protection,cy...This study focuses on the risks associated with the on-balance sheet recognition of data resources.At the legal level,disputes over ownership often arise due to unclear data property rights,while privacy protection,cybersecurity,and cross-border data flows create additional compliance challenges.In terms of recognition,the subjectivity of traditional valuation methods,the lack of active markets,and the rapid depreciation of data value caused by technological iteration hinder reliable measurement.With respect to disclosure,organizations face a dilemma between transparency and confidentiality.Collectively,these issues exacerbate audit risks.It is therefore imperative to establish an appropriate legal,accounting,and auditing framework to mitigate such risks and remove barriers to the proper recognition of data assets on balance sheets.展开更多
The increasing prevalence of multi-view data has made multi-view clustering a crucial technique for discovering latent structures from heterogeneous representations.However,traditional fuzzy clustering algorithms show...The increasing prevalence of multi-view data has made multi-view clustering a crucial technique for discovering latent structures from heterogeneous representations.However,traditional fuzzy clustering algorithms show limitations with the inherent uncertainty and imprecision of such data,as they rely on a single-dimensional membership value.To overcome these limitations,we propose an auto-weighted multi-view neutrosophic fuzzy clustering(AW-MVNFC)algorithm.Our method leverages the neutrosophic framework,an extension of fuzzy sets,to explicitly model imprecision and ambiguity through three membership degrees.The core novelty of AWMVNFC lies in a hierarchical weighting strategy that adaptively learns the contributions of both individual data views and the importance of each feature within a view.Through a unified objective function,AW-MVNFC jointly optimizes the neutrosophic membership assignments,cluster centers,and the distributions of view and feature weights.Comprehensive experiments conducted on synthetic and real-world datasets demonstrate that our algorithm achieves more accurate and stable clustering than existing methods,demonstrating its effectiveness in handling the complexities of multi-view data.展开更多
Many bioinformatics applications require determining the class of a newly sequenced Deoxyribonucleic acid(DNA)sequence,making DNA sequence classification an integral step in performing bioinformatics analysis,where la...Many bioinformatics applications require determining the class of a newly sequenced Deoxyribonucleic acid(DNA)sequence,making DNA sequence classification an integral step in performing bioinformatics analysis,where large biomedical datasets are transformed into valuable knowledge.Existing methods rely on a feature extraction step and suffer from high computational time requirements.In contrast,newer approaches leveraging deep learning have shown significant promise in enhancing accuracy and efficiency.In this paper,we investigate the performance of various deep learning architectures:Convolutional Neural Network(CNN),CNN-Long Short-Term Memory(CNNLSTM),CNN-Bidirectional Long Short-Term Memory(CNN-BiLSTM),Residual Network(ResNet),and InceptionV3 for DNA sequence classification.Various numerical and visual data representation techniques are utilized to represent the input datasets,including:label encoding,k-mer sentence encoding,k-mer one-hot vector,Frequency Chaos Game Representation(FCGR)and 5-Color Map(ColorSquare).Three datasets are used for the training of the models including H3,H4 and DNA Sequence Dataset(Yeast,Human,Arabidopsis Thaliana).Experiments are performed to determine which combination of DNA representation and deep learning architecture yields improved performance for the classification task.Our results indicate that using a hybrid CNN-LSTM neural network trained on DNA sequences represented as one-hot encoded k-mer sequences yields the best performance,achieving an accuracy of 92.1%.展开更多
Visual object tracking(VOT),aiming to track a target object in a continuous video,is a fundamental and critical task in computer vision.However,the reliance on third-party resources(e.g.,dataset)for training poses con...Visual object tracking(VOT),aiming to track a target object in a continuous video,is a fundamental and critical task in computer vision.However,the reliance on third-party resources(e.g.,dataset)for training poses concealed threats to the security of VOT models.In this paper,we reveal that VOT models are vulnerable to a poison-only and targeted backdoor attack,where the adversary can achieve arbitrary tracking predictions by manipulating only part of the training data.Specifically,we first define and formulate three different variants of the targeted attacks:size-manipulation,trajectory-manipulation,and hybrid attacks.To implement these,we introduce Random Video Poisoning(RVP),a novel poison-only strategy that exploits temporal correlations within video data by poisoning entire video sequences.Extensive experiments demonstrate that RVP effectively injects controllable backdoors,enabling precise manipulation of tracking behavior upon trigger activation,while maintaining high performance on benign data,thus ensuring stealth.Our findings not only expose significant vulnerabilities but also highlight that the underlying principles could be adapted for beneficial uses,such as dataset watermarking for copyright protection.展开更多
Medical image classification is crucial in disease diagnosis,treatment planning,and clinical decisionmaking.We introduced a novel medical image classification approach that integrates Bayesian Random Semantic Data Aug...Medical image classification is crucial in disease diagnosis,treatment planning,and clinical decisionmaking.We introduced a novel medical image classification approach that integrates Bayesian Random Semantic Data Augmentation(BSDA)with a Vision Mamba-based model for medical image classification(MedMamba),enhanced by residual connection blocks,we named the model BSDA-Mamba.BSDA augments medical image data semantically,enhancing the model’s generalization ability and classification performance.MedMamba,a deep learning-based state space model,excels in capturing long-range dependencies in medical images.By incorporating residual connections,BSDA-Mamba further improves feature extraction capabilities.Through comprehensive experiments on eight medical image datasets,we demonstrate that BSDA-Mamba outperforms existing models in accuracy,area under the curve,and F1-score.Our results highlight BSDA-Mamba’s potential as a reliable tool for medical image analysis,particularly in handling diverse imaging modalities from X-rays to MRI.The open-sourcing of our model’s code and datasets,will facilitate the reproduction and extension of our work.展开更多
Hue-Saturation-Intensity (HSI) color model, a psychologically appealing color model, was employed to visualize uncertainty represented by relative prediction error based on the case of spatial prediction of pH of to...Hue-Saturation-Intensity (HSI) color model, a psychologically appealing color model, was employed to visualize uncertainty represented by relative prediction error based on the case of spatial prediction of pH of topsoil in the peri-urban Beijing. A two-dimensional legend was designed to accompany the visualization-vertical axis (hues) for visualizing the predicted values and horizontal axis (whiteness) for visualizing the prediction error. Moreover, different ways of visualizing uncertainty were briefly reviewed in this paper. This case study indicated that visualization of both predictions and prediction uncertainty offered a possibility to enhance visual exploration of the data uncertainty and to compare different prediction methods or predictions of totally different variables. The whitish region of the visualization map can be simply interpreted as unsatisfactory prediction results, where may need additional samples or more suitable prediction models for a better prediction results.展开更多
The control system of Hefei Light Source II(HLS-Ⅱ) is a distributed system based on the experimental physics and industrial control system(EPICS). It is necessary to maintain the central configuration files for the e...The control system of Hefei Light Source II(HLS-Ⅱ) is a distributed system based on the experimental physics and industrial control system(EPICS). It is necessary to maintain the central configuration files for the existing archiving system. When the process variables in the control system are added, removed, or updated, the configuration files must be manually modified to maintain consistency with the control system. This paper presents a new method for data archiving, which realizes the automatic configuration of the archiving parameters. The system uses microservice architecture to integrate the EPICS Archiver Appliance and Rec Sync. In this way, the system can collect all the archived meta-configuration from the distributed input/output controllers and enter them into the EPICS Archiver Appliance automatically. Furthermore, we also developed a web-based GUI to provide automatic visualization of real-time and historical data. At present,this system is under commissioning at HLS-Ⅱ. The results indicate that the new archiving system is reliable and convenient to operate. The operation mode without maintenance is valuable for large-scale scientific facilities.展开更多
A visualization tool was developed through a web browser based on Java applets embedded into HTML pages, in order to provide a world access to the EAST experimental data. It can display data from various trees in diff...A visualization tool was developed through a web browser based on Java applets embedded into HTML pages, in order to provide a world access to the EAST experimental data. It can display data from various trees in different servers in a single panel. With WebScope, it is easier to make a comparison between different data sources and perform a simple calculation over different data sources.展开更多
Monitoring data are often used to identify stormwater runoff characteristics and in stormwater runoff modelling without consideration of their inherent uncertainties. Integrated with discrete sample analysis and error...Monitoring data are often used to identify stormwater runoff characteristics and in stormwater runoff modelling without consideration of their inherent uncertainties. Integrated with discrete sample analysis and error propagation analysis, this study attempted to quantify the uncertainties of discrete chemical oxygen demand (COD), total suspended solids (TSS) concentration, stormwater flowrate, stormwater event volumes, COD event mean concentration (EMC), and COD event loads in terms of flow measurement, sample collection, storage and laboratory analysis. The results showed that the uncertainties due to sample collection, storage and laboratory analysis of COD from stormwater runoff are 13.99%, 19.48% and 12.28%. Meanwhile, flow measurement uncertainty was 12.82%, and the sample collection uncertainty of TSS from stormwater runoff was 31.63%. Based on the law of propagation of uncertainties, the uncertainties regarding event flow volume, COD EMC and COD event loads were quantified as 7.03%, 10.26% and 18.47%.展开更多
With long-term marine surveys and research,and especially with the development of new marine environment monitoring technologies,prodigious amounts of complex marine environmental data are generated,and continuously i...With long-term marine surveys and research,and especially with the development of new marine environment monitoring technologies,prodigious amounts of complex marine environmental data are generated,and continuously increase rapidly.Features of these data include massive volume,widespread distribution,multiple-sources,heterogeneous,multi-dimensional and dynamic in structure and time.The present study recommends an integrative visualization solution for these data,to enhance the visual display of data and data archives,and to develop a joint use of these data distributed among different organizations or communities.This study also analyses the web services technologies and defines the concept of the marine information gird,then focuses on the spatiotemporal visualization method and proposes a process-oriented spatiotemporal visualization method.We discuss how marine environmental data can be organized based on the spatiotemporal visualization method,and how organized data are represented for use with web services and stored in a reusable fashion.In addition,we provide an original visualization architecture that is integrative and based on the explored technologies.In the end,we propose a prototype system of marine environmental data of the South China Sea for visualizations of Argo floats,sea surface temperature fields,sea current fields,salinity,in-situ investigation data,and ocean stations.An integration visualization architecture is illustrated on the prototype system,which highlights the process-oriented temporal visualization method and demonstrates the benefit of the architecture and the methods described in this study.展开更多
基金supported by the National Natural Science Foundation of China(Nos.42530801,42425208)the Natural Science Foundation of Hubei Province(China)(No.2023AFA001)+1 种基金the MOST Special Fund from State Key Laboratory of Geological Processes and Mineral Resources,China University of Geosciences(No.MSFGPMR2025-401)the China Scholarship Council(No.202306410181)。
文摘Geochemical survey data are essential across Earth Science disciplines but are often affected by noise,which can obscure important geological signals and compromise subsequent prediction and interpretation.Quantifying prediction uncertainty is hence crucial for robust geoscientific decision-making.This study proposes a novel deep learning framework,the Spatially Constrained Variational Autoencoder(SC-VAE),for denoising geochemical survey data with integrated uncertainty quantification.The SC-VAE incorporates spatial regularization,which enforces spatial coherence by modeling inter-sample relationships directly within the latent space.The performance of the SC-VAE was systematically evaluated against a standard Variational Autoencoder(VAE)using geochemical data from the gold polymetallic district in the northwestern part of Sichuan Province,China.Both models were optimized using Bayesian optimization,with objective functions specifically designed to maintain essential geostatistical characteristics.Evaluation metrics include variogram analysis,quantitative measures of spatial interpolation accuracy,visual assessment of denoised maps,and statistical analysis of data distributions,as well as decomposition of uncertainties.Results show that the SC-VAE achieves superior noise suppression and better preservation of spatial structure compared to the standard VAE,as demonstrated by a significant reduction in the variogram nugget effect and an increased partial sill.The SC-VAE produces denoised maps with clearer anomaly delineation and more regularized data distributions,effectively mitigating outliers and reducing kurtosis.Additionally,it delivers improved interpolation accuracy and spatially explicit uncertainty estimates,facilitating more reliable and interpretable assessments of prediction confidence.The SC-VAE framework thus provides a robust,geostatistically informed solution for enhancing the quality and interpretability of geochemical data,with broad applicability in mineral exploration,environmental geochemistry,and other Earth Science domains.
文摘Aviation data analysis can help airlines to understand passenger needs,so as to provide passengers with more sophisticated and better services.How to explore the implicit message and analyze contained features from large amounts of data has become an important issue in the civil aviation passenger data analysis process.The uncertainty analysis and visualization methods of data record and property measurement are offered in this paper,based on the visual analysis and uncertainty measure theory combined with parallel coordinates,radar chart,histogram,pixel chart and good interaction.At the same time,the data source expression clearly shows the uncertainty and hidden information as an information base for passengers’service
文摘Background A task assigned to space exploration satellites involves detecting the physical environment within a certain space.However,space detection data are complex and abstract.These data are not conducive for researchers'visual perceptions of the evolution and interaction of events in the space environment.Methods A time-series dynamic data sampling method for large-scale space was proposed for sample detection data in space and time,and the corresponding relationships between data location features and other attribute features were established.A tone-mapping method based on statistical histogram equalization was proposed and applied to the final attribute feature data.The visualization process is optimized for rendering by merging materials,reducing the number of patches,and performing other operations.Results The results of sampling,feature extraction,and uniform visualization of the detection data of complex types,long duration spans,and uneven spatial distributions were obtained.The real-time visualization of large-scale spatial structures using augmented reality devices,particularly low-performance devices,was also investigated.Conclusions The proposed visualization system can reconstruct the three-dimensional structure of a large-scale space,express the structure and changes in the spatial environment using augmented reality,and assist in intuitively discovering spatial environmental events and evolutionary rules.
文摘The advent of the big data era has made data visualization a crucial tool for enhancing the efficiency and insights of data analysis. This theoretical research delves into the current applications and potential future trends of data visualization in big data analysis. The article first systematically reviews the theoretical foundations and technological evolution of data visualization, and thoroughly analyzes the challenges faced by visualization in the big data environment, such as massive data processing, real-time visualization requirements, and multi-dimensional data display. Through extensive literature research, it explores innovative application cases and theoretical models of data visualization in multiple fields including business intelligence, scientific research, and public decision-making. The study reveals that interactive visualization, real-time visualization, and immersive visualization technologies may become the main directions for future development and analyzes the potential of these technologies in enhancing user experience and data comprehension. The paper also delves into the theoretical potential of artificial intelligence technology in enhancing data visualization capabilities, such as automated chart generation, intelligent recommendation of visualization schemes, and adaptive visualization interfaces. The research also focuses on the role of data visualization in promoting interdisciplinary collaboration and data democratization. Finally, the paper proposes theoretical suggestions for promoting data visualization technology innovation and application popularization, including strengthening visualization literacy education, developing standardized visualization frameworks, and promoting open-source sharing of visualization tools. This study provides a comprehensive theoretical perspective for understanding the importance of data visualization in the big data era and its future development directions.
文摘This study aims to explore the application of Bayesian analysis based on neural networks and deep learning in data visualization.The research background is that with the increasing amount and complexity of data,traditional data analysis methods have been unable to meet the needs.Research methods include building neural networks and deep learning models,optimizing and improving them through Bayesian analysis,and applying them to the visualization of large-scale data sets.The results show that the neural network combined with Bayesian analysis and deep learning method can effectively improve the accuracy and efficiency of data visualization,and enhance the intuitiveness and depth of data interpretation.The significance of the research is that it provides a new solution for data visualization in the big data environment and helps to further promote the development and application of data science.
文摘Graphical abstracts(GAs)are emerging as a pivotal tool in medical literature,enhancing the dissemination and comprehension of complex clinical data through visual summaries.This editorial highlights the significant advantages of GAs,including improved clarity,increased reader engagement,and enhanced visibility of research findings.By transforming intricate scientific data into accessible visual formats,these abstracts facilitate quick and effective knowledge transfer,crucial in clinical decision-making and patient care.However,challenges such as potential data misrepresentation due to oversimplification,the skill gap in graphic design among researchers,and the lack of standardized creation guidelines pose barriers to their widespread adoption.Additionally,while software such as Adobe Illustrator,BioRender,and Canva are commonly employed to create these visuals,not all researchers may be proficient in their use.To address these issues,we recommend that academic journals establish clear guidelines and provide necessary design training to researchers.This proactive approach will ensure the creation of high-quality GAs,promote their standardization,and expand their use in clinical reporting,ultimately benefiting the medical community and improving healthcare outcomes.
文摘With the advent of the big data era,real-time data analysis and decision-support systems have been recognized as essential tools for enhancing enterprise competitiveness and optimizing the decision-making process.This study aims to explore the development strategies of real-time data analysis and decision-support systems,and analyze their application status and future development trends in various industries.The article first reviews the basic concepts and importance of real-time data analysis and decision-support systems,and then discusses in detail the key technical aspects such as system architecture,data collection and processing,analysis methods,and visualization techniques.
文摘This study focuses on the risks associated with the on-balance sheet recognition of data resources.At the legal level,disputes over ownership often arise due to unclear data property rights,while privacy protection,cybersecurity,and cross-border data flows create additional compliance challenges.In terms of recognition,the subjectivity of traditional valuation methods,the lack of active markets,and the rapid depreciation of data value caused by technological iteration hinder reliable measurement.With respect to disclosure,organizations face a dilemma between transparency and confidentiality.Collectively,these issues exacerbate audit risks.It is therefore imperative to establish an appropriate legal,accounting,and auditing framework to mitigate such risks and remove barriers to the proper recognition of data assets on balance sheets.
文摘The increasing prevalence of multi-view data has made multi-view clustering a crucial technique for discovering latent structures from heterogeneous representations.However,traditional fuzzy clustering algorithms show limitations with the inherent uncertainty and imprecision of such data,as they rely on a single-dimensional membership value.To overcome these limitations,we propose an auto-weighted multi-view neutrosophic fuzzy clustering(AW-MVNFC)algorithm.Our method leverages the neutrosophic framework,an extension of fuzzy sets,to explicitly model imprecision and ambiguity through three membership degrees.The core novelty of AWMVNFC lies in a hierarchical weighting strategy that adaptively learns the contributions of both individual data views and the importance of each feature within a view.Through a unified objective function,AW-MVNFC jointly optimizes the neutrosophic membership assignments,cluster centers,and the distributions of view and feature weights.Comprehensive experiments conducted on synthetic and real-world datasets demonstrate that our algorithm achieves more accurate and stable clustering than existing methods,demonstrating its effectiveness in handling the complexities of multi-view data.
基金funded by the Researchers Supporting Project number(RSPD2025R857),King Saud University,Riyadh,Saudi Arabia.
文摘Many bioinformatics applications require determining the class of a newly sequenced Deoxyribonucleic acid(DNA)sequence,making DNA sequence classification an integral step in performing bioinformatics analysis,where large biomedical datasets are transformed into valuable knowledge.Existing methods rely on a feature extraction step and suffer from high computational time requirements.In contrast,newer approaches leveraging deep learning have shown significant promise in enhancing accuracy and efficiency.In this paper,we investigate the performance of various deep learning architectures:Convolutional Neural Network(CNN),CNN-Long Short-Term Memory(CNNLSTM),CNN-Bidirectional Long Short-Term Memory(CNN-BiLSTM),Residual Network(ResNet),and InceptionV3 for DNA sequence classification.Various numerical and visual data representation techniques are utilized to represent the input datasets,including:label encoding,k-mer sentence encoding,k-mer one-hot vector,Frequency Chaos Game Representation(FCGR)and 5-Color Map(ColorSquare).Three datasets are used for the training of the models including H3,H4 and DNA Sequence Dataset(Yeast,Human,Arabidopsis Thaliana).Experiments are performed to determine which combination of DNA representation and deep learning architecture yields improved performance for the classification task.Our results indicate that using a hybrid CNN-LSTM neural network trained on DNA sequences represented as one-hot encoded k-mer sequences yields the best performance,achieving an accuracy of 92.1%.
基金supported in part by the"Pioneer"and"Leading Goose"R&D Program of Zhejiang under Grant No. 2024C01169the National Natural Science Foundation of China under Grant Nos. 62441238 and U2441240。
文摘Visual object tracking(VOT),aiming to track a target object in a continuous video,is a fundamental and critical task in computer vision.However,the reliance on third-party resources(e.g.,dataset)for training poses concealed threats to the security of VOT models.In this paper,we reveal that VOT models are vulnerable to a poison-only and targeted backdoor attack,where the adversary can achieve arbitrary tracking predictions by manipulating only part of the training data.Specifically,we first define and formulate three different variants of the targeted attacks:size-manipulation,trajectory-manipulation,and hybrid attacks.To implement these,we introduce Random Video Poisoning(RVP),a novel poison-only strategy that exploits temporal correlations within video data by poisoning entire video sequences.Extensive experiments demonstrate that RVP effectively injects controllable backdoors,enabling precise manipulation of tracking behavior upon trigger activation,while maintaining high performance on benign data,thus ensuring stealth.Our findings not only expose significant vulnerabilities but also highlight that the underlying principles could be adapted for beneficial uses,such as dataset watermarking for copyright protection.
文摘Medical image classification is crucial in disease diagnosis,treatment planning,and clinical decisionmaking.We introduced a novel medical image classification approach that integrates Bayesian Random Semantic Data Augmentation(BSDA)with a Vision Mamba-based model for medical image classification(MedMamba),enhanced by residual connection blocks,we named the model BSDA-Mamba.BSDA augments medical image data semantically,enhancing the model’s generalization ability and classification performance.MedMamba,a deep learning-based state space model,excels in capturing long-range dependencies in medical images.By incorporating residual connections,BSDA-Mamba further improves feature extraction capabilities.Through comprehensive experiments on eight medical image datasets,we demonstrate that BSDA-Mamba outperforms existing models in accuracy,area under the curve,and F1-score.Our results highlight BSDA-Mamba’s potential as a reliable tool for medical image analysis,particularly in handling diverse imaging modalities from X-rays to MRI.The open-sourcing of our model’s code and datasets,will facilitate the reproduction and extension of our work.
基金Under the auspices of Knowledge Innovation Frontier Project of Institute of Soil Science,Chinese Academy of Sciences(No.ISSASIP0716 )the National Nature Science Foundation of China ( No.40701070,40571065)
文摘Hue-Saturation-Intensity (HSI) color model, a psychologically appealing color model, was employed to visualize uncertainty represented by relative prediction error based on the case of spatial prediction of pH of topsoil in the peri-urban Beijing. A two-dimensional legend was designed to accompany the visualization-vertical axis (hues) for visualizing the predicted values and horizontal axis (whiteness) for visualizing the prediction error. Moreover, different ways of visualizing uncertainty were briefly reviewed in this paper. This case study indicated that visualization of both predictions and prediction uncertainty offered a possibility to enhance visual exploration of the data uncertainty and to compare different prediction methods or predictions of totally different variables. The whitish region of the visualization map can be simply interpreted as unsatisfactory prediction results, where may need additional samples or more suitable prediction models for a better prediction results.
基金supported by the National Natural Science Foundation of China(No.11375186)
文摘The control system of Hefei Light Source II(HLS-Ⅱ) is a distributed system based on the experimental physics and industrial control system(EPICS). It is necessary to maintain the central configuration files for the existing archiving system. When the process variables in the control system are added, removed, or updated, the configuration files must be manually modified to maintain consistency with the control system. This paper presents a new method for data archiving, which realizes the automatic configuration of the archiving parameters. The system uses microservice architecture to integrate the EPICS Archiver Appliance and Rec Sync. In this way, the system can collect all the archived meta-configuration from the distributed input/output controllers and enter them into the EPICS Archiver Appliance automatically. Furthermore, we also developed a web-based GUI to provide automatic visualization of real-time and historical data. At present,this system is under commissioning at HLS-Ⅱ. The results indicate that the new archiving system is reliable and convenient to operate. The operation mode without maintenance is valuable for large-scale scientific facilities.
基金supported by National Natural Science Foundation of China (No.10835009)Chinese Academy of Sciences for the Key Project of Knowledge Innovation Program (No.KJCX3.SYW.N4)Chinese Ministry of Sciences for the 973 project (No.2009GB103000)
文摘A visualization tool was developed through a web browser based on Java applets embedded into HTML pages, in order to provide a world access to the EAST experimental data. It can display data from various trees in different servers in a single panel. With WebScope, it is easier to make a comparison between different data sources and perform a simple calculation over different data sources.
基金supported by the National Natural Science Foundation of China(No.50778098)the Youth Project of Fujian Provincial Department of Science&Technology(No.2007F3093)
文摘Monitoring data are often used to identify stormwater runoff characteristics and in stormwater runoff modelling without consideration of their inherent uncertainties. Integrated with discrete sample analysis and error propagation analysis, this study attempted to quantify the uncertainties of discrete chemical oxygen demand (COD), total suspended solids (TSS) concentration, stormwater flowrate, stormwater event volumes, COD event mean concentration (EMC), and COD event loads in terms of flow measurement, sample collection, storage and laboratory analysis. The results showed that the uncertainties due to sample collection, storage and laboratory analysis of COD from stormwater runoff are 13.99%, 19.48% and 12.28%. Meanwhile, flow measurement uncertainty was 12.82%, and the sample collection uncertainty of TSS from stormwater runoff was 31.63%. Based on the law of propagation of uncertainties, the uncertainties regarding event flow volume, COD EMC and COD event loads were quantified as 7.03%, 10.26% and 18.47%.
基金Supported by the Knowledge Innovation Program of the Chinese Academy of Sciences (No.KZCX1-YW-12-04)the National High Technology Research and Development Program of China (863 Program) (Nos.2009AA12Z148,2007AA092202)Support for this study was provided by the Institute of Geographical Sciences and the Natural Resources Research,Chinese Academy of Science (IGSNRR,CAS) and the Institute of Oceanology, CAS
文摘With long-term marine surveys and research,and especially with the development of new marine environment monitoring technologies,prodigious amounts of complex marine environmental data are generated,and continuously increase rapidly.Features of these data include massive volume,widespread distribution,multiple-sources,heterogeneous,multi-dimensional and dynamic in structure and time.The present study recommends an integrative visualization solution for these data,to enhance the visual display of data and data archives,and to develop a joint use of these data distributed among different organizations or communities.This study also analyses the web services technologies and defines the concept of the marine information gird,then focuses on the spatiotemporal visualization method and proposes a process-oriented spatiotemporal visualization method.We discuss how marine environmental data can be organized based on the spatiotemporal visualization method,and how organized data are represented for use with web services and stored in a reusable fashion.In addition,we provide an original visualization architecture that is integrative and based on the explored technologies.In the end,we propose a prototype system of marine environmental data of the South China Sea for visualizations of Argo floats,sea surface temperature fields,sea current fields,salinity,in-situ investigation data,and ocean stations.An integration visualization architecture is illustrated on the prototype system,which highlights the process-oriented temporal visualization method and demonstrates the benefit of the architecture and the methods described in this study.