With the advent of the big data era,real-time data analysis and decision-support systems have been recognized as essential tools for enhancing enterprise competitiveness and optimizing the decision-making process.This...With the advent of the big data era,real-time data analysis and decision-support systems have been recognized as essential tools for enhancing enterprise competitiveness and optimizing the decision-making process.This study aims to explore the development strategies of real-time data analysis and decision-support systems,and analyze their application status and future development trends in various industries.The article first reviews the basic concepts and importance of real-time data analysis and decision-support systems,and then discusses in detail the key technical aspects such as system architecture,data collection and processing,analysis methods,and visualization techniques.展开更多
DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expres...DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expression data generated. To address this, this paper employs a mixed-effects model to analyze gene expression data. In terms of data selection, 1176 genes from the white mouse gene expression dataset under two experimental conditions were chosen, setting up two conditions: pneumococcal infection and no infection, and constructing a mixed-effects model. After preprocessing the gene chip information, the data were imported into the model, preliminary results were calculated, and permutation tests were performed to biologically validate the preliminary results using GSEA. The final dataset consists of 20 groups of gene expression data from pneumococcal infection, which categorizes functionally related genes based on the similarity of their expression profiles, facilitating the study of genes with unknown functions.展开更多
Semantic communication(SemCom)aims to achieve high-fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy.Nevertheless,semantic communication still suffers from unexpe...Semantic communication(SemCom)aims to achieve high-fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy.Nevertheless,semantic communication still suffers from unexpected channel volatility and thus developing a re-transmission mechanism(e.g.,hybrid automatic repeat request[HARQ])becomes indispensable.In that regard,instead of discarding previously transmitted information,the incremental knowledge-based HARQ(IK-HARQ)is deemed as a more effective mechanism that could sufficiently utilize the information semantics.However,considering the possible existence of semantic ambiguity in image transmission,a simple bit-level cyclic redundancy check(CRC)might compromise the performance of IK-HARQ.Therefore,there emerges a strong incentive to revolutionize the CRC mechanism,thus more effectively reaping the benefits of both SemCom and HARQ.In this paper,built on top of swin transformer-based joint source-channel coding(JSCC)and IK-HARQ,we propose a semantic image transmission framework SC-TDA-HARQ.In particular,different from the conventional CRC,we introduce a topological data analysis(TDA)-based error detection method,which capably digs out the inner topological and geometric information of images,to capture semantic information and determine the necessity for re-transmission.Extensive numerical results validate the effectiveness and efficiency of the proposed SC-TDA-HARQ framework,especially under the limited bandwidth condition,and manifest the superiority of TDA-based error detection method in image transmission.展开更多
Cervical cancer,a leading malignancy globally,poses a significant threat to women's health,with an estimated 604,000 new cases and 342,000 deaths reported in 2020^([1]).As cervical cancer is closely linked to huma...Cervical cancer,a leading malignancy globally,poses a significant threat to women's health,with an estimated 604,000 new cases and 342,000 deaths reported in 2020^([1]).As cervical cancer is closely linked to human papilloma virus(HPV)infection,early detection relies on HPV screening;however,late-stage prognosis remains poor,underscoring the need for novel diagnostic and therapeutic targets^([2]).展开更多
Microsoft Excel is essential for the End-User Approach (EUA), offering versatility in data organization, analysis, and visualization, as well as widespread accessibility. It fosters collaboration and informed decision...Microsoft Excel is essential for the End-User Approach (EUA), offering versatility in data organization, analysis, and visualization, as well as widespread accessibility. It fosters collaboration and informed decision-making across diverse domains. Conversely, Python is indispensable for professional programming due to its versatility, readability, extensive libraries, and robust community support. It enables efficient development, advanced data analysis, data mining, and automation, catering to diverse industries and applications. However, one primary issue when using Microsoft Excel with Python libraries is compatibility and interoperability. While Excel is a widely used tool for data storage and analysis, it may not seamlessly integrate with Python libraries, leading to challenges in reading and writing data, especially in complex or large datasets. Additionally, manipulating Excel files with Python may not always preserve formatting or formulas accurately, potentially affecting data integrity. Moreover, dependency on Excel’s graphical user interface (GUI) for automation can limit scalability and reproducibility compared to Python’s scripting capabilities. This paper covers the integration solution of empowering non-programmers to leverage Python’s capabilities within the familiar Excel environment. This enables users to perform advanced data analysis and automation tasks without requiring extensive programming knowledge. Based on Soliciting feedback from non-programmers who have tested the integration solution, the case study shows how the solution evaluates the ease of implementation, performance, and compatibility of Python with Excel versions.展开更多
This research paper compares Excel and R language for data analysis and concludes that R language is more suitable for complex data analysis tasks.R language’s open-source nature makes it accessible to everyone,and i...This research paper compares Excel and R language for data analysis and concludes that R language is more suitable for complex data analysis tasks.R language’s open-source nature makes it accessible to everyone,and its powerful data management and analysis tools make it suitable for handling complex data analysis tasks.It is also highly customizable,allowing users to create custom functions and packages to meet their specific needs.Additionally,R language provides high reproducibility,making it easy to replicate and verify research results,and it has excellent collaboration capabilities,enabling multiple users to work on the same project simultaneously.These advantages make R language a more suitable choice for complex data analysis tasks,particularly in scientific research and business applications.The findings of this study will help people understand that R is not just a language that can handle more data than Excel and demonstrate that r is essential to the field of data analysis.At the same time,it will also help users and organizations make informed decisions regarding their data analysis needs and software preferences.展开更多
Objective:To explain the use of concept mapping in a study about family members'experiences in taking care of people with cancer.Methods:This study used a phenomenological study design.In this study,we describe th...Objective:To explain the use of concept mapping in a study about family members'experiences in taking care of people with cancer.Methods:This study used a phenomenological study design.In this study,we describe the analytical process of using concept mapping in our phenomenological studies about family members'experiences in taking care of people with cancer.Results:We developed several concept maps that aided us in analyzing our collected data from the interviews.Conclusions:The use of concept mapping is suggested to researchers who intend to analyze their data in any qualitative studies,including those using a phenomenological design,because it is a time-efficient way of dealing with large amounts of qualitative data during the analytical process.展开更多
Gravitational wave detection is one of the most cutting-edge research areas in modern physics, with its success relying on advanced data analysis and signal processing techniques. This study provides a comprehensive r...Gravitational wave detection is one of the most cutting-edge research areas in modern physics, with its success relying on advanced data analysis and signal processing techniques. This study provides a comprehensive review of data analysis methods and signal processing techniques in gravitational wave detection. The research begins by introducing the characteristics of gravitational wave signals and the challenges faced in their detection, such as extremely low signal-to-noise ratios and complex noise backgrounds. It then systematically analyzes the application of time-frequency analysis methods in extracting transient gravitational wave signals, including wavelet transforms and Hilbert-Huang transforms. The study focuses on discussing the crucial role of matched filtering techniques in improving signal detection sensitivity and explores strategies for template bank optimization. Additionally, the research evaluates the potential of machine learning algorithms, especially deep learning networks, in rapidly identifying and classifying gravitational wave events. The study also analyzes the application of Bayesian inference methods in parameter estimation and model selection, as well as their advantages in handling uncertainties. However, the research also points out the challenges faced by current technologies, such as dealing with non-Gaussian noise and improving computational efficiency. To address these issues, the study proposes a hybrid analysis framework combining physical models and data-driven methods. Finally, the research looks ahead to the potential applications of quantum computing in future gravitational wave data analysis. This study provides a comprehensive theoretical foundation for the optimization and innovation of gravitational wave data analysis methods, contributing to the advancement of gravitational wave astronomy.展开更多
The connectivity of sandbodies is a key constraint to the exploration effectiveness of Bohai A Oilfield.Conventional connectivity studies often use methods such as seismic attribute fusion,while the development of con...The connectivity of sandbodies is a key constraint to the exploration effectiveness of Bohai A Oilfield.Conventional connectivity studies often use methods such as seismic attribute fusion,while the development of contiguous composite sandbodies in this area makes it challenging to characterize connectivity changes with conventional seismic attributes.Aiming at the above problem in the Bohai A Oilfield,this study proposes a big data analysis method based on the Deep Forest algorithm to predict the sandbody connectivity.Firstly,by compiling the abundant exploration and development sandbodies data in the study area,typical sandbodies with reliable connectivity were selected.Then,sensitive seismic attribute were extracted to obtain training samples.Finally,based on the Deep Forest algorithm,mapping model between attribute combinations and sandbody connectivity was established through machine learning.This method achieves the first quantitative determination of the connectivity for continuous composite sandbodies in the Bohai Oilfield.Compared with conventional connectivity discrimination methods such as high-resolution processing and seismic attribute analysis,this method can combine the sandbody characteristics of the study area in the process of machine learning,and jointly judge connectivity by combining multiple seismic attributes.The study results show that this method has high accuracy and timeliness in predicting connectivity for continuous composite sandbodies.Applied to the Bohai A Oilfield,it successfully identified multiple sandbody connectivity relationships and provided strong support for the subsequent exploration potential assessment and well placement optimization.This method also provides a new idea and method for studying sandbody connectivity under similar complex geological conditions.展开更多
This study investigates university English teachers’acceptance and willingness to use learning management system(LMS)data analysis tools in their teaching practices.The research employs a mixed-method approach,combin...This study investigates university English teachers’acceptance and willingness to use learning management system(LMS)data analysis tools in their teaching practices.The research employs a mixed-method approach,combining quantitative surveys and qualitative interviews to understand teachers’perceptions and attitudes,and the factors influencing their adoption of LMS data analysis tools.The findings reveal that perceived usefulness,perceived ease of use,technical literacy,organizational support,and data privacy concerns significantly impact teachers’willingness to use these tools.Based on these insights,the study offers practical recommendations for educational institutions to enhance the effective adoption of LMS data analysis tools in English language teaching.展开更多
The advent of the big data era has made data visualization a crucial tool for enhancing the efficiency and insights of data analysis. This theoretical research delves into the current applications and potential future...The advent of the big data era has made data visualization a crucial tool for enhancing the efficiency and insights of data analysis. This theoretical research delves into the current applications and potential future trends of data visualization in big data analysis. The article first systematically reviews the theoretical foundations and technological evolution of data visualization, and thoroughly analyzes the challenges faced by visualization in the big data environment, such as massive data processing, real-time visualization requirements, and multi-dimensional data display. Through extensive literature research, it explores innovative application cases and theoretical models of data visualization in multiple fields including business intelligence, scientific research, and public decision-making. The study reveals that interactive visualization, real-time visualization, and immersive visualization technologies may become the main directions for future development and analyzes the potential of these technologies in enhancing user experience and data comprehension. The paper also delves into the theoretical potential of artificial intelligence technology in enhancing data visualization capabilities, such as automated chart generation, intelligent recommendation of visualization schemes, and adaptive visualization interfaces. The research also focuses on the role of data visualization in promoting interdisciplinary collaboration and data democratization. Finally, the paper proposes theoretical suggestions for promoting data visualization technology innovation and application popularization, including strengthening visualization literacy education, developing standardized visualization frameworks, and promoting open-source sharing of visualization tools. This study provides a comprehensive theoretical perspective for understanding the importance of data visualization in the big data era and its future development directions.展开更多
There are some limitations when we apply conventional methods to analyze the massive amounts of seismic data acquired with high-density spatial sampling since processors usually obtain the properties of raw data from ...There are some limitations when we apply conventional methods to analyze the massive amounts of seismic data acquired with high-density spatial sampling since processors usually obtain the properties of raw data from common shot gathers or other datasets located at certain points or along lines. We propose a novel method in this paper to observe seismic data on time slices from spatial subsets. The composition of a spatial subset and the unique character of orthogonal or oblique subsets are described and pre-stack subsets are shown by 3D visualization. In seismic data processing, spatial subsets can be used for the following aspects: (1) to check the trace distribution uniformity and regularity; (2) to observe the main features of ground-roll and linear noise; (3) to find abnormal traces from slices of datasets; and (4) to QC the results of pre-stack noise attenuation. The field data application shows that seismic data analysis in spatial subsets is an effective method that may lead to a better discrimination among various wavefields and help us obtain more information.展开更多
Quantized kernel least mean square(QKLMS) algorithm is an effective nonlinear adaptive online learning algorithm with good performance in constraining the growth of network size through the use of quantization for inp...Quantized kernel least mean square(QKLMS) algorithm is an effective nonlinear adaptive online learning algorithm with good performance in constraining the growth of network size through the use of quantization for input space. It can serve as a powerful tool to perform complex computing for network service and application. With the purpose of compressing the input to further improve learning performance, this article proposes a novel QKLMS with entropy-guided learning, called EQ-KLMS. Under the consecutive square entropy learning framework, the basic idea of entropy-guided learning technique is to measure the uncertainty of the input vectors used for QKLMS, and delete those data with larger uncertainty, which are insignificant or easy to cause learning errors. Then, the dataset is compressed. Consequently, by using square entropy, the learning performance of proposed EQ-KLMS is improved with high precision and low computational cost. The proposed EQ-KLMS is validated using a weather-related dataset, and the results demonstrate the desirable performance of our scheme.展开更多
Space debris poses a serious threat to human space activities and needs to be measured and cataloged. As a new technology for space target surveillance, the measurement accuracy of diffuse reflection laser ranging (D...Space debris poses a serious threat to human space activities and needs to be measured and cataloged. As a new technology for space target surveillance, the measurement accuracy of diffuse reflection laser ranging (DRLR) is much higher than that of microwave radar and optoelectronic measurement. Based on the laser ranging data of space debris from the DRLR system at Shanghai Astronomical Observatory acquired in March-April, 2013, the characteristics and precision of the laser ranging data are analyzed and their applications in orbit determination of space debris are discussed, which is implemented for the first time in China. The experiment indicates that the precision of laser ranging data can reach 39 cm-228 cm. When the data are sufficient enough (four arcs measured over three days), the orbital accuracy of space debris can be up to 50 m.展开更多
Detection of a periodic signal hidden in noise is the goal of Superconducting Gravimeter(SG)data analysis.Due to spikes,gaps,datum shrifts(offsets)and other disturbances,the traditional FFT method shows inherent limit...Detection of a periodic signal hidden in noise is the goal of Superconducting Gravimeter(SG)data analysis.Due to spikes,gaps,datum shrifts(offsets)and other disturbances,the traditional FFT method shows inherent limitations.Instead,the least squares spectral analysis(LSSA)has showed itself more suitable than Fourier analysis of gappy,unequally spaced and unequally weighted data series in a variety of applications in geodesy and geophysics.This paper reviews the principle of LSSA and gives a possible strategy for the analysis of time series obtained from the Canadian Superconducting Gravimeter Installation(CGSI),with gaps,offsets,unequal sampling decimation of the data and unequally weighted data points.展开更多
In this paper, variational inference is studied on manifolds with certain metrics. To solve the problem, the analysis is first proposed for the variational Bayesian on Lie group, and then extended to the manifold that...In this paper, variational inference is studied on manifolds with certain metrics. To solve the problem, the analysis is first proposed for the variational Bayesian on Lie group, and then extended to the manifold that is approximated by Lie groups. Then the convergence of the proposed algorithm with respect to the manifold metric is proved in two iterative processes: variational Bayesian expectation (VB-F) step and variational Bayesian maximum (VB-M) step. Moreover, the effective of different metrics for Bayesian analysis is discussed.展开更多
This paper presents the development and application of a production data analysis software that can analyze and forecast the production performance and reservoir properties of shale gas wells.The theories used in the ...This paper presents the development and application of a production data analysis software that can analyze and forecast the production performance and reservoir properties of shale gas wells.The theories used in the study were based on the analytical and empirical approaches.Its reliability has been confirmed through comparisons with a commercial software.Using transient data relating to multi-stage hydraulic fractured horizontal wells,it was confirmed that the accuracy of the modified hyperbolic method showed an error of approximately 4%compared to the actual estimated ultimate recovery(EUR).On the basis of the developed model,reliable productivity forecasts have been obtained by analyzing field production data relating to wells in Canada.The EUR was computed as 9.6 Bcf using the modified hyperbolic method.Employing the Pow Law Exponential method,the EUR would be 9.4 Bcf.The models developed in this study will allow in the future integration of new analytical and empirical theories in a relatively readily than commercial models.展开更多
RNA-sequencing(RNA-seq),based on next-generation sequencing technologies,has rapidly become a standard and popular technology for transcriptome analysis.However,serious challenges still exist in analyzing and interpre...RNA-sequencing(RNA-seq),based on next-generation sequencing technologies,has rapidly become a standard and popular technology for transcriptome analysis.However,serious challenges still exist in analyzing and interpreting the RNA-seq data.With the development of high-throughput sequencing technology,the sequencing depth of RNA-seq data increases explosively.The intricate biological process of transcriptome is more complicated and diversified beyond our imagination.Moreover,most of the remaining organisms still have no available reference genome or have only incomplete genome annotations.Therefore,a large number of bioinformatics methods for various transcriptomics studies are proposed to effectively settle these challenges.This review comprehensively summarizes the various studies in RNA-seq data analysis and their corresponding analysis methods,including genome annotation,quality control and pre-processing of reads,read alignment,transcriptome assembly,gene and isoform expression quantification,differential expression analysis,data visualization and other analyses.展开更多
A novel study using LCeMS(Liquid chromatography tandem mass spectrometry)coupled with multivariate data analysis and bioactivity evaluation was established for discrimination of aqueous extract and vinegar extract of...A novel study using LCeMS(Liquid chromatography tandem mass spectrometry)coupled with multivariate data analysis and bioactivity evaluation was established for discrimination of aqueous extract and vinegar extract of Shixiao San.Batches of these two kinds of samples were subjected to analysis,and the datasets of sample codes,tR-m/z pairs and ion intensities were processed with principal component analysis(PCA).The result of score plot showed a clear classification of the aqueous and vinegar groups.And the chemical markers having great contributions to the differentiation were screened out on the loading plot.The identities of the chemical markers were performed by comparing the mass fragments and retention times with those of reference compounds and/or the known compounds published in the literatures.Based on the proposed strategy,quercetin-3-Oneohesperidoside,isorhamnetin-3-O-neohespeeridoside,kaempferol-3-O-neohesperidoside,isorhamnetin-3-O-rutinoside and isorhamnetin-3-O-(2G-a-l-rhamnosyl)-rutinoside were explored as representative markers in distinguishing the vinegar extract from the aqueous extract.The anti-hyperlipidemic activities of two processed extracts of Shixiao San were examined on serum levels of lipids,lipoprotein and blood antioxidant enzymes in a rat hyperlipidemia model,and the vinegary extract,exerting strong lipid-lowering and antioxidative effects,was superior to the aqueous extract.Therefore,boiling with vinegary was predicted as the greatest processing procedure for anti-hyperlipidemic effect of Shixiao San.Furthermore,combining the changes in the metabolic profiling and bioactivity evaluation,the five representative markers may be related to the observed antihyperlipidemic effect.展开更多
A data processing method was proposed for eliminating the end restraint in triaxial tests of soil. A digital image processing method was used to calculate the local deformations and local stresses for any region on th...A data processing method was proposed for eliminating the end restraint in triaxial tests of soil. A digital image processing method was used to calculate the local deformations and local stresses for any region on the surface of triaxial soil specimens. The principle and implementation of this digital image processing method were introduced as well as the calculation method for local mechanical properties of soil specimens. Comparisons were made between the test results calculated by the data from both the entire specimen and local regions, and it was found that the deformations were more uniform in the middle region compared with the entire specimen. In order to quantify the nonuniform characteristic of deformation, the non-uniformity coefficients of strain were defined and calculated. Traditional and end-lubricated triaxial tests were conducted under the same condition to investigate the effects of using local region data for deformation calculation on eliminating the end restraint of specimens. After the statistical analysis of all test results, it was concluded that for the tested soil specimen with the size of 39.1 mm × 80 ram, the utilization of the middle 35 mm region of traditional specimens in data processing had a better effect on eliminating end restraint compared with end lubrication. Furthermore, the local data analysis in this paper was validated through the comparisons with the test results from other researchers.展开更多
文摘With the advent of the big data era,real-time data analysis and decision-support systems have been recognized as essential tools for enhancing enterprise competitiveness and optimizing the decision-making process.This study aims to explore the development strategies of real-time data analysis and decision-support systems,and analyze their application status and future development trends in various industries.The article first reviews the basic concepts and importance of real-time data analysis and decision-support systems,and then discusses in detail the key technical aspects such as system architecture,data collection and processing,analysis methods,and visualization techniques.
文摘DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expression data generated. To address this, this paper employs a mixed-effects model to analyze gene expression data. In terms of data selection, 1176 genes from the white mouse gene expression dataset under two experimental conditions were chosen, setting up two conditions: pneumococcal infection and no infection, and constructing a mixed-effects model. After preprocessing the gene chip information, the data were imported into the model, preliminary results were calculated, and permutation tests were performed to biologically validate the preliminary results using GSEA. The final dataset consists of 20 groups of gene expression data from pneumococcal infection, which categorizes functionally related genes based on the similarity of their expression profiles, facilitating the study of genes with unknown functions.
基金supported in part by the National Key Research and Development Program of China under Grant 2024YFE0200600in part by the National Natural Science Foundation of China under Grant 62071425+3 种基金in part by the Zhejiang Key Research and Development Plan under Grant 2022C01093in part by the Zhejiang Provincial Natural Science Foundation of China under Grant LR23F010005in part by the National Key Laboratory of Wireless Communications Foundation under Grant 2023KP01601in part by the Big Data and Intelligent Computing Key Lab of CQUPT under Grant BDIC-2023-B-001.
文摘Semantic communication(SemCom)aims to achieve high-fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy.Nevertheless,semantic communication still suffers from unexpected channel volatility and thus developing a re-transmission mechanism(e.g.,hybrid automatic repeat request[HARQ])becomes indispensable.In that regard,instead of discarding previously transmitted information,the incremental knowledge-based HARQ(IK-HARQ)is deemed as a more effective mechanism that could sufficiently utilize the information semantics.However,considering the possible existence of semantic ambiguity in image transmission,a simple bit-level cyclic redundancy check(CRC)might compromise the performance of IK-HARQ.Therefore,there emerges a strong incentive to revolutionize the CRC mechanism,thus more effectively reaping the benefits of both SemCom and HARQ.In this paper,built on top of swin transformer-based joint source-channel coding(JSCC)and IK-HARQ,we propose a semantic image transmission framework SC-TDA-HARQ.In particular,different from the conventional CRC,we introduce a topological data analysis(TDA)-based error detection method,which capably digs out the inner topological and geometric information of images,to capture semantic information and determine the necessity for re-transmission.Extensive numerical results validate the effectiveness and efficiency of the proposed SC-TDA-HARQ framework,especially under the limited bandwidth condition,and manifest the superiority of TDA-based error detection method in image transmission.
基金supported by a project funded by the Hebei Provincial Central Guidance Local Science and Technology Development Fund(236Z7714G)。
文摘Cervical cancer,a leading malignancy globally,poses a significant threat to women's health,with an estimated 604,000 new cases and 342,000 deaths reported in 2020^([1]).As cervical cancer is closely linked to human papilloma virus(HPV)infection,early detection relies on HPV screening;however,late-stage prognosis remains poor,underscoring the need for novel diagnostic and therapeutic targets^([2]).
文摘Microsoft Excel is essential for the End-User Approach (EUA), offering versatility in data organization, analysis, and visualization, as well as widespread accessibility. It fosters collaboration and informed decision-making across diverse domains. Conversely, Python is indispensable for professional programming due to its versatility, readability, extensive libraries, and robust community support. It enables efficient development, advanced data analysis, data mining, and automation, catering to diverse industries and applications. However, one primary issue when using Microsoft Excel with Python libraries is compatibility and interoperability. While Excel is a widely used tool for data storage and analysis, it may not seamlessly integrate with Python libraries, leading to challenges in reading and writing data, especially in complex or large datasets. Additionally, manipulating Excel files with Python may not always preserve formatting or formulas accurately, potentially affecting data integrity. Moreover, dependency on Excel’s graphical user interface (GUI) for automation can limit scalability and reproducibility compared to Python’s scripting capabilities. This paper covers the integration solution of empowering non-programmers to leverage Python’s capabilities within the familiar Excel environment. This enables users to perform advanced data analysis and automation tasks without requiring extensive programming knowledge. Based on Soliciting feedback from non-programmers who have tested the integration solution, the case study shows how the solution evaluates the ease of implementation, performance, and compatibility of Python with Excel versions.
文摘This research paper compares Excel and R language for data analysis and concludes that R language is more suitable for complex data analysis tasks.R language’s open-source nature makes it accessible to everyone,and its powerful data management and analysis tools make it suitable for handling complex data analysis tasks.It is also highly customizable,allowing users to create custom functions and packages to meet their specific needs.Additionally,R language provides high reproducibility,making it easy to replicate and verify research results,and it has excellent collaboration capabilities,enabling multiple users to work on the same project simultaneously.These advantages make R language a more suitable choice for complex data analysis tasks,particularly in scientific research and business applications.The findings of this study will help people understand that R is not just a language that can handle more data than Excel and demonstrate that r is essential to the field of data analysis.At the same time,it will also help users and organizations make informed decisions regarding their data analysis needs and software preferences.
基金supported by Faculty of Medicine,Ministry of Education,Cultures,Research and Technology Tanjungpura University(No.3483/UN22.9/PG/2021)。
文摘Objective:To explain the use of concept mapping in a study about family members'experiences in taking care of people with cancer.Methods:This study used a phenomenological study design.In this study,we describe the analytical process of using concept mapping in our phenomenological studies about family members'experiences in taking care of people with cancer.Results:We developed several concept maps that aided us in analyzing our collected data from the interviews.Conclusions:The use of concept mapping is suggested to researchers who intend to analyze their data in any qualitative studies,including those using a phenomenological design,because it is a time-efficient way of dealing with large amounts of qualitative data during the analytical process.
文摘Gravitational wave detection is one of the most cutting-edge research areas in modern physics, with its success relying on advanced data analysis and signal processing techniques. This study provides a comprehensive review of data analysis methods and signal processing techniques in gravitational wave detection. The research begins by introducing the characteristics of gravitational wave signals and the challenges faced in their detection, such as extremely low signal-to-noise ratios and complex noise backgrounds. It then systematically analyzes the application of time-frequency analysis methods in extracting transient gravitational wave signals, including wavelet transforms and Hilbert-Huang transforms. The study focuses on discussing the crucial role of matched filtering techniques in improving signal detection sensitivity and explores strategies for template bank optimization. Additionally, the research evaluates the potential of machine learning algorithms, especially deep learning networks, in rapidly identifying and classifying gravitational wave events. The study also analyzes the application of Bayesian inference methods in parameter estimation and model selection, as well as their advantages in handling uncertainties. However, the research also points out the challenges faced by current technologies, such as dealing with non-Gaussian noise and improving computational efficiency. To address these issues, the study proposes a hybrid analysis framework combining physical models and data-driven methods. Finally, the research looks ahead to the potential applications of quantum computing in future gravitational wave data analysis. This study provides a comprehensive theoretical foundation for the optimization and innovation of gravitational wave data analysis methods, contributing to the advancement of gravitational wave astronomy.
文摘The connectivity of sandbodies is a key constraint to the exploration effectiveness of Bohai A Oilfield.Conventional connectivity studies often use methods such as seismic attribute fusion,while the development of contiguous composite sandbodies in this area makes it challenging to characterize connectivity changes with conventional seismic attributes.Aiming at the above problem in the Bohai A Oilfield,this study proposes a big data analysis method based on the Deep Forest algorithm to predict the sandbody connectivity.Firstly,by compiling the abundant exploration and development sandbodies data in the study area,typical sandbodies with reliable connectivity were selected.Then,sensitive seismic attribute were extracted to obtain training samples.Finally,based on the Deep Forest algorithm,mapping model between attribute combinations and sandbody connectivity was established through machine learning.This method achieves the first quantitative determination of the connectivity for continuous composite sandbodies in the Bohai Oilfield.Compared with conventional connectivity discrimination methods such as high-resolution processing and seismic attribute analysis,this method can combine the sandbody characteristics of the study area in the process of machine learning,and jointly judge connectivity by combining multiple seismic attributes.The study results show that this method has high accuracy and timeliness in predicting connectivity for continuous composite sandbodies.Applied to the Bohai A Oilfield,it successfully identified multiple sandbody connectivity relationships and provided strong support for the subsequent exploration potential assessment and well placement optimization.This method also provides a new idea and method for studying sandbody connectivity under similar complex geological conditions.
文摘This study investigates university English teachers’acceptance and willingness to use learning management system(LMS)data analysis tools in their teaching practices.The research employs a mixed-method approach,combining quantitative surveys and qualitative interviews to understand teachers’perceptions and attitudes,and the factors influencing their adoption of LMS data analysis tools.The findings reveal that perceived usefulness,perceived ease of use,technical literacy,organizational support,and data privacy concerns significantly impact teachers’willingness to use these tools.Based on these insights,the study offers practical recommendations for educational institutions to enhance the effective adoption of LMS data analysis tools in English language teaching.
文摘The advent of the big data era has made data visualization a crucial tool for enhancing the efficiency and insights of data analysis. This theoretical research delves into the current applications and potential future trends of data visualization in big data analysis. The article first systematically reviews the theoretical foundations and technological evolution of data visualization, and thoroughly analyzes the challenges faced by visualization in the big data environment, such as massive data processing, real-time visualization requirements, and multi-dimensional data display. Through extensive literature research, it explores innovative application cases and theoretical models of data visualization in multiple fields including business intelligence, scientific research, and public decision-making. The study reveals that interactive visualization, real-time visualization, and immersive visualization technologies may become the main directions for future development and analyzes the potential of these technologies in enhancing user experience and data comprehension. The paper also delves into the theoretical potential of artificial intelligence technology in enhancing data visualization capabilities, such as automated chart generation, intelligent recommendation of visualization schemes, and adaptive visualization interfaces. The research also focuses on the role of data visualization in promoting interdisciplinary collaboration and data democratization. Finally, the paper proposes theoretical suggestions for promoting data visualization technology innovation and application popularization, including strengthening visualization literacy education, developing standardized visualization frameworks, and promoting open-source sharing of visualization tools. This study provides a comprehensive theoretical perspective for understanding the importance of data visualization in the big data era and its future development directions.
文摘There are some limitations when we apply conventional methods to analyze the massive amounts of seismic data acquired with high-density spatial sampling since processors usually obtain the properties of raw data from common shot gathers or other datasets located at certain points or along lines. We propose a novel method in this paper to observe seismic data on time slices from spatial subsets. The composition of a spatial subset and the unique character of orthogonal or oblique subsets are described and pre-stack subsets are shown by 3D visualization. In seismic data processing, spatial subsets can be used for the following aspects: (1) to check the trace distribution uniformity and regularity; (2) to observe the main features of ground-roll and linear noise; (3) to find abnormal traces from slices of datasets; and (4) to QC the results of pre-stack noise attenuation. The field data application shows that seismic data analysis in spatial subsets is an effective method that may lead to a better discrimination among various wavefields and help us obtain more information.
基金supported by the National Key Technologies R&D Program of China under Grant No. 2015BAK38B01the National Natural Science Foundation of China under Grant Nos. 61174103 and 61603032+4 种基金the National Key Research and Development Program of China under Grant Nos. 2016YFB0700502, 2016YFB1001404, and 2017YFB0702300the China Postdoctoral Science Foundation under Grant No. 2016M590048the Fundamental Research Funds for the Central Universities under Grant No. 06500025the University of Science and Technology Beijing - Taipei University of Technology Joint Research Program under Grant No. TW201610the Foundation from the Taipei University of Technology of Taiwan under Grant No. NTUT-USTB-105-4
文摘Quantized kernel least mean square(QKLMS) algorithm is an effective nonlinear adaptive online learning algorithm with good performance in constraining the growth of network size through the use of quantization for input space. It can serve as a powerful tool to perform complex computing for network service and application. With the purpose of compressing the input to further improve learning performance, this article proposes a novel QKLMS with entropy-guided learning, called EQ-KLMS. Under the consecutive square entropy learning framework, the basic idea of entropy-guided learning technique is to measure the uncertainty of the input vectors used for QKLMS, and delete those data with larger uncertainty, which are insignificant or easy to cause learning errors. Then, the dataset is compressed. Consequently, by using square entropy, the learning performance of proposed EQ-KLMS is improved with high precision and low computational cost. The proposed EQ-KLMS is validated using a weather-related dataset, and the results demonstrate the desirable performance of our scheme.
基金Supported by the National Natural Science Foundation of China
文摘Space debris poses a serious threat to human space activities and needs to be measured and cataloged. As a new technology for space target surveillance, the measurement accuracy of diffuse reflection laser ranging (DRLR) is much higher than that of microwave radar and optoelectronic measurement. Based on the laser ranging data of space debris from the DRLR system at Shanghai Astronomical Observatory acquired in March-April, 2013, the characteristics and precision of the laser ranging data are analyzed and their applications in orbit determination of space debris are discussed, which is implemented for the first time in China. The experiment indicates that the precision of laser ranging data can reach 39 cm-228 cm. When the data are sufficient enough (four arcs measured over three days), the orbital accuracy of space debris can be up to 50 m.
基金Funded by the Canadian National Center of Excellence GEOIDE.
文摘Detection of a periodic signal hidden in noise is the goal of Superconducting Gravimeter(SG)data analysis.Due to spikes,gaps,datum shrifts(offsets)and other disturbances,the traditional FFT method shows inherent limitations.Instead,the least squares spectral analysis(LSSA)has showed itself more suitable than Fourier analysis of gappy,unequally spaced and unequally weighted data series in a variety of applications in geodesy and geophysics.This paper reviews the principle of LSSA and gives a possible strategy for the analysis of time series obtained from the Canadian Superconducting Gravimeter Installation(CGSI),with gaps,offsets,unequal sampling decimation of the data and unequally weighted data points.
基金This work was supported by the National Key Research and Development Program of China (No. 2016YF-B0901900) and the National Natural Science Foundation of China (Nos. 61733018, 61333001, 61573344).
文摘In this paper, variational inference is studied on manifolds with certain metrics. To solve the problem, the analysis is first proposed for the variational Bayesian on Lie group, and then extended to the manifold that is approximated by Lie groups. Then the convergence of the proposed algorithm with respect to the manifold metric is proved in two iterative processes: variational Bayesian expectation (VB-F) step and variational Bayesian maximum (VB-M) step. Moreover, the effective of different metrics for Bayesian analysis is discussed.
基金supported by the Energy Efficiency&Resources Core Technology Program of the Korea Institute of Energy Technology Evaluation and Planning(KETEP)granted financial resource from the Ministry of Trade,Industry&Energy,Republic of Korea(No.20172510102090).
文摘This paper presents the development and application of a production data analysis software that can analyze and forecast the production performance and reservoir properties of shale gas wells.The theories used in the study were based on the analytical and empirical approaches.Its reliability has been confirmed through comparisons with a commercial software.Using transient data relating to multi-stage hydraulic fractured horizontal wells,it was confirmed that the accuracy of the modified hyperbolic method showed an error of approximately 4%compared to the actual estimated ultimate recovery(EUR).On the basis of the developed model,reliable productivity forecasts have been obtained by analyzing field production data relating to wells in Canada.The EUR was computed as 9.6 Bcf using the modified hyperbolic method.Employing the Pow Law Exponential method,the EUR would be 9.4 Bcf.The models developed in this study will allow in the future integration of new analytical and empirical theories in a relatively readily than commercial models.
文摘RNA-sequencing(RNA-seq),based on next-generation sequencing technologies,has rapidly become a standard and popular technology for transcriptome analysis.However,serious challenges still exist in analyzing and interpreting the RNA-seq data.With the development of high-throughput sequencing technology,the sequencing depth of RNA-seq data increases explosively.The intricate biological process of transcriptome is more complicated and diversified beyond our imagination.Moreover,most of the remaining organisms still have no available reference genome or have only incomplete genome annotations.Therefore,a large number of bioinformatics methods for various transcriptomics studies are proposed to effectively settle these challenges.This review comprehensively summarizes the various studies in RNA-seq data analysis and their corresponding analysis methods,including genome annotation,quality control and pre-processing of reads,read alignment,transcriptome assembly,gene and isoform expression quantification,differential expression analysis,data visualization and other analyses.
基金Natural Science Foundation of China(T11036061/T0108).
文摘A novel study using LCeMS(Liquid chromatography tandem mass spectrometry)coupled with multivariate data analysis and bioactivity evaluation was established for discrimination of aqueous extract and vinegar extract of Shixiao San.Batches of these two kinds of samples were subjected to analysis,and the datasets of sample codes,tR-m/z pairs and ion intensities were processed with principal component analysis(PCA).The result of score plot showed a clear classification of the aqueous and vinegar groups.And the chemical markers having great contributions to the differentiation were screened out on the loading plot.The identities of the chemical markers were performed by comparing the mass fragments and retention times with those of reference compounds and/or the known compounds published in the literatures.Based on the proposed strategy,quercetin-3-Oneohesperidoside,isorhamnetin-3-O-neohespeeridoside,kaempferol-3-O-neohesperidoside,isorhamnetin-3-O-rutinoside and isorhamnetin-3-O-(2G-a-l-rhamnosyl)-rutinoside were explored as representative markers in distinguishing the vinegar extract from the aqueous extract.The anti-hyperlipidemic activities of two processed extracts of Shixiao San were examined on serum levels of lipids,lipoprotein and blood antioxidant enzymes in a rat hyperlipidemia model,and the vinegary extract,exerting strong lipid-lowering and antioxidative effects,was superior to the aqueous extract.Therefore,boiling with vinegary was predicted as the greatest processing procedure for anti-hyperlipidemic effect of Shixiao San.Furthermore,combining the changes in the metabolic profiling and bioactivity evaluation,the five representative markers may be related to the observed antihyperlipidemic effect.
基金Supported by Major State Basic Research Development Program of China("973" Program,No.2010CB731502)
文摘A data processing method was proposed for eliminating the end restraint in triaxial tests of soil. A digital image processing method was used to calculate the local deformations and local stresses for any region on the surface of triaxial soil specimens. The principle and implementation of this digital image processing method were introduced as well as the calculation method for local mechanical properties of soil specimens. Comparisons were made between the test results calculated by the data from both the entire specimen and local regions, and it was found that the deformations were more uniform in the middle region compared with the entire specimen. In order to quantify the nonuniform characteristic of deformation, the non-uniformity coefficients of strain were defined and calculated. Traditional and end-lubricated triaxial tests were conducted under the same condition to investigate the effects of using local region data for deformation calculation on eliminating the end restraint of specimens. After the statistical analysis of all test results, it was concluded that for the tested soil specimen with the size of 39.1 mm × 80 ram, the utilization of the middle 35 mm region of traditional specimens in data processing had a better effect on eliminating end restraint compared with end lubrication. Furthermore, the local data analysis in this paper was validated through the comparisons with the test results from other researchers.