With the advent of the big data era,real-time data analysis and decision-support systems have been recognized as essential tools for enhancing enterprise competitiveness and optimizing the decision-making process.This...With the advent of the big data era,real-time data analysis and decision-support systems have been recognized as essential tools for enhancing enterprise competitiveness and optimizing the decision-making process.This study aims to explore the development strategies of real-time data analysis and decision-support systems,and analyze their application status and future development trends in various industries.The article first reviews the basic concepts and importance of real-time data analysis and decision-support systems,and then discusses in detail the key technical aspects such as system architecture,data collection and processing,analysis methods,and visualization techniques.展开更多
Semantic communication(SemCom)aims to achieve high-fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy.Nevertheless,semantic communication still suffers from unexpe...Semantic communication(SemCom)aims to achieve high-fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy.Nevertheless,semantic communication still suffers from unexpected channel volatility and thus developing a re-transmission mechanism(e.g.,hybrid automatic repeat request[HARQ])becomes indispensable.In that regard,instead of discarding previously transmitted information,the incremental knowledge-based HARQ(IK-HARQ)is deemed as a more effective mechanism that could sufficiently utilize the information semantics.However,considering the possible existence of semantic ambiguity in image transmission,a simple bit-level cyclic redundancy check(CRC)might compromise the performance of IK-HARQ.Therefore,there emerges a strong incentive to revolutionize the CRC mechanism,thus more effectively reaping the benefits of both SemCom and HARQ.In this paper,built on top of swin transformer-based joint source-channel coding(JSCC)and IK-HARQ,we propose a semantic image transmission framework SC-TDA-HARQ.In particular,different from the conventional CRC,we introduce a topological data analysis(TDA)-based error detection method,which capably digs out the inner topological and geometric information of images,to capture semantic information and determine the necessity for re-transmission.Extensive numerical results validate the effectiveness and efficiency of the proposed SC-TDA-HARQ framework,especially under the limited bandwidth condition,and manifest the superiority of TDA-based error detection method in image transmission.展开更多
Cervical cancer,a leading malignancy globally,poses a significant threat to women's health,with an estimated 604,000 new cases and 342,000 deaths reported in 2020^([1]).As cervical cancer is closely linked to huma...Cervical cancer,a leading malignancy globally,poses a significant threat to women's health,with an estimated 604,000 new cases and 342,000 deaths reported in 2020^([1]).As cervical cancer is closely linked to human papilloma virus(HPV)infection,early detection relies on HPV screening;however,late-stage prognosis remains poor,underscoring the need for novel diagnostic and therapeutic targets^([2]).展开更多
DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expres...DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expression data generated. To address this, this paper employs a mixed-effects model to analyze gene expression data. In terms of data selection, 1176 genes from the white mouse gene expression dataset under two experimental conditions were chosen, setting up two conditions: pneumococcal infection and no infection, and constructing a mixed-effects model. After preprocessing the gene chip information, the data were imported into the model, preliminary results were calculated, and permutation tests were performed to biologically validate the preliminary results using GSEA. The final dataset consists of 20 groups of gene expression data from pneumococcal infection, which categorizes functionally related genes based on the similarity of their expression profiles, facilitating the study of genes with unknown functions.展开更多
In order to overcome the defects that the analysis of multi-well typical curves of shale gas reservoirs is rarely applied to engineering,this study proposes a robust production data analysis method based on deconvolut...In order to overcome the defects that the analysis of multi-well typical curves of shale gas reservoirs is rarely applied to engineering,this study proposes a robust production data analysis method based on deconvolution,which is used for multi-well inter-well interference research.In this study,a multi-well conceptual trilinear seepage model for multi-stage fractured horizontal wells was established,and its Laplace solutions under two different outer boundary conditions were obtained.Then,an improved pressure deconvolution algorithm was used to normalize the scattered production data.Furthermore,the typical curve fitting was carried out using the production data and the seepage model solution.Finally,some reservoir parameters and fracturing parameters were interpreted,and the intensity of inter-well interference was compared.The effectiveness of the method was verified by analyzing the production dynamic data of six shale gas wells in Duvernay area.The results showed that the fitting effect of typical curves was greatly improved due to the mutual restriction between deconvolution calculation parameter debugging and seepage model parameter debugging.Besides,by using the morphological characteristics of the log-log typical curves and the time corresponding to the intersection point of the log-log typical curves of two models under different outer boundary conditions,the strength of the interference between wells on the same well platform was well judged.This work can provide a reference for the optimization of well spacing and hydraulic fracturing measures for shale gas wells.展开更多
The advent of the big data era has made data visualization a crucial tool for enhancing the efficiency and insights of data analysis. This theoretical research delves into the current applications and potential future...The advent of the big data era has made data visualization a crucial tool for enhancing the efficiency and insights of data analysis. This theoretical research delves into the current applications and potential future trends of data visualization in big data analysis. The article first systematically reviews the theoretical foundations and technological evolution of data visualization, and thoroughly analyzes the challenges faced by visualization in the big data environment, such as massive data processing, real-time visualization requirements, and multi-dimensional data display. Through extensive literature research, it explores innovative application cases and theoretical models of data visualization in multiple fields including business intelligence, scientific research, and public decision-making. The study reveals that interactive visualization, real-time visualization, and immersive visualization technologies may become the main directions for future development and analyzes the potential of these technologies in enhancing user experience and data comprehension. The paper also delves into the theoretical potential of artificial intelligence technology in enhancing data visualization capabilities, such as automated chart generation, intelligent recommendation of visualization schemes, and adaptive visualization interfaces. The research also focuses on the role of data visualization in promoting interdisciplinary collaboration and data democratization. Finally, the paper proposes theoretical suggestions for promoting data visualization technology innovation and application popularization, including strengthening visualization literacy education, developing standardized visualization frameworks, and promoting open-source sharing of visualization tools. This study provides a comprehensive theoretical perspective for understanding the importance of data visualization in the big data era and its future development directions.展开更多
Identifying the causal impact of' some intervention challenging when one is faced with correlated binary end-points in observational studies is a challenging task, and it is even more The statistical literature on an...Identifying the causal impact of' some intervention challenging when one is faced with correlated binary end-points in observational studies is a challenging task, and it is even more The statistical literature on analyzing such data is well documented. Dependence between observations from the same study subject in correlated data renders invalid the usual chi-square tests of independence and inflates the variance ofparameter estimates. Disaggregated approaches such as hierarchical linear models which are able to adjust for individual level covariate:s are favoured in the analysis of such data, thereby gaining power over aggregated and individual-level analyses. In this article the authors, therefore, address the issue of analyzing correlated data with dichotomous end-points by using hierarchical logistic regression, a generalization of the standard logistic regression model for independent outcomes.展开更多
The connectivity of sandbodies is a key constraint to the exploration effectiveness of Bohai A Oilfield.Conventional connectivity studies often use methods such as seismic attribute fusion,while the development of con...The connectivity of sandbodies is a key constraint to the exploration effectiveness of Bohai A Oilfield.Conventional connectivity studies often use methods such as seismic attribute fusion,while the development of contiguous composite sandbodies in this area makes it challenging to characterize connectivity changes with conventional seismic attributes.Aiming at the above problem in the Bohai A Oilfield,this study proposes a big data analysis method based on the Deep Forest algorithm to predict the sandbody connectivity.Firstly,by compiling the abundant exploration and development sandbodies data in the study area,typical sandbodies with reliable connectivity were selected.Then,sensitive seismic attribute were extracted to obtain training samples.Finally,based on the Deep Forest algorithm,mapping model between attribute combinations and sandbody connectivity was established through machine learning.This method achieves the first quantitative determination of the connectivity for continuous composite sandbodies in the Bohai Oilfield.Compared with conventional connectivity discrimination methods such as high-resolution processing and seismic attribute analysis,this method can combine the sandbody characteristics of the study area in the process of machine learning,and jointly judge connectivity by combining multiple seismic attributes.The study results show that this method has high accuracy and timeliness in predicting connectivity for continuous composite sandbodies.Applied to the Bohai A Oilfield,it successfully identified multiple sandbody connectivity relationships and provided strong support for the subsequent exploration potential assessment and well placement optimization.This method also provides a new idea and method for studying sandbody connectivity under similar complex geological conditions.展开更多
Microsoft Excel is essential for the End-User Approach (EUA), offering versatility in data organization, analysis, and visualization, as well as widespread accessibility. It fosters collaboration and informed decision...Microsoft Excel is essential for the End-User Approach (EUA), offering versatility in data organization, analysis, and visualization, as well as widespread accessibility. It fosters collaboration and informed decision-making across diverse domains. Conversely, Python is indispensable for professional programming due to its versatility, readability, extensive libraries, and robust community support. It enables efficient development, advanced data analysis, data mining, and automation, catering to diverse industries and applications. However, one primary issue when using Microsoft Excel with Python libraries is compatibility and interoperability. While Excel is a widely used tool for data storage and analysis, it may not seamlessly integrate with Python libraries, leading to challenges in reading and writing data, especially in complex or large datasets. Additionally, manipulating Excel files with Python may not always preserve formatting or formulas accurately, potentially affecting data integrity. Moreover, dependency on Excel’s graphical user interface (GUI) for automation can limit scalability and reproducibility compared to Python’s scripting capabilities. This paper covers the integration solution of empowering non-programmers to leverage Python’s capabilities within the familiar Excel environment. This enables users to perform advanced data analysis and automation tasks without requiring extensive programming knowledge. Based on Soliciting feedback from non-programmers who have tested the integration solution, the case study shows how the solution evaluates the ease of implementation, performance, and compatibility of Python with Excel versions.展开更多
This research paper compares Excel and R language for data analysis and concludes that R language is more suitable for complex data analysis tasks.R language’s open-source nature makes it accessible to everyone,and i...This research paper compares Excel and R language for data analysis and concludes that R language is more suitable for complex data analysis tasks.R language’s open-source nature makes it accessible to everyone,and its powerful data management and analysis tools make it suitable for handling complex data analysis tasks.It is also highly customizable,allowing users to create custom functions and packages to meet their specific needs.Additionally,R language provides high reproducibility,making it easy to replicate and verify research results,and it has excellent collaboration capabilities,enabling multiple users to work on the same project simultaneously.These advantages make R language a more suitable choice for complex data analysis tasks,particularly in scientific research and business applications.The findings of this study will help people understand that R is not just a language that can handle more data than Excel and demonstrate that r is essential to the field of data analysis.At the same time,it will also help users and organizations make informed decisions regarding their data analysis needs and software preferences.展开更多
Objective:To explain the use of concept mapping in a study about family members'experiences in taking care of people with cancer.Methods:This study used a phenomenological study design.In this study,we describe th...Objective:To explain the use of concept mapping in a study about family members'experiences in taking care of people with cancer.Methods:This study used a phenomenological study design.In this study,we describe the analytical process of using concept mapping in our phenomenological studies about family members'experiences in taking care of people with cancer.Results:We developed several concept maps that aided us in analyzing our collected data from the interviews.Conclusions:The use of concept mapping is suggested to researchers who intend to analyze their data in any qualitative studies,including those using a phenomenological design,because it is a time-efficient way of dealing with large amounts of qualitative data during the analytical process.展开更多
This study investigates university English teachers’acceptance and willingness to use learning management system(LMS)data analysis tools in their teaching practices.The research employs a mixed-method approach,combin...This study investigates university English teachers’acceptance and willingness to use learning management system(LMS)data analysis tools in their teaching practices.The research employs a mixed-method approach,combining quantitative surveys and qualitative interviews to understand teachers’perceptions and attitudes,and the factors influencing their adoption of LMS data analysis tools.The findings reveal that perceived usefulness,perceived ease of use,technical literacy,organizational support,and data privacy concerns significantly impact teachers’willingness to use these tools.Based on these insights,the study offers practical recommendations for educational institutions to enhance the effective adoption of LMS data analysis tools in English language teaching.展开更多
Ancient stellar observations are a valuable cultural heritage,profoundly influencing both cultural domains and modern astronomical research.Shi’s Star Catalog(石氏星经),the oldest extant star catalog in China,faces c...Ancient stellar observations are a valuable cultural heritage,profoundly influencing both cultural domains and modern astronomical research.Shi’s Star Catalog(石氏星经),the oldest extant star catalog in China,faces controversy regarding its observational epoch.Determining this epoch via precession assumes accurate ancient coordinates and correspondence with contemporary stars,posing significant challenges.This study introduces a novel method using the Generalized Hough Transform to ascertain the catalog’s observational epoch.This approach statistically accommodates errors in ancient coordinates and discrepancies between ancient and modern stars,addressing limitations in prior methods.Our findings date Shi’s Star Catalog to the 4th century BCE,with 2nd-century CE adjustments.In comparison,the Western tradition’s oldest known catalog,the Ptolemaic Star Catalog(2nd century CE),likely derives from the Hipparchus Star Catalog(2nd century BCE).Thus,Shi’s Star Catalog is identified as the world’s oldest known star catalog.Beyond establishing its observation period,this study aims to consolidate and digitize these cultural artifacts.展开更多
There are some limitations when we apply conventional methods to analyze the massive amounts of seismic data acquired with high-density spatial sampling since processors usually obtain the properties of raw data from ...There are some limitations when we apply conventional methods to analyze the massive amounts of seismic data acquired with high-density spatial sampling since processors usually obtain the properties of raw data from common shot gathers or other datasets located at certain points or along lines. We propose a novel method in this paper to observe seismic data on time slices from spatial subsets. The composition of a spatial subset and the unique character of orthogonal or oblique subsets are described and pre-stack subsets are shown by 3D visualization. In seismic data processing, spatial subsets can be used for the following aspects: (1) to check the trace distribution uniformity and regularity; (2) to observe the main features of ground-roll and linear noise; (3) to find abnormal traces from slices of datasets; and (4) to QC the results of pre-stack noise attenuation. The field data application shows that seismic data analysis in spatial subsets is an effective method that may lead to a better discrimination among various wavefields and help us obtain more information.展开更多
RNA-sequencing(RNA-seq),based on next-generation sequencing technologies,has rapidly become a standard and popular technology for transcriptome analysis.However,serious challenges still exist in analyzing and interpre...RNA-sequencing(RNA-seq),based on next-generation sequencing technologies,has rapidly become a standard and popular technology for transcriptome analysis.However,serious challenges still exist in analyzing and interpreting the RNA-seq data.With the development of high-throughput sequencing technology,the sequencing depth of RNA-seq data increases explosively.The intricate biological process of transcriptome is more complicated and diversified beyond our imagination.Moreover,most of the remaining organisms still have no available reference genome or have only incomplete genome annotations.Therefore,a large number of bioinformatics methods for various transcriptomics studies are proposed to effectively settle these challenges.This review comprehensively summarizes the various studies in RNA-seq data analysis and their corresponding analysis methods,including genome annotation,quality control and pre-processing of reads,read alignment,transcriptome assembly,gene and isoform expression quantification,differential expression analysis,data visualization and other analyses.展开更多
The proliferation of textual data in society currently is overwhelming, in particular, unstructured textual data is being constantly generated via call centre logs, emails, documents on the web, blogs, tweets, custome...The proliferation of textual data in society currently is overwhelming, in particular, unstructured textual data is being constantly generated via call centre logs, emails, documents on the web, blogs, tweets, customer comments, customer reviews, etc.While the amount of textual data is increasing rapidly, users ability to summarise, understand, and make sense of such data for making better business/living decisions remains challenging. This paper studies how to analyse textual data, based on layered software patterns, for extracting insightful user intelligence from a large collection of documents and for using such information to improve user operations and performance.展开更多
Aircraft Meteorological Data Relay(AMDAR)observations have been widely used in numerical weather prediction(NWP)because of its high spatiotemporal resolution.The observational error of AMDAR is influenced by aircraft ...Aircraft Meteorological Data Relay(AMDAR)observations have been widely used in numerical weather prediction(NWP)because of its high spatiotemporal resolution.The observational error of AMDAR is influenced by aircraft flight altitude and atmospheric condition.In this study,the wind speed and altitude dependent observational error of AMDAR is estimated.The statistical results show that the temperature and the observational error in wind speeds slightly decrease as altitude increases,and the observational error in wind speed increases as wind speed increases.Pseudo single AMDAR observation assimilation tests demonstrate that the wind speed and altitude dependent observational error can provide more reasonable analysis increment.Furthermore,to assess the performance of wind speed and altitude dependent observational error on data assimilation and forecasting,two-month 3-hourly cycling data assimilation and forecast experiments based on the Weather Research and Forecasting Model(WRF)and its Data Assimilation system(WRFDA)are performed for the period during 1 September-31 October,2017.The results of the two-month 3-hourly cycling experiments indicate that new observational error improves analysis and forecast of wind field and geo-potential height,and has slight improvements on temperature.The Fractions Skill Score(FSS)of the 6-h accumulated precipitation shows that new wind speed and altitude dependent observational error leads to better precipitation forecast skill than the default observational error in the WRFDA does.展开更多
This paper presents the development and application of a production data analysis software that can analyze and forecast the production performance and reservoir properties of shale gas wells.The theories used in the ...This paper presents the development and application of a production data analysis software that can analyze and forecast the production performance and reservoir properties of shale gas wells.The theories used in the study were based on the analytical and empirical approaches.Its reliability has been confirmed through comparisons with a commercial software.Using transient data relating to multi-stage hydraulic fractured horizontal wells,it was confirmed that the accuracy of the modified hyperbolic method showed an error of approximately 4%compared to the actual estimated ultimate recovery(EUR).On the basis of the developed model,reliable productivity forecasts have been obtained by analyzing field production data relating to wells in Canada.The EUR was computed as 9.6 Bcf using the modified hyperbolic method.Employing the Pow Law Exponential method,the EUR would be 9.4 Bcf.The models developed in this study will allow in the future integration of new analytical and empirical theories in a relatively readily than commercial models.展开更多
To analyze the effects of gas cannons on clouds and precipitation,multisource observational data,including those from National Centers for Environmental Prediction(NCEP)reanalysis,Hangzhou and Huzhou new-generation we...To analyze the effects of gas cannons on clouds and precipitation,multisource observational data,including those from National Centers for Environmental Prediction(NCEP)reanalysis,Hangzhou and Huzhou new-generation weather radars,laser disdrometer,ground-based automatic weather station,wind profiler radar,and Lin'an C-band dualpolarization radar,were adopted in this study.Based on the variational dual-Doppler wind retrieval method and the polarimetric variables obtained by the dual-polarization radar,we analyzed the microphysical processes and the variations in the macro-and microphysical quantities in clouds from the perspective of the synoptic background before precipitation enhancement,the polarization echo characteristics before,during and after enhancement,and the evolution of the fine three-dimensional kinematic structure and the microphysical structure.The results show that the precipitation enhancement operation promoted the development of radar echoes and prolonged their duration,and both the horizontal and vertical wind speeds increased.The dual-polarization radar echo showed that the diameter of the precipitation particles increased,and the concentration of raindrops increased after precipitation enhancement.The raindrops were lifted to a height corresponding to 0 to-20℃due to vertical updrafts.Based on the disdrometer data during precipitation enhancement,the concentration of small raindrops(lgN_(w))showed a significant increase,and the mass-weighted diameter D_(m)value decreased,indicating that the precipitation enhancement operation played a certain“lubricating”effect.After the precipitation enhancement,the concentration of raindrops did not change much compared with that during the enhancement process,while the Dm increased,corresponding to an increase in rain intensity.The results suggest the positive effect of gas cannons on precipitation enhancement.展开更多
A new dynamic model identification method is developed for continuous-time series analysis and forward prediction applications. The quantum of data is defined over moving time intervals in sliding window coordinates f...A new dynamic model identification method is developed for continuous-time series analysis and forward prediction applications. The quantum of data is defined over moving time intervals in sliding window coordinates for compressing the size of stored data while retaining the resolution of information. Quantum vectors are introduced as the basis of a linear space for defining a Dynamic Quantum Operator (DQO) model of the system defined by its data stream. The transport of the quantum of compressed data is modeled between the time interval bins during the movement of the sliding time window. The DQO model is identified from the samples of the real-time flow of data over the sliding time window. A least-square-fit identification method is used for evaluating the parameters of the quantum operator model, utilizing the repeated use of the sampled data through a number of time steps. The method is tested to analyze, and forward-predict air temperature variations accessed from weather data as well as methane concentration variations obtained from measurements of an operating mine. The results show efficient forward prediction capabilities, surpassing those using neural networks and other methods for the same task.展开更多
文摘With the advent of the big data era,real-time data analysis and decision-support systems have been recognized as essential tools for enhancing enterprise competitiveness and optimizing the decision-making process.This study aims to explore the development strategies of real-time data analysis and decision-support systems,and analyze their application status and future development trends in various industries.The article first reviews the basic concepts and importance of real-time data analysis and decision-support systems,and then discusses in detail the key technical aspects such as system architecture,data collection and processing,analysis methods,and visualization techniques.
基金supported in part by the National Key Research and Development Program of China under Grant 2024YFE0200600in part by the National Natural Science Foundation of China under Grant 62071425+3 种基金in part by the Zhejiang Key Research and Development Plan under Grant 2022C01093in part by the Zhejiang Provincial Natural Science Foundation of China under Grant LR23F010005in part by the National Key Laboratory of Wireless Communications Foundation under Grant 2023KP01601in part by the Big Data and Intelligent Computing Key Lab of CQUPT under Grant BDIC-2023-B-001.
文摘Semantic communication(SemCom)aims to achieve high-fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy.Nevertheless,semantic communication still suffers from unexpected channel volatility and thus developing a re-transmission mechanism(e.g.,hybrid automatic repeat request[HARQ])becomes indispensable.In that regard,instead of discarding previously transmitted information,the incremental knowledge-based HARQ(IK-HARQ)is deemed as a more effective mechanism that could sufficiently utilize the information semantics.However,considering the possible existence of semantic ambiguity in image transmission,a simple bit-level cyclic redundancy check(CRC)might compromise the performance of IK-HARQ.Therefore,there emerges a strong incentive to revolutionize the CRC mechanism,thus more effectively reaping the benefits of both SemCom and HARQ.In this paper,built on top of swin transformer-based joint source-channel coding(JSCC)and IK-HARQ,we propose a semantic image transmission framework SC-TDA-HARQ.In particular,different from the conventional CRC,we introduce a topological data analysis(TDA)-based error detection method,which capably digs out the inner topological and geometric information of images,to capture semantic information and determine the necessity for re-transmission.Extensive numerical results validate the effectiveness and efficiency of the proposed SC-TDA-HARQ framework,especially under the limited bandwidth condition,and manifest the superiority of TDA-based error detection method in image transmission.
基金supported by a project funded by the Hebei Provincial Central Guidance Local Science and Technology Development Fund(236Z7714G)。
文摘Cervical cancer,a leading malignancy globally,poses a significant threat to women's health,with an estimated 604,000 new cases and 342,000 deaths reported in 2020^([1]).As cervical cancer is closely linked to human papilloma virus(HPV)infection,early detection relies on HPV screening;however,late-stage prognosis remains poor,underscoring the need for novel diagnostic and therapeutic targets^([2]).
文摘DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expression data generated. To address this, this paper employs a mixed-effects model to analyze gene expression data. In terms of data selection, 1176 genes from the white mouse gene expression dataset under two experimental conditions were chosen, setting up two conditions: pneumococcal infection and no infection, and constructing a mixed-effects model. After preprocessing the gene chip information, the data were imported into the model, preliminary results were calculated, and permutation tests were performed to biologically validate the preliminary results using GSEA. The final dataset consists of 20 groups of gene expression data from pneumococcal infection, which categorizes functionally related genes based on the similarity of their expression profiles, facilitating the study of genes with unknown functions.
基金financial support from PetroChina Innovation Foundation。
文摘In order to overcome the defects that the analysis of multi-well typical curves of shale gas reservoirs is rarely applied to engineering,this study proposes a robust production data analysis method based on deconvolution,which is used for multi-well inter-well interference research.In this study,a multi-well conceptual trilinear seepage model for multi-stage fractured horizontal wells was established,and its Laplace solutions under two different outer boundary conditions were obtained.Then,an improved pressure deconvolution algorithm was used to normalize the scattered production data.Furthermore,the typical curve fitting was carried out using the production data and the seepage model solution.Finally,some reservoir parameters and fracturing parameters were interpreted,and the intensity of inter-well interference was compared.The effectiveness of the method was verified by analyzing the production dynamic data of six shale gas wells in Duvernay area.The results showed that the fitting effect of typical curves was greatly improved due to the mutual restriction between deconvolution calculation parameter debugging and seepage model parameter debugging.Besides,by using the morphological characteristics of the log-log typical curves and the time corresponding to the intersection point of the log-log typical curves of two models under different outer boundary conditions,the strength of the interference between wells on the same well platform was well judged.This work can provide a reference for the optimization of well spacing and hydraulic fracturing measures for shale gas wells.
文摘The advent of the big data era has made data visualization a crucial tool for enhancing the efficiency and insights of data analysis. This theoretical research delves into the current applications and potential future trends of data visualization in big data analysis. The article first systematically reviews the theoretical foundations and technological evolution of data visualization, and thoroughly analyzes the challenges faced by visualization in the big data environment, such as massive data processing, real-time visualization requirements, and multi-dimensional data display. Through extensive literature research, it explores innovative application cases and theoretical models of data visualization in multiple fields including business intelligence, scientific research, and public decision-making. The study reveals that interactive visualization, real-time visualization, and immersive visualization technologies may become the main directions for future development and analyzes the potential of these technologies in enhancing user experience and data comprehension. The paper also delves into the theoretical potential of artificial intelligence technology in enhancing data visualization capabilities, such as automated chart generation, intelligent recommendation of visualization schemes, and adaptive visualization interfaces. The research also focuses on the role of data visualization in promoting interdisciplinary collaboration and data democratization. Finally, the paper proposes theoretical suggestions for promoting data visualization technology innovation and application popularization, including strengthening visualization literacy education, developing standardized visualization frameworks, and promoting open-source sharing of visualization tools. This study provides a comprehensive theoretical perspective for understanding the importance of data visualization in the big data era and its future development directions.
文摘Identifying the causal impact of' some intervention challenging when one is faced with correlated binary end-points in observational studies is a challenging task, and it is even more The statistical literature on analyzing such data is well documented. Dependence between observations from the same study subject in correlated data renders invalid the usual chi-square tests of independence and inflates the variance ofparameter estimates. Disaggregated approaches such as hierarchical linear models which are able to adjust for individual level covariate:s are favoured in the analysis of such data, thereby gaining power over aggregated and individual-level analyses. In this article the authors, therefore, address the issue of analyzing correlated data with dichotomous end-points by using hierarchical logistic regression, a generalization of the standard logistic regression model for independent outcomes.
文摘The connectivity of sandbodies is a key constraint to the exploration effectiveness of Bohai A Oilfield.Conventional connectivity studies often use methods such as seismic attribute fusion,while the development of contiguous composite sandbodies in this area makes it challenging to characterize connectivity changes with conventional seismic attributes.Aiming at the above problem in the Bohai A Oilfield,this study proposes a big data analysis method based on the Deep Forest algorithm to predict the sandbody connectivity.Firstly,by compiling the abundant exploration and development sandbodies data in the study area,typical sandbodies with reliable connectivity were selected.Then,sensitive seismic attribute were extracted to obtain training samples.Finally,based on the Deep Forest algorithm,mapping model between attribute combinations and sandbody connectivity was established through machine learning.This method achieves the first quantitative determination of the connectivity for continuous composite sandbodies in the Bohai Oilfield.Compared with conventional connectivity discrimination methods such as high-resolution processing and seismic attribute analysis,this method can combine the sandbody characteristics of the study area in the process of machine learning,and jointly judge connectivity by combining multiple seismic attributes.The study results show that this method has high accuracy and timeliness in predicting connectivity for continuous composite sandbodies.Applied to the Bohai A Oilfield,it successfully identified multiple sandbody connectivity relationships and provided strong support for the subsequent exploration potential assessment and well placement optimization.This method also provides a new idea and method for studying sandbody connectivity under similar complex geological conditions.
文摘Microsoft Excel is essential for the End-User Approach (EUA), offering versatility in data organization, analysis, and visualization, as well as widespread accessibility. It fosters collaboration and informed decision-making across diverse domains. Conversely, Python is indispensable for professional programming due to its versatility, readability, extensive libraries, and robust community support. It enables efficient development, advanced data analysis, data mining, and automation, catering to diverse industries and applications. However, one primary issue when using Microsoft Excel with Python libraries is compatibility and interoperability. While Excel is a widely used tool for data storage and analysis, it may not seamlessly integrate with Python libraries, leading to challenges in reading and writing data, especially in complex or large datasets. Additionally, manipulating Excel files with Python may not always preserve formatting or formulas accurately, potentially affecting data integrity. Moreover, dependency on Excel’s graphical user interface (GUI) for automation can limit scalability and reproducibility compared to Python’s scripting capabilities. This paper covers the integration solution of empowering non-programmers to leverage Python’s capabilities within the familiar Excel environment. This enables users to perform advanced data analysis and automation tasks without requiring extensive programming knowledge. Based on Soliciting feedback from non-programmers who have tested the integration solution, the case study shows how the solution evaluates the ease of implementation, performance, and compatibility of Python with Excel versions.
文摘This research paper compares Excel and R language for data analysis and concludes that R language is more suitable for complex data analysis tasks.R language’s open-source nature makes it accessible to everyone,and its powerful data management and analysis tools make it suitable for handling complex data analysis tasks.It is also highly customizable,allowing users to create custom functions and packages to meet their specific needs.Additionally,R language provides high reproducibility,making it easy to replicate and verify research results,and it has excellent collaboration capabilities,enabling multiple users to work on the same project simultaneously.These advantages make R language a more suitable choice for complex data analysis tasks,particularly in scientific research and business applications.The findings of this study will help people understand that R is not just a language that can handle more data than Excel and demonstrate that r is essential to the field of data analysis.At the same time,it will also help users and organizations make informed decisions regarding their data analysis needs and software preferences.
基金supported by Faculty of Medicine,Ministry of Education,Cultures,Research and Technology Tanjungpura University(No.3483/UN22.9/PG/2021)。
文摘Objective:To explain the use of concept mapping in a study about family members'experiences in taking care of people with cancer.Methods:This study used a phenomenological study design.In this study,we describe the analytical process of using concept mapping in our phenomenological studies about family members'experiences in taking care of people with cancer.Results:We developed several concept maps that aided us in analyzing our collected data from the interviews.Conclusions:The use of concept mapping is suggested to researchers who intend to analyze their data in any qualitative studies,including those using a phenomenological design,because it is a time-efficient way of dealing with large amounts of qualitative data during the analytical process.
文摘This study investigates university English teachers’acceptance and willingness to use learning management system(LMS)data analysis tools in their teaching practices.The research employs a mixed-method approach,combining quantitative surveys and qualitative interviews to understand teachers’perceptions and attitudes,and the factors influencing their adoption of LMS data analysis tools.The findings reveal that perceived usefulness,perceived ease of use,technical literacy,organizational support,and data privacy concerns significantly impact teachers’willingness to use these tools.Based on these insights,the study offers practical recommendations for educational institutions to enhance the effective adoption of LMS data analysis tools in English language teaching.
基金supported by China National Astronomical Data Center(NADC),CAS Astronomical Data Center and Chinese Virtual Observatory(China-VO)supported by Astronomical Big Data Joint Research Center,co-founded by National Astronomical Observatories,Chinese Academy of Sciences and Alibaba Cloud。
文摘Ancient stellar observations are a valuable cultural heritage,profoundly influencing both cultural domains and modern astronomical research.Shi’s Star Catalog(石氏星经),the oldest extant star catalog in China,faces controversy regarding its observational epoch.Determining this epoch via precession assumes accurate ancient coordinates and correspondence with contemporary stars,posing significant challenges.This study introduces a novel method using the Generalized Hough Transform to ascertain the catalog’s observational epoch.This approach statistically accommodates errors in ancient coordinates and discrepancies between ancient and modern stars,addressing limitations in prior methods.Our findings date Shi’s Star Catalog to the 4th century BCE,with 2nd-century CE adjustments.In comparison,the Western tradition’s oldest known catalog,the Ptolemaic Star Catalog(2nd century CE),likely derives from the Hipparchus Star Catalog(2nd century BCE).Thus,Shi’s Star Catalog is identified as the world’s oldest known star catalog.Beyond establishing its observation period,this study aims to consolidate and digitize these cultural artifacts.
文摘There are some limitations when we apply conventional methods to analyze the massive amounts of seismic data acquired with high-density spatial sampling since processors usually obtain the properties of raw data from common shot gathers or other datasets located at certain points or along lines. We propose a novel method in this paper to observe seismic data on time slices from spatial subsets. The composition of a spatial subset and the unique character of orthogonal or oblique subsets are described and pre-stack subsets are shown by 3D visualization. In seismic data processing, spatial subsets can be used for the following aspects: (1) to check the trace distribution uniformity and regularity; (2) to observe the main features of ground-roll and linear noise; (3) to find abnormal traces from slices of datasets; and (4) to QC the results of pre-stack noise attenuation. The field data application shows that seismic data analysis in spatial subsets is an effective method that may lead to a better discrimination among various wavefields and help us obtain more information.
文摘RNA-sequencing(RNA-seq),based on next-generation sequencing technologies,has rapidly become a standard and popular technology for transcriptome analysis.However,serious challenges still exist in analyzing and interpreting the RNA-seq data.With the development of high-throughput sequencing technology,the sequencing depth of RNA-seq data increases explosively.The intricate biological process of transcriptome is more complicated and diversified beyond our imagination.Moreover,most of the remaining organisms still have no available reference genome or have only incomplete genome annotations.Therefore,a large number of bioinformatics methods for various transcriptomics studies are proposed to effectively settle these challenges.This review comprehensively summarizes the various studies in RNA-seq data analysis and their corresponding analysis methods,including genome annotation,quality control and pre-processing of reads,read alignment,transcriptome assembly,gene and isoform expression quantification,differential expression analysis,data visualization and other analyses.
文摘The proliferation of textual data in society currently is overwhelming, in particular, unstructured textual data is being constantly generated via call centre logs, emails, documents on the web, blogs, tweets, customer comments, customer reviews, etc.While the amount of textual data is increasing rapidly, users ability to summarise, understand, and make sense of such data for making better business/living decisions remains challenging. This paper studies how to analyse textual data, based on layered software patterns, for extracting insightful user intelligence from a large collection of documents and for using such information to improve user operations and performance.
基金National Key R&D Program of China(2017YFC1502102,2018YFC1506802)National Natural Science Foundation of China(41675102)。
文摘Aircraft Meteorological Data Relay(AMDAR)observations have been widely used in numerical weather prediction(NWP)because of its high spatiotemporal resolution.The observational error of AMDAR is influenced by aircraft flight altitude and atmospheric condition.In this study,the wind speed and altitude dependent observational error of AMDAR is estimated.The statistical results show that the temperature and the observational error in wind speeds slightly decrease as altitude increases,and the observational error in wind speed increases as wind speed increases.Pseudo single AMDAR observation assimilation tests demonstrate that the wind speed and altitude dependent observational error can provide more reasonable analysis increment.Furthermore,to assess the performance of wind speed and altitude dependent observational error on data assimilation and forecasting,two-month 3-hourly cycling data assimilation and forecast experiments based on the Weather Research and Forecasting Model(WRF)and its Data Assimilation system(WRFDA)are performed for the period during 1 September-31 October,2017.The results of the two-month 3-hourly cycling experiments indicate that new observational error improves analysis and forecast of wind field and geo-potential height,and has slight improvements on temperature.The Fractions Skill Score(FSS)of the 6-h accumulated precipitation shows that new wind speed and altitude dependent observational error leads to better precipitation forecast skill than the default observational error in the WRFDA does.
基金supported by the Energy Efficiency&Resources Core Technology Program of the Korea Institute of Energy Technology Evaluation and Planning(KETEP)granted financial resource from the Ministry of Trade,Industry&Energy,Republic of Korea(No.20172510102090).
文摘This paper presents the development and application of a production data analysis software that can analyze and forecast the production performance and reservoir properties of shale gas wells.The theories used in the study were based on the analytical and empirical approaches.Its reliability has been confirmed through comparisons with a commercial software.Using transient data relating to multi-stage hydraulic fractured horizontal wells,it was confirmed that the accuracy of the modified hyperbolic method showed an error of approximately 4%compared to the actual estimated ultimate recovery(EUR).On the basis of the developed model,reliable productivity forecasts have been obtained by analyzing field production data relating to wells in Canada.The EUR was computed as 9.6 Bcf using the modified hyperbolic method.Employing the Pow Law Exponential method,the EUR would be 9.4 Bcf.The models developed in this study will allow in the future integration of new analytical and empirical theories in a relatively readily than commercial models.
基金National Natural Science Foundation of China(41675029)Postgraduate Research&Practice Innovation Program of Jiangsu Province(KYCX18_0998)+1 种基金Science and Technology Program of Huzhou(2021GZ14,2020GZ31)Science and Technology(Key)Program of Zhejiang Meteorological Service(2021ZD27)。
文摘To analyze the effects of gas cannons on clouds and precipitation,multisource observational data,including those from National Centers for Environmental Prediction(NCEP)reanalysis,Hangzhou and Huzhou new-generation weather radars,laser disdrometer,ground-based automatic weather station,wind profiler radar,and Lin'an C-band dualpolarization radar,were adopted in this study.Based on the variational dual-Doppler wind retrieval method and the polarimetric variables obtained by the dual-polarization radar,we analyzed the microphysical processes and the variations in the macro-and microphysical quantities in clouds from the perspective of the synoptic background before precipitation enhancement,the polarization echo characteristics before,during and after enhancement,and the evolution of the fine three-dimensional kinematic structure and the microphysical structure.The results show that the precipitation enhancement operation promoted the development of radar echoes and prolonged their duration,and both the horizontal and vertical wind speeds increased.The dual-polarization radar echo showed that the diameter of the precipitation particles increased,and the concentration of raindrops increased after precipitation enhancement.The raindrops were lifted to a height corresponding to 0 to-20℃due to vertical updrafts.Based on the disdrometer data during precipitation enhancement,the concentration of small raindrops(lgN_(w))showed a significant increase,and the mass-weighted diameter D_(m)value decreased,indicating that the precipitation enhancement operation played a certain“lubricating”effect.After the precipitation enhancement,the concentration of raindrops did not change much compared with that during the enhancement process,while the Dm increased,corresponding to an increase in rain intensity.The results suggest the positive effect of gas cannons on precipitation enhancement.
文摘A new dynamic model identification method is developed for continuous-time series analysis and forward prediction applications. The quantum of data is defined over moving time intervals in sliding window coordinates for compressing the size of stored data while retaining the resolution of information. Quantum vectors are introduced as the basis of a linear space for defining a Dynamic Quantum Operator (DQO) model of the system defined by its data stream. The transport of the quantum of compressed data is modeled between the time interval bins during the movement of the sliding time window. The DQO model is identified from the samples of the real-time flow of data over the sliding time window. A least-square-fit identification method is used for evaluating the parameters of the quantum operator model, utilizing the repeated use of the sampled data through a number of time steps. The method is tested to analyze, and forward-predict air temperature variations accessed from weather data as well as methane concentration variations obtained from measurements of an operating mine. The results show efficient forward prediction capabilities, surpassing those using neural networks and other methods for the same task.