In the industrial process situation, principal component analysis (PCA) is ageneral method in data reconciliation. However, PCA sometime is unfeasible to nonlinear featureanalysis and limited in application to nonline...In the industrial process situation, principal component analysis (PCA) is ageneral method in data reconciliation. However, PCA sometime is unfeasible to nonlinear featureanalysis and limited in application to nonlinear industrial process. Kernel PCA (KPCA) is extensionof PCA and can be used for nonlinear feature analysis. A nonlinear data reconciliation method basedon KPCA is proposed. The basic idea of this method is that firstly original data are mapped to highdimensional feature space by nonlinear function, and PCA is implemented in the feature space. Thennonlinear feature analysis is implemented and data are reconstructed by using the kernel. The datareconciliation method based on KPCA is applied to ternary distillation column. Simulation resultsshow that this method can filter the noise in measurements of nonlinear process and reconciliateddata can represent the true information of nonlinear process.展开更多
This paper establishes the phase space in the light of spacial series data , discusses the fractal structure of geological data in terms of correlated functions and studies the chaos of these data . In addition , it i...This paper establishes the phase space in the light of spacial series data , discusses the fractal structure of geological data in terms of correlated functions and studies the chaos of these data . In addition , it introduces the R/S analysis for time series analysis into spacial series to calculate the structural fractal dimensions of ranges and standard deviation for spacial series data -and to establish the fractal dimension matrix and the procedures in plotting the fractal dimension anomaly diagram with vector distances of fractal dimension . At last , it has examples of its application .展开更多
The heteroscedastic regression model was established and the heteroscedastic regression analysis method was presented for mixed data composed of complete data,type-Ⅰ censored data and type-Ⅱ censored data from the l...The heteroscedastic regression model was established and the heteroscedastic regression analysis method was presented for mixed data composed of complete data,type-Ⅰ censored data and type-Ⅱ censored data from the location-scale distribution.The best unbiased estimations of regression coefficients,as well as the confidence limits of the location parameter and scale parameter were given.Furthermore,the point estimations and confidence limits of percentiles were obtained.Thus,the traditional multiple regression analysis method which is only suitable to the complete data from normal distribution can be extended to the cases of heteroscedastic mixed data and the location-scale distribution.So the presented method has a broad range of promising applications.展开更多
This article presents a micro-structure tensor enhanced elasto-plastic finite element(FE)method to address strength anisotropy in three-dimensional(3D)soil slope stability analysis.The gravity increase method(GIM)is e...This article presents a micro-structure tensor enhanced elasto-plastic finite element(FE)method to address strength anisotropy in three-dimensional(3D)soil slope stability analysis.The gravity increase method(GIM)is employed to analyze the stability of 3D anisotropic soil slopes.The accuracy of the proposed method is first verified against the data in the literature.We then simulate the 3D soil slope with a straight slope surface and the convex and concave slope surfaces with a 90turning corner to study the 3D effect on slope stability and the failure mechanism under anisotropy conditions.Based on our numerical results,the end effect significantly impacts the failure mechanism and safety factor.Anisotropy degree notably affects the safety factor,with higher degrees leading to deeper landslides.For concave slopes,they can be approximated by straight slopes with suitable boundary conditions to assess their stability.Furthermore,a case study of the Saint-Alban test embankment A in Quebec,Canada,is provided to demonstrate the applicability of the proposed FE model.展开更多
Scatterometer is an instrument which provides all-day and large-scale wind field information, and its application especially to wind retrieval always attracts meteorologists. Certain reasons cause large direction erro...Scatterometer is an instrument which provides all-day and large-scale wind field information, and its application especially to wind retrieval always attracts meteorologists. Certain reasons cause large direction error, so it is important to find where the error mainly comes. Does it mainly result from the background field, the normalized radar cross-section (NRCS) or the method of wind retrieval? It is valuable to research. First, depending on SDP2.0, the simulated 'true' NRCS is calculated from the simulated 'true' wind through the geophysical mode] function NSCAT2. The simulated background field is configured by adding a noise to the simulated 'true' wind with the non-divergence constraint. Also, the simulated 'measured' NRCS is formed by adding a noise to the simulated 'true' NRCS. Then, the sensitivity experiments are taken, and the new method of regularization is used to improve the ambiguity removal with simulation experiments. The results show that the accuracy of wind retrieval is more sensitive to the noise in the background than in the measured NRCS; compared with the two-dimensional variational (2DVAR) ambiguity removal method, the accuracy of wind retrieval can be improved with the new method of Tikhonov regularization through choosing an appropriate regularization parameter, especially for the case of large error in the background. The work will provide important information and a new method for the wind retrieval with real data.展开更多
This paper concentrates on methods for comparing activity units found relatively efficient by data envelopment analysis (DEA). The use of the basic DEA models does not provide direct information regarding the performa...This paper concentrates on methods for comparing activity units found relatively efficient by data envelopment analysis (DEA). The use of the basic DEA models does not provide direct information regarding the performance of such units. The paper provides a systematic framework of alternative ways for ranking DEA-efficient units. The framework contains criteria derived as by-products of the basic DEA models and also criteria derived from complementary DEA analysis that needs to be carried out. The proposed framework is applied to rank a set of relatively efficient restaurants on the basis of their market efficiency.展开更多
DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expres...DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expression data generated. To address this, this paper employs a mixed-effects model to analyze gene expression data. In terms of data selection, 1176 genes from the white mouse gene expression dataset under two experimental conditions were chosen, setting up two conditions: pneumococcal infection and no infection, and constructing a mixed-effects model. After preprocessing the gene chip information, the data were imported into the model, preliminary results were calculated, and permutation tests were performed to biologically validate the preliminary results using GSEA. The final dataset consists of 20 groups of gene expression data from pneumococcal infection, which categorizes functionally related genes based on the similarity of their expression profiles, facilitating the study of genes with unknown functions.展开更多
The issue of strong noise has increasingly become a bottleneck restricting the precision and application space of electromagnetic exploration methods.Noise suppression and extraction of effective electromagnetic respo...The issue of strong noise has increasingly become a bottleneck restricting the precision and application space of electromagnetic exploration methods.Noise suppression and extraction of effective electromagnetic response information under a strong noise background is a crucial scientific task to be addressed.To solve the noise suppression problem of the controlled-source electromagnetic method in strong interference areas,we propose an approach based on complex-plane 2D k-means clustering for data processing.Based on the stability of the controlled-source signal response,clustering analysis is applied to classify the spectra of different sources and noises in multiple time segments.By identifying the power spectra with controlled-source characteristics,it helps to improve the quality of the controlled-source response extraction.This paper presents the principle and workflow of the proposed algorithm,and demonstrates feasibility and effectiveness of the new algorithm through synthetic and real data examples.The results show that,compared with the conventional Robust denoising method,the clustering algorithm has a stronger suppression effect on common noise,can identify high-quality signals,and improve the preprocessing data quality of the controlledsource electromagnetic method.展开更多
With the advent of the big data era,real-time data analysis and decision-support systems have been recognized as essential tools for enhancing enterprise competitiveness and optimizing the decision-making process.This...With the advent of the big data era,real-time data analysis and decision-support systems have been recognized as essential tools for enhancing enterprise competitiveness and optimizing the decision-making process.This study aims to explore the development strategies of real-time data analysis and decision-support systems,and analyze their application status and future development trends in various industries.The article first reviews the basic concepts and importance of real-time data analysis and decision-support systems,and then discusses in detail the key technical aspects such as system architecture,data collection and processing,analysis methods,and visualization techniques.展开更多
This paper analyzes the advantages of legal digital currencies and explores their impact on bank big data practices.By combining bank big data collection and processing,it clarifies that legal digital currencies can e...This paper analyzes the advantages of legal digital currencies and explores their impact on bank big data practices.By combining bank big data collection and processing,it clarifies that legal digital currencies can enhance the efficiency of bank data processing,enrich data types,and strengthen data analysis and application capabilities.In response to future development needs,it is necessary to strengthen data collection management,enhance data processing capabilities,innovate big data application models,and provide references for bank big data practices,promoting the transformation and upgrading of the banking industry in the context of legal digital currencies.展开更多
With the acceleration of intelligent transformation of energy system,the monitoring of equipment operation status and optimization of production process in thermal power plants face the challenge of multi-source heter...With the acceleration of intelligent transformation of energy system,the monitoring of equipment operation status and optimization of production process in thermal power plants face the challenge of multi-source heterogeneous data integration.In view of the heterogeneous characteristics of physical sensor data,including temperature,vibration and pressure that generated by boilers,steam turbines and other key equipment and real-time working condition data of SCADA system,this paper proposes a multi-source heterogeneous data fusion and analysis platform for thermal power plants based on edge computing and deep learning.By constructing a multi-level fusion architecture,the platform adopts dynamic weight allocation strategy and 5D digital twin model to realize the collaborative analysis of physical sensor data,simulation calculation results and expert knowledge.The data fusion module combines Kalman filter,wavelet transform and Bayesian estimation method to solve the problem of data time series alignment and dimension difference.Simulation results show that the data fusion accuracy can be improved to more than 98%,and the calculation delay can be controlled within 500 ms.The data analysis module integrates Dymola simulation model and AERMOD pollutant diffusion model,supports the cascade analysis of boiler combustion efficiency prediction and flue gas emission monitoring,system response time is less than 2 seconds,and data consistency verification accuracy reaches 99.5%.展开更多
With Beijing Huilongguan Sports and Cultural Park as the research object,this study was conducted to investigate public service satisfaction in the park by the Importance-Performance Analysis(IPA)method.A questionnair...With Beijing Huilongguan Sports and Cultural Park as the research object,this study was conducted to investigate public service satisfaction in the park by the Importance-Performance Analysis(IPA)method.A questionnaire covering six dimensions,including public transportation,sanitation and environment,and supporting facility construction,was designed.A total of 208 valid samples were collected,and SPSS was employed for reliability and validity tests as well as IPA analysis.The findings were as follows:①Visitors were generally quite satisfied with the overall public services in Huilongguan Sports and Cultural Park.②The highest satisfaction levels were observed in sanitation and environment services and the sports and cultural atmosphere,while lower satisfaction was noted for supporting facility construction and public information services.③The advantage enhancement zone includes sanitation and environment services and sports and cultural atmosphere;and the continuous maintenance zone includes public transportation services and security management amd maintenance;the subsequent opportunity zone includes supporting facility construction and public information services;and there are no dimensions in the urgent improvement zone.The study recommends strengthening the service connotations from three aspects:enhancing facilities with sports as the core,optimizing services with a people-centered approach,and upgrading the information platform through technological efficiency.Additionally,a multi-stakeholder collaborative mechanism involving the government in coordinating policy resources,the operator in improving implementation efficiency,and the public participating in supervision and evaluation is proposed to drive the enhancement of public service quality at Huilongguan Sports and Cultural Park.展开更多
In the contemporary era,characterized by the Internet and digitalization as fundamental features,the operation and application of digital currency have gradually developed into a comprehensive structural system.This s...In the contemporary era,characterized by the Internet and digitalization as fundamental features,the operation and application of digital currency have gradually developed into a comprehensive structural system.This system restores the essential characteristics of currency while providing auxiliary services related to the formation,circulation,storage,application,and promotion of digital currency.Compared to traditional currency management technologies,big data analysis technology,which is primarily embedded in digital currency systems,enables the rapid acquisition of information.This facilitates the identification of standard associations within currency data and provides technical support for the operational framework of digital currency.展开更多
The fracture volume is gradually changed with the depletion of fracture pressure during the production process.However,there are few flowback models available so far that can estimate the fracture volume loss using pr...The fracture volume is gradually changed with the depletion of fracture pressure during the production process.However,there are few flowback models available so far that can estimate the fracture volume loss using pressure transient and rate transient data.The initial flowback involves producing back the fracturing fuid after hydraulic fracturing,while the second flowback involves producing back the preloading fluid injected into the parent wells before fracturing of child wells.The main objective of this research is to compare the initial and second flowback data to capture the changes in fracture volume after production and preload processes.Such a comparison is useful for evaluating well performance and optimizing frac-turing operations.We construct rate-normalized pressure(RNP)versus material balance time(MBT)diagnostic plots using both initial and second flowback data(FB;and FBs,respectively)of six multi-fractured horizontal wells completed in Niobrara and Codell formations in DJ Basin.In general,the slope of RNP plot during the FB,period is higher than that during the FB;period,indicating a potential loss of fracture volume from the FB;to the FB,period.We estimate the changes in effective fracture volume(Ver)by analyzing the changes in the RNP slope and total compressibility between these two flowback periods.Ver during FB,is in general 3%-45%lower than that during FB:.We also compare the drive mechanisms for the two flowback periods by calculating the compaction-drive index(CDI),hydrocarbon-drive index(HDI),and water-drive index(WDI).The dominant drive mechanism during both flowback periods is CDI,but its contribution is reduced by 16%in the FB,period.This drop is generally compensated by a relatively higher HDI during this period.The loss of effective fracture volume might be attributed to the pressure depletion in fractures,which occurs during the production period and can extend 800 days.展开更多
The formation water sample in oil and gas fields may be polluted in processes of testing, trial production, collection, storage, transportation and analysis, making the properties of formation water not be reflected t...The formation water sample in oil and gas fields may be polluted in processes of testing, trial production, collection, storage, transportation and analysis, making the properties of formation water not be reflected truly. This paper discusses identification methods and the data credibility evaluation method for formation water in oil and gas fields of petroliferous basins within China. The results of the study show that: (1) the identification methods of formation water include the basic methods of single factors such as physical characteristics, water composition characteristics, water type characteristics, and characteristic coefficients, as well as the comprehensive evaluation method of data credibility proposed on this basis, which mainly relies on the correlation analysis sodium chloride coefficient and desulfurization coefficient and combines geological background evaluation;(2) The basic identifying methods for formation water enable the preliminary identification of hydrochemical data and the preliminary screening of data on site, the proposed comprehensive method realizes the evaluation by classifying the CaCl2-type water into types A-I to A-VI and the NaHCO3-type water into types B-I to B-IV, so that researchers can make in-depth evaluation on the credibility of hydrochemical data and analysis of influencing factors;(3) When the basic methods are used to identify the formation water, the formation water containing anions such as CO_(3)^(2-), OH- and NO_(3)^(-), or the formation water with the sodium chloride coefficient and desulphurization coefficient not matching the geological setting, are all invaded with surface water or polluted by working fluid;(4) When the comprehensive method is used, the data credibility of A-I, A-II, B-I and B-II formation water can be evaluated effectively and accurately only if the geological setting analysis in respect of the factors such as formation environment, sampling conditions, condensate water, acid fluid, leaching of ancient weathering crust, and ancient atmospheric fresh water, is combined, although such formation water is believed with high credibility.展开更多
When assessing seismic liquefaction potential with data-driven models,addressing the uncertainties of establishing models,interpreting cone penetration tests(CPT)data and decision threshold is crucial for avoiding bia...When assessing seismic liquefaction potential with data-driven models,addressing the uncertainties of establishing models,interpreting cone penetration tests(CPT)data and decision threshold is crucial for avoiding biased data selection,ameliorating overconfident models,and being flexible to varying practical objectives,especially when the training and testing data are not identically distributed.A workflow characterized by leveraging Bayesian methodology was proposed to address these issues.Employing a Multi-Layer Perceptron(MLP)as the foundational model,this approach was benchmarked against empirical methods and advanced algorithms for its efficacy in simplicity,accuracy,and resistance to overfitting.The analysis revealed that,while MLP models optimized via maximum a posteriori algorithm suffices for straightforward scenarios,Bayesian neural networks showed great potential for preventing overfitting.Additionally,integrating decision thresholds through various evaluative principles offers insights for challenging decisions.Two case studies demonstrate the framework's capacity for nuanced interpretation of in situ data,employing a model committee for a detailed evaluation of liquefaction potential via Monte Carlo simulations and basic statistics.Overall,the proposed step-by-step workflow for analyzing seismic liquefaction incorporates multifold testing and real-world data validation,showing improved robustness against overfitting and greater versatility in addressing practical challenges.This research contributes to the seismic liquefaction assessment field by providing a structured,adaptable methodology for accurate and reliable analysis.展开更多
Data hiding methods involve embedding secret messages into cover objects to enable covert communication in a way that is difficult to detect.In data hiding methods based on image interpolation,the image size is reduce...Data hiding methods involve embedding secret messages into cover objects to enable covert communication in a way that is difficult to detect.In data hiding methods based on image interpolation,the image size is reduced and then enlarged through interpolation,followed by the embedding of secret data into the newly generated pixels.A general improving approach for embedding secret messages is proposed.The approach may be regarded a general model for enhancing the data embedding capacity of various existing image interpolation-based data hiding methods.This enhancement is achieved by expanding the range of pixel values available for embedding secret messages,removing the limitations of many existing methods,where the range is restricted to powers of two to facilitate the direct embedding of bit-based messages.This improvement is accomplished through the application of multiple-based number conversion to the secret message data.The method converts the message bits into a multiple-based number and uses an algorithm to embed each digit of this number into an individual pixel,thereby enhancing the message embedding efficiency,as proved by a theorem derived in this study.The proposed improvement method has been tested through experiments on three well-known image interpolation-based data hiding methods.The results show that the proposed method can enhance the three data embedding rates by approximately 14%,13%,and 10%,respectively,create stego-images with good quality,and resist RS steganalysis attacks.These experimental results indicate that the use of the multiple-based number conversion technique to improve the three interpolation-based methods for embedding secret messages increases the number of message bits embedded in the images.For many image interpolation-based data hiding methods,which use power-of-two pixel-value ranges for message embedding,other than the three tested ones,the proposed improvement method is also expected to be effective for enhancing their data embedding capabilities.展开更多
With the rapid development of the Internet and e-commerce,e-commerce platforms have accumulated huge amounts of user behavior data.The emergence of big data technology provides a powerful means for in-depth analysis o...With the rapid development of the Internet and e-commerce,e-commerce platforms have accumulated huge amounts of user behavior data.The emergence of big data technology provides a powerful means for in-depth analysis of these data and insight into user behavior patterns and preferences.This paper elaborates on the application of big data technology in the analysis of user behavior on e-commerce platforms,including the technical methods of data collection,storage,processing and analysis,as well as the specific applications in the construction of user profiles,precision marketing,personalized recommendation,user retention and churn analysis,etc.,and discusses the challenges and countermeasures faced in the application.Through the study of actual cases,it demonstrates the remarkable effectiveness of big data technology in enhancing the competitiveness of e-commerce platforms and user experience.展开更多
Influenza,an acute respiratory infectious disease caused by the influenza virus,exhibits distinct seasonal patterns in China,with peak activity occurring in winter and spring in northern regions,and in winter and summ...Influenza,an acute respiratory infectious disease caused by the influenza virus,exhibits distinct seasonal patterns in China,with peak activity occurring in winter and spring in northern regions,and in winter and summer in southern areas[1].The World Health Organization(WHO)emphasizes that early warning and epidemic intensity assessments are critical public health strategies for influenza prevention and control.Internet-based flu surveillance,with real-time data and low costs,effectively complements traditional methods.The Baidu Search Index,which reflects flu-related queries,strongly correlates with influenza trends,aiding in regional activity assessment and outbreak tracking[2].展开更多
In response to the issue of fuzzy matching and association when optical observation data are matched with the orbital elements in a catalog database,this paper proposes a matching and association strategy based on the...In response to the issue of fuzzy matching and association when optical observation data are matched with the orbital elements in a catalog database,this paper proposes a matching and association strategy based on the arcsegment difference method.First,a matching error threshold is set to match the observation data with the known catalog database.Second,the matching results for the same day are sorted on the basis of target identity and observation residuals.Different matching error thresholds and arc-segment dynamic association thresholds are then applied to categorize the observation residuals of the same target across different arc-segments,yielding matching results under various thresholds.Finally,the orbital residual is computed through orbit determination(OD),and the positional error is derived by comparing the OD results with the orbit track from the catalog database.The appropriate matching error threshold is then selected on the basis of these results,leading to the final matching and association of the fuzzy correlation data.Experimental results showed that the correct matching rate for data arc-segments is 92.34% when the matching error threshold is set to 720″,with the arc-segment difference method processing the results of an average matching rate of 97.62% within 8 days.The remaining 5.28% of the fuzzy correlation data are correctly matched and associated,enabling identification of orbital maneuver targets through further processing and analysis.This method substantially enhances the efficiency and accuracy of space target cataloging,offering robust technical support for dynamic maintenance of the space target database.展开更多
基金This project is supported by Special Foundation for Major State Basic Research of China (Project 973, No.G1998030415)
文摘In the industrial process situation, principal component analysis (PCA) is ageneral method in data reconciliation. However, PCA sometime is unfeasible to nonlinear featureanalysis and limited in application to nonlinear industrial process. Kernel PCA (KPCA) is extensionof PCA and can be used for nonlinear feature analysis. A nonlinear data reconciliation method basedon KPCA is proposed. The basic idea of this method is that firstly original data are mapped to highdimensional feature space by nonlinear function, and PCA is implemented in the feature space. Thennonlinear feature analysis is implemented and data are reconstructed by using the kernel. The datareconciliation method based on KPCA is applied to ternary distillation column. Simulation resultsshow that this method can filter the noise in measurements of nonlinear process and reconciliateddata can represent the true information of nonlinear process.
文摘This paper establishes the phase space in the light of spacial series data , discusses the fractal structure of geological data in terms of correlated functions and studies the chaos of these data . In addition , it introduces the R/S analysis for time series analysis into spacial series to calculate the structural fractal dimensions of ranges and standard deviation for spacial series data -and to establish the fractal dimension matrix and the procedures in plotting the fractal dimension anomaly diagram with vector distances of fractal dimension . At last , it has examples of its application .
基金Supported by the National Natural Science Foundation of China(Grant No.10472006)
文摘The heteroscedastic regression model was established and the heteroscedastic regression analysis method was presented for mixed data composed of complete data,type-Ⅰ censored data and type-Ⅱ censored data from the location-scale distribution.The best unbiased estimations of regression coefficients,as well as the confidence limits of the location parameter and scale parameter were given.Furthermore,the point estimations and confidence limits of percentiles were obtained.Thus,the traditional multiple regression analysis method which is only suitable to the complete data from normal distribution can be extended to the cases of heteroscedastic mixed data and the location-scale distribution.So the presented method has a broad range of promising applications.
基金supported by the National Natural Science Foundation of China(Grant Nos.51890912,51979025 and 52011530189).
文摘This article presents a micro-structure tensor enhanced elasto-plastic finite element(FE)method to address strength anisotropy in three-dimensional(3D)soil slope stability analysis.The gravity increase method(GIM)is employed to analyze the stability of 3D anisotropic soil slopes.The accuracy of the proposed method is first verified against the data in the literature.We then simulate the 3D soil slope with a straight slope surface and the convex and concave slope surfaces with a 90turning corner to study the 3D effect on slope stability and the failure mechanism under anisotropy conditions.Based on our numerical results,the end effect significantly impacts the failure mechanism and safety factor.Anisotropy degree notably affects the safety factor,with higher degrees leading to deeper landslides.For concave slopes,they can be approximated by straight slopes with suitable boundary conditions to assess their stability.Furthermore,a case study of the Saint-Alban test embankment A in Quebec,Canada,is provided to demonstrate the applicability of the proposed FE model.
基金supported by the National Natural Science Foundation of China (Grant No. 40775023)
文摘Scatterometer is an instrument which provides all-day and large-scale wind field information, and its application especially to wind retrieval always attracts meteorologists. Certain reasons cause large direction error, so it is important to find where the error mainly comes. Does it mainly result from the background field, the normalized radar cross-section (NRCS) or the method of wind retrieval? It is valuable to research. First, depending on SDP2.0, the simulated 'true' NRCS is calculated from the simulated 'true' wind through the geophysical mode] function NSCAT2. The simulated background field is configured by adding a noise to the simulated 'true' wind with the non-divergence constraint. Also, the simulated 'measured' NRCS is formed by adding a noise to the simulated 'true' NRCS. Then, the sensitivity experiments are taken, and the new method of regularization is used to improve the ambiguity removal with simulation experiments. The results show that the accuracy of wind retrieval is more sensitive to the noise in the background than in the measured NRCS; compared with the two-dimensional variational (2DVAR) ambiguity removal method, the accuracy of wind retrieval can be improved with the new method of Tikhonov regularization through choosing an appropriate regularization parameter, especially for the case of large error in the background. The work will provide important information and a new method for the wind retrieval with real data.
文摘This paper concentrates on methods for comparing activity units found relatively efficient by data envelopment analysis (DEA). The use of the basic DEA models does not provide direct information regarding the performance of such units. The paper provides a systematic framework of alternative ways for ranking DEA-efficient units. The framework contains criteria derived as by-products of the basic DEA models and also criteria derived from complementary DEA analysis that needs to be carried out. The proposed framework is applied to rank a set of relatively efficient restaurants on the basis of their market efficiency.
文摘DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expression data generated. To address this, this paper employs a mixed-effects model to analyze gene expression data. In terms of data selection, 1176 genes from the white mouse gene expression dataset under two experimental conditions were chosen, setting up two conditions: pneumococcal infection and no infection, and constructing a mixed-effects model. After preprocessing the gene chip information, the data were imported into the model, preliminary results were calculated, and permutation tests were performed to biologically validate the preliminary results using GSEA. The final dataset consists of 20 groups of gene expression data from pneumococcal infection, which categorizes functionally related genes based on the similarity of their expression profiles, facilitating the study of genes with unknown functions.
基金supported by the National Key Research and Development Program Project of China(Grant No.2023YFF0718003)the key research and development plan project of Yunnan Province(Grant No.202303AA080006).
文摘The issue of strong noise has increasingly become a bottleneck restricting the precision and application space of electromagnetic exploration methods.Noise suppression and extraction of effective electromagnetic response information under a strong noise background is a crucial scientific task to be addressed.To solve the noise suppression problem of the controlled-source electromagnetic method in strong interference areas,we propose an approach based on complex-plane 2D k-means clustering for data processing.Based on the stability of the controlled-source signal response,clustering analysis is applied to classify the spectra of different sources and noises in multiple time segments.By identifying the power spectra with controlled-source characteristics,it helps to improve the quality of the controlled-source response extraction.This paper presents the principle and workflow of the proposed algorithm,and demonstrates feasibility and effectiveness of the new algorithm through synthetic and real data examples.The results show that,compared with the conventional Robust denoising method,the clustering algorithm has a stronger suppression effect on common noise,can identify high-quality signals,and improve the preprocessing data quality of the controlledsource electromagnetic method.
文摘With the advent of the big data era,real-time data analysis and decision-support systems have been recognized as essential tools for enhancing enterprise competitiveness and optimizing the decision-making process.This study aims to explore the development strategies of real-time data analysis and decision-support systems,and analyze their application status and future development trends in various industries.The article first reviews the basic concepts and importance of real-time data analysis and decision-support systems,and then discusses in detail the key technical aspects such as system architecture,data collection and processing,analysis methods,and visualization techniques.
文摘This paper analyzes the advantages of legal digital currencies and explores their impact on bank big data practices.By combining bank big data collection and processing,it clarifies that legal digital currencies can enhance the efficiency of bank data processing,enrich data types,and strengthen data analysis and application capabilities.In response to future development needs,it is necessary to strengthen data collection management,enhance data processing capabilities,innovate big data application models,and provide references for bank big data practices,promoting the transformation and upgrading of the banking industry in the context of legal digital currencies.
文摘With the acceleration of intelligent transformation of energy system,the monitoring of equipment operation status and optimization of production process in thermal power plants face the challenge of multi-source heterogeneous data integration.In view of the heterogeneous characteristics of physical sensor data,including temperature,vibration and pressure that generated by boilers,steam turbines and other key equipment and real-time working condition data of SCADA system,this paper proposes a multi-source heterogeneous data fusion and analysis platform for thermal power plants based on edge computing and deep learning.By constructing a multi-level fusion architecture,the platform adopts dynamic weight allocation strategy and 5D digital twin model to realize the collaborative analysis of physical sensor data,simulation calculation results and expert knowledge.The data fusion module combines Kalman filter,wavelet transform and Bayesian estimation method to solve the problem of data time series alignment and dimension difference.Simulation results show that the data fusion accuracy can be improved to more than 98%,and the calculation delay can be controlled within 500 ms.The data analysis module integrates Dymola simulation model and AERMOD pollutant diffusion model,supports the cascade analysis of boiler combustion efficiency prediction and flue gas emission monitoring,system response time is less than 2 seconds,and data consistency verification accuracy reaches 99.5%.
基金Sponsored by The Youth Project of National Social Science Foundation of China(21CTY007)Special Fund for Basic Scientific Research Business Expenses of Central Universities(2024DAWH008).
文摘With Beijing Huilongguan Sports and Cultural Park as the research object,this study was conducted to investigate public service satisfaction in the park by the Importance-Performance Analysis(IPA)method.A questionnaire covering six dimensions,including public transportation,sanitation and environment,and supporting facility construction,was designed.A total of 208 valid samples were collected,and SPSS was employed for reliability and validity tests as well as IPA analysis.The findings were as follows:①Visitors were generally quite satisfied with the overall public services in Huilongguan Sports and Cultural Park.②The highest satisfaction levels were observed in sanitation and environment services and the sports and cultural atmosphere,while lower satisfaction was noted for supporting facility construction and public information services.③The advantage enhancement zone includes sanitation and environment services and sports and cultural atmosphere;and the continuous maintenance zone includes public transportation services and security management amd maintenance;the subsequent opportunity zone includes supporting facility construction and public information services;and there are no dimensions in the urgent improvement zone.The study recommends strengthening the service connotations from three aspects:enhancing facilities with sports as the core,optimizing services with a people-centered approach,and upgrading the information platform through technological efficiency.Additionally,a multi-stakeholder collaborative mechanism involving the government in coordinating policy resources,the operator in improving implementation efficiency,and the public participating in supervision and evaluation is proposed to drive the enhancement of public service quality at Huilongguan Sports and Cultural Park.
文摘In the contemporary era,characterized by the Internet and digitalization as fundamental features,the operation and application of digital currency have gradually developed into a comprehensive structural system.This system restores the essential characteristics of currency while providing auxiliary services related to the formation,circulation,storage,application,and promotion of digital currency.Compared to traditional currency management technologies,big data analysis technology,which is primarily embedded in digital currency systems,enables the rapid acquisition of information.This facilitates the identification of standard associations within currency data and provides technical support for the operational framework of digital currency.
文摘The fracture volume is gradually changed with the depletion of fracture pressure during the production process.However,there are few flowback models available so far that can estimate the fracture volume loss using pressure transient and rate transient data.The initial flowback involves producing back the fracturing fuid after hydraulic fracturing,while the second flowback involves producing back the preloading fluid injected into the parent wells before fracturing of child wells.The main objective of this research is to compare the initial and second flowback data to capture the changes in fracture volume after production and preload processes.Such a comparison is useful for evaluating well performance and optimizing frac-turing operations.We construct rate-normalized pressure(RNP)versus material balance time(MBT)diagnostic plots using both initial and second flowback data(FB;and FBs,respectively)of six multi-fractured horizontal wells completed in Niobrara and Codell formations in DJ Basin.In general,the slope of RNP plot during the FB,period is higher than that during the FB;period,indicating a potential loss of fracture volume from the FB;to the FB,period.We estimate the changes in effective fracture volume(Ver)by analyzing the changes in the RNP slope and total compressibility between these two flowback periods.Ver during FB,is in general 3%-45%lower than that during FB:.We also compare the drive mechanisms for the two flowback periods by calculating the compaction-drive index(CDI),hydrocarbon-drive index(HDI),and water-drive index(WDI).The dominant drive mechanism during both flowback periods is CDI,but its contribution is reduced by 16%in the FB,period.This drop is generally compensated by a relatively higher HDI during this period.The loss of effective fracture volume might be attributed to the pressure depletion in fractures,which occurs during the production period and can extend 800 days.
基金Supported by the PetroChina Science and Technology Project(2023ZZ0202)。
文摘The formation water sample in oil and gas fields may be polluted in processes of testing, trial production, collection, storage, transportation and analysis, making the properties of formation water not be reflected truly. This paper discusses identification methods and the data credibility evaluation method for formation water in oil and gas fields of petroliferous basins within China. The results of the study show that: (1) the identification methods of formation water include the basic methods of single factors such as physical characteristics, water composition characteristics, water type characteristics, and characteristic coefficients, as well as the comprehensive evaluation method of data credibility proposed on this basis, which mainly relies on the correlation analysis sodium chloride coefficient and desulfurization coefficient and combines geological background evaluation;(2) The basic identifying methods for formation water enable the preliminary identification of hydrochemical data and the preliminary screening of data on site, the proposed comprehensive method realizes the evaluation by classifying the CaCl2-type water into types A-I to A-VI and the NaHCO3-type water into types B-I to B-IV, so that researchers can make in-depth evaluation on the credibility of hydrochemical data and analysis of influencing factors;(3) When the basic methods are used to identify the formation water, the formation water containing anions such as CO_(3)^(2-), OH- and NO_(3)^(-), or the formation water with the sodium chloride coefficient and desulphurization coefficient not matching the geological setting, are all invaded with surface water or polluted by working fluid;(4) When the comprehensive method is used, the data credibility of A-I, A-II, B-I and B-II formation water can be evaluated effectively and accurately only if the geological setting analysis in respect of the factors such as formation environment, sampling conditions, condensate water, acid fluid, leaching of ancient weathering crust, and ancient atmospheric fresh water, is combined, although such formation water is believed with high credibility.
文摘When assessing seismic liquefaction potential with data-driven models,addressing the uncertainties of establishing models,interpreting cone penetration tests(CPT)data and decision threshold is crucial for avoiding biased data selection,ameliorating overconfident models,and being flexible to varying practical objectives,especially when the training and testing data are not identically distributed.A workflow characterized by leveraging Bayesian methodology was proposed to address these issues.Employing a Multi-Layer Perceptron(MLP)as the foundational model,this approach was benchmarked against empirical methods and advanced algorithms for its efficacy in simplicity,accuracy,and resistance to overfitting.The analysis revealed that,while MLP models optimized via maximum a posteriori algorithm suffices for straightforward scenarios,Bayesian neural networks showed great potential for preventing overfitting.Additionally,integrating decision thresholds through various evaluative principles offers insights for challenging decisions.Two case studies demonstrate the framework's capacity for nuanced interpretation of in situ data,employing a model committee for a detailed evaluation of liquefaction potential via Monte Carlo simulations and basic statistics.Overall,the proposed step-by-step workflow for analyzing seismic liquefaction incorporates multifold testing and real-world data validation,showing improved robustness against overfitting and greater versatility in addressing practical challenges.This research contributes to the seismic liquefaction assessment field by providing a structured,adaptable methodology for accurate and reliable analysis.
文摘Data hiding methods involve embedding secret messages into cover objects to enable covert communication in a way that is difficult to detect.In data hiding methods based on image interpolation,the image size is reduced and then enlarged through interpolation,followed by the embedding of secret data into the newly generated pixels.A general improving approach for embedding secret messages is proposed.The approach may be regarded a general model for enhancing the data embedding capacity of various existing image interpolation-based data hiding methods.This enhancement is achieved by expanding the range of pixel values available for embedding secret messages,removing the limitations of many existing methods,where the range is restricted to powers of two to facilitate the direct embedding of bit-based messages.This improvement is accomplished through the application of multiple-based number conversion to the secret message data.The method converts the message bits into a multiple-based number and uses an algorithm to embed each digit of this number into an individual pixel,thereby enhancing the message embedding efficiency,as proved by a theorem derived in this study.The proposed improvement method has been tested through experiments on three well-known image interpolation-based data hiding methods.The results show that the proposed method can enhance the three data embedding rates by approximately 14%,13%,and 10%,respectively,create stego-images with good quality,and resist RS steganalysis attacks.These experimental results indicate that the use of the multiple-based number conversion technique to improve the three interpolation-based methods for embedding secret messages increases the number of message bits embedded in the images.For many image interpolation-based data hiding methods,which use power-of-two pixel-value ranges for message embedding,other than the three tested ones,the proposed improvement method is also expected to be effective for enhancing their data embedding capabilities.
文摘With the rapid development of the Internet and e-commerce,e-commerce platforms have accumulated huge amounts of user behavior data.The emergence of big data technology provides a powerful means for in-depth analysis of these data and insight into user behavior patterns and preferences.This paper elaborates on the application of big data technology in the analysis of user behavior on e-commerce platforms,including the technical methods of data collection,storage,processing and analysis,as well as the specific applications in the construction of user profiles,precision marketing,personalized recommendation,user retention and churn analysis,etc.,and discusses the challenges and countermeasures faced in the application.Through the study of actual cases,it demonstrates the remarkable effectiveness of big data technology in enhancing the competitiveness of e-commerce platforms and user experience.
基金supported by the National Key Research and Development Program of China(Project No.2023YFC2307500).
文摘Influenza,an acute respiratory infectious disease caused by the influenza virus,exhibits distinct seasonal patterns in China,with peak activity occurring in winter and spring in northern regions,and in winter and summer in southern areas[1].The World Health Organization(WHO)emphasizes that early warning and epidemic intensity assessments are critical public health strategies for influenza prevention and control.Internet-based flu surveillance,with real-time data and low costs,effectively complements traditional methods.The Baidu Search Index,which reflects flu-related queries,strongly correlates with influenza trends,aiding in regional activity assessment and outbreak tracking[2].
基金supported by National Natural Science Foundation of China(12273080).
文摘In response to the issue of fuzzy matching and association when optical observation data are matched with the orbital elements in a catalog database,this paper proposes a matching and association strategy based on the arcsegment difference method.First,a matching error threshold is set to match the observation data with the known catalog database.Second,the matching results for the same day are sorted on the basis of target identity and observation residuals.Different matching error thresholds and arc-segment dynamic association thresholds are then applied to categorize the observation residuals of the same target across different arc-segments,yielding matching results under various thresholds.Finally,the orbital residual is computed through orbit determination(OD),and the positional error is derived by comparing the OD results with the orbit track from the catalog database.The appropriate matching error threshold is then selected on the basis of these results,leading to the final matching and association of the fuzzy correlation data.Experimental results showed that the correct matching rate for data arc-segments is 92.34% when the matching error threshold is set to 720″,with the arc-segment difference method processing the results of an average matching rate of 97.62% within 8 days.The remaining 5.28% of the fuzzy correlation data are correctly matched and associated,enabling identification of orbital maneuver targets through further processing and analysis.This method substantially enhances the efficiency and accuracy of space target cataloging,offering robust technical support for dynamic maintenance of the space target database.