In recent years, the rapid decline of Arctic sea ice area (SIA) and sea ice extent (SIE), especially for the multiyear (MY) ice, has led to significant effect on climate change. The accurate retrieval of MY ice ...In recent years, the rapid decline of Arctic sea ice area (SIA) and sea ice extent (SIE), especially for the multiyear (MY) ice, has led to significant effect on climate change. The accurate retrieval of MY ice concentration retrieval is very important and challenging to understand the ongoing changes. Three MY ice concentration retrieval algorithms were systematically evaluated. A similar total ice concentration was yielded by these algorithms, while the retrieved MY sea ice concentrations differs from each other. The MY SIA derived from NASA TEAM algorithm is relatively stable. Other two algorithms created seasonal fluctuations of MY SIA, particularly in autumn and winter. In this paper, we proposed an ice concentration retrieval algorithm, which developed the NASA TEAM algorithm by adding to use AMSR-E 6.9 GHz brightness temperature data and sea ice concentration using 89.0 GHz data. Comparison with the reference MY SIA from reference MY ice, indicates that the mean difference and root mean square (rms) difference of MY SIA derived from the algorithm of this study are 0.65×10^6 km^2 and 0.69×10^6 km^2 during January to March, -0.06×10^6 km^2 and 0.14×10^6 km^2during September to December respectively. Comparison with MY SIE obtained from weekly ice age data provided by University of Colorado show that, the mean difference and rms difference are 0.69×10^6 km^2 and 0.84×10^6 km^2, respectively. The developed algorithm proposed in this study has smaller difference compared with the reference MY ice and MY SIE from ice age data than the Wang's, Lomax' and NASA TEAM algorithms.展开更多
Based on the atmospheric horizontal visibility data from forty-seven observational stations along the eastern coast of China near the Taiwan Strait and simultaneous NOAA/AVHRR multichannel satellite data during Januar...Based on the atmospheric horizontal visibility data from forty-seven observational stations along the eastern coast of China near the Taiwan Strait and simultaneous NOAA/AVHRR multichannel satellite data during January 2001 to December 2002, the spectral characters associated with visibility were investigated. Successful retrieval of visibility from multichannel NOAA/AVHRR data was performed using the principal component regression (PCR) method. A sample of retrieved visibility distribution was discussed with a sea fog process. The correlation coefficient between the observed and retrieved visibility was about 0.82, which is far higher than the 99.9% confidence level by statistical test. The rate of successful retrieval is 94.98% of the 458 cases during 2001 2002. The error distribution showed that high visibilities were usually under-estimated and low visibilities were over-estimated and the relative error between the observed and retrieved visibilities was about 21.4%.展开更多
In monitoring systems, multiple sensor nodes can detect a single target of interest simultaneously and the data collected are usually highly correlated and redundant. If each node sends data to the base station, energ...In monitoring systems, multiple sensor nodes can detect a single target of interest simultaneously and the data collected are usually highly correlated and redundant. If each node sends data to the base station, energy will be wasted and thus the network energy will be depleted quickly. Data aggregation is an important paradigm for compressing data so that the energy of the network is spent efficiently. In this paper, a novel data aggregation algorithm called Redundancy Elimination for Accurate Data Aggregation (READA) has been proposed. By exploiting the range of spatial correlations of data in the network, READA applies a grouping and compression mechanism to remove duplicate data in the aggregated set of data to be sent to the base station without largely losing the accuracy of the final aggregated data. One peculiarity of READA is that it uses a prediction model derived from cached values to confirm whether any outlier is actually an event which has occurred. From the various simulations conducted, it was observed that in READA the accuracy of data has been highly preserved taking into consideration the energy dissipated for aggregating the展开更多
Based on surfaced-related multiple elimination (SRME) , this research has derived the methods on multiples elimination in the inverse data space. Inverse data processing means moving seismic data from forwar...Based on surfaced-related multiple elimination (SRME) , this research has derived the methods on multiples elimination in the inverse data space. Inverse data processing means moving seismic data from forward data space (FDS) to inverse data space ( IDS) . The surface-related multiples and primaries can then be sepa-rated in the IDS, since surface-related multiples wi l l form a focus region in the IDS. Muting the multiples ener-gy can achieve the purpose of multiples elimination and avoid the damage to primaries energy during the process of adaptive subtraction. Randomized singular value decomposition ( RSYD) is used to enhance calculation speed and improve the accuracy in the conversion of FDS to IDS. The synthetic shot record of the salt dome model shows that the relationship between primaries and multiples is simple and clear, and RSVD can easily eliminate multiples and save primaries energy. Compared with conventional multiples elimination methods and ordinary methods of multiples elimination in the inverse data space, this technique has an advantage of high cal-culation speed and reliable outcomes.展开更多
Internal multiples are commonly present in seismic data due to variations in velocity or density of subsurface media.They can reduce the signal-to-noise ratio of seismic data and degrade the quality of the image.With ...Internal multiples are commonly present in seismic data due to variations in velocity or density of subsurface media.They can reduce the signal-to-noise ratio of seismic data and degrade the quality of the image.With the development of seismic exploration into deep and ultradeep events,especially those from complex targets in the western region of China,the internal multiple eliminations become increasingly challenging.Currently,three-dimensional(3D)seismic data are primarily used for oil and gas target recognition and drilling.Effectively eliminating internal multiples in 3D seismic data of complex structures and mitigating their adverse effects is crucial for enhancing the success rate of drilling.In this study,we propose an internal multiple prediction algorithm for 3D seismic data in complex structures using the Marchenko autofocusing theory.This method can predict the accurate internal multiples of time difference without an accurate velocity model and the implementation process mainly consists of several steps.Firstly,simulating direct waves with a 3D macroscopic velocity model.Secondly,using direct waves and 3D full seismic acquisition records to obtain the upgoing and down-going Green's functions between the virtual source point and surface.Thirdly,constructing internal multiples of the relevant layers by upgoing and downgoing Green's functions.Finally,utilizing the adaptive matching subtraction method to remove predicted internal multiples from the original data to obtain seismic records without multiples.Compared with the two-dimensional(2D)Marchenko algo-rithm,the performance of the 3D Marchenko algorithm for internal multiple prediction has been significantly enhanced,resulting in higher computational accuracy.Numerical simulation test results indicate that our proposed method can effectively eliminate internal multiples in 3D seismic data,thereby exhibiting important theoretical and industrial application value.展开更多
This paper adopts satellite channel brightness temperature simulation to study M-estimator variational retrieval. This approach combines both the advantages of classical variational inversion and robust M-estimators. ...This paper adopts satellite channel brightness temperature simulation to study M-estimator variational retrieval. This approach combines both the advantages of classical variational inversion and robust M-estimators. Classical variational inversion depends on prior quality control to elim- inate outliers, and its errors follow a Gaussian distribution. We coupled the M-estimators to the framework of classical variational inversion to obtain a M-estimator variational inversion. The cost function contains the M-estimator to guarantee the robustness to outliers and improve the retrieval re- sults. The experimental evaluation adopts Feng Yun-3A (FY-3A) simulated data to add to the Gaussian and Non-Gaussian error. The variational in- version is used to obtain the inversion brightness temperature, and temperature and humidity data are used for validation. The preliminary results demonstrate the potential of M-estimator variational retrieval.展开更多
Operation control of power systems has become challenging with an increase in the scale and complexity of power distribution systems and extensive access to renewable energy.Therefore,improvement of the ability of dat...Operation control of power systems has become challenging with an increase in the scale and complexity of power distribution systems and extensive access to renewable energy.Therefore,improvement of the ability of data-driven operation management,intelligent analysis,and mining is urgently required.To investigate and explore similar regularities of the historical operating section of the power distribution system and assist the power grid in obtaining high-value historical operation,maintenance experience,and knowledge by rule and line,a neural information retrieval model with an attention mechanism is proposed based on graph data computing technology.Based on the processing flow of the operating data of the power distribution system,a technical framework of neural information retrieval is established.Combined with the natural graph characteristics of the power distribution system,a unified graph data structure and a data fusion method of data access,data complement,and multi-source data are constructed.Further,a graph node feature-embedding representation learning algorithm and a neural information retrieval algorithm model are constructed.The neural information retrieval algorithm model is trained and tested using the generated graph node feature representation vector set.The model is verified on the operating section of the power distribution system of a provincial grid area.The results show that the proposed method demonstrates high accuracy in the similarity matching of historical operation characteristics and effectively supports intelligent fault diagnosis and elimination in power distribution systems.展开更多
Big data analytics in business intelligence do not provide effective data retrieval methods and job scheduling that will cause execution inefficiency and low system throughput.This paper aims to enhance the capability...Big data analytics in business intelligence do not provide effective data retrieval methods and job scheduling that will cause execution inefficiency and low system throughput.This paper aims to enhance the capability of data retrieval and job scheduling to speed up the operation of big data analytics to overcome inefficiency and low throughput problems.First,integrating stacked sparse autoencoder and Elasticsearch indexing explored fast data searching and distributed indexing,which reduces the search scope of the database and dramatically speeds up data searching.Next,exploiting a deep neural network to predict the approximate execution time of a job gives prioritized job scheduling based on the shortest job first,which reduces the average waiting time of job execution.As a result,the proposed data retrieval approach outperforms the previous method using a deep autoencoder and Solr indexing,significantly improving the speed of data retrieval up to 53%and increasing system throughput by 53%.On the other hand,the proposed job scheduling algorithmdefeats both first-in-first-out andmemory-sensitive heterogeneous early finish time scheduling algorithms,effectively shortening the average waiting time up to 5%and average weighted turnaround time by 19%,respectively.展开更多
In view of the study of finance and economics information, we research on the real-time financial news posted on the authority sites in the world's major advanced economies. Analyzing the massive financial news of...In view of the study of finance and economics information, we research on the real-time financial news posted on the authority sites in the world's major advanced economies. Analyzing the massive financial news of different information sources and language origins, we come up with a basic theory model and its algorithm on financial news, which is capable of intelligent collection, quick access, deduplication, correction and integration with financial news' backgrounds. Furthermore, we can find out connections between financial news and readers' interest. So we can achieve a real-time and on-demand financial news feed, as well as provide a theoretical basis and verification of the scientific problems on real-time processing of massive information. Finally, the simulation experiment shows that the multilingual financial news matching technology can give more help to distinguish the similar financial news in different languages than the traditional method.展开更多
With the development of information technology,the online retrieval of remote electronic data has become an important method for investigative agencies to collect evidence.In the current normative documents,the online...With the development of information technology,the online retrieval of remote electronic data has become an important method for investigative agencies to collect evidence.In the current normative documents,the online retrieval of electronic data is positioned as a new type of arbitrary investigative measure.However,study of its actual operation has found that the online retrieval of electronic data does not fully comply with the characteristics of arbitrary investigative measures.The root cause is its inaccurately defined nature due to analogy errors,an emphasis on the authenticity of electronic data at the cost of rights protection,insufficient effectiveness of normative documents to break through the boundaries of law,and superficial inconsistency found in the mechanical comparison with the nature of existing investigative measures causes.The nature of electronic data retrieved online should be defined according to different circumstances.The retrieval of electronic data disclosed on the Internet is an arbitrary investigative measure,and following procedural specifications should be sufficient.When investigators conceal their true identities and enter the cyberspace of the suspected crime through a registered account to extract dynamic electronic data for criminal activities,it is essentially a covert investigation in cyberspace,and they should follow the normative requirements for covert investigations.The retrieval of dynamic electronic data from private spaces is a technical investigative measure and should be implemented in accordance with the technical investigative procedures.Retrieval of remote“non-public electronic data involving privacy”is a mandatory investigative measure,and is essentially a search in the virtual space.Therefore,procedural specifications should be set in accordance with the standards of searching.展开更多
This paper focuses on developing a system that allows presentation authors to effectively retrieve presentation slides for reuse from a large volume of existing presentation materials. We assume episodic memories of t...This paper focuses on developing a system that allows presentation authors to effectively retrieve presentation slides for reuse from a large volume of existing presentation materials. We assume episodic memories of the authors can be used as contextual keywords in query expressions to efficiently dig out the expected slides for reuse rather than using only the part-of-slide-descriptions-based keyword queries. As a system, a new slide repository is proposed, composed of slide material collections, slide content data and pieces of information from authors' episodic memories related to each slide and presentation together with a slide retrieval application enabling authors to use the episodic memories as part of queries. The result of our experiment shows that the episodic memory-used queries can give more discoverability than the keyword-based queries. Additionally, an improvement model is discussed on the slide retrieval for further slide-finding efficiency by expanding the episodic memories model in the repository taking in the links with the author-and-slide-related data and events having been post on the private and social media sites.展开更多
We develop a data driven method(probability model) to construct a composite shape descriptor by combining a pair of scale-based shape descriptors. The selection of a pair of scale-based shape descriptors is modeled as...We develop a data driven method(probability model) to construct a composite shape descriptor by combining a pair of scale-based shape descriptors. The selection of a pair of scale-based shape descriptors is modeled as the computation of the union of two events, i.e.,retrieving similar shapes by using a single scale-based shape descriptor. The pair of scale-based shape descriptors with the highest probability forms the composite shape descriptor. Given a shape database, the composite shape descriptors for the shapes constitute a planar point set.A VoR-Tree of the planar point set is then used as an indexing structure for efficient query operation. Experiments and comparisons show the effectiveness and efficiency of the proposed composite shape descriptor.展开更多
Interference in the data of geochemical hydrocarbon exploration is a large obstacle for anomaly recognition. The multiresolution analysis of wavelet analysis can extract the information at different scales so as to pr...Interference in the data of geochemical hydrocarbon exploration is a large obstacle for anomaly recognition. The multiresolution analysis of wavelet analysis can extract the information at different scales so as to provide a powerful tool for information analysis and processing. Based on the analysis of the geometric nature of hydrocarbon anomalies and background, Mallat wavelet and symmetric border treatment are selected and data pre-processing (logarithm-normalization) is established. This approach provide good results in Shandong and Inner Mongolia, China. It is demonstrated that this approach overcome the disadvantage of backgound variation in the window (interference in window), used in moving average, frame filtering and spatial and scaling modeling methods.展开更多
A kind of single linked lists named aggregative chain is introduced to the algorithm, thus improving the architecture of FP tree. The new FP tree is a one-way tree and only the pointers that point its parent at each n...A kind of single linked lists named aggregative chain is introduced to the algorithm, thus improving the architecture of FP tree. The new FP tree is a one-way tree and only the pointers that point its parent at each node are kept. Route information of different nodes in a same item are compressed into aggregative chains so that the frequent patterns will be produced in aggregative chains without generating node links and conditional pattern bases. An example of Web key words retrieval is given to analyze and verify the frequent pattern algorithm in this paper.展开更多
It is well known that retrieval of parameters is usually ill-posed and highly nonlinear, so parameter retrieval problems are very difficult. There are still many important theoretical issues under research, although g...It is well known that retrieval of parameters is usually ill-posed and highly nonlinear, so parameter retrieval problems are very difficult. There are still many important theoretical issues under research, although great success has been achieved in data assimilation in meteorology and oceanography. This paper reviews the recent research on parameter retrieval, especially that of the authors. First, some concepts and issues of parameter retrieval are introduced and the state-of-the-art parameter retrieval technology in meteorology and oceanography is reviewed briefly, and then atmospheric and oceanic parameters are retrieved using the variational data assimilation method combined with the regularization techniques in four examples: retrieval of the vertical eddy diffusion coefficient; of the turbulivity of the atmospheric boundary layer; of wind from Doppler radar data, and of the physical process parameters. Model parameter retrieval with global and local observations is also introduced.展开更多
With the increasing popularity of cloud computing,privacy has become one of the key problem in cloud security.When data is outsourced to the cloud,for data owners,they need to ensure the security of their privacy;for ...With the increasing popularity of cloud computing,privacy has become one of the key problem in cloud security.When data is outsourced to the cloud,for data owners,they need to ensure the security of their privacy;for cloud service providers,they need some information of the data to provide high QoS services;and for authorized users,they need to access to the true value of data.The existing privacy-preserving methods can't meet all the needs of the three parties at the same time.To address this issue,we propose a retrievable data perturbation method and use it in the privacy-preserving in data outsourcing in cloud computing.Our scheme comes in four steps.Firstly,an improved random generator is proposed to generate an accurate "noise".Next,a perturbation algorithm is introduced to add noise to the original data.By doing this,the privacy information is hidden,but the mean and covariance of data which the service providers may need remain unchanged.Then,a retrieval algorithm is proposed to get the original data back from the perturbed data.Finally,we combine the retrievable perturbation with the access control process to ensure only the authorized users can retrieve the original data.The experiments show that our scheme perturbs date correctly,efficiently,and securely.展开更多
In this paper we investigate the impact of the Atmospheric Infra-Red Sounder (AIRS) temperature retrievals on data assimilation and the resulting forecasts using the four-dimensional Local Ensemble Transform Kalman Fi...In this paper we investigate the impact of the Atmospheric Infra-Red Sounder (AIRS) temperature retrievals on data assimilation and the resulting forecasts using the four-dimensional Local Ensemble Transform Kalman Filter (LETKF) data assimilation scheme and a reduced resolution version of the NCEP Global Forecast System (GFS).Our results indicate that the AIRS temperature retrievals have a significant and consistent positive impact in the Southern Hemispheric extratropics on both analyses and forecasts,which is found not only in the temperature field but also in other variables.In tropics and the Northern Hemispheric extratropics these impacts are smaller,but are still generally positive or neutral.展开更多
The drastic growth of coastal observation sensors results in copious data that provide weather information.The intricacies in sensor-generated big data are heterogeneity and interpretation,driving high-end Information...The drastic growth of coastal observation sensors results in copious data that provide weather information.The intricacies in sensor-generated big data are heterogeneity and interpretation,driving high-end Information Retrieval(IR)systems.The Semantic Web(SW)can solve this issue by integrating data into a single platform for information exchange and knowledge retrieval.This paper focuses on exploiting the SWbase systemto provide interoperability through ontologies by combining the data concepts with ontology classes.This paper presents a 4-phase weather data model:data processing,ontology creation,SW processing,and query engine.The developed Oceanographic Weather Ontology helps to enhance data analysis,discovery,IR,and decision making.In addition to that,it also evaluates the developed ontology with other state-of-the-art ontologies.The proposed ontology’s quality has improved by 39.28%in terms of completeness,and structural complexity has decreased by 45.29%,11%and 37.7%in Precision and Accuracy.Indian Meteorological Satellite INSAT-3D’s ocean data is a typical example of testing the proposed model.The experimental result shows the effectiveness of the proposed data model and its advantages in machine understanding and IR.展开更多
A simple fast method is given for sequentially retrieving all the records in a B tree. A file structure for database is proposed. The records in its primary data file are sorted according to the key order. A B tree ...A simple fast method is given for sequentially retrieving all the records in a B tree. A file structure for database is proposed. The records in its primary data file are sorted according to the key order. A B tree is used as its dense index. It is easy to insert, delete or search a record, and it is also convenient to retrieve records in the sequential order of the keys. The merits and efficiencies of these methods or structures are discussed in detail.展开更多
In this paper, we present machine learning algorithms and systems for similar video retrieval. Here, the query is itself a video. For the similarity measurement, exemplars, or representative frames in each video, are ...In this paper, we present machine learning algorithms and systems for similar video retrieval. Here, the query is itself a video. For the similarity measurement, exemplars, or representative frames in each video, are extracted by unsupervised learning. For this learning, we chose the order-aware competitive learning. After obtaining a set of exemplars for each video, the similarity is computed. Because the numbers and positions of the exemplars are different in each video, we use a similarity computing method called M-distance, which generalizes existing global and local alignment methods using followers to the exemplars. To represent each frame in the video, this paper emphasizes the Frame Signature of the ISO/IEC standard so that the total system, along with its graphical user interface, becomes practical. Experiments on the detection of inserted plagiaristic scenes showed excellent precision-recall curves, with precision values very close to 1. Thus, the proposed system can work as a plagiarism detector for videos. In addition, this method can be regarded as the structuring of unstructured data via numerical labeling by exemplars. Finally, further sophistication of this labeling is discussed.展开更多
基金The National Natural Science Foundation of China under contract Nos 41330960 and 41276193 and 41206184
文摘In recent years, the rapid decline of Arctic sea ice area (SIA) and sea ice extent (SIE), especially for the multiyear (MY) ice, has led to significant effect on climate change. The accurate retrieval of MY ice concentration retrieval is very important and challenging to understand the ongoing changes. Three MY ice concentration retrieval algorithms were systematically evaluated. A similar total ice concentration was yielded by these algorithms, while the retrieved MY sea ice concentrations differs from each other. The MY SIA derived from NASA TEAM algorithm is relatively stable. Other two algorithms created seasonal fluctuations of MY SIA, particularly in autumn and winter. In this paper, we proposed an ice concentration retrieval algorithm, which developed the NASA TEAM algorithm by adding to use AMSR-E 6.9 GHz brightness temperature data and sea ice concentration using 89.0 GHz data. Comparison with the reference MY SIA from reference MY ice, indicates that the mean difference and root mean square (rms) difference of MY SIA derived from the algorithm of this study are 0.65×10^6 km^2 and 0.69×10^6 km^2 during January to March, -0.06×10^6 km^2 and 0.14×10^6 km^2during September to December respectively. Comparison with MY SIE obtained from weekly ice age data provided by University of Colorado show that, the mean difference and rms difference are 0.69×10^6 km^2 and 0.84×10^6 km^2, respectively. The developed algorithm proposed in this study has smaller difference compared with the reference MY ice and MY SIE from ice age data than the Wang's, Lomax' and NASA TEAM algorithms.
基金This research is supported by the National High Technology Development Project (863) of China (Grant No. 2002AA639500) the Natural Science Foundation of Guangdong Province (Grant No. 032212)+1 种基金 National Basic Research Program of China (973 Program) (No. 2005CB422301) Program for New Century Excellent Talents in University ( NCET-05-0591 ).
文摘Based on the atmospheric horizontal visibility data from forty-seven observational stations along the eastern coast of China near the Taiwan Strait and simultaneous NOAA/AVHRR multichannel satellite data during January 2001 to December 2002, the spectral characters associated with visibility were investigated. Successful retrieval of visibility from multichannel NOAA/AVHRR data was performed using the principal component regression (PCR) method. A sample of retrieved visibility distribution was discussed with a sea fog process. The correlation coefficient between the observed and retrieved visibility was about 0.82, which is far higher than the 99.9% confidence level by statistical test. The rate of successful retrieval is 94.98% of the 458 cases during 2001 2002. The error distribution showed that high visibilities were usually under-estimated and low visibilities were over-estimated and the relative error between the observed and retrieved visibilities was about 21.4%.
文摘In monitoring systems, multiple sensor nodes can detect a single target of interest simultaneously and the data collected are usually highly correlated and redundant. If each node sends data to the base station, energy will be wasted and thus the network energy will be depleted quickly. Data aggregation is an important paradigm for compressing data so that the energy of the network is spent efficiently. In this paper, a novel data aggregation algorithm called Redundancy Elimination for Accurate Data Aggregation (READA) has been proposed. By exploiting the range of spatial correlations of data in the network, READA applies a grouping and compression mechanism to remove duplicate data in the aggregated set of data to be sent to the base station without largely losing the accuracy of the final aggregated data. One peculiarity of READA is that it uses a prediction model derived from cached values to confirm whether any outlier is actually an event which has occurred. From the various simulations conducted, it was observed that in READA the accuracy of data has been highly preserved taking into consideration the energy dissipated for aggregating the
文摘Based on surfaced-related multiple elimination (SRME) , this research has derived the methods on multiples elimination in the inverse data space. Inverse data processing means moving seismic data from forward data space (FDS) to inverse data space ( IDS) . The surface-related multiples and primaries can then be sepa-rated in the IDS, since surface-related multiples wi l l form a focus region in the IDS. Muting the multiples ener-gy can achieve the purpose of multiples elimination and avoid the damage to primaries energy during the process of adaptive subtraction. Randomized singular value decomposition ( RSYD) is used to enhance calculation speed and improve the accuracy in the conversion of FDS to IDS. The synthetic shot record of the salt dome model shows that the relationship between primaries and multiples is simple and clear, and RSVD can easily eliminate multiples and save primaries energy. Compared with conventional multiples elimination methods and ordinary methods of multiples elimination in the inverse data space, this technique has an advantage of high cal-culation speed and reliable outcomes.
文摘Internal multiples are commonly present in seismic data due to variations in velocity or density of subsurface media.They can reduce the signal-to-noise ratio of seismic data and degrade the quality of the image.With the development of seismic exploration into deep and ultradeep events,especially those from complex targets in the western region of China,the internal multiple eliminations become increasingly challenging.Currently,three-dimensional(3D)seismic data are primarily used for oil and gas target recognition and drilling.Effectively eliminating internal multiples in 3D seismic data of complex structures and mitigating their adverse effects is crucial for enhancing the success rate of drilling.In this study,we propose an internal multiple prediction algorithm for 3D seismic data in complex structures using the Marchenko autofocusing theory.This method can predict the accurate internal multiples of time difference without an accurate velocity model and the implementation process mainly consists of several steps.Firstly,simulating direct waves with a 3D macroscopic velocity model.Secondly,using direct waves and 3D full seismic acquisition records to obtain the upgoing and down-going Green's functions between the virtual source point and surface.Thirdly,constructing internal multiples of the relevant layers by upgoing and downgoing Green's functions.Finally,utilizing the adaptive matching subtraction method to remove predicted internal multiples from the original data to obtain seismic records without multiples.Compared with the two-dimensional(2D)Marchenko algo-rithm,the performance of the 3D Marchenko algorithm for internal multiple prediction has been significantly enhanced,resulting in higher computational accuracy.Numerical simulation test results indicate that our proposed method can effectively eliminate internal multiples in 3D seismic data,thereby exhibiting important theoretical and industrial application value.
基金Supported by Special Scientific Research Fund of Meteorological Public Welfare Profession of China(GYHY201406028)Meteorological Open Research Fund for Huaihe River Basin(HRM201407)Anhui Meteorological Bureau Science and Technology Development Fund(RC201506)
文摘This paper adopts satellite channel brightness temperature simulation to study M-estimator variational retrieval. This approach combines both the advantages of classical variational inversion and robust M-estimators. Classical variational inversion depends on prior quality control to elim- inate outliers, and its errors follow a Gaussian distribution. We coupled the M-estimators to the framework of classical variational inversion to obtain a M-estimator variational inversion. The cost function contains the M-estimator to guarantee the robustness to outliers and improve the retrieval re- sults. The experimental evaluation adopts Feng Yun-3A (FY-3A) simulated data to add to the Gaussian and Non-Gaussian error. The variational in- version is used to obtain the inversion brightness temperature, and temperature and humidity data are used for validation. The preliminary results demonstrate the potential of M-estimator variational retrieval.
基金supported by the National Key R&D Program of China(2020YFB0905900).
文摘Operation control of power systems has become challenging with an increase in the scale and complexity of power distribution systems and extensive access to renewable energy.Therefore,improvement of the ability of data-driven operation management,intelligent analysis,and mining is urgently required.To investigate and explore similar regularities of the historical operating section of the power distribution system and assist the power grid in obtaining high-value historical operation,maintenance experience,and knowledge by rule and line,a neural information retrieval model with an attention mechanism is proposed based on graph data computing technology.Based on the processing flow of the operating data of the power distribution system,a technical framework of neural information retrieval is established.Combined with the natural graph characteristics of the power distribution system,a unified graph data structure and a data fusion method of data access,data complement,and multi-source data are constructed.Further,a graph node feature-embedding representation learning algorithm and a neural information retrieval algorithm model are constructed.The neural information retrieval algorithm model is trained and tested using the generated graph node feature representation vector set.The model is verified on the operating section of the power distribution system of a provincial grid area.The results show that the proposed method demonstrates high accuracy in the similarity matching of historical operation characteristics and effectively supports intelligent fault diagnosis and elimination in power distribution systems.
基金supported and granted by the Ministry of Science and Technology,Taiwan(MOST110-2622-E-390-001 and MOST109-2622-E-390-002-CC3).
文摘Big data analytics in business intelligence do not provide effective data retrieval methods and job scheduling that will cause execution inefficiency and low system throughput.This paper aims to enhance the capability of data retrieval and job scheduling to speed up the operation of big data analytics to overcome inefficiency and low throughput problems.First,integrating stacked sparse autoencoder and Elasticsearch indexing explored fast data searching and distributed indexing,which reduces the search scope of the database and dramatically speeds up data searching.Next,exploiting a deep neural network to predict the approximate execution time of a job gives prioritized job scheduling based on the shortest job first,which reduces the average waiting time of job execution.As a result,the proposed data retrieval approach outperforms the previous method using a deep autoencoder and Solr indexing,significantly improving the speed of data retrieval up to 53%and increasing system throughput by 53%.On the other hand,the proposed job scheduling algorithmdefeats both first-in-first-out andmemory-sensitive heterogeneous early finish time scheduling algorithms,effectively shortening the average waiting time up to 5%and average weighted turnaround time by 19%,respectively.
基金the National Social Science Foundation of China(Nos.15CTQ028 and 14@ZH036)the Social Science Foundation of Beijing(No.15SHA002)the Young Faculty Research Fund of Beijing Foreign Studies University(No.2015JT008)
文摘In view of the study of finance and economics information, we research on the real-time financial news posted on the authority sites in the world's major advanced economies. Analyzing the massive financial news of different information sources and language origins, we come up with a basic theory model and its algorithm on financial news, which is capable of intelligent collection, quick access, deduplication, correction and integration with financial news' backgrounds. Furthermore, we can find out connections between financial news and readers' interest. So we can achieve a real-time and on-demand financial news feed, as well as provide a theoretical basis and verification of the scientific problems on real-time processing of massive information. Finally, the simulation experiment shows that the multilingual financial news matching technology can give more help to distinguish the similar financial news in different languages than the traditional method.
基金the phased research result of the Supreme People’s Procuratorate’s procuratorial theory research program“Research on the Governance Problems of the Crime of Aiding Information Network Criminal Activities”(Project Approval Number GJ2023D28)。
文摘With the development of information technology,the online retrieval of remote electronic data has become an important method for investigative agencies to collect evidence.In the current normative documents,the online retrieval of electronic data is positioned as a new type of arbitrary investigative measure.However,study of its actual operation has found that the online retrieval of electronic data does not fully comply with the characteristics of arbitrary investigative measures.The root cause is its inaccurately defined nature due to analogy errors,an emphasis on the authenticity of electronic data at the cost of rights protection,insufficient effectiveness of normative documents to break through the boundaries of law,and superficial inconsistency found in the mechanical comparison with the nature of existing investigative measures causes.The nature of electronic data retrieved online should be defined according to different circumstances.The retrieval of electronic data disclosed on the Internet is an arbitrary investigative measure,and following procedural specifications should be sufficient.When investigators conceal their true identities and enter the cyberspace of the suspected crime through a registered account to extract dynamic electronic data for criminal activities,it is essentially a covert investigation in cyberspace,and they should follow the normative requirements for covert investigations.The retrieval of dynamic electronic data from private spaces is a technical investigative measure and should be implemented in accordance with the technical investigative procedures.Retrieval of remote“non-public electronic data involving privacy”is a mandatory investigative measure,and is essentially a search in the virtual space.Therefore,procedural specifications should be set in accordance with the standards of searching.
文摘This paper focuses on developing a system that allows presentation authors to effectively retrieve presentation slides for reuse from a large volume of existing presentation materials. We assume episodic memories of the authors can be used as contextual keywords in query expressions to efficiently dig out the expected slides for reuse rather than using only the part-of-slide-descriptions-based keyword queries. As a system, a new slide repository is proposed, composed of slide material collections, slide content data and pieces of information from authors' episodic memories related to each slide and presentation together with a slide retrieval application enabling authors to use the episodic memories as part of queries. The result of our experiment shows that the episodic memory-used queries can give more discoverability than the keyword-based queries. Additionally, an improvement model is discussed on the slide retrieval for further slide-finding efficiency by expanding the episodic memories model in the repository taking in the links with the author-and-slide-related data and events having been post on the private and social media sites.
基金supported by the National Key R&D Plan of China(2016YFB1001501)
文摘We develop a data driven method(probability model) to construct a composite shape descriptor by combining a pair of scale-based shape descriptors. The selection of a pair of scale-based shape descriptors is modeled as the computation of the union of two events, i.e.,retrieving similar shapes by using a single scale-based shape descriptor. The pair of scale-based shape descriptors with the highest probability forms the composite shape descriptor. Given a shape database, the composite shape descriptors for the shapes constitute a planar point set.A VoR-Tree of the planar point set is then used as an indexing structure for efficient query operation. Experiments and comparisons show the effectiveness and efficiency of the proposed composite shape descriptor.
文摘Interference in the data of geochemical hydrocarbon exploration is a large obstacle for anomaly recognition. The multiresolution analysis of wavelet analysis can extract the information at different scales so as to provide a powerful tool for information analysis and processing. Based on the analysis of the geometric nature of hydrocarbon anomalies and background, Mallat wavelet and symmetric border treatment are selected and data pre-processing (logarithm-normalization) is established. This approach provide good results in Shandong and Inner Mongolia, China. It is demonstrated that this approach overcome the disadvantage of backgound variation in the window (interference in window), used in moving average, frame filtering and spatial and scaling modeling methods.
基金Supported by the Natural Science Foundation ofLiaoning Province (20042020)
文摘A kind of single linked lists named aggregative chain is introduced to the algorithm, thus improving the architecture of FP tree. The new FP tree is a one-way tree and only the pointers that point its parent at each node are kept. Route information of different nodes in a same item are compressed into aggregative chains so that the frequent patterns will be produced in aggregative chains without generating node links and conditional pattern bases. An example of Web key words retrieval is given to analyze and verify the frequent pattern algorithm in this paper.
基金This work was supported by the National Natural Science Foundation of China (Grant No. 90411006)by the Shanghai Science and Technology Association (Grant No. 02DJ14032).
文摘It is well known that retrieval of parameters is usually ill-posed and highly nonlinear, so parameter retrieval problems are very difficult. There are still many important theoretical issues under research, although great success has been achieved in data assimilation in meteorology and oceanography. This paper reviews the recent research on parameter retrieval, especially that of the authors. First, some concepts and issues of parameter retrieval are introduced and the state-of-the-art parameter retrieval technology in meteorology and oceanography is reviewed briefly, and then atmospheric and oceanic parameters are retrieved using the variational data assimilation method combined with the regularization techniques in four examples: retrieval of the vertical eddy diffusion coefficient; of the turbulivity of the atmospheric boundary layer; of wind from Doppler radar data, and of the physical process parameters. Model parameter retrieval with global and local observations is also introduced.
基金supported in part by NSFC under Grant No.61172090National Science and Technology Major Project under Grant 2012ZX03002001+3 种基金Research Fund for the Doctoral Program of Higher Education of China under Grant No.20120201110013Scientific and Technological Project in Shaanxi Province under Grant(No.2012K06-30, No.2014JQ8322)Basic Science Research Fund in Xi'an Jiaotong University(No. XJJ2014049,No.XKJC2014008)Shaanxi Science and Technology Innovation Project (2013SZS16-Z01/P01/K01)
文摘With the increasing popularity of cloud computing,privacy has become one of the key problem in cloud security.When data is outsourced to the cloud,for data owners,they need to ensure the security of their privacy;for cloud service providers,they need some information of the data to provide high QoS services;and for authorized users,they need to access to the true value of data.The existing privacy-preserving methods can't meet all the needs of the three parties at the same time.To address this issue,we propose a retrievable data perturbation method and use it in the privacy-preserving in data outsourcing in cloud computing.Our scheme comes in four steps.Firstly,an improved random generator is proposed to generate an accurate "noise".Next,a perturbation algorithm is introduced to add noise to the original data.By doing this,the privacy information is hidden,but the mean and covariance of data which the service providers may need remain unchanged.Then,a retrieval algorithm is proposed to get the original data back from the perturbed data.Finally,we combine the retrievable perturbation with the access control process to ensure only the authorized users can retrieve the original data.The experiments show that our scheme perturbs date correctly,efficiently,and securely.
基金National Natural Science Foundation of China (40975067)973 Program (2009CB421500)+1 种基金CMA Grant GYHY200806029NASA grant NNX07AM97G in U.S.A
文摘In this paper we investigate the impact of the Atmospheric Infra-Red Sounder (AIRS) temperature retrievals on data assimilation and the resulting forecasts using the four-dimensional Local Ensemble Transform Kalman Filter (LETKF) data assimilation scheme and a reduced resolution version of the NCEP Global Forecast System (GFS).Our results indicate that the AIRS temperature retrievals have a significant and consistent positive impact in the Southern Hemispheric extratropics on both analyses and forecasts,which is found not only in the temperature field but also in other variables.In tropics and the Northern Hemispheric extratropics these impacts are smaller,but are still generally positive or neutral.
基金This work is financially supported by the Ministry of Earth Science(MoES),Government of India,(Grant.No.MoES/36/OOIS/Extra/45/2015),URL:https://www.moes.gov.in。
文摘The drastic growth of coastal observation sensors results in copious data that provide weather information.The intricacies in sensor-generated big data are heterogeneity and interpretation,driving high-end Information Retrieval(IR)systems.The Semantic Web(SW)can solve this issue by integrating data into a single platform for information exchange and knowledge retrieval.This paper focuses on exploiting the SWbase systemto provide interoperability through ontologies by combining the data concepts with ontology classes.This paper presents a 4-phase weather data model:data processing,ontology creation,SW processing,and query engine.The developed Oceanographic Weather Ontology helps to enhance data analysis,discovery,IR,and decision making.In addition to that,it also evaluates the developed ontology with other state-of-the-art ontologies.The proposed ontology’s quality has improved by 39.28%in terms of completeness,and structural complexity has decreased by 45.29%,11%and 37.7%in Precision and Accuracy.Indian Meteorological Satellite INSAT-3D’s ocean data is a typical example of testing the proposed model.The experimental result shows the effectiveness of the proposed data model and its advantages in machine understanding and IR.
文摘A simple fast method is given for sequentially retrieving all the records in a B tree. A file structure for database is proposed. The records in its primary data file are sorted according to the key order. A B tree is used as its dense index. It is easy to insert, delete or search a record, and it is also convenient to retrieve records in the sequential order of the keys. The merits and efficiencies of these methods or structures are discussed in detail.
文摘In this paper, we present machine learning algorithms and systems for similar video retrieval. Here, the query is itself a video. For the similarity measurement, exemplars, or representative frames in each video, are extracted by unsupervised learning. For this learning, we chose the order-aware competitive learning. After obtaining a set of exemplars for each video, the similarity is computed. Because the numbers and positions of the exemplars are different in each video, we use a similarity computing method called M-distance, which generalizes existing global and local alignment methods using followers to the exemplars. To represent each frame in the video, this paper emphasizes the Frame Signature of the ISO/IEC standard so that the total system, along with its graphical user interface, becomes practical. Experiments on the detection of inserted plagiaristic scenes showed excellent precision-recall curves, with precision values very close to 1. Thus, the proposed system can work as a plagiarism detector for videos. In addition, this method can be regarded as the structuring of unstructured data via numerical labeling by exemplars. Finally, further sophistication of this labeling is discussed.