Enterprise applications utilize relational databases and structured business processes, requiring slow and expensive conversion of inputs and outputs, from business documents such as invoices, purchase orders, and rec...Enterprise applications utilize relational databases and structured business processes, requiring slow and expensive conversion of inputs and outputs, from business documents such as invoices, purchase orders, and receipts, into known templates and schemas before processing. We propose a new LLM Agent-based intelligent data extraction, transformation, and load (IntelligentETL) pipeline that not only ingests PDFs and detects inputs within it but also addresses the extraction of structured and unstructured data by developing tools that most efficiently and securely deal with respective data types. We study the efficiency of our proposed pipeline and compare it with enterprise solutions that also utilize LLMs. We establish the supremacy in timely and accurate data extraction and transformation capabilities of our approach for analyzing the data from varied sources based on nested and/or interlinked input constraints.展开更多
With the rapid advancement of cloud computing technology,reversible data hiding algorithms in encrypted images(RDH-EI)have developed into an important field of study concentrated on safeguarding privacy in distributed...With the rapid advancement of cloud computing technology,reversible data hiding algorithms in encrypted images(RDH-EI)have developed into an important field of study concentrated on safeguarding privacy in distributed cloud environments.However,existing algorithms often suffer from low embedding capacities and are inadequate for complex data access scenarios.To address these challenges,this paper proposes a novel reversible data hiding algorithm in encrypted images based on adaptive median edge detection(AMED)and ciphertext-policy attributebased encryption(CP-ABE).This proposed algorithm enhances the conventional median edge detection(MED)by incorporating dynamic variables to improve pixel prediction accuracy.The carrier image is subsequently reconstructed using the Huffman coding technique.Encrypted image generation is then achieved by encrypting the image based on system user attributes and data access rights,with the hierarchical embedding of the group’s secret data seamlessly integrated during the encryption process using the CP-ABE scheme.Ultimately,the encrypted image is transmitted to the data hider,enabling independent embedding of the secret data and resulting in the creation of the marked encrypted image.This approach allows only the receiver to extract the authorized group’s secret data,thereby enabling fine-grained,controlled access.Test results indicate that,in contrast to current algorithms,the method introduced here considerably improves the embedding rate while preserving lossless image recovery.Specifically,the average maximum embedding rates for the(3,4)-threshold and(6,6)-threshold schemes reach 5.7853 bits per pixel(bpp)and 7.7781 bpp,respectively,across the BOSSbase,BOW-2,and USD databases.Furthermore,the algorithm facilitates permission-granting and joint-decryption capabilities.Additionally,this paper conducts a comprehensive examination of the algorithm’s robustness using metrics such as image correlation,information entropy,and number of pixel change rate(NPCR),confirming its high level of security.Overall,the algorithm can be applied in a multi-user and multi-level cloud service environment to realize the secure storage of carrier images and secret data.展开更多
With the rapid development of information technology,smart teaching platforms have become important tools for higher education teaching reform.As a core course of computer science and technology-related majors in high...With the rapid development of information technology,smart teaching platforms have become important tools for higher education teaching reform.As a core course of computer science and technology-related majors in higher education,the data structure course lays a solid foundation for students’professional learning and plays an important role in promoting their future success in technology,research,and industry.This study conducts an in-depth analysis of the pain points faced by the data structure course,and explores a teaching reform and practice of integration of theory and practice based on the system application of a smart teaching platform before class,during class,and after class.The reform practice shows that this teaching mode improves students’learning initiative,learning motivation,and practical skills.Students not only achieved better results in knowledge mastery but also significantly improved in problem analysis and solution.展开更多
Data warehouse provides storage and management for mass data, but data schema evolves with time on. When data schema is changed, added or deleted, the data in data warehouse must comply with the changed data schema, ...Data warehouse provides storage and management for mass data, but data schema evolves with time on. When data schema is changed, added or deleted, the data in data warehouse must comply with the changed data schema, so data warehouse must be re organized or re constructed, but this process is exhausting and wasteful. In order to cope with these problems, this paper develops an approach to model data cube with XML, which emerges as a universal format for data exchange on the Web and which can make data warehouse flexible and scalable. This paper also extends OLAP algebra for XML based data cube, which is called X OLAP.展开更多
A robust and efficient algorithm is presented to build multiresolution models (MRMs) of arbitrary meshes without requirement of subdivision connectivity. To overcome the sampling difficulty of arbitrary meshes, edge c...A robust and efficient algorithm is presented to build multiresolution models (MRMs) of arbitrary meshes without requirement of subdivision connectivity. To overcome the sampling difficulty of arbitrary meshes, edge contraction and vertex expansion are used as downsampling and upsampling methods. Our MRMs of a mesh are composed of a base mesh and a series of edge split operations, which are organized as a directed graph. Each split operation encodes two parts of information. One is the modification to the mesh, and the other is the dependency relation among splits. Such organization ensures the efficiency and robustness of our MRM algorithm. Examples demonstrate the functionality of our method.展开更多
In order to improve the quality of web search,a new query expansion method by choosing meaningful structure data from a domain database is proposed.It categories attributes into three different classes,named as concep...In order to improve the quality of web search,a new query expansion method by choosing meaningful structure data from a domain database is proposed.It categories attributes into three different classes,named as concept attribute,context attribute and meaningless attribute,according to their semantic features which are document frequency features and distinguishing capability features.It also defines the semantic relevance between two attributes when they have correlations in the database.Then it proposes trie-bitmap structure and pair pointer tables to implement efficient algorithms for discovering attribute semantic feature and detecting their semantic relevances.By using semantic attributes and their semantic relevances,expansion words can be generated and embedded into a vector space model with interpolation parameters.The experiments use an IMDB movie database and real texts collections to evaluate the proposed method by comparing its performance with a classical vector space model.The results show that the proposed method can improve text search efficiently and also improve both semantic features and semantic relevances with good separation capabilities.展开更多
More web pages are widely applying AJAX (Asynchronous JavaScript XML) due to the rich interactivity and incremental communication. By observing, it is found that the AJAX contents, which could not be seen by traditi...More web pages are widely applying AJAX (Asynchronous JavaScript XML) due to the rich interactivity and incremental communication. By observing, it is found that the AJAX contents, which could not be seen by traditional crawler, are well-structured and belong to one specific domain generally. Extracting the structured data from AJAX contents and annotating its semantic are very significant for further applications. In this paper, a structured AJAX data extraction method for agricultural domain based on agricultural ontology was proposed. Firstly, Crawljax, an open AJAX crawling tool, was overridden to explore and retrieve the AJAX contents; secondly, the retrieved contents were partitioned into items and then classified by combining with agricultural ontology. HTML tags and punctuations were used to segment the retrieved contents into entity items. Finally, the entity items were clustered and the semantic annotation was assigned to clustering results according to agricultural ontology. By experimental evaluation, the proposed approach was proved effectively in resource exploring, entity extraction, and semantic annotation.展开更多
In this paper, a new concept called numerical structure of seismic data is introduced and the difference between numerical structure and numerical value of seismic data is explained. Our study shows that the numerical...In this paper, a new concept called numerical structure of seismic data is introduced and the difference between numerical structure and numerical value of seismic data is explained. Our study shows that the numerical seismic structure is closely related to oil and gas-bearing reservoir, so it is very useful for a geologist or a geophysicist to precisely interpret the oil-bearing layers from the seismic data. This technology can be applied to any exploration or production stage. The new method has been tested on a series of exploratory or development wells and proved to be reliable in China. Hydrocarbon-detection with this new method for 39 exploration wells on 25 structures indi- cates a success ratio of over 80 percent. The new method of hydrocarbon prediction can be applied for: (1) depositional environment of reservoirs with marine fades, delta, or non-marine fades (including fluvial facies, lacustrine fades); (2) sedimentary rocks of reservoirs that are non-marine clastic rocks and carbonate rock; and (3) burial depths range from 300 m to 7000 m, and the minimum thickness of these reservoirs is over 8 m (main frequency is about 50 Hz).展开更多
Seismic data structure characteristics means the waveform character arranged in the time sequence at discrete data points in each 2-D or 3-D seismic trace. Hydrocarbon prediction using seismic data structure character...Seismic data structure characteristics means the waveform character arranged in the time sequence at discrete data points in each 2-D or 3-D seismic trace. Hydrocarbon prediction using seismic data structure characteristics is a new reservoir prediction technique. When the main pay interval is in carbonate fracture and fissure-cavern type reservoirs with very strong inhomogeneity, there are some difficulties with hydrocarbon prediction. Because of the special geological conditions of the eighth zone in the Tahe oil field, we apply seismic data structure characteristics to hydrocarbon prediction for the Ordovician reservoir in this zone. We divide the area oil zone into favorable and unfavorable blocks. Eighteen well locations were proposed in the favorable oil block, drilled, and recovered higher output of oil and gas.展开更多
Multi-fidelity Data Fusion(MDF)frameworks have emerged as a prominent approach to producing economical but accurate surrogate models for aerodynamic data modeling by integrating data with different fidelity levels.How...Multi-fidelity Data Fusion(MDF)frameworks have emerged as a prominent approach to producing economical but accurate surrogate models for aerodynamic data modeling by integrating data with different fidelity levels.However,most existing MDF frameworks assume a uniform data structure between sampling data sources;thus,producing an accurate solution at the required level,for cases of non-uniform data structures is challenging.To address this challenge,an Adaptive Multi-fidelity Data Fusion(AMDF)framework is proposed to produce a composite surrogate model which can efficiently model multi-fidelity data featuring non-uniform structures.Firstly,the design space of the input data with non-uniform data structures is decomposed into subdomains containing simplified structures.Secondly,different MDF frameworks and a rule-based selection process are adopted to construct multiple local models for the subdomain data.On the other hand,the Enhanced Local Fidelity Modeling(ELFM)method is proposed to combine the generated local models into a unique and continuous global model.Finally,the resulting model inherits the features of local models and approximates a complete database for the whole design space.The validation of the proposed framework is performed to demonstrate its approximation capabilities in(A)four multi-dimensional analytical problems and(B)a practical engineering case study of constructing an F16C fighter aircraft’s aerodynamic database.Accuracy comparisons of the generated models using the proposed AMDF framework and conventional MDF approaches using a single global modeling algorithm are performed to reveal the adaptability of the proposed approach for fusing multi-fidelity data featuring non-uniform structures.Indeed,the results indicated that the proposed framework outperforms the state-of-the-art MDF approach in the cases of non-uniform data.展开更多
The proliferation of textual data in society currently is overwhelming, in particular, unstructured textual data is being constantly generated via call centre logs, emails, documents on the web, blogs, tweets, custome...The proliferation of textual data in society currently is overwhelming, in particular, unstructured textual data is being constantly generated via call centre logs, emails, documents on the web, blogs, tweets, customer comments, customer reviews, etc.While the amount of textual data is increasing rapidly, users ability to summarise, understand, and make sense of such data for making better business/living decisions remains challenging. This paper studies how to analyse textual data, based on layered software patterns, for extracting insightful user intelligence from a large collection of documents and for using such information to improve user operations and performance.展开更多
In the application development of database,sharing information a- mong different DBMSs is an important and meaningful technical subject. This paper analyzes the schema definition and physical organization of popu- lar...In the application development of database,sharing information a- mong different DBMSs is an important and meaningful technical subject. This paper analyzes the schema definition and physical organization of popu- lar relational DBMSs and suggests the use of an intermediary schema.This technology provides many advantages such as powerful extensibility and ease in the integration of data conversions among different DBMSs etc.This pa- per introduces the data conversion system under DOS and XENIX operating systems.展开更多
In conjunction with association rules for data mining, the connections between testing indices and strong and weak association rules were determined, and new derivative rules were obtained by further reasoning. Associ...In conjunction with association rules for data mining, the connections between testing indices and strong and weak association rules were determined, and new derivative rules were obtained by further reasoning. Association rules were used to analyze correlation and check consistency between indices. This study shows that the judgment obtained by weak association rules or non-association rules is more accurate and more credible than that obtained by strong association rules. When the testing grades of two indices in the weak association rules are inconsistent, the testing grades of indices are more likely to be erroneous, and the mistakes are often caused by human factors. Clustering data mining technology was used to analyze the reliability of a diagnosis, or to perform health diagnosis directly. Analysis showed that the clustering results are related to the indices selected, and that if the indices selected are more significant, the characteristics of clustering results are also more significant, and the analysis or diagnosis is more credible. The indices and diagnosis analysis function produced by this study provide a necessary theoretical foundation and new ideas for the development of hydraulic metal structure health diagnosis technology.展开更多
Taking autonomous driving and driverless as the research object,we discuss and define intelligent high-precision map.Intelligent high-precision map is considered as a key link of future travel,a carrier of real-time p...Taking autonomous driving and driverless as the research object,we discuss and define intelligent high-precision map.Intelligent high-precision map is considered as a key link of future travel,a carrier of real-time perception of traffic resources in the entire space-time range,and the criterion for the operation and control of the whole process of the vehicle.As a new form of map,it has distinctive features in terms of cartography theory and application requirements compared with traditional navigation electronic maps.Thus,it is necessary to analyze and discuss its key features and problems to promote the development of research and application of intelligent high-precision map.Accordingly,we propose an information transmission model based on the cartography theory and combine the wheeled robot’s control flow in practical application.Next,we put forward the data logic structure of intelligent high-precision map,and analyze its application in autonomous driving.Then,we summarize the computing mode of“Crowdsourcing+Edge-Cloud Collaborative Computing”,and carry out key technical analysis on how to improve the quality of crowdsourced data.We also analyze the effective application scenarios of intelligent high-precision map in the future.Finally,we present some thoughts and suggestions for the future development of this field.展开更多
Tree logic, inherited from ambient logic, is introduced as the formal foundation of related programming language and type systems, In this paper, we introduce recursion into such logic system, which can describe the t...Tree logic, inherited from ambient logic, is introduced as the formal foundation of related programming language and type systems, In this paper, we introduce recursion into such logic system, which can describe the tree data more dearly and concisely. By making a distinction between proposition and predicate, a concise semantics interpretation for our modal logic is given. We also develop a model checking algorithm for the logic without △ operator. The correctness of the algorithm is shown. Such work can be seen as the basis of the semi-structured data processing language and more flexible type system.展开更多
To make inorganic structure data more useful for further studies a five-point list of simple procedures to be followed by authors of crystal structure papers is proposed. 1. A crystal structure should be described wit...To make inorganic structure data more useful for further studies a five-point list of simple procedures to be followed by authors of crystal structure papers is proposed. 1. A crystal structure should be described with the space group corresponding to its true symmetry. 2. A new structure proposal should be tested, if it is realistic in principle. 3. A structure should be described with a space group in a setting given in the International Tables. 4. For a comparison with other structures the structure data should be standardized with the program STRUCTURE TIDY. 5. 揘ew?structure data should be checked in the databases, Chemical Abstracts or on-line internet resources, if they are really new. The list is supplemented with many explanations, commentaries, examples and references.展开更多
To extract structured data from a web page with customized requirements,a user labels some DOM elements on the page with attribute names.The common features of the labeled elements are utilized to guide the user throu...To extract structured data from a web page with customized requirements,a user labels some DOM elements on the page with attribute names.The common features of the labeled elements are utilized to guide the user through the labeling process to minimize user efforts,and are also utilized to retrieve attribute values.To turn the attribute values into a structured result,the attribute pattern needs to be induced.For this purpose,a space-optimized suffix tree called attribute tree is built to transform the document object model(DOM) tree into a simpler form while preserving its useful properties such as attribute sequence order.The pattern is induced bottom-up on the attribute tree,and is further used to build the structured result.Experiments are conducted and show high performance of our approach in terms of precision,recall and structural correctness.展开更多
3D city models are widely used in many disciplines and applications,such as urban planning,disaster management,and environmental simulation.Usually,the terrain and embedded objects like buildings are taken into consid...3D city models are widely used in many disciplines and applications,such as urban planning,disaster management,and environmental simulation.Usually,the terrain and embedded objects like buildings are taken into consideration.A consistent model integrating these elements is vital for GIS analysis,especially if the geometry is accompanied by the topological relations between neighboring objects.Such a model allows for more efficient and errorless analysis.The memory consumption is another crucial aspect when the wide area of a city is considered-light models are highly desirable.Three methods of the terrain representation using the geometrical-topological data structure-the dual half-edge-are proposed in this article.The integration of buildings and other structures like bridges with the terrain is also presented.展开更多
The statistical map is usually used to indicate the quantitative features of various socio economic phenomena among regions on the base map of administrative divisions or on other base maps which connected with stati...The statistical map is usually used to indicate the quantitative features of various socio economic phenomena among regions on the base map of administrative divisions or on other base maps which connected with statistical unit. Making use of geographic information system (GIS) techniques, and supported by Auto CAD software, the author of this paper has put forward a practical method for making statistical map and developed a software (SMT) for the making of small scale statistical map using C language.展开更多
文摘Enterprise applications utilize relational databases and structured business processes, requiring slow and expensive conversion of inputs and outputs, from business documents such as invoices, purchase orders, and receipts, into known templates and schemas before processing. We propose a new LLM Agent-based intelligent data extraction, transformation, and load (IntelligentETL) pipeline that not only ingests PDFs and detects inputs within it but also addresses the extraction of structured and unstructured data by developing tools that most efficiently and securely deal with respective data types. We study the efficiency of our proposed pipeline and compare it with enterprise solutions that also utilize LLMs. We establish the supremacy in timely and accurate data extraction and transformation capabilities of our approach for analyzing the data from varied sources based on nested and/or interlinked input constraints.
基金the National Natural Science Foundation of China(Grant Numbers 622724786210245062102451).
文摘With the rapid advancement of cloud computing technology,reversible data hiding algorithms in encrypted images(RDH-EI)have developed into an important field of study concentrated on safeguarding privacy in distributed cloud environments.However,existing algorithms often suffer from low embedding capacities and are inadequate for complex data access scenarios.To address these challenges,this paper proposes a novel reversible data hiding algorithm in encrypted images based on adaptive median edge detection(AMED)and ciphertext-policy attributebased encryption(CP-ABE).This proposed algorithm enhances the conventional median edge detection(MED)by incorporating dynamic variables to improve pixel prediction accuracy.The carrier image is subsequently reconstructed using the Huffman coding technique.Encrypted image generation is then achieved by encrypting the image based on system user attributes and data access rights,with the hierarchical embedding of the group’s secret data seamlessly integrated during the encryption process using the CP-ABE scheme.Ultimately,the encrypted image is transmitted to the data hider,enabling independent embedding of the secret data and resulting in the creation of the marked encrypted image.This approach allows only the receiver to extract the authorized group’s secret data,thereby enabling fine-grained,controlled access.Test results indicate that,in contrast to current algorithms,the method introduced here considerably improves the embedding rate while preserving lossless image recovery.Specifically,the average maximum embedding rates for the(3,4)-threshold and(6,6)-threshold schemes reach 5.7853 bits per pixel(bpp)and 7.7781 bpp,respectively,across the BOSSbase,BOW-2,and USD databases.Furthermore,the algorithm facilitates permission-granting and joint-decryption capabilities.Additionally,this paper conducts a comprehensive examination of the algorithm’s robustness using metrics such as image correlation,information entropy,and number of pixel change rate(NPCR),confirming its high level of security.Overall,the algorithm can be applied in a multi-user and multi-level cloud service environment to realize the secure storage of carrier images and secret data.
文摘With the rapid development of information technology,smart teaching platforms have become important tools for higher education teaching reform.As a core course of computer science and technology-related majors in higher education,the data structure course lays a solid foundation for students’professional learning and plays an important role in promoting their future success in technology,research,and industry.This study conducts an in-depth analysis of the pain points faced by the data structure course,and explores a teaching reform and practice of integration of theory and practice based on the system application of a smart teaching platform before class,during class,and after class.The reform practice shows that this teaching mode improves students’learning initiative,learning motivation,and practical skills.Students not only achieved better results in knowledge mastery but also significantly improved in problem analysis and solution.
文摘Data warehouse provides storage and management for mass data, but data schema evolves with time on. When data schema is changed, added or deleted, the data in data warehouse must comply with the changed data schema, so data warehouse must be re organized or re constructed, but this process is exhausting and wasteful. In order to cope with these problems, this paper develops an approach to model data cube with XML, which emerges as a universal format for data exchange on the Web and which can make data warehouse flexible and scalable. This paper also extends OLAP algebra for XML based data cube, which is called X OLAP.
文摘A robust and efficient algorithm is presented to build multiresolution models (MRMs) of arbitrary meshes without requirement of subdivision connectivity. To overcome the sampling difficulty of arbitrary meshes, edge contraction and vertex expansion are used as downsampling and upsampling methods. Our MRMs of a mesh are composed of a base mesh and a series of edge split operations, which are organized as a directed graph. Each split operation encodes two parts of information. One is the modification to the mesh, and the other is the dependency relation among splits. Such organization ensures the efficiency and robustness of our MRM algorithm. Examples demonstrate the functionality of our method.
基金Program for New Century Excellent Talents in University(No.NCET-06-0290)the National Natural Science Foundation of China(No.60503036)the Fok Ying Tong Education Foundation Award(No.104027)
文摘In order to improve the quality of web search,a new query expansion method by choosing meaningful structure data from a domain database is proposed.It categories attributes into three different classes,named as concept attribute,context attribute and meaningless attribute,according to their semantic features which are document frequency features and distinguishing capability features.It also defines the semantic relevance between two attributes when they have correlations in the database.Then it proposes trie-bitmap structure and pair pointer tables to implement efficient algorithms for discovering attribute semantic feature and detecting their semantic relevances.By using semantic attributes and their semantic relevances,expansion words can be generated and embedded into a vector space model with interpolation parameters.The experiments use an IMDB movie database and real texts collections to evaluate the proposed method by comparing its performance with a classical vector space model.The results show that the proposed method can improve text search efficiently and also improve both semantic features and semantic relevances with good separation capabilities.
基金supported by the Knowledge Innovation Program of the Chinese Academy of Sciencesthe National High-Tech R&D Program of China(2008BAK49B05)
文摘More web pages are widely applying AJAX (Asynchronous JavaScript XML) due to the rich interactivity and incremental communication. By observing, it is found that the AJAX contents, which could not be seen by traditional crawler, are well-structured and belong to one specific domain generally. Extracting the structured data from AJAX contents and annotating its semantic are very significant for further applications. In this paper, a structured AJAX data extraction method for agricultural domain based on agricultural ontology was proposed. Firstly, Crawljax, an open AJAX crawling tool, was overridden to explore and retrieve the AJAX contents; secondly, the retrieved contents were partitioned into items and then classified by combining with agricultural ontology. HTML tags and punctuations were used to segment the retrieved contents into entity items. Finally, the entity items were clustered and the semantic annotation was assigned to clustering results according to agricultural ontology. By experimental evaluation, the proposed approach was proved effectively in resource exploring, entity extraction, and semantic annotation.
基金Mainly presented at the 6-th international meeting of acoustics in Aug. 2003, and The 1999 SPE Asia Pacific Oil and GasConference and Exhibition held in Jakarta, Indonesia, 20-22 April 1999, SPE 54274.
文摘In this paper, a new concept called numerical structure of seismic data is introduced and the difference between numerical structure and numerical value of seismic data is explained. Our study shows that the numerical seismic structure is closely related to oil and gas-bearing reservoir, so it is very useful for a geologist or a geophysicist to precisely interpret the oil-bearing layers from the seismic data. This technology can be applied to any exploration or production stage. The new method has been tested on a series of exploratory or development wells and proved to be reliable in China. Hydrocarbon-detection with this new method for 39 exploration wells on 25 structures indi- cates a success ratio of over 80 percent. The new method of hydrocarbon prediction can be applied for: (1) depositional environment of reservoirs with marine fades, delta, or non-marine fades (including fluvial facies, lacustrine fades); (2) sedimentary rocks of reservoirs that are non-marine clastic rocks and carbonate rock; and (3) burial depths range from 300 m to 7000 m, and the minimum thickness of these reservoirs is over 8 m (main frequency is about 50 Hz).
基金This reservoir research is sponsored by the National 973 Subject Project (No. 2001CB209).
文摘Seismic data structure characteristics means the waveform character arranged in the time sequence at discrete data points in each 2-D or 3-D seismic trace. Hydrocarbon prediction using seismic data structure characteristics is a new reservoir prediction technique. When the main pay interval is in carbonate fracture and fissure-cavern type reservoirs with very strong inhomogeneity, there are some difficulties with hydrocarbon prediction. Because of the special geological conditions of the eighth zone in the Tahe oil field, we apply seismic data structure characteristics to hydrocarbon prediction for the Ordovician reservoir in this zone. We divide the area oil zone into favorable and unfavorable blocks. Eighteen well locations were proposed in the favorable oil block, drilled, and recovered higher output of oil and gas.
基金supported by the Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(No.2020R1A6A1A03046811).This paper was also supported by Konkuk University Researcher Fund in 2021.
文摘Multi-fidelity Data Fusion(MDF)frameworks have emerged as a prominent approach to producing economical but accurate surrogate models for aerodynamic data modeling by integrating data with different fidelity levels.However,most existing MDF frameworks assume a uniform data structure between sampling data sources;thus,producing an accurate solution at the required level,for cases of non-uniform data structures is challenging.To address this challenge,an Adaptive Multi-fidelity Data Fusion(AMDF)framework is proposed to produce a composite surrogate model which can efficiently model multi-fidelity data featuring non-uniform structures.Firstly,the design space of the input data with non-uniform data structures is decomposed into subdomains containing simplified structures.Secondly,different MDF frameworks and a rule-based selection process are adopted to construct multiple local models for the subdomain data.On the other hand,the Enhanced Local Fidelity Modeling(ELFM)method is proposed to combine the generated local models into a unique and continuous global model.Finally,the resulting model inherits the features of local models and approximates a complete database for the whole design space.The validation of the proposed framework is performed to demonstrate its approximation capabilities in(A)four multi-dimensional analytical problems and(B)a practical engineering case study of constructing an F16C fighter aircraft’s aerodynamic database.Accuracy comparisons of the generated models using the proposed AMDF framework and conventional MDF approaches using a single global modeling algorithm are performed to reveal the adaptability of the proposed approach for fusing multi-fidelity data featuring non-uniform structures.Indeed,the results indicated that the proposed framework outperforms the state-of-the-art MDF approach in the cases of non-uniform data.
文摘The proliferation of textual data in society currently is overwhelming, in particular, unstructured textual data is being constantly generated via call centre logs, emails, documents on the web, blogs, tweets, customer comments, customer reviews, etc.While the amount of textual data is increasing rapidly, users ability to summarise, understand, and make sense of such data for making better business/living decisions remains challenging. This paper studies how to analyse textual data, based on layered software patterns, for extracting insightful user intelligence from a large collection of documents and for using such information to improve user operations and performance.
文摘In the application development of database,sharing information a- mong different DBMSs is an important and meaningful technical subject. This paper analyzes the schema definition and physical organization of popu- lar relational DBMSs and suggests the use of an intermediary schema.This technology provides many advantages such as powerful extensibility and ease in the integration of data conversions among different DBMSs etc.This pa- per introduces the data conversion system under DOS and XENIX operating systems.
基金supported by the Key Program of the National Natural Science Foundation of China(Grant No.50539010)the Special Fund for Public Welfare Industry of the Ministry of Water Resources of China(Grant No.200801019)
文摘In conjunction with association rules for data mining, the connections between testing indices and strong and weak association rules were determined, and new derivative rules were obtained by further reasoning. Association rules were used to analyze correlation and check consistency between indices. This study shows that the judgment obtained by weak association rules or non-association rules is more accurate and more credible than that obtained by strong association rules. When the testing grades of two indices in the weak association rules are inconsistent, the testing grades of indices are more likely to be erroneous, and the mistakes are often caused by human factors. Clustering data mining technology was used to analyze the reliability of a diagnosis, or to perform health diagnosis directly. Analysis showed that the clustering results are related to the indices selected, and that if the indices selected are more significant, the characteristics of clustering results are also more significant, and the analysis or diagnosis is more credible. The indices and diagnosis analysis function produced by this study provide a necessary theoretical foundation and new ideas for the development of hydraulic metal structure health diagnosis technology.
基金National Key Research and Development Program(No.2018YFB1305001)Major Consulting and Research Project of Chinese Academy of Engineering(No.2018-ZD-02-07)。
文摘Taking autonomous driving and driverless as the research object,we discuss and define intelligent high-precision map.Intelligent high-precision map is considered as a key link of future travel,a carrier of real-time perception of traffic resources in the entire space-time range,and the criterion for the operation and control of the whole process of the vehicle.As a new form of map,it has distinctive features in terms of cartography theory and application requirements compared with traditional navigation electronic maps.Thus,it is necessary to analyze and discuss its key features and problems to promote the development of research and application of intelligent high-precision map.Accordingly,we propose an information transmission model based on the cartography theory and combine the wheeled robot’s control flow in practical application.Next,we put forward the data logic structure of intelligent high-precision map,and analyze its application in autonomous driving.Then,we summarize the computing mode of“Crowdsourcing+Edge-Cloud Collaborative Computing”,and carry out key technical analysis on how to improve the quality of crowdsourced data.We also analyze the effective application scenarios of intelligent high-precision map in the future.Finally,we present some thoughts and suggestions for the future development of this field.
基金Supported by the National Natural Sciences Foun-dation of China (60233010 ,60273034 ,60403014) ,863 ProgramofChina (2002AA116010) ,973 Programof China (2002CB312002)
文摘Tree logic, inherited from ambient logic, is introduced as the formal foundation of related programming language and type systems, In this paper, we introduce recursion into such logic system, which can describe the tree data more dearly and concisely. By making a distinction between proposition and predicate, a concise semantics interpretation for our modal logic is given. We also develop a model checking algorithm for the logic without △ operator. The correctness of the algorithm is shown. Such work can be seen as the basis of the semi-structured data processing language and more flexible type system.
文摘To make inorganic structure data more useful for further studies a five-point list of simple procedures to be followed by authors of crystal structure papers is proposed. 1. A crystal structure should be described with the space group corresponding to its true symmetry. 2. A new structure proposal should be tested, if it is realistic in principle. 3. A structure should be described with a space group in a setting given in the International Tables. 4. For a comparison with other structures the structure data should be standardized with the program STRUCTURE TIDY. 5. 揘ew?structure data should be checked in the databases, Chemical Abstracts or on-line internet resources, if they are really new. The list is supplemented with many explanations, commentaries, examples and references.
基金Supported by the National High Technology Research and Development Programme of China(No.2009AA01 Z141)the National Natural Science Foundation of China(No.60573117)Beijing Natural Science Foundation(No.4131001)
文摘To extract structured data from a web page with customized requirements,a user labels some DOM elements on the page with attribute names.The common features of the labeled elements are utilized to guide the user through the labeling process to minimize user efforts,and are also utilized to retrieve attribute values.To turn the attribute values into a structured result,the attribute pattern needs to be induced.For this purpose,a space-optimized suffix tree called attribute tree is built to transform the document object model(DOM) tree into a simpler form while preserving its useful properties such as attribute sequence order.The pattern is induced bottom-up on the attribute tree,and is further used to build the structured result.Experiments are conducted and show high performance of our approach in terms of precision,recall and structural correctness.
基金The authors would like to thank sponsors for their support:research on the dual half-edge data structure was funded by the EPSRC and Ordnance Survey,UK(New CASE Award,2006−2010)Technical University of Malaysia and the Ministry of Science,Technology and Innovation,Malaysia(eScience 01-01-06-SF1046,Vot No.4S049)(2011−2014).
文摘3D city models are widely used in many disciplines and applications,such as urban planning,disaster management,and environmental simulation.Usually,the terrain and embedded objects like buildings are taken into consideration.A consistent model integrating these elements is vital for GIS analysis,especially if the geometry is accompanied by the topological relations between neighboring objects.Such a model allows for more efficient and errorless analysis.The memory consumption is another crucial aspect when the wide area of a city is considered-light models are highly desirable.Three methods of the terrain representation using the geometrical-topological data structure-the dual half-edge-are proposed in this article.The integration of buildings and other structures like bridges with the terrain is also presented.
文摘The statistical map is usually used to indicate the quantitative features of various socio economic phenomena among regions on the base map of administrative divisions or on other base maps which connected with statistical unit. Making use of geographic information system (GIS) techniques, and supported by Auto CAD software, the author of this paper has put forward a practical method for making statistical map and developed a software (SMT) for the making of small scale statistical map using C language.