Enterprise applications utilize relational databases and structured business processes, requiring slow and expensive conversion of inputs and outputs, from business documents such as invoices, purchase orders, and rec...Enterprise applications utilize relational databases and structured business processes, requiring slow and expensive conversion of inputs and outputs, from business documents such as invoices, purchase orders, and receipts, into known templates and schemas before processing. We propose a new LLM Agent-based intelligent data extraction, transformation, and load (IntelligentETL) pipeline that not only ingests PDFs and detects inputs within it but also addresses the extraction of structured and unstructured data by developing tools that most efficiently and securely deal with respective data types. We study the efficiency of our proposed pipeline and compare it with enterprise solutions that also utilize LLMs. We establish the supremacy in timely and accurate data extraction and transformation capabilities of our approach for analyzing the data from varied sources based on nested and/or interlinked input constraints.展开更多
Addressing the current challenges in transforming pixel displacement into physical displacement in visual monitoring technologies,as well as the inability to achieve precise full-field monitoring,this paper proposes a...Addressing the current challenges in transforming pixel displacement into physical displacement in visual monitoring technologies,as well as the inability to achieve precise full-field monitoring,this paper proposes a method for identifying the structural dynamic characteristics of wind turbines based on visual monitoring data fusion.Firstly,the Lucas-Kanade Tomasi(LKT)optical flow method and a multi-region of interest(ROI)monitoring structure are employed to track pixel displacements,which are subsequently subjected to band pass filtering and resampling operations.Secondly,the actual displacement time history is derived through double integration of the acquired acceleration data and subsequent band pass filtering.The scale factor is obtained by applying the least squares method to compare the visual displacement with the displacement derived from double integration of the acceleration data.Based on this,the multi-point displacement time histories under physical coordinates are obtained using the vision data and the scale factor.Subsequently,when visual monitoring of displacements becomes impossible due to issues such as image blurring or lens occlusion,the structural vibration equation and boundary condition constraints,among other key parameters,are employed to predict the displacements at unknown monitoring points,thereby enabling full-field displacement monitoring and dynamic characteristic testing of the structure.Finally,a small-scale shaking table test was conducted on a simulated wind turbine structure undergoing shutdown to validate the dynamic characteristics of the proposed method through test verification.The research results indicate that the proposed method achieves a time-domain error within the submillimeter range and a frequency-domain accuracy of over 99%,effectively monitoring the full-field structural dynamic characteristics of wind turbines and providing a basis for the condition assessment of wind turbine structures.展开更多
Surgical site infections(SSIs)are the most common healthcare-related infections in patients with lung cancer.Constructing a lung cancer SSI risk prediction model requires the extraction of relevant risk factors from l...Surgical site infections(SSIs)are the most common healthcare-related infections in patients with lung cancer.Constructing a lung cancer SSI risk prediction model requires the extraction of relevant risk factors from lung cancer case texts,which involves two types of text structuring tasks:attribute discrimination and attribute extraction.This article proposes a joint model,Multi-BGLC,around these two types of tasks,using bidirectional encoder representations from transformers(BERT)as the encoder and fine-tuning the decoder composed of graph convolutional neural network(GCNN)+long short-term memory(LSTM)+conditional random field(CRF)based on cancer case data.The GCNN is used for attribute discrimination,whereas the LSTM and CRF are used for attribute extraction.The experiment verified the effectiveness and accuracy of the model compared with other baseline models.展开更多
More web pages are widely applying AJAX (Asynchronous JavaScript XML) due to the rich interactivity and incremental communication. By observing, it is found that the AJAX contents, which could not be seen by traditi...More web pages are widely applying AJAX (Asynchronous JavaScript XML) due to the rich interactivity and incremental communication. By observing, it is found that the AJAX contents, which could not be seen by traditional crawler, are well-structured and belong to one specific domain generally. Extracting the structured data from AJAX contents and annotating its semantic are very significant for further applications. In this paper, a structured AJAX data extraction method for agricultural domain based on agricultural ontology was proposed. Firstly, Crawljax, an open AJAX crawling tool, was overridden to explore and retrieve the AJAX contents; secondly, the retrieved contents were partitioned into items and then classified by combining with agricultural ontology. HTML tags and punctuations were used to segment the retrieved contents into entity items. Finally, the entity items were clustered and the semantic annotation was assigned to clustering results according to agricultural ontology. By experimental evaluation, the proposed approach was proved effectively in resource exploring, entity extraction, and semantic annotation.展开更多
The inter-city linkage heat data provided by Baidu Migration is employed as a characterization of inter-city linkages in order to facilitate the study of the network linkage characteristics and hierarchical structure ...The inter-city linkage heat data provided by Baidu Migration is employed as a characterization of inter-city linkages in order to facilitate the study of the network linkage characteristics and hierarchical structure of urban agglomeration in the Greater Bay Area through the use of social network analysis method.This is the inaugural application of big data based on location services in the study of urban agglomeration network structure,which represents a novel research perspective on this topic.The study reveals that the density of network linkages in the Greater Bay Area urban agglomeration has reached 100%,indicating a mature network-like spatial structure.This structure has given rise to three distinct communities:Shenzhen-Dongguan-Huizhou,Guangzhou-Foshan-Zhaoqing,and Zhuhai-Zhongshan-Jiangmen.Additionally,cities within the Greater Bay Area urban agglomeration play different roles,suggesting that varying development strategies may be necessary to achieve staggered development.The study demonstrates that large datasets represented by LBS can offer novel insights and methodologies for the examination of urban agglomeration network structures,contingent on the appropriate mining and processing of the data.展开更多
The structural modeling of open-high-low-close(OHLC)data contained within the candlestick chart is crucial to financial practice.However,the inherent constraints in OHLC data pose immense challenges to its structural ...The structural modeling of open-high-low-close(OHLC)data contained within the candlestick chart is crucial to financial practice.However,the inherent constraints in OHLC data pose immense challenges to its structural modeling.Models that fail to process these constraints may yield results deviating from those of the original OHLC data structure.To address this issue,a novel unconstrained transformation method,along with its explicit inverse transformation,is proposed to properly handle the inherent constraints of OHLC data.A flexible and effective framework for structurally modeling OHLC data is designed,and the detailed procedure for modeling OHLC data through the vector autoregression and vector error correction model are provided as an example of multivariate time-series analysis.Extensive simulations and three authentic financial datasets from the Kweichow Moutai,CSI 100 index,and 50 ETF of the Chinese stock market demonstrate the effectiveness and stability of the proposed modeling approach.The modeling results of support vector regression provide further evidence that the proposed unconstrained transformation not only ensures structural forecasting of OHLC data but also is an effective feature-extraction method that can effectively improve the forecasting accuracy of machine-learning models for close prices.展开更多
With the rapid development of information technology,smart teaching platforms have become important tools for higher education teaching reform.As a core course of computer science and technology-related majors in high...With the rapid development of information technology,smart teaching platforms have become important tools for higher education teaching reform.As a core course of computer science and technology-related majors in higher education,the data structure course lays a solid foundation for students’professional learning and plays an important role in promoting their future success in technology,research,and industry.This study conducts an in-depth analysis of the pain points faced by the data structure course,and explores a teaching reform and practice of integration of theory and practice based on the system application of a smart teaching platform before class,during class,and after class.The reform practice shows that this teaching mode improves students’learning initiative,learning motivation,and practical skills.Students not only achieved better results in knowledge mastery but also significantly improved in problem analysis and solution.展开更多
Tree logic, inherited from ambient logic, is introduced as the formal foundation of related programming language and type systems, In this paper, we introduce recursion into such logic system, which can describe the t...Tree logic, inherited from ambient logic, is introduced as the formal foundation of related programming language and type systems, In this paper, we introduce recursion into such logic system, which can describe the tree data more dearly and concisely. By making a distinction between proposition and predicate, a concise semantics interpretation for our modal logic is given. We also develop a model checking algorithm for the logic without △ operator. The correctness of the algorithm is shown. Such work can be seen as the basis of the semi-structured data processing language and more flexible type system.展开更多
Hotel review data analysis is a key way to understand customers’opinions on hotel service quality and experience.By analyzing these comments,hotel managers can gain an in-depth understanding of customers’needs and e...Hotel review data analysis is a key way to understand customers’opinions on hotel service quality and experience.By analyzing these comments,hotel managers can gain an in-depth understanding of customers’needs and expectations,and thereby adjust strategies and improve service quality.This article will introduce how to conduct hotel review data analysis and how to transform this data into practical operational suggestions.展开更多
With the rapid development of science and technology,the application of intelligent technology in the field of civil engineering is more extensive,especially in the safety evaluation and management of engineering stru...With the rapid development of science and technology,the application of intelligent technology in the field of civil engineering is more extensive,especially in the safety evaluation and management of engineering structures.This paper discusses the role of intelligent technologies(such as artificial intelligence,Internet of Things,BIM,big data analysis,etc.)in the monitoring,evaluation,and maintenance of engineering structure safety.By studying the principle,application scenarios,and advantages of intelligent technology in structural safety evaluation,this paper summarizes how intelligent technology can improve engineering management efficiency and reduce safety risks,and puts forward the trend and challenge of future development.展开更多
In the year 1971,the world’s biggest structural biology collaboration name—The Research Collaboratory for Structural Bioinformatics(RCSB),was formed to gather all the structural biologists at a single platform and t...In the year 1971,the world’s biggest structural biology collaboration name—The Research Collaboratory for Structural Bioinformatics(RCSB),was formed to gather all the structural biologists at a single platform and then extended out to be the world’s most extensive structural data repository named RCSB-Protein Data Bank(PDB)(https://www.rcsb.org/)that has provided the service for more than 50 years and continues its legacy for the discoveries and repositories for structural data.The RCSB has evolved from being a collaboratory network to a full-fledged database and tool with a huge list of protein structures,nucleic acid-containing structures,ModelArchive,and AlphaFold structures,and the best is that it is expanding day by day with computational advancement with tools and visual experiences.In this review article,we have discussed how RCSB has been a successful collaboratory network,its expansion in each decade,and how it has helped the ground-breaking research.The PDB tools that are helping the researchers,yearly data deposition,validation,processing,and suggestions that can help the developer improve for upcoming years are also discussed.This review will help future researchers understand the complete history of RCSB and its developments in each decade and how various future collaborative networks can be developed in various scientific areas and can be successful by keeping RCSB as a case study.展开更多
In this paper, we research on the research on the mass structured data storage and sorting algorithm and methodology for SQL database under the big data environment. With the data storage market development and center...In this paper, we research on the research on the mass structured data storage and sorting algorithm and methodology for SQL database under the big data environment. With the data storage market development and centering on the server, the data will store model to data- centric data storage model. Storage is considered from the start, just keep a series of data, for the management system and storage device rarely consider the intrinsic value of the stored data. The prosperity of the Internet has changed the world data storage, and with the emergence of many new applications. Theoretically, the proposed algorithm has the ability of dealing with massive data and numerically, the algorithm could enhance the processing accuracy and speed which will be meaningful.展开更多
In this paper, we conduct research on the structured data mining algorithm and applications on machine learning field. Various fields due to the advancement of informatization and digitization, a lot of multi-source a...In this paper, we conduct research on the structured data mining algorithm and applications on machine learning field. Various fields due to the advancement of informatization and digitization, a lot of multi-source and heterogeneous data distributed storage, in order to achieve the sharing, we must solve from the storage management to the interoperability of a series of mechanism, the method and implementation technology. Unstructured data does not have strict structure, therefore, compared with structured information that is more difficult to standardization, with management more difficult. According to these characteristics, the large capacity of unstructured data or using files separately store, is stored in the database index of similar pointer. Under this background, we propose the new idea on the structured data mining algorithm that is meaningful.展开更多
A robust and efficient algorithm is presented to build multiresolution models (MRMs) of arbitrary meshes without requirement of subdivision connectivity. To overcome the sampling difficulty of arbitrary meshes, edge c...A robust and efficient algorithm is presented to build multiresolution models (MRMs) of arbitrary meshes without requirement of subdivision connectivity. To overcome the sampling difficulty of arbitrary meshes, edge contraction and vertex expansion are used as downsampling and upsampling methods. Our MRMs of a mesh are composed of a base mesh and a series of edge split operations, which are organized as a directed graph. Each split operation encodes two parts of information. One is the modification to the mesh, and the other is the dependency relation among splits. Such organization ensures the efficiency and robustness of our MRM algorithm. Examples demonstrate the functionality of our method.展开更多
Structural health monitoring (SHM) is a multi-discipline field that involves the automatic sensing of structural loads and response by means of a large number of sensors and instruments, followed by a diagnosis of the...Structural health monitoring (SHM) is a multi-discipline field that involves the automatic sensing of structural loads and response by means of a large number of sensors and instruments, followed by a diagnosis of the structural health based on the collected data. Because an SHM system implemented into a structure automatically senses, evaluates, and warns about structural conditions in real time, massive data are a significant feature of SHM. The techniques related to massive data are referred to as data science and engineering, and include acquisition techniques, transition techniques, management techniques, and processing and mining algorithms for massive data. This paper provides a brief review of the state of the art of data science and engineering in SHM as investigated by these authors, and covers the compressive sampling-based data-acquisition algorithm, the anomaly data diagnosis approach using a deep learning algorithm, crack identification approaches using computer vision techniques, and condition assessment approaches for bridges using machine learning algorithms. Future trends are discussed in the conclusion.展开更多
In this paper, a new concept called numerical structure of seismic data is introduced and the difference between numerical structure and numerical value of seismic data is explained. Our study shows that the numerical...In this paper, a new concept called numerical structure of seismic data is introduced and the difference between numerical structure and numerical value of seismic data is explained. Our study shows that the numerical seismic structure is closely related to oil and gas-bearing reservoir, so it is very useful for a geologist or a geophysicist to precisely interpret the oil-bearing layers from the seismic data. This technology can be applied to any exploration or production stage. The new method has been tested on a series of exploratory or development wells and proved to be reliable in China. Hydrocarbon-detection with this new method for 39 exploration wells on 25 structures indi- cates a success ratio of over 80 percent. The new method of hydrocarbon prediction can be applied for: (1) depositional environment of reservoirs with marine fades, delta, or non-marine fades (including fluvial facies, lacustrine fades); (2) sedimentary rocks of reservoirs that are non-marine clastic rocks and carbonate rock; and (3) burial depths range from 300 m to 7000 m, and the minimum thickness of these reservoirs is over 8 m (main frequency is about 50 Hz).展开更多
In order to improve the quality of web search,a new query expansion method by choosing meaningful structure data from a domain database is proposed.It categories attributes into three different classes,named as concep...In order to improve the quality of web search,a new query expansion method by choosing meaningful structure data from a domain database is proposed.It categories attributes into three different classes,named as concept attribute,context attribute and meaningless attribute,according to their semantic features which are document frequency features and distinguishing capability features.It also defines the semantic relevance between two attributes when they have correlations in the database.Then it proposes trie-bitmap structure and pair pointer tables to implement efficient algorithms for discovering attribute semantic feature and detecting their semantic relevances.By using semantic attributes and their semantic relevances,expansion words can be generated and embedded into a vector space model with interpolation parameters.The experiments use an IMDB movie database and real texts collections to evaluate the proposed method by comparing its performance with a classical vector space model.The results show that the proposed method can improve text search efficiently and also improve both semantic features and semantic relevances with good separation capabilities.展开更多
Seismic data structure characteristics means the waveform character arranged in the time sequence at discrete data points in each 2-D or 3-D seismic trace. Hydrocarbon prediction using seismic data structure character...Seismic data structure characteristics means the waveform character arranged in the time sequence at discrete data points in each 2-D or 3-D seismic trace. Hydrocarbon prediction using seismic data structure characteristics is a new reservoir prediction technique. When the main pay interval is in carbonate fracture and fissure-cavern type reservoirs with very strong inhomogeneity, there are some difficulties with hydrocarbon prediction. Because of the special geological conditions of the eighth zone in the Tahe oil field, we apply seismic data structure characteristics to hydrocarbon prediction for the Ordovician reservoir in this zone. We divide the area oil zone into favorable and unfavorable blocks. Eighteen well locations were proposed in the favorable oil block, drilled, and recovered higher output of oil and gas.展开更多
Aiming to increase the efficiency of gem design and manufacturing, a new method in computer-aided-design (CAD) of convex faceted gem cuts (CFGC) based on Half-edge data structure (HDS), including the algorithms for th...Aiming to increase the efficiency of gem design and manufacturing, a new method in computer-aided-design (CAD) of convex faceted gem cuts (CFGC) based on Half-edge data structure (HDS), including the algorithms for the implementation is presented in this work. By using object-oriented methods, geometrical elements of CFGC are classified and responding geometrical feature classes are established. Each class is implemented and embedded based on the gem process. Matrix arithmetic and analytical geometry are used to derive the affine transformation and the cutting algorithm. Based on the demand for a diversity of gem cuts, CAD functions both for free-style faceted cuts and parametric designs of typical cuts and visualization and human-computer interactions of the CAD system including two-dimensional and three-dimensional interactions have been realized which enhances the flexibility and universality of the CAD system. Furthermore, data in this CAD system can also be used directly by the gem CAM module, which will promote the gem CAD/CAM integration.展开更多
Monodisperse nanoparticle assembly with tunable structure, composition and properties can be taken as a superstructured building block for the construction of hierarchical nanostruc tures from the bottom up, which als...Monodisperse nanoparticle assembly with tunable structure, composition and properties can be taken as a superstructured building block for the construction of hierarchical nanostruc tures from the bottom up, which also represents a great challenge in nanotechnology. Here we report on a facile and controllable method that enables a high yield fabricatioa of uniform gold nanoparticle (AuNP) coresatellites with definable number (in average) of the satellite particles and tunable coretosatellite distance. The formation of the coresatellite nanostruc tures is driven by programmable DNAbasepairing, with the resulting nanocomplexes being isolatable via gel electrophoresis. By rationally controlling the DNA coverages on the core and shell particles, high production yields are achieved for the assembly/isolation process. As well, benefiting from a minimum DNA coverage on the satellite AuNPs, a strong affinity is observed for the asprepared coresatellites to get adsorbed on proteincoated graphene ox ide, which allows for a twodimensional hierarchical assembly of the coresatellite structures. The resulting hierarchical nanoassemblies are expected to find applications in various areas, including plasmonics, biosensing, and nanocatalysis. The method should be generalizable to make even more complicated and higherorder structures by making use of the structural programmability of DNA molecules.展开更多
文摘Enterprise applications utilize relational databases and structured business processes, requiring slow and expensive conversion of inputs and outputs, from business documents such as invoices, purchase orders, and receipts, into known templates and schemas before processing. We propose a new LLM Agent-based intelligent data extraction, transformation, and load (IntelligentETL) pipeline that not only ingests PDFs and detects inputs within it but also addresses the extraction of structured and unstructured data by developing tools that most efficiently and securely deal with respective data types. We study the efficiency of our proposed pipeline and compare it with enterprise solutions that also utilize LLMs. We establish the supremacy in timely and accurate data extraction and transformation capabilities of our approach for analyzing the data from varied sources based on nested and/or interlinked input constraints.
基金supported by the National Science Foundation of China(Grant Nos.52068049 and 51908266)the Science Fund for Distinguished Young Scholars of Gansu Province(No.21JR7RA267)Hongliu Outstanding Young Talents Program of Lanzhou University of Technology.
文摘Addressing the current challenges in transforming pixel displacement into physical displacement in visual monitoring technologies,as well as the inability to achieve precise full-field monitoring,this paper proposes a method for identifying the structural dynamic characteristics of wind turbines based on visual monitoring data fusion.Firstly,the Lucas-Kanade Tomasi(LKT)optical flow method and a multi-region of interest(ROI)monitoring structure are employed to track pixel displacements,which are subsequently subjected to band pass filtering and resampling operations.Secondly,the actual displacement time history is derived through double integration of the acquired acceleration data and subsequent band pass filtering.The scale factor is obtained by applying the least squares method to compare the visual displacement with the displacement derived from double integration of the acceleration data.Based on this,the multi-point displacement time histories under physical coordinates are obtained using the vision data and the scale factor.Subsequently,when visual monitoring of displacements becomes impossible due to issues such as image blurring or lens occlusion,the structural vibration equation and boundary condition constraints,among other key parameters,are employed to predict the displacements at unknown monitoring points,thereby enabling full-field displacement monitoring and dynamic characteristic testing of the structure.Finally,a small-scale shaking table test was conducted on a simulated wind turbine structure undergoing shutdown to validate the dynamic characteristics of the proposed method through test verification.The research results indicate that the proposed method achieves a time-domain error within the submillimeter range and a frequency-domain accuracy of over 99%,effectively monitoring the full-field structural dynamic characteristics of wind turbines and providing a basis for the condition assessment of wind turbine structures.
基金the Special Project of the Shanghai Municipal Commission of Economy and Information Technology for Promoting High-Quality Industrial Development(No.2024-GZL-RGZN-02011)the Shanghai City Digital Transformation Project(No.202301002)the Project of Shanghai Shenkang Hospital Development Center(No.SHDC22023214)。
文摘Surgical site infections(SSIs)are the most common healthcare-related infections in patients with lung cancer.Constructing a lung cancer SSI risk prediction model requires the extraction of relevant risk factors from lung cancer case texts,which involves two types of text structuring tasks:attribute discrimination and attribute extraction.This article proposes a joint model,Multi-BGLC,around these two types of tasks,using bidirectional encoder representations from transformers(BERT)as the encoder and fine-tuning the decoder composed of graph convolutional neural network(GCNN)+long short-term memory(LSTM)+conditional random field(CRF)based on cancer case data.The GCNN is used for attribute discrimination,whereas the LSTM and CRF are used for attribute extraction.The experiment verified the effectiveness and accuracy of the model compared with other baseline models.
基金supported by the Knowledge Innovation Program of the Chinese Academy of Sciencesthe National High-Tech R&D Program of China(2008BAK49B05)
文摘More web pages are widely applying AJAX (Asynchronous JavaScript XML) due to the rich interactivity and incremental communication. By observing, it is found that the AJAX contents, which could not be seen by traditional crawler, are well-structured and belong to one specific domain generally. Extracting the structured data from AJAX contents and annotating its semantic are very significant for further applications. In this paper, a structured AJAX data extraction method for agricultural domain based on agricultural ontology was proposed. Firstly, Crawljax, an open AJAX crawling tool, was overridden to explore and retrieve the AJAX contents; secondly, the retrieved contents were partitioned into items and then classified by combining with agricultural ontology. HTML tags and punctuations were used to segment the retrieved contents into entity items. Finally, the entity items were clustered and the semantic annotation was assigned to clustering results according to agricultural ontology. By experimental evaluation, the proposed approach was proved effectively in resource exploring, entity extraction, and semantic annotation.
文摘The inter-city linkage heat data provided by Baidu Migration is employed as a characterization of inter-city linkages in order to facilitate the study of the network linkage characteristics and hierarchical structure of urban agglomeration in the Greater Bay Area through the use of social network analysis method.This is the inaugural application of big data based on location services in the study of urban agglomeration network structure,which represents a novel research perspective on this topic.The study reveals that the density of network linkages in the Greater Bay Area urban agglomeration has reached 100%,indicating a mature network-like spatial structure.This structure has given rise to three distinct communities:Shenzhen-Dongguan-Huizhou,Guangzhou-Foshan-Zhaoqing,and Zhuhai-Zhongshan-Jiangmen.Additionally,cities within the Greater Bay Area urban agglomeration play different roles,suggesting that varying development strategies may be necessary to achieve staggered development.The study demonstrates that large datasets represented by LBS can offer novel insights and methodologies for the examination of urban agglomeration network structures,contingent on the appropriate mining and processing of the data.
基金the financial support from the Beijing Natural Science Foundation(Grant No.9244030)the National Natural Science Foundation of China(Grant Nos.72021001,11701023).
文摘The structural modeling of open-high-low-close(OHLC)data contained within the candlestick chart is crucial to financial practice.However,the inherent constraints in OHLC data pose immense challenges to its structural modeling.Models that fail to process these constraints may yield results deviating from those of the original OHLC data structure.To address this issue,a novel unconstrained transformation method,along with its explicit inverse transformation,is proposed to properly handle the inherent constraints of OHLC data.A flexible and effective framework for structurally modeling OHLC data is designed,and the detailed procedure for modeling OHLC data through the vector autoregression and vector error correction model are provided as an example of multivariate time-series analysis.Extensive simulations and three authentic financial datasets from the Kweichow Moutai,CSI 100 index,and 50 ETF of the Chinese stock market demonstrate the effectiveness and stability of the proposed modeling approach.The modeling results of support vector regression provide further evidence that the proposed unconstrained transformation not only ensures structural forecasting of OHLC data but also is an effective feature-extraction method that can effectively improve the forecasting accuracy of machine-learning models for close prices.
文摘With the rapid development of information technology,smart teaching platforms have become important tools for higher education teaching reform.As a core course of computer science and technology-related majors in higher education,the data structure course lays a solid foundation for students’professional learning and plays an important role in promoting their future success in technology,research,and industry.This study conducts an in-depth analysis of the pain points faced by the data structure course,and explores a teaching reform and practice of integration of theory and practice based on the system application of a smart teaching platform before class,during class,and after class.The reform practice shows that this teaching mode improves students’learning initiative,learning motivation,and practical skills.Students not only achieved better results in knowledge mastery but also significantly improved in problem analysis and solution.
基金Supported by the National Natural Sciences Foun-dation of China (60233010 ,60273034 ,60403014) ,863 ProgramofChina (2002AA116010) ,973 Programof China (2002CB312002)
文摘Tree logic, inherited from ambient logic, is introduced as the formal foundation of related programming language and type systems, In this paper, we introduce recursion into such logic system, which can describe the tree data more dearly and concisely. By making a distinction between proposition and predicate, a concise semantics interpretation for our modal logic is given. We also develop a model checking algorithm for the logic without △ operator. The correctness of the algorithm is shown. Such work can be seen as the basis of the semi-structured data processing language and more flexible type system.
文摘Hotel review data analysis is a key way to understand customers’opinions on hotel service quality and experience.By analyzing these comments,hotel managers can gain an in-depth understanding of customers’needs and expectations,and thereby adjust strategies and improve service quality.This article will introduce how to conduct hotel review data analysis and how to transform this data into practical operational suggestions.
文摘With the rapid development of science and technology,the application of intelligent technology in the field of civil engineering is more extensive,especially in the safety evaluation and management of engineering structures.This paper discusses the role of intelligent technologies(such as artificial intelligence,Internet of Things,BIM,big data analysis,etc.)in the monitoring,evaluation,and maintenance of engineering structure safety.By studying the principle,application scenarios,and advantages of intelligent technology in structural safety evaluation,this paper summarizes how intelligent technology can improve engineering management efficiency and reduce safety risks,and puts forward the trend and challenge of future development.
文摘In the year 1971,the world’s biggest structural biology collaboration name—The Research Collaboratory for Structural Bioinformatics(RCSB),was formed to gather all the structural biologists at a single platform and then extended out to be the world’s most extensive structural data repository named RCSB-Protein Data Bank(PDB)(https://www.rcsb.org/)that has provided the service for more than 50 years and continues its legacy for the discoveries and repositories for structural data.The RCSB has evolved from being a collaboratory network to a full-fledged database and tool with a huge list of protein structures,nucleic acid-containing structures,ModelArchive,and AlphaFold structures,and the best is that it is expanding day by day with computational advancement with tools and visual experiences.In this review article,we have discussed how RCSB has been a successful collaboratory network,its expansion in each decade,and how it has helped the ground-breaking research.The PDB tools that are helping the researchers,yearly data deposition,validation,processing,and suggestions that can help the developer improve for upcoming years are also discussed.This review will help future researchers understand the complete history of RCSB and its developments in each decade and how various future collaborative networks can be developed in various scientific areas and can be successful by keeping RCSB as a case study.
文摘In this paper, we research on the research on the mass structured data storage and sorting algorithm and methodology for SQL database under the big data environment. With the data storage market development and centering on the server, the data will store model to data- centric data storage model. Storage is considered from the start, just keep a series of data, for the management system and storage device rarely consider the intrinsic value of the stored data. The prosperity of the Internet has changed the world data storage, and with the emergence of many new applications. Theoretically, the proposed algorithm has the ability of dealing with massive data and numerically, the algorithm could enhance the processing accuracy and speed which will be meaningful.
文摘In this paper, we conduct research on the structured data mining algorithm and applications on machine learning field. Various fields due to the advancement of informatization and digitization, a lot of multi-source and heterogeneous data distributed storage, in order to achieve the sharing, we must solve from the storage management to the interoperability of a series of mechanism, the method and implementation technology. Unstructured data does not have strict structure, therefore, compared with structured information that is more difficult to standardization, with management more difficult. According to these characteristics, the large capacity of unstructured data or using files separately store, is stored in the database index of similar pointer. Under this background, we propose the new idea on the structured data mining algorithm that is meaningful.
文摘A robust and efficient algorithm is presented to build multiresolution models (MRMs) of arbitrary meshes without requirement of subdivision connectivity. To overcome the sampling difficulty of arbitrary meshes, edge contraction and vertex expansion are used as downsampling and upsampling methods. Our MRMs of a mesh are composed of a base mesh and a series of edge split operations, which are organized as a directed graph. Each split operation encodes two parts of information. One is the modification to the mesh, and the other is the dependency relation among splits. Such organization ensures the efficiency and robustness of our MRM algorithm. Examples demonstrate the functionality of our method.
基金the National Natural Science Foundation of China (51638007, 51478149, 51678203,and 51678204).
文摘Structural health monitoring (SHM) is a multi-discipline field that involves the automatic sensing of structural loads and response by means of a large number of sensors and instruments, followed by a diagnosis of the structural health based on the collected data. Because an SHM system implemented into a structure automatically senses, evaluates, and warns about structural conditions in real time, massive data are a significant feature of SHM. The techniques related to massive data are referred to as data science and engineering, and include acquisition techniques, transition techniques, management techniques, and processing and mining algorithms for massive data. This paper provides a brief review of the state of the art of data science and engineering in SHM as investigated by these authors, and covers the compressive sampling-based data-acquisition algorithm, the anomaly data diagnosis approach using a deep learning algorithm, crack identification approaches using computer vision techniques, and condition assessment approaches for bridges using machine learning algorithms. Future trends are discussed in the conclusion.
基金Mainly presented at the 6-th international meeting of acoustics in Aug. 2003, and The 1999 SPE Asia Pacific Oil and GasConference and Exhibition held in Jakarta, Indonesia, 20-22 April 1999, SPE 54274.
文摘In this paper, a new concept called numerical structure of seismic data is introduced and the difference between numerical structure and numerical value of seismic data is explained. Our study shows that the numerical seismic structure is closely related to oil and gas-bearing reservoir, so it is very useful for a geologist or a geophysicist to precisely interpret the oil-bearing layers from the seismic data. This technology can be applied to any exploration or production stage. The new method has been tested on a series of exploratory or development wells and proved to be reliable in China. Hydrocarbon-detection with this new method for 39 exploration wells on 25 structures indi- cates a success ratio of over 80 percent. The new method of hydrocarbon prediction can be applied for: (1) depositional environment of reservoirs with marine fades, delta, or non-marine fades (including fluvial facies, lacustrine fades); (2) sedimentary rocks of reservoirs that are non-marine clastic rocks and carbonate rock; and (3) burial depths range from 300 m to 7000 m, and the minimum thickness of these reservoirs is over 8 m (main frequency is about 50 Hz).
基金Program for New Century Excellent Talents in University(No.NCET-06-0290)the National Natural Science Foundation of China(No.60503036)the Fok Ying Tong Education Foundation Award(No.104027)
文摘In order to improve the quality of web search,a new query expansion method by choosing meaningful structure data from a domain database is proposed.It categories attributes into three different classes,named as concept attribute,context attribute and meaningless attribute,according to their semantic features which are document frequency features and distinguishing capability features.It also defines the semantic relevance between two attributes when they have correlations in the database.Then it proposes trie-bitmap structure and pair pointer tables to implement efficient algorithms for discovering attribute semantic feature and detecting their semantic relevances.By using semantic attributes and their semantic relevances,expansion words can be generated and embedded into a vector space model with interpolation parameters.The experiments use an IMDB movie database and real texts collections to evaluate the proposed method by comparing its performance with a classical vector space model.The results show that the proposed method can improve text search efficiently and also improve both semantic features and semantic relevances with good separation capabilities.
基金This reservoir research is sponsored by the National 973 Subject Project (No. 2001CB209).
文摘Seismic data structure characteristics means the waveform character arranged in the time sequence at discrete data points in each 2-D or 3-D seismic trace. Hydrocarbon prediction using seismic data structure characteristics is a new reservoir prediction technique. When the main pay interval is in carbonate fracture and fissure-cavern type reservoirs with very strong inhomogeneity, there are some difficulties with hydrocarbon prediction. Because of the special geological conditions of the eighth zone in the Tahe oil field, we apply seismic data structure characteristics to hydrocarbon prediction for the Ordovician reservoir in this zone. We divide the area oil zone into favorable and unfavorable blocks. Eighteen well locations were proposed in the favorable oil block, drilled, and recovered higher output of oil and gas.
基金Supported by the National Natural Science Foundation of China(21576240)Experimental Technology Research Program of China University of Geosciences(Key Program)(SJ-201422)
文摘Aiming to increase the efficiency of gem design and manufacturing, a new method in computer-aided-design (CAD) of convex faceted gem cuts (CFGC) based on Half-edge data structure (HDS), including the algorithms for the implementation is presented in this work. By using object-oriented methods, geometrical elements of CFGC are classified and responding geometrical feature classes are established. Each class is implemented and embedded based on the gem process. Matrix arithmetic and analytical geometry are used to derive the affine transformation and the cutting algorithm. Based on the demand for a diversity of gem cuts, CAD functions both for free-style faceted cuts and parametric designs of typical cuts and visualization and human-computer interactions of the CAD system including two-dimensional and three-dimensional interactions have been realized which enhances the flexibility and universality of the CAD system. Furthermore, data in this CAD system can also be used directly by the gem CAM module, which will promote the gem CAD/CAM integration.
文摘Monodisperse nanoparticle assembly with tunable structure, composition and properties can be taken as a superstructured building block for the construction of hierarchical nanostruc tures from the bottom up, which also represents a great challenge in nanotechnology. Here we report on a facile and controllable method that enables a high yield fabricatioa of uniform gold nanoparticle (AuNP) coresatellites with definable number (in average) of the satellite particles and tunable coretosatellite distance. The formation of the coresatellite nanostruc tures is driven by programmable DNAbasepairing, with the resulting nanocomplexes being isolatable via gel electrophoresis. By rationally controlling the DNA coverages on the core and shell particles, high production yields are achieved for the assembly/isolation process. As well, benefiting from a minimum DNA coverage on the satellite AuNPs, a strong affinity is observed for the asprepared coresatellites to get adsorbed on proteincoated graphene ox ide, which allows for a twodimensional hierarchical assembly of the coresatellite structures. The resulting hierarchical nanoassemblies are expected to find applications in various areas, including plasmonics, biosensing, and nanocatalysis. The method should be generalizable to make even more complicated and higherorder structures by making use of the structural programmability of DNA molecules.