In cabin-type component alignment, digital measurement technology is usually adopted to provide guidance for assembly. Depending on the system of measurement, the alignment process can be divided into measurement-assi...In cabin-type component alignment, digital measurement technology is usually adopted to provide guidance for assembly. Depending on the system of measurement, the alignment process can be divided into measurement-assisted assembly(MAA) and force-driven assembly. In MAA,relative pose between components is directly measured to guide assembly, while in force-driven assembly, only contact state can be recognized according to measured six-dimensional force and torque(6 D F/T) and the process is completed based on preset assembly strategy. Aiming to improve the efficiency of force-driven cabin-type component alignment, this paper proposed a heuristic alignment method based on multi-source data fusion. In this method, measured 6 D F/T, pose data and geometric information of components are fused to calculate the relative pose between components and guide the movement of pose adjustment platform. Among these data types, pose data and measured 6 D F/T are combined as data set. To collect the data sets needed for data fusion, dynamic gravity compensation method and hybrid motion control method are designed. Then the relative pose calculation method is elaborated, which transforms collected data sets into discrete geometric elements and calculates the relative poses based on the geometric information of components.Finally, experiments are conducted in simulation environment and the results show that the proposed alignment method is feasible and effective.展开更多
Industrial data mining usually deals with data from different sources.These heterogeneous datasets describe the same object in different views.However,samples from some of the datasets may be lost.Then the remaining s...Industrial data mining usually deals with data from different sources.These heterogeneous datasets describe the same object in different views.However,samples from some of the datasets may be lost.Then the remaining samples do not correspond one-to-one correctly.Mismatched datasets caused by missing samples make the industrial data unavailable for further machine learning.In order to align the mismatched samples,this article presents a cooperative iteration matching method(CIMM)based on the modified dynamic time warping(DTW).The proposed method regards the sequentially accumulated industrial data as the time series.Mismatched samples are aligned by the DTW.In addition,dynamic constraints are applied to the warping distance of the DTW process to make the alignment more efficient.Then a series of models are trained with the cumulated samples iteratively.Several groups of numerical experiments on different missing patterns and missing locations are designed and analyzed to prove the effectiveness and the applicability of the proposed method.展开更多
A data acquisition system based on LabVIEW and NI PXI-5105 is presented for multi-channel data acquisition. It can realize the functions of parameter setting, data acquisition and storage, waveform display and data an...A data acquisition system based on LabVIEW and NI PXI-5105 is presented for multi-channel data acquisition. It can realize the functions of parameter setting, data acquisition and storage, waveform display and data analysis using LabVIEW and NI-SCOPE device driver. The advantages of the system are that the setting is convenient, the operation is easy, the interface is friendly and the functions are practical. The experiment results show that the system has good stability and high reliability and is a powerful tool for multi-channel data acquisition.展开更多
Encoding information in light polarization is of great importance in facilitating optical data storage(ODS)for information security and data storage capacity escalation.However,despite recent advances in nanophotonic ...Encoding information in light polarization is of great importance in facilitating optical data storage(ODS)for information security and data storage capacity escalation.However,despite recent advances in nanophotonic techniques vastly en-hancing the feasibility of applying polarization channels,the data fidelity in reconstructed bits has been constrained by severe crosstalks occurring between varied polarization angles during data recording and reading process,which gravely hindered the utilization of this technique in practice.In this paper,we demonstrate an ultra-low crosstalk polarization-en-coding multilayer ODS technique for high-fidelity data recording and retrieving by utilizing a nanofibre-based nanocom-posite film involving highly aligned gold nanorods(GNRs).With parallelizing the gold nanorods in the recording medium,the information carrier configuration minimizes miswriting and misreading possibilities for information input and output,respectively,compared with its randomly self-assembled counterparts.The enhanced data accuracy has significantly im-proved the bit recall fidelity that is quantified by a correlation coefficient higher than 0.99.It is anticipated that the demon-strated technique can facilitate the development of multiplexing ODS for a greener future.展开更多
A data acquisition system (DAS) to implement high-speed, real-time and multi-channel data acquisition and store is presented. The control of the system is implemented by the combination of complex programable logic ...A data acquisition system (DAS) to implement high-speed, real-time and multi-channel data acquisition and store is presented. The control of the system is implemented by the combination of complex programable logic device (CPLD) and digital signal processing (DSP), the bulk buffer of the system is implemented by the combination of CPLD, DSP, and synchronous dynamic random access memory (SDRAM), and the data transfer is implemented by the combination of DSP, first in first out (FIFO), universal serial bus (USB) and USB hub. The system could not only work independently in single-channel mode, but also implement high-speed real-time multi-channel data acquisition system (MCDAS) by the combination of multiple single-channels. The sampling rate and data storage capacity of each channel could reach up to 100 million sampiing per second and 256 MB respectively.展开更多
This paper describes the detailed desi gn of data acquisition device with multi-channel and high-precision for aerosp ace.Based on detailed analysis of the advantages and disadvantages of tw o common acquisition circu...This paper describes the detailed desi gn of data acquisition device with multi-channel and high-precision for aerosp ace.Based on detailed analysis of the advantages and disadvantages of tw o common acquisition circuits,the design factors of acquisition device focus o n accuracy,sampling rate,hardware overhead and design space.The me chanical structure of the system is divided into different card layers according to different functions and the structure has the characteristics of high reliability,conveni ence to install and scalability.To ens ure reliable operation mode,the interface uses the optocoupler isolated from th e e xternal circuit.The transmission of signal is decided by the current in the cur rent loop that consists of optocouplers between acquisition device and t est bench.In multi-channel switching circuit,by establ ishing analog multiplexer model,the selection principles of circuit modes are given.展开更多
In the realm of data privacy protection,federated learning aims to collaboratively train a global model.However,heterogeneous data between clients presents challenges,often resulting in slow convergence and inadequate...In the realm of data privacy protection,federated learning aims to collaboratively train a global model.However,heterogeneous data between clients presents challenges,often resulting in slow convergence and inadequate accuracy of the global model.Utilizing shared feature representations alongside customized classifiers for individual clients emerges as a promising personalized solution.Nonetheless,previous research has frequently neglected the integration of global knowledge into local representation learning and the synergy between global and local classifiers,thereby limiting model performance.To tackle these issues,this study proposes a hierarchical optimization method for federated learning with feature alignment and the fusion of classification decisions(FedFCD).FedFCD regularizes the relationship between global and local feature representations to achieve alignment and incorporates decision information from the global classifier,facilitating the late fusion of decision outputs from both global and local classifiers.Additionally,FedFCD employs a hierarchical optimization strategy to flexibly optimize model parameters.Through experiments on the Fashion-MNIST,CIFAR-10 and CIFAR-100 datasets,we demonstrate the effectiveness and superiority of FedFCD.For instance,on the CIFAR-100 dataset,FedFCD exhibited a significant improvement in average test accuracy by 6.83%compared to four outstanding personalized federated learning approaches.Furthermore,extended experiments confirm the robustness of FedFCD across various hyperparameter values.展开更多
Initial alignment is the precondition for strapdown inertial navigation system(SINS)to navigate.Its two important indexes are accuracy and rapidity,the accuracy of the initial alignment is directly related to the work...Initial alignment is the precondition for strapdown inertial navigation system(SINS)to navigate.Its two important indexes are accuracy and rapidity,the accuracy of the initial alignment is directly related to the working accuracy of SINS,but in self-alignment,the two indexes are often contradictory.In view of the limitations of conventional data processing algorithms,a novel method of compass alignment based on stored data and repeated navigation calculation for SINS is proposed.By means of data storage,the same data is used in different stages of the initial alignment,which is beneficial to shorten the initial alignment time and improve the alignment accuracy.In order to verify the correctness of the compass algorithm based on stored data and repeated navigation calculation,the simulation experiment was done.In summary,when the computer performance is sufficiently high,the compass alignment method based on the stored data and the forward and reverse navigation calculation can effectively improve the alignment speed and improve the alignment accuracy.展开更多
In this paper, we present machine learning algorithms and systems for similar video retrieval. Here, the query is itself a video. For the similarity measurement, exemplars, or representative frames in each video, are ...In this paper, we present machine learning algorithms and systems for similar video retrieval. Here, the query is itself a video. For the similarity measurement, exemplars, or representative frames in each video, are extracted by unsupervised learning. For this learning, we chose the order-aware competitive learning. After obtaining a set of exemplars for each video, the similarity is computed. Because the numbers and positions of the exemplars are different in each video, we use a similarity computing method called M-distance, which generalizes existing global and local alignment methods using followers to the exemplars. To represent each frame in the video, this paper emphasizes the Frame Signature of the ISO/IEC standard so that the total system, along with its graphical user interface, becomes practical. Experiments on the detection of inserted plagiaristic scenes showed excellent precision-recall curves, with precision values very close to 1. Thus, the proposed system can work as a plagiarism detector for videos. In addition, this method can be regarded as the structuring of unstructured data via numerical labeling by exemplars. Finally, further sophistication of this labeling is discussed.展开更多
Currently,most rock physics models,used for evaluating the elastic properties of cracked or fractured media,take into account the crack properties,but not the background anisotropy.This creats the errors of in the ani...Currently,most rock physics models,used for evaluating the elastic properties of cracked or fractured media,take into account the crack properties,but not the background anisotropy.This creats the errors of in the anisotropy estimates by using fi eld logging data.In this work,based on the scattered wavefi eld theory,a sphere-equivalency method of elastic wave scattering was developed to accurately calculate the elastic properties of a vertical transversely isotropic solid containing aligned cracks.By setting the scattered wavefi eld due to a crack equal to that due to an equivalent sphere,an eff ective elastic stiff ness tensor was derived for the cracked medium.The stability and accuracy of the approach were determined for varying background anisotropy values.The results show that the anisotropy of the eff ective media is aff ected by cracks and background anisotropy for transversely isotropic background permeated by horizontally aligned cracks,especially for the elastic wave propagating along the horizontal direction.Meanwhile,the crack orientation has a signifi cant infl uence on the elastic wave velocity anisotropy.The theory was subsequently applied to model laboratory ultrasonic experimental data for artifi cially cracked samples and to model borehole acoustic anisotropy measurements.After considering the background anisotropy,the model shows an improvement in the agreement between theoretical predictions and measurement data,demonstrating that the present theory can adequately explain the anisotropic characteristics of cracked media.展开更多
With the popularisation of intelligent power,power devices have different shapes,numbers and specifications.This means that the power data has distributional variability,the model learning process cannot achieve suffi...With the popularisation of intelligent power,power devices have different shapes,numbers and specifications.This means that the power data has distributional variability,the model learning process cannot achieve sufficient extraction of data features,which seriously affects the accuracy and performance of anomaly detection.Therefore,this paper proposes a deep learning-based anomaly detection model for power data,which integrates a data alignment enhancement technique based on random sampling and an adaptive feature fusion method leveraging dimension reduction.Aiming at the distribution variability of power data,this paper developed a sliding window-based data adjustment method for this model,which solves the problem of high-dimensional feature noise and low-dimensional missing data.To address the problem of insufficient feature fusion,an adaptive feature fusion method based on feature dimension reduction and dictionary learning is proposed to improve the anomaly data detection accuracy of the model.In order to verify the effectiveness of the proposed method,we conducted effectiveness comparisons through elimination experiments.The experimental results show that compared with the traditional anomaly detection methods,the method proposed in this paper not only has an advantage in model accuracy,but also reduces the amount of parameter calculation of the model in the process of feature matching and improves the detection speed.展开更多
The massive extension in biological data induced a need for user-friendly bioinformatics tools could be used for routine biological data manipulation. Bioanalyzer is a simple analytical software implements a variety o...The massive extension in biological data induced a need for user-friendly bioinformatics tools could be used for routine biological data manipulation. Bioanalyzer is a simple analytical software implements a variety of tools to perform common data analysis on different biological data types and databases. Bioanalyzer provides general aspects of data analysis such as handling nucleotide data, fetching different data formats information, NGS quality control, data visualization, performing multiple sequence alignment and sequence BLAST. These tools accept common biological data formats and produce human-readable output files could be stored on local computer machines. Bioanalyzer has a user-friendly graphical user interface to simplify massive biological data analysis and consume less memory and processing power. Bioanalyzer source code was written through Python programming language which provides less memory usage and initial startup time. Bioanalyzer is a free and open source software, where its code could be modified, extended or integrated in different bioinformatics pipelines. Bioinformatics Produce huge data in FASTA and Genbank format which can be used to produce a lot of annotation information which can be done with Python programming language that open the door form bioinformatics tool due to their elasticity in data analysis and simplicity which inspire us to develop new multiple tool software able to manipulate FASTA and Genbank files. The goal Develop new software uses Genomic data files to produce annotated data. Software was written using python programming language and biopython packages.展开更多
Advancements in next-generation sequencer(NGS)platforms have improved NGS sequence data production and reduced the cost involved,which has resulted in the production of a large amount of genome data.The downstream ana...Advancements in next-generation sequencer(NGS)platforms have improved NGS sequence data production and reduced the cost involved,which has resulted in the production of a large amount of genome data.The downstream analysis of multiple associated sequences has become a bottleneck for the growing genomic data due to storage and space utilization issues in the domain of bioinformatics.The traditional string-matching algorithms are efficient for small sized data sequences and cannot process large amounts of data for downstream analysis.This study proposes a novel bit-parallelism algorithm called BitmapAligner to overcome the issues faced due to a large number of sequences and to improve the speed and quality of multiple sequence alignment(MSA).The input files(sequences)tested over BitmapAligner can be easily managed and organized using the Hadoop distributed file system.The proposed aligner converts the test file(the whole genome sequence)into binaries of an equal length of the sequence,line by line,before the sequence alignment processing.The Hadoop distributed file system splits the larger files into blocks,based on a defined block size,which is 128 MB by default.BitmapAligner can accurately process the sequence alignment using the bitmask approach on large-scale sequences after sorting the data.The experimental results indicate that BitmapAligner operates in real time,with a large number of sequences.Moreover,BitmapAligner achieves the exact start and end positions of the pattern sequence to test the MSA application in the whole genome query sequence.The MSA’s accuracy is verified by the bitmask indexing property of the bit-parallelism extended shifts(BXS)algorithm.The dynamic and exact approach of the BXS algorithm is implemented through the MapReduce function of Apache Hadoop.Conversely,the traditional seeds-and-extend approach faces the risk of errors while identifying the pattern sequences’positions.Moreover,the proposed model resolves the largescale data challenges that are covered through MapReduce in the Hadoop framework.Hive,Yarn,HBase,Cassandra,and many other pertinent flavors are to be used in the future for data structuring and annotations on the top layer of Hadoop since Hadoop is primarily used for data organization and handles text documents.展开更多
In order to achieve low-latency and high-reliability data gathering in heterogeneous wireless sensor networks(HWSNs),the problem of multi-channel-based data gathering with minimum latency(MCDGML),which associates with...In order to achieve low-latency and high-reliability data gathering in heterogeneous wireless sensor networks(HWSNs),the problem of multi-channel-based data gathering with minimum latency(MCDGML),which associates with construction of data gathering trees,channel allocation,power assignment of nodes and link scheduling,is formulated as an optimization problem in this paper.Then,the optimization problem is proved to be NP-hard.To make the problem tractable,firstly,a multi-channel-based low-latency(MCLL)algorithm that constructs data gathering trees is proposed by optimizing the topology of nodes.Secondly,a maximum links scheduling(MLS)algorithm is proposed to further reduce the latency of data gathering,which ensures that the signal to interference plus noise ratio(SINR)of all scheduled links is not less than a certain threshold to guarantee the reliability of links.In addition,considering the interruption problem of data gathering caused by dead nodes or failed links,a robust mechanism is proposed by selecting certain assistant nodes based on the defined one-hop weight.A number of simulation results show that our algorithms can achieve a lower data gathering latency than some comparable data gathering algorithms while guaranteeing the reliability of links,and a higher packet arrival rate at the sink node can be achieved when the proposed algorithms are performed with the robust mechanism.展开更多
Supervised models for event detection usually require large-scale human-annotated training data,especially neural models.A data augmentation technique is proposed to improve the performance of event detection by gener...Supervised models for event detection usually require large-scale human-annotated training data,especially neural models.A data augmentation technique is proposed to improve the performance of event detection by generating paraphrase sentences to enrich expressions of the original data.Specifically,based on an existing human-annotated event detection dataset,we first automatically build a paraphrase dataset and label it with a designed event annotation alignment algorithm.To alleviate possible wrong labels in the generated paraphrase dataset,a multi-instance learning(MIL)method is adopted for joint training on both the gold human-annotated data and the generated paraphrase dataset.Experimental results on a widely used dataset ACE2005 show the effectiveness of our approach.展开更多
基金co-supported by the Special Research on Civil Aircraft of China (No.MJZ-2017-J-96)the Defense Industrial Technology Development Program of China (No.JCKY2016206B009)。
文摘In cabin-type component alignment, digital measurement technology is usually adopted to provide guidance for assembly. Depending on the system of measurement, the alignment process can be divided into measurement-assisted assembly(MAA) and force-driven assembly. In MAA,relative pose between components is directly measured to guide assembly, while in force-driven assembly, only contact state can be recognized according to measured six-dimensional force and torque(6 D F/T) and the process is completed based on preset assembly strategy. Aiming to improve the efficiency of force-driven cabin-type component alignment, this paper proposed a heuristic alignment method based on multi-source data fusion. In this method, measured 6 D F/T, pose data and geometric information of components are fused to calculate the relative pose between components and guide the movement of pose adjustment platform. Among these data types, pose data and measured 6 D F/T are combined as data set. To collect the data sets needed for data fusion, dynamic gravity compensation method and hybrid motion control method are designed. Then the relative pose calculation method is elaborated, which transforms collected data sets into discrete geometric elements and calculates the relative poses based on the geometric information of components.Finally, experiments are conducted in simulation environment and the results show that the proposed alignment method is feasible and effective.
基金the Key National Natural Science Foundation of China(No.U1864211)the National Natural Science Foundation of China(No.11772191)the Natural Science Foundation of Shanghai(No.21ZR1431500)。
文摘Industrial data mining usually deals with data from different sources.These heterogeneous datasets describe the same object in different views.However,samples from some of the datasets may be lost.Then the remaining samples do not correspond one-to-one correctly.Mismatched datasets caused by missing samples make the industrial data unavailable for further machine learning.In order to align the mismatched samples,this article presents a cooperative iteration matching method(CIMM)based on the modified dynamic time warping(DTW).The proposed method regards the sequentially accumulated industrial data as the time series.Mismatched samples are aligned by the DTW.In addition,dynamic constraints are applied to the warping distance of the DTW process to make the alignment more efficient.Then a series of models are trained with the cumulated samples iteratively.Several groups of numerical experiments on different missing patterns and missing locations are designed and analyzed to prove the effectiveness and the applicability of the proposed method.
文摘A data acquisition system based on LabVIEW and NI PXI-5105 is presented for multi-channel data acquisition. It can realize the functions of parameter setting, data acquisition and storage, waveform display and data analysis using LabVIEW and NI-SCOPE device driver. The advantages of the system are that the setting is convenient, the operation is easy, the interface is friendly and the functions are practical. The experiment results show that the system has good stability and high reliability and is a powerful tool for multi-channel data acquisition.
基金financial supports from the National Natural Science Foundation of China(Grant Nos.62174073,61875073,11674130,91750110 and 61522504)the National Key R&D Program of China(Grant No.2018YFB1107200)+3 种基金the Guangdong Provincial Innovation and Entrepren-eurship Project(Grant No.2016ZT06D081)the Natural Science Founda-tion of Guangdong Province,China(Grant Nos.2016A030306016 and 2016TQ03X981)the Pearl River Nova Program of Guangzhou(Grant No.201806010040)the Technology Innovation and Development Plan of Yantai(Grant No.2020XDRH095).
文摘Encoding information in light polarization is of great importance in facilitating optical data storage(ODS)for information security and data storage capacity escalation.However,despite recent advances in nanophotonic techniques vastly en-hancing the feasibility of applying polarization channels,the data fidelity in reconstructed bits has been constrained by severe crosstalks occurring between varied polarization angles during data recording and reading process,which gravely hindered the utilization of this technique in practice.In this paper,we demonstrate an ultra-low crosstalk polarization-en-coding multilayer ODS technique for high-fidelity data recording and retrieving by utilizing a nanofibre-based nanocom-posite film involving highly aligned gold nanorods(GNRs).With parallelizing the gold nanorods in the recording medium,the information carrier configuration minimizes miswriting and misreading possibilities for information input and output,respectively,compared with its randomly self-assembled counterparts.The enhanced data accuracy has significantly im-proved the bit recall fidelity that is quantified by a correlation coefficient higher than 0.99.It is anticipated that the demon-strated technique can facilitate the development of multiplexing ODS for a greener future.
文摘A data acquisition system (DAS) to implement high-speed, real-time and multi-channel data acquisition and store is presented. The control of the system is implemented by the combination of complex programable logic device (CPLD) and digital signal processing (DSP), the bulk buffer of the system is implemented by the combination of CPLD, DSP, and synchronous dynamic random access memory (SDRAM), and the data transfer is implemented by the combination of DSP, first in first out (FIFO), universal serial bus (USB) and USB hub. The system could not only work independently in single-channel mode, but also implement high-speed real-time multi-channel data acquisition system (MCDAS) by the combination of multiple single-channels. The sampling rate and data storage capacity of each channel could reach up to 100 million sampiing per second and 256 MB respectively.
基金National Natural Science Foundation of China (No.50905169)
文摘This paper describes the detailed desi gn of data acquisition device with multi-channel and high-precision for aerosp ace.Based on detailed analysis of the advantages and disadvantages of tw o common acquisition circuits,the design factors of acquisition device focus o n accuracy,sampling rate,hardware overhead and design space.The me chanical structure of the system is divided into different card layers according to different functions and the structure has the characteristics of high reliability,conveni ence to install and scalability.To ens ure reliable operation mode,the interface uses the optocoupler isolated from th e e xternal circuit.The transmission of signal is decided by the current in the cur rent loop that consists of optocouplers between acquisition device and t est bench.In multi-channel switching circuit,by establ ishing analog multiplexer model,the selection principles of circuit modes are given.
基金the National Natural Science Foundation of China(Grant No.62062001)Ningxia Youth Top Talent Project(2021).
文摘In the realm of data privacy protection,federated learning aims to collaboratively train a global model.However,heterogeneous data between clients presents challenges,often resulting in slow convergence and inadequate accuracy of the global model.Utilizing shared feature representations alongside customized classifiers for individual clients emerges as a promising personalized solution.Nonetheless,previous research has frequently neglected the integration of global knowledge into local representation learning and the synergy between global and local classifiers,thereby limiting model performance.To tackle these issues,this study proposes a hierarchical optimization method for federated learning with feature alignment and the fusion of classification decisions(FedFCD).FedFCD regularizes the relationship between global and local feature representations to achieve alignment and incorporates decision information from the global classifier,facilitating the late fusion of decision outputs from both global and local classifiers.Additionally,FedFCD employs a hierarchical optimization strategy to flexibly optimize model parameters.Through experiments on the Fashion-MNIST,CIFAR-10 and CIFAR-100 datasets,we demonstrate the effectiveness and superiority of FedFCD.For instance,on the CIFAR-100 dataset,FedFCD exhibited a significant improvement in average test accuracy by 6.83%compared to four outstanding personalized federated learning approaches.Furthermore,extended experiments confirm the robustness of FedFCD across various hyperparameter values.
基金This work was supported by the National Nature Science Foundation of China(Grant No.5200110367)Natural Science Foundation of Jiangsu Province(Grant No.SBK2020043219)+1 种基金Scientific Research Foundation of the Higher Education Institutions of Jiangsu Province(Grant No.19KJB510052)NUPTSF(Grant No.NY219023).
文摘Initial alignment is the precondition for strapdown inertial navigation system(SINS)to navigate.Its two important indexes are accuracy and rapidity,the accuracy of the initial alignment is directly related to the working accuracy of SINS,but in self-alignment,the two indexes are often contradictory.In view of the limitations of conventional data processing algorithms,a novel method of compass alignment based on stored data and repeated navigation calculation for SINS is proposed.By means of data storage,the same data is used in different stages of the initial alignment,which is beneficial to shorten the initial alignment time and improve the alignment accuracy.In order to verify the correctness of the compass algorithm based on stored data and repeated navigation calculation,the simulation experiment was done.In summary,when the computer performance is sufficiently high,the compass alignment method based on the stored data and the forward and reverse navigation calculation can effectively improve the alignment speed and improve the alignment accuracy.
文摘In this paper, we present machine learning algorithms and systems for similar video retrieval. Here, the query is itself a video. For the similarity measurement, exemplars, or representative frames in each video, are extracted by unsupervised learning. For this learning, we chose the order-aware competitive learning. After obtaining a set of exemplars for each video, the similarity is computed. Because the numbers and positions of the exemplars are different in each video, we use a similarity computing method called M-distance, which generalizes existing global and local alignment methods using followers to the exemplars. To represent each frame in the video, this paper emphasizes the Frame Signature of the ISO/IEC standard so that the total system, along with its graphical user interface, becomes practical. Experiments on the detection of inserted plagiaristic scenes showed excellent precision-recall curves, with precision values very close to 1. Thus, the proposed system can work as a plagiarism detector for videos. In addition, this method can be regarded as the structuring of unstructured data via numerical labeling by exemplars. Finally, further sophistication of this labeling is discussed.
基金supported by the National Natural Science Foundation of China (No. 41821002)the Fundamental Research Funds for the Central Universities (Nos. 18CX02065A,20CX06046A)+3 种基金the Young Elite Scientist Sponsorship Program by the China Association for Science and TechnologyMajor Scientifi c and Technological Projects of CNPC (No. ZD2019-183-004)Qingdao Postdoctoral Applied Research Project (No. qdyy20190079)China Postdoctoral Science Foundation (No. 2020M672171)。
文摘Currently,most rock physics models,used for evaluating the elastic properties of cracked or fractured media,take into account the crack properties,but not the background anisotropy.This creats the errors of in the anisotropy estimates by using fi eld logging data.In this work,based on the scattered wavefi eld theory,a sphere-equivalency method of elastic wave scattering was developed to accurately calculate the elastic properties of a vertical transversely isotropic solid containing aligned cracks.By setting the scattered wavefi eld due to a crack equal to that due to an equivalent sphere,an eff ective elastic stiff ness tensor was derived for the cracked medium.The stability and accuracy of the approach were determined for varying background anisotropy values.The results show that the anisotropy of the eff ective media is aff ected by cracks and background anisotropy for transversely isotropic background permeated by horizontally aligned cracks,especially for the elastic wave propagating along the horizontal direction.Meanwhile,the crack orientation has a signifi cant infl uence on the elastic wave velocity anisotropy.The theory was subsequently applied to model laboratory ultrasonic experimental data for artifi cially cracked samples and to model borehole acoustic anisotropy measurements.After considering the background anisotropy,the model shows an improvement in the agreement between theoretical predictions and measurement data,demonstrating that the present theory can adequately explain the anisotropic characteristics of cracked media.
文摘With the popularisation of intelligent power,power devices have different shapes,numbers and specifications.This means that the power data has distributional variability,the model learning process cannot achieve sufficient extraction of data features,which seriously affects the accuracy and performance of anomaly detection.Therefore,this paper proposes a deep learning-based anomaly detection model for power data,which integrates a data alignment enhancement technique based on random sampling and an adaptive feature fusion method leveraging dimension reduction.Aiming at the distribution variability of power data,this paper developed a sliding window-based data adjustment method for this model,which solves the problem of high-dimensional feature noise and low-dimensional missing data.To address the problem of insufficient feature fusion,an adaptive feature fusion method based on feature dimension reduction and dictionary learning is proposed to improve the anomaly data detection accuracy of the model.In order to verify the effectiveness of the proposed method,we conducted effectiveness comparisons through elimination experiments.The experimental results show that compared with the traditional anomaly detection methods,the method proposed in this paper not only has an advantage in model accuracy,but also reduces the amount of parameter calculation of the model in the process of feature matching and improves the detection speed.
文摘The massive extension in biological data induced a need for user-friendly bioinformatics tools could be used for routine biological data manipulation. Bioanalyzer is a simple analytical software implements a variety of tools to perform common data analysis on different biological data types and databases. Bioanalyzer provides general aspects of data analysis such as handling nucleotide data, fetching different data formats information, NGS quality control, data visualization, performing multiple sequence alignment and sequence BLAST. These tools accept common biological data formats and produce human-readable output files could be stored on local computer machines. Bioanalyzer has a user-friendly graphical user interface to simplify massive biological data analysis and consume less memory and processing power. Bioanalyzer source code was written through Python programming language which provides less memory usage and initial startup time. Bioanalyzer is a free and open source software, where its code could be modified, extended or integrated in different bioinformatics pipelines. Bioinformatics Produce huge data in FASTA and Genbank format which can be used to produce a lot of annotation information which can be done with Python programming language that open the door form bioinformatics tool due to their elasticity in data analysis and simplicity which inspire us to develop new multiple tool software able to manipulate FASTA and Genbank files. The goal Develop new software uses Genomic data files to produce annotated data. Software was written using python programming language and biopython packages.
基金This work was supported in part by the National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(No.2018R1C1B5084424)in part by the Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(No.2019R1A6A1A03032119).
文摘Advancements in next-generation sequencer(NGS)platforms have improved NGS sequence data production and reduced the cost involved,which has resulted in the production of a large amount of genome data.The downstream analysis of multiple associated sequences has become a bottleneck for the growing genomic data due to storage and space utilization issues in the domain of bioinformatics.The traditional string-matching algorithms are efficient for small sized data sequences and cannot process large amounts of data for downstream analysis.This study proposes a novel bit-parallelism algorithm called BitmapAligner to overcome the issues faced due to a large number of sequences and to improve the speed and quality of multiple sequence alignment(MSA).The input files(sequences)tested over BitmapAligner can be easily managed and organized using the Hadoop distributed file system.The proposed aligner converts the test file(the whole genome sequence)into binaries of an equal length of the sequence,line by line,before the sequence alignment processing.The Hadoop distributed file system splits the larger files into blocks,based on a defined block size,which is 128 MB by default.BitmapAligner can accurately process the sequence alignment using the bitmask approach on large-scale sequences after sorting the data.The experimental results indicate that BitmapAligner operates in real time,with a large number of sequences.Moreover,BitmapAligner achieves the exact start and end positions of the pattern sequence to test the MSA application in the whole genome query sequence.The MSA’s accuracy is verified by the bitmask indexing property of the bit-parallelism extended shifts(BXS)algorithm.The dynamic and exact approach of the BXS algorithm is implemented through the MapReduce function of Apache Hadoop.Conversely,the traditional seeds-and-extend approach faces the risk of errors while identifying the pattern sequences’positions.Moreover,the proposed model resolves the largescale data challenges that are covered through MapReduce in the Hadoop framework.Hive,Yarn,HBase,Cassandra,and many other pertinent flavors are to be used in the future for data structuring and annotations on the top layer of Hadoop since Hadoop is primarily used for data organization and handles text documents.
基金This work was supported by the Natural Science Foun-dation of China(Nos.U1334210 and 61374059).
文摘In order to achieve low-latency and high-reliability data gathering in heterogeneous wireless sensor networks(HWSNs),the problem of multi-channel-based data gathering with minimum latency(MCDGML),which associates with construction of data gathering trees,channel allocation,power assignment of nodes and link scheduling,is formulated as an optimization problem in this paper.Then,the optimization problem is proved to be NP-hard.To make the problem tractable,firstly,a multi-channel-based low-latency(MCLL)algorithm that constructs data gathering trees is proposed by optimizing the topology of nodes.Secondly,a maximum links scheduling(MLS)algorithm is proposed to further reduce the latency of data gathering,which ensures that the signal to interference plus noise ratio(SINR)of all scheduled links is not less than a certain threshold to guarantee the reliability of links.In addition,considering the interruption problem of data gathering caused by dead nodes or failed links,a robust mechanism is proposed by selecting certain assistant nodes based on the defined one-hop weight.A number of simulation results show that our algorithms can achieve a lower data gathering latency than some comparable data gathering algorithms while guaranteeing the reliability of links,and a higher packet arrival rate at the sink node can be achieved when the proposed algorithms are performed with the robust mechanism.
基金National Natural Science Foundation of China(No.62006039)。
文摘Supervised models for event detection usually require large-scale human-annotated training data,especially neural models.A data augmentation technique is proposed to improve the performance of event detection by generating paraphrase sentences to enrich expressions of the original data.Specifically,based on an existing human-annotated event detection dataset,we first automatically build a paraphrase dataset and label it with a designed event annotation alignment algorithm.To alleviate possible wrong labels in the generated paraphrase dataset,a multi-instance learning(MIL)method is adopted for joint training on both the gold human-annotated data and the generated paraphrase dataset.Experimental results on a widely used dataset ACE2005 show the effectiveness of our approach.