Analysis of customers' satisfaction provides a guarantee to improve the service quality in call centers.In this paper,a novel satisfaction recognition framework is introduced to analyze the customers' satisfaction.I...Analysis of customers' satisfaction provides a guarantee to improve the service quality in call centers.In this paper,a novel satisfaction recognition framework is introduced to analyze the customers' satisfaction.In natural conversations,the interaction between a customer and its agent take place more than once.One of the difficulties insatisfaction analysis at call centers is that not all conversation turns exhibit customer satisfaction or dissatisfaction. To solve this problem,an intelligent system is proposed that utilizes acoustic features to recognize customers' emotion and utilizes the global features of emotion and duration to analyze the satisfaction. Experiments on real-call data show that the proposed system offers a significantly higher accuracy in analyzing the satisfaction than the baseline system. The average F value is improved to 0. 701 from 0. 664.展开更多
In this paper, we present a tire defect detection algorithm based on sparse representation. The dictionary learned from reference images can efficiently represent the test image. As the representation coefficients of ...In this paper, we present a tire defect detection algorithm based on sparse representation. The dictionary learned from reference images can efficiently represent the test image. As the representation coefficients of normal images have a specific distribution, the local feature can be estimate by comparing representation coefficient distribution. Meanwhile, a coding length is used to measure the global features of representation coefficients. The tire defect is located by both these local and global features. Experimental results demonstrate that the proposed method can accurately detect and locate the tire defects.展开更多
Chinese new words are particularly problematic in Chinese natural language processing. With the fast development of Internet and information explosion, it is impossible to get a complete system lexicon for application...Chinese new words are particularly problematic in Chinese natural language processing. With the fast development of Internet and information explosion, it is impossible to get a complete system lexicon for applications in Chinese natural language processing, as new words out of dictionaries are always being created. The procedure of new words identification and POS tagging are usually separated and the features of lexical information cannot be fully used. A latent discriminative model, which combines the strengths of Latent Dynamic Conditional Random Field (LDCRF) and semi-CRF, is proposed to detect new words together with their POS synchronously regardless of the types of new words from Chinese text without being pre-segmented. Unlike semi-CRF, in proposed latent discriminative model, LDCRF is applied to generate candidate entities, which accelerates the training speed and decreases the computational cost. The complexity of proposed hidden semi-CRF could be further adjusted by tuning the number of hidden variables and the number of candidate entities from the Nbest outputs of LDCRF model. A new-word-generating framework is proposed for model training and testing, under which the definitions and distributions of new words conform to the ones in real text. The global feature called "Global Fragment Features" for new word identification is adopted. We tested our model on the corpus from SIGHAN-6. Experimental results show that the proposed method is capable of detecting even low frequency new words together with their POS tags with satisfactory results. The proposed model performs competitively with the state-of-the-art models.展开更多
Recognizing road scene context from a single image remains a critical challenge for intelligent autonomous driving systems,particularly in dynamic and unstructured environments.While recent advancements in deep learni...Recognizing road scene context from a single image remains a critical challenge for intelligent autonomous driving systems,particularly in dynamic and unstructured environments.While recent advancements in deep learning have significantly enhanced road scene classification,simultaneously achieving high accuracy,computational efficiency,and adaptability across diverse conditions continues to be difficult.To address these challenges,this study proposes HybridLSTM,a novel and efficient framework that integrates deep learning-based,object-based,and handcrafted feature extraction methods within a unified architecture.HybridLSTM is designed to classify four distinct road scene categories—crosswalk(CW),highway(HW),overpass/tunnel(OP/T),and parking(P)—by leveraging multiple publicly available datasets,including Places-365,BDD100K,LabelMe,and KITTI,thereby promoting domain generalization.The framework fuses object-level features extracted using YOLOv5 and VGG19,scene-level global representations obtained from a modified VGG19,and fine-grained texture features captured through eight handcrafted descriptors.This hybrid feature fusion enables the model to capture both semantic context and low-level visual cues,which are critical for robust scene understanding.To model spatial arrangements and latent sequential dependencies present even in static imagery,the combined features are processed through a Long Short-Term Memory(LSTM)network,allowing the extraction of discriminative patterns across heterogeneous feature spaces.Extensive experiments conducted on 2725 annotated road scene images,with an 80:20 training-to-testing split,validate the effectiveness of the proposed model.HybridLSTM achieves a classification accuracy of 96.3%,a precision of 95.8%,a recall of 96.1%,and an F1-score of 96.0%,outperforming several existing state-of-the-art methods.These results demonstrate the robustness,scalability,and generalization capability of HybridLSTM across varying environments and scene complexities.Moreover,the framework is optimized to balance classification performance with computational efficiency,making it highly suitable for real-time deployment in embedded autonomous driving systems.Future work will focus on extending the model to multi-class detection within a single frame and optimizing it further for edge-device deployments to reduce computational overhead in practical applications.展开更多
Because of the ambiguity and dynamic nature of natural language,the research of named entity recognition is very challenging.As an international language,English plays an important role in the fields of science and te...Because of the ambiguity and dynamic nature of natural language,the research of named entity recognition is very challenging.As an international language,English plays an important role in the fields of science and technology,finance and business.Therefore,the early named entity recognition technology is mainly based on English,which is often used to identify the names of people,places and organizations in the text.International conferences in the field of natural language processing,such as CoNLL,MUC,and ACE,have identified named entity recognition as a specific evaluation task,and the relevant research uses evaluation corpus from English-language media organizations such as the Wall Street Journal,the New York Times,and Wikipedia.The research of named entity recognition on relevant data has achieved good results.Aiming at the sparse distribution of entities in text,a model combining local and global features is proposed.The model takes a single English character as input,and uses the local feature layer composed of local attention and convolution to process the text pieceby way of sliding window to construct the corresponding local features.In addition,the self-attention mechanism is used to generate the global features of the text to improve the recognition effect of the model on long sentences.Experiments on three data sets,Resume,MSRA and Weibo,show that the proposed method can effectively improve the model’s recognition of English named entities.展开更多
Gait recognition,a promising biometric technology,relies on analyzing individuals' walking patterns and offers a non-intrusive and convenient approach to identity verification.However,gait recognition accuracy is ...Gait recognition,a promising biometric technology,relies on analyzing individuals' walking patterns and offers a non-intrusive and convenient approach to identity verification.However,gait recognition accuracy is often compromised by external factors such as changes in viewpoint and attire,which present substantial challenges in practical applications.To enhance gait recognition performance under diverse viewpoints and complex conditions,a global-local part-shift network is proposed in this paper.This framework integrates two novel modules:the part-shift feature extractor and the dynamic feature aggregator.The part-shift feature extractor strategically shifts body parts to capture the intrinsic relationships between non-adjacent regions,enriching the recognition process with both global and local spatial features.The dynamic feature aggregator addresses long-range dependency issues by incorporating multi-range temporal modeling,effectively aggregating information across parts and time steps to achieve a more robust recognition outcome.Comprehensive experiments on the CASIA-B dataset demonstrate that the proposed global-local part-shift network delivers superior performance compared with state-of-the-art methods,highlighting its potential for practical deployment.展开更多
With the increasing popularity of high-resolution remote sensing images,the remote sensing image retrieval(RSIR)has always been a topic of major issue.A combined,global non-subsampled shearlet transform(NSST)-domain s...With the increasing popularity of high-resolution remote sensing images,the remote sensing image retrieval(RSIR)has always been a topic of major issue.A combined,global non-subsampled shearlet transform(NSST)-domain statistical features(NSSTds)and local three dimensional local ternary pattern(3D-LTP)features,is proposed for high-resolution remote sensing images.We model the NSST image coefficients of detail subbands using 2-state laplacian mixture(LM)distribution and its three parameters are estimated using Expectation-Maximization(EM)algorithm.We also calculate the statistical parameters such as subband kurtosis and skewness from detail subbands along with mean and standard deviation calculated from approximation subband,and concatenate all of them with the 2-state LM parameters to describe the global features of the image.The various properties of NSST such as multiscale,localization and flexible directional sensitivity make it a suitable choice to provide an effective approximation of an image.In order to extract the dense local features,a new 3D-LTP is proposed where dimension reduction is performed via selection of‘uniform’patterns.The 3D-LTP is calculated from spatial RGB planes of the input image.The proposed inter-channel 3D-LTP not only exploits the local texture information but the color information is captured too.Finally,a fused feature representation(NSSTds-3DLTP)is proposed using new global(NSSTds)and local(3D-LTP)features to enhance the discriminativeness of features.The retrieval performance of proposed NSSTds-3DLTP features are tested on three challenging remote sensing image datasets such as WHU-RS19,Aerial Image Dataset(AID)and PatternNet in terms of mean average precision(MAP),average normalized modified retrieval rank(ANMRR)and precision-recall(P-R)graph.The experimental results are encouraging and the NSSTds-3DLTP features leads to superior retrieval performance compared to many well known existing descriptors such as Gabor RGB,Granulometry,local binary pattern(LBP),Fisher vector(FV),vector of locally aggregated descriptors(VLAD)and median robust extended local binary pattern(MRELBP).For WHU-RS19 dataset,in terms of{MAP,ANMRR},the NSSTds-3DLTP improves upon Gabor RGB,Granulometry,LBP,FV,VLAD and MRELBP descriptors by{41.93%,20.87%},{92.30%,32.68%},{86.14%,31.97%},{18.18%,15.22%},{8.96%,19.60%}and{15.60%,13.26%},respectively.For AID,in terms of{MAP,ANMRR},the NSSTds-3DLTP improves upon Gabor RGB,Granulometry,LBP,FV,VLAD and MRELBP descriptors by{152.60%,22.06%},{226.65%,25.08%},{185.03%,23.33%},{80.06%,12.16%},{50.58%,10.49%}and{62.34%,3.24%},respectively.For PatternNet,the NSSTds-3DLTP respectively improves upon Gabor RGB,Granulometry,LBP,FV,VLAD and MRELBP descriptors by{32.79%,10.34%},{141.30%,24.72%},{17.47%,10.34%},{83.20%,19.07%},{21.56%,3.60%},and{19.30%,0.48%}in terms of{MAP,ANMRR}.The moderate dimensionality of simple NSSTds-3DLTP allows the system to run in real-time.展开更多
A distinct aridity trend in China in last 100 years is presented by applying a linear fitting to both the climate records and the hydrological records, which is supported by evidence of environmental changes and seems...A distinct aridity trend in China in last 100 years is presented by applying a linear fitting to both the climate records and the hydrological records, which is supported by evidence of environmental changes and seems to be associated with a global warming trend during this period.The Mann Kendall Rank statistic test reveals a very interesting feature that the climate of China entered into a dry regime abruptly in about 1920's, which synchronized with the rapid warming of the global temperature at almost the same time.According to an analysis of the meridional profile of observed global zonal mean precipitation anomalies during the peak period of global wanning (1930-1940), the drought occurred in whole middle latitude zone (25°N-55°N) of the Northern Hemisphere, where the most part of China is located in. Although this pattern is in good agreement with the latitude distribution of the difference of zonal mean rates of precipitation between 4 × CO2 and 1 × CO2 simulated by climate model (Manabe and Wetherald, 1983), more studies are required to understand the linkage between the aridity trend in China and the greenhouse effect.The EOF analysis of the Northern Hemisphere sea level pressure for the season of June to August shows an abrupt change of the time coefficient of its first eigenvector from positive to negative in mid-1920's, indicating an enhancement of the subtropical high over Southeast Asia and the western Pacific after that time. This is an atmospheric circulation pattern that is favorable to the development of dry climate in China.展开更多
Video summarization aims at selecting valuable clips for browsing videos with high efficiency.Previous approaches typically focus on aggregating temporal features while ignoring the potential role of visual representa...Video summarization aims at selecting valuable clips for browsing videos with high efficiency.Previous approaches typically focus on aggregating temporal features while ignoring the potential role of visual representations in summarizing videos.In this paper,we present a global difference-aware network(GDANet)that exploits the feature difference across frame and video as guidance to enhance visual features.Initially,a difference optimization module(DOM)is devised to enhance the discriminability of visual features,bringing gains in accurately aggregating temporal cues.Subsequently,a dual-scale attention module(DSAM)is introduced to capture informative contextual information.Eventually,we design an adaptive feature fusion module(AFFM)to make the network adaptively learn context representations and perform feature fusion effectively.We have conducted experiments on benchmark datasets,and the empirical results demonstrate the effectiveness of the proposed framework.展开更多
Heading into the second half of the year,the global apparel fabrics and accessories industry’s attention has begun to focus on the 2016 Autumn Edition of Intertextile Shanghai Apparel Fabrics which will be held from...Heading into the second half of the year,the global apparel fabrics and accessories industry’s attention has begun to focus on the 2016 Autumn Edition of Intertextile Shanghai Apparel Fabrics which will be held from 11–13 October.Over 5,000 exhibitors from more than 25 countries and regions will take part and showcase an all-encompassing range of products across 260,000 sqm.of exhibition space at the Nation Exhibition and Convention Center(Shanghai).To展开更多
In order to develop precision or personalized medicine,identifying new quantitative imaging markers and building machine learning models to predict cancer risk and prognosis has been attracting broad research interest...In order to develop precision or personalized medicine,identifying new quantitative imaging markers and building machine learning models to predict cancer risk and prognosis has been attracting broad research interest recently.Most of these research approaches use the similar concepts of the conventional computer-aided detection schemes of medical images,which include steps in detecting and segmenting suspicious regions or tumors,followed by training machine learning models based on the fusion of multiple image features computed from the segmented regions or tumors.However,due to the heterogeneity and boundary fuzziness of the suspicious regions or tumors,segmenting subtle regions is often difficult and unreliable.Additionally,ignoring global and/or background parenchymal tissue characteristics may also be a limitation of the conventional approaches.In our recent studies,we investigated the feasibility of developing new computer-aided schemes implemented with the machine learning models that are trained by global image features to predict cancer risk and prognosis.We trained and tested several models using images obtained from full-field digital mammography,magnetic resonance imaging,and computed tomography of breast,lung,and ovarian cancers.Study results showed that many of these new models yielded higher performance than other approaches used in current clinical practice.Furthermore,the computed global image features also contain complementary information from the features computed from the segmented regions or tumors in predicting cancer prognosis.Therefore,the global image features can be used alone to develop new case-based prediction models or can be added to current tumor-based models to increase their discriminatory power.展开更多
Globally,diabetic retinopathy(DR)is the primary cause of blindness,affecting millions of people worldwide.This widespread impact underscores the critical need for reliable and precise diagnostic techniques to ensure p...Globally,diabetic retinopathy(DR)is the primary cause of blindness,affecting millions of people worldwide.This widespread impact underscores the critical need for reliable and precise diagnostic techniques to ensure prompt diagnosis and effective treatment.Deep learning-based automated diagnosis for diabetic retinopathy can facilitate early detection and treatment.However,traditional deep learning models that focus on local views often learn feature representations that are less discriminative at the semantic level.On the other hand,models that focus on global semantic-level information might overlook critical,subtle local pathological features.To address this issue,we propose an adaptive multi-scale feature fusion network called(AMSFuse),which can adaptively combine multi-scale global and local features without compromising their individual representation.Specifically,our model incorporates global features for extracting high-level contextual information from retinal images.Concurrently,local features capture fine-grained details,such as microaneurysms,hemorrhages,and exudates,which are critical for DR diagnosis.These global and local features are adaptively fused using a fusion block,followed by an Integrated Attention Mechanism(IAM)that refines the fused features by emphasizing relevant regions,thereby enhancing classification accuracy for DR classification.Our model achieves 86.3%accuracy on the APTOS dataset and 96.6%RFMiD,both of which are comparable to state-of-the-art methods.展开更多
Vehicle re-identification involves matching images of vehicles across varying camera views.The diversity of camera locations along different roadways leads to significant intra-class variation and only minimal inter-c...Vehicle re-identification involves matching images of vehicles across varying camera views.The diversity of camera locations along different roadways leads to significant intra-class variation and only minimal inter-class similarity in the collected vehicle images,which increases the complexity of re-identification tasks.To tackle these challenges,this study proposes AG-GCN(Attention-Guided Graph Convolutional Network),a novel framework integrating several pivotal components.Initially,AG-GCN embeds a lightweight attention module within the ResNet-50 structure to learn feature weights automatically,thereby improving the representation of vehicle features globally by highlighting salient features and suppressing extraneous ones.Moreover,AG-GCN adopts a graph-based structure to encapsulate deep local features.A graph convolutional network then amalgamates these features to understand the relationships among vehicle-related characteristics.Subsequently,we amalgamate feature maps from both the attention and graph-based branches for a more comprehensive representation of vehicle features.The framework then gauges feature similarities and ranks them,thus enhancing the accuracy of vehicle re-identification.Comprehensive qualitative and quantitative analyses on two publicly available datasets verify the efficacy of AG-GCN in addressing intra-class and inter-class variability issues.展开更多
With the aim of extracting the features of face images in face recognition, a new method of face recognition by fusing global features and local features is presented. The global features are extracted using principal...With the aim of extracting the features of face images in face recognition, a new method of face recognition by fusing global features and local features is presented. The global features are extracted using principal component analysis (PCA). Active appearance model (AAM) locates 58 facial fiducial points, from which 17 points are characterized as local features using the Gabor wavelet transform (GWT). Normalized global match degree (local match degree) can be obtained by global features (local features) of the probe image and each gallery image. After the fusion of normalized global match degree and normalized local match degree, the recognition result is the class that included the gallery image corresponding to the largest fused match degree. The method is evaluated by the recognition rates over two face image databases (AR and SJTU-IPPR). The experimental results show that the method outperforms PCA and elastic bunch graph matching (EBGM). Moreover, it is effective and robust to expression, illumination and pose variation in some degree.展开更多
Quantitative analysis of clinical function parameters from MRI images is crucial for diagnosing and assessing cardiovascular disease.However,the manual calculation of these parameters is challenging due to the high va...Quantitative analysis of clinical function parameters from MRI images is crucial for diagnosing and assessing cardiovascular disease.However,the manual calculation of these parameters is challenging due to the high variability among patients and the time-consuming nature of the process.In this study,the authors introduce a framework named MultiJSQ,comprising the feature presentation network(FRN)and the indicator prediction network(IEN),which is designed for simultaneous joint segmentation and quantification.The FRN is tailored for representing global image features,facilitating the direct acquisition of left ventricle(LV)contour images through pixel classification.Additionally,the IEN incorporates specifically designed modules to extract relevant clinical indices.The authors’method considers the interdependence of different tasks,demonstrating the validity of these relationships and yielding favourable results.Through extensive experiments on cardiac MR images from 145 patients,MultiJSQ achieves impressive outcomes,with low mean absolute errors of 124 mm^(2),1.72 mm,and 1.21 mm for areas,dimensions,and regional wall thicknesses,respectively,along with a Dice metric score of 0.908.The experimental findings underscore the excellent performance of our framework in LV segmentation and quantification,highlighting its promising clinical application prospects.展开更多
An exhaustive study has been conducted to investigate span-based models for the joint entity and relation extraction task.However,these models sample a large number of negative entities and negative relations during t...An exhaustive study has been conducted to investigate span-based models for the joint entity and relation extraction task.However,these models sample a large number of negative entities and negative relations during the model training,which are essential but result in grossly imbalanced data distributions and in turn cause suboptimal model performance.In order to address the above issues,we propose a two-phase paradigm for the span-based joint entity and relation extraction,which involves classifying the entities and relations in the first phase,and predicting the types of these entities and relations in the second phase.The two-phase paradigm enables our model to significantly reduce the data distribution gap,including the gap between negative entities and other entities,aswell as the gap between negative relations and other relations.In addition,we make the first attempt at combining entity type and entity distance as global features,which has proven effective,especially for the relation extraction.Experimental results on several datasets demonstrate that the span-based joint extraction model augmented with the two-phase paradigm and the global features consistently outperforms previous state-ofthe-art span-based models for the joint extraction task,establishing a new standard benchmark.Qualitative and quantitative analyses further validate the effectiveness the proposed paradigm and the global features.展开更多
Aquatic medicine knowledge graph is an effective means to realize intelligent aquaculture.Graph completion technology is key to improving the quality of knowledge graph construction.However,the difficulty of semantic ...Aquatic medicine knowledge graph is an effective means to realize intelligent aquaculture.Graph completion technology is key to improving the quality of knowledge graph construction.However,the difficulty of semantic discrimination among similar entities and inconspicuous semantic features result in low accuracy when completing aquatic medicine knowledge graph with complex relationships.In this study,an aquatic medicine knowledge graph completion method(TransH+HConvAM)is proposed.Firstly,TransH is applied to split the vector plane between entities and relations,ameliorating the poor completion effect caused by low semantic resolution of entities.Then,hybrid convolution is introduced to obtain the global interaction of triples based on the complete interaction between head/tail entities and relations,which improves the semantic features of triples and enhances the completion effect of complex relationships in the graph.Experiments are conducted to verify the performance of the proposed method.The MR,MRR and Hit@10 of the TransH+HConvAM are found to be 674,0.339,and 0.361,respectively.This study shows that the model effectively overcomes the poor completion effect of complex relationships and improves the construction quality of the aquatic medicine knowledge graph,providing technical support for intelligent aquaculture.展开更多
Airborne LIDAR can flexibly obtain point cloud data with three-dimensional structural information,which can improve its effectiveness of automatic target recognition in the complex environment.Compared with 2D informa...Airborne LIDAR can flexibly obtain point cloud data with three-dimensional structural information,which can improve its effectiveness of automatic target recognition in the complex environment.Compared with 2D information,3D information performs better in separating objects and background.However,an aircraft platform can have a negative influence on LIDAR obtained data because of various flight attitudes,flight heights and atmospheric disturbances.A structure of global feature based 3D automatic target recognition method for airborne LIDAR is proposed,which is composed of offline phase and online phase.The performance of four global feature descriptors is compared.Considering the summed volume region(SVR) discrepancy in real objects,SVR selection is added into the pre-processing operations to eliminate mismatching clusters compared with the interested target.Highly reliable simulated data are obtained under various sensor’s altitudes,detection distances and atmospheric disturbances.The final experiments results show that the added step increases the recognition rate by above 2.4% and decreases the execution time by about 33%.展开更多
To address the problem that traditional keypoint detection methods are susceptible to complex backgrounds and local similarity of images resulting in inaccurate descriptor matching and bias in visual localization, key...To address the problem that traditional keypoint detection methods are susceptible to complex backgrounds and local similarity of images resulting in inaccurate descriptor matching and bias in visual localization, keypoints and descriptors based on cross-modality fusion are proposed and applied to the study of camera motion estimation. A convolutional neural network is used to detect the positions of keypoints and generate the corresponding descriptors, and the pyramid convolution is used to extract multi-scale features in the network. The problem of local similarity of images is solved by capturing local and global feature information and fusing the geometric position information of keypoints to generate descriptors. According to our experiments, the repeatability of our method is improved by 3.7%, and the homography estimation is improved by 1.6%. To demonstrate the practicability of the method, the visual odometry part of simultaneous localization and mapping is constructed and our method is 35% higher positioning accuracy than the traditional method.展开更多
During natural viewing,we often recognize multiple objects,detect their motion,and select one object as the target to track.It remains to be determined how such behavior is guided by the integration of visual form and...During natural viewing,we often recognize multiple objects,detect their motion,and select one object as the target to track.It remains to be determined how such behavior is guided by the integration of visual form and motion perception.To address this,we studied how monkeys made a choice to track moving targets with different forms by smooth pursuit eye movements in a two-target task.We found that pursuit responses were biased toward the motion direction of a target with a hole.By computing the relative weighting,we found that the target with a hole exhibited a larger weight for vector computation.The global hole feature dominated other form properties.This dominance failed to account for changes in pursuit responses to a target with different forms moving singly.These findings suggest that the integration of visual form and motion perception can reshape the competition in sensorimotor networks to guide behavioral selection.展开更多
基金Supported by the National Natural Science Foundation of China(61473041,61571044,11590772)
文摘Analysis of customers' satisfaction provides a guarantee to improve the service quality in call centers.In this paper,a novel satisfaction recognition framework is introduced to analyze the customers' satisfaction.In natural conversations,the interaction between a customer and its agent take place more than once.One of the difficulties insatisfaction analysis at call centers is that not all conversation turns exhibit customer satisfaction or dissatisfaction. To solve this problem,an intelligent system is proposed that utilizes acoustic features to recognize customers' emotion and utilizes the global features of emotion and duration to analyze the satisfaction. Experiments on real-call data show that the proposed system offers a significantly higher accuracy in analyzing the satisfaction than the baseline system. The average F value is improved to 0. 701 from 0. 664.
基金Supported by Project of Shandong Province Higher Educational Science and Technology Program(No.J11LG77)
文摘In this paper, we present a tire defect detection algorithm based on sparse representation. The dictionary learned from reference images can efficiently represent the test image. As the representation coefficients of normal images have a specific distribution, the local feature can be estimate by comparing representation coefficient distribution. Meanwhile, a coding length is used to measure the global features of representation coefficients. The tire defect is located by both these local and global features. Experimental results demonstrate that the proposed method can accurately detect and locate the tire defects.
基金partially supported by the Doctor Startup Fund of Liaoning Province under Grant No.20101021
文摘Chinese new words are particularly problematic in Chinese natural language processing. With the fast development of Internet and information explosion, it is impossible to get a complete system lexicon for applications in Chinese natural language processing, as new words out of dictionaries are always being created. The procedure of new words identification and POS tagging are usually separated and the features of lexical information cannot be fully used. A latent discriminative model, which combines the strengths of Latent Dynamic Conditional Random Field (LDCRF) and semi-CRF, is proposed to detect new words together with their POS synchronously regardless of the types of new words from Chinese text without being pre-segmented. Unlike semi-CRF, in proposed latent discriminative model, LDCRF is applied to generate candidate entities, which accelerates the training speed and decreases the computational cost. The complexity of proposed hidden semi-CRF could be further adjusted by tuning the number of hidden variables and the number of candidate entities from the Nbest outputs of LDCRF model. A new-word-generating framework is proposed for model training and testing, under which the definitions and distributions of new words conform to the ones in real text. The global feature called "Global Fragment Features" for new word identification is adopted. We tested our model on the corpus from SIGHAN-6. Experimental results show that the proposed method is capable of detecting even low frequency new words together with their POS tags with satisfactory results. The proposed model performs competitively with the state-of-the-art models.
文摘Recognizing road scene context from a single image remains a critical challenge for intelligent autonomous driving systems,particularly in dynamic and unstructured environments.While recent advancements in deep learning have significantly enhanced road scene classification,simultaneously achieving high accuracy,computational efficiency,and adaptability across diverse conditions continues to be difficult.To address these challenges,this study proposes HybridLSTM,a novel and efficient framework that integrates deep learning-based,object-based,and handcrafted feature extraction methods within a unified architecture.HybridLSTM is designed to classify four distinct road scene categories—crosswalk(CW),highway(HW),overpass/tunnel(OP/T),and parking(P)—by leveraging multiple publicly available datasets,including Places-365,BDD100K,LabelMe,and KITTI,thereby promoting domain generalization.The framework fuses object-level features extracted using YOLOv5 and VGG19,scene-level global representations obtained from a modified VGG19,and fine-grained texture features captured through eight handcrafted descriptors.This hybrid feature fusion enables the model to capture both semantic context and low-level visual cues,which are critical for robust scene understanding.To model spatial arrangements and latent sequential dependencies present even in static imagery,the combined features are processed through a Long Short-Term Memory(LSTM)network,allowing the extraction of discriminative patterns across heterogeneous feature spaces.Extensive experiments conducted on 2725 annotated road scene images,with an 80:20 training-to-testing split,validate the effectiveness of the proposed model.HybridLSTM achieves a classification accuracy of 96.3%,a precision of 95.8%,a recall of 96.1%,and an F1-score of 96.0%,outperforming several existing state-of-the-art methods.These results demonstrate the robustness,scalability,and generalization capability of HybridLSTM across varying environments and scene complexities.Moreover,the framework is optimized to balance classification performance with computational efficiency,making it highly suitable for real-time deployment in embedded autonomous driving systems.Future work will focus on extending the model to multi-class detection within a single frame and optimizing it further for edge-device deployments to reduce computational overhead in practical applications.
基金Reform and Practice of Practical Teaching System for Applied Translation Undergraduate Majors from the Perspective of Technology Hard Trend of Henan Province Education Reform Project in 2024(Project number:2024SJGLX0581)Teaching Reform Project of Zhengzhou University of Science and Technology in 2024,”Innovative Research on Practical Teaching of Digital-Intelligence Technology Enabling Production-Teaching Integration”(Project number:2024JGZD11).
文摘Because of the ambiguity and dynamic nature of natural language,the research of named entity recognition is very challenging.As an international language,English plays an important role in the fields of science and technology,finance and business.Therefore,the early named entity recognition technology is mainly based on English,which is often used to identify the names of people,places and organizations in the text.International conferences in the field of natural language processing,such as CoNLL,MUC,and ACE,have identified named entity recognition as a specific evaluation task,and the relevant research uses evaluation corpus from English-language media organizations such as the Wall Street Journal,the New York Times,and Wikipedia.The research of named entity recognition on relevant data has achieved good results.Aiming at the sparse distribution of entities in text,a model combining local and global features is proposed.The model takes a single English character as input,and uses the local feature layer composed of local attention and convolution to process the text pieceby way of sliding window to construct the corresponding local features.In addition,the self-attention mechanism is used to generate the global features of the text to improve the recognition effect of the model on long sentences.Experiments on three data sets,Resume,MSRA and Weibo,show that the proposed method can effectively improve the model’s recognition of English named entities.
文摘Gait recognition,a promising biometric technology,relies on analyzing individuals' walking patterns and offers a non-intrusive and convenient approach to identity verification.However,gait recognition accuracy is often compromised by external factors such as changes in viewpoint and attire,which present substantial challenges in practical applications.To enhance gait recognition performance under diverse viewpoints and complex conditions,a global-local part-shift network is proposed in this paper.This framework integrates two novel modules:the part-shift feature extractor and the dynamic feature aggregator.The part-shift feature extractor strategically shifts body parts to capture the intrinsic relationships between non-adjacent regions,enriching the recognition process with both global and local spatial features.The dynamic feature aggregator addresses long-range dependency issues by incorporating multi-range temporal modeling,effectively aggregating information across parts and time steps to achieve a more robust recognition outcome.Comprehensive experiments on the CASIA-B dataset demonstrate that the proposed global-local part-shift network delivers superior performance compared with state-of-the-art methods,highlighting its potential for practical deployment.
文摘With the increasing popularity of high-resolution remote sensing images,the remote sensing image retrieval(RSIR)has always been a topic of major issue.A combined,global non-subsampled shearlet transform(NSST)-domain statistical features(NSSTds)and local three dimensional local ternary pattern(3D-LTP)features,is proposed for high-resolution remote sensing images.We model the NSST image coefficients of detail subbands using 2-state laplacian mixture(LM)distribution and its three parameters are estimated using Expectation-Maximization(EM)algorithm.We also calculate the statistical parameters such as subband kurtosis and skewness from detail subbands along with mean and standard deviation calculated from approximation subband,and concatenate all of them with the 2-state LM parameters to describe the global features of the image.The various properties of NSST such as multiscale,localization and flexible directional sensitivity make it a suitable choice to provide an effective approximation of an image.In order to extract the dense local features,a new 3D-LTP is proposed where dimension reduction is performed via selection of‘uniform’patterns.The 3D-LTP is calculated from spatial RGB planes of the input image.The proposed inter-channel 3D-LTP not only exploits the local texture information but the color information is captured too.Finally,a fused feature representation(NSSTds-3DLTP)is proposed using new global(NSSTds)and local(3D-LTP)features to enhance the discriminativeness of features.The retrieval performance of proposed NSSTds-3DLTP features are tested on three challenging remote sensing image datasets such as WHU-RS19,Aerial Image Dataset(AID)and PatternNet in terms of mean average precision(MAP),average normalized modified retrieval rank(ANMRR)and precision-recall(P-R)graph.The experimental results are encouraging and the NSSTds-3DLTP features leads to superior retrieval performance compared to many well known existing descriptors such as Gabor RGB,Granulometry,local binary pattern(LBP),Fisher vector(FV),vector of locally aggregated descriptors(VLAD)and median robust extended local binary pattern(MRELBP).For WHU-RS19 dataset,in terms of{MAP,ANMRR},the NSSTds-3DLTP improves upon Gabor RGB,Granulometry,LBP,FV,VLAD and MRELBP descriptors by{41.93%,20.87%},{92.30%,32.68%},{86.14%,31.97%},{18.18%,15.22%},{8.96%,19.60%}and{15.60%,13.26%},respectively.For AID,in terms of{MAP,ANMRR},the NSSTds-3DLTP improves upon Gabor RGB,Granulometry,LBP,FV,VLAD and MRELBP descriptors by{152.60%,22.06%},{226.65%,25.08%},{185.03%,23.33%},{80.06%,12.16%},{50.58%,10.49%}and{62.34%,3.24%},respectively.For PatternNet,the NSSTds-3DLTP respectively improves upon Gabor RGB,Granulometry,LBP,FV,VLAD and MRELBP descriptors by{32.79%,10.34%},{141.30%,24.72%},{17.47%,10.34%},{83.20%,19.07%},{21.56%,3.60%},and{19.30%,0.48%}in terms of{MAP,ANMRR}.The moderate dimensionality of simple NSSTds-3DLTP allows the system to run in real-time.
文摘A distinct aridity trend in China in last 100 years is presented by applying a linear fitting to both the climate records and the hydrological records, which is supported by evidence of environmental changes and seems to be associated with a global warming trend during this period.The Mann Kendall Rank statistic test reveals a very interesting feature that the climate of China entered into a dry regime abruptly in about 1920's, which synchronized with the rapid warming of the global temperature at almost the same time.According to an analysis of the meridional profile of observed global zonal mean precipitation anomalies during the peak period of global wanning (1930-1940), the drought occurred in whole middle latitude zone (25°N-55°N) of the Northern Hemisphere, where the most part of China is located in. Although this pattern is in good agreement with the latitude distribution of the difference of zonal mean rates of precipitation between 4 × CO2 and 1 × CO2 simulated by climate model (Manabe and Wetherald, 1983), more studies are required to understand the linkage between the aridity trend in China and the greenhouse effect.The EOF analysis of the Northern Hemisphere sea level pressure for the season of June to August shows an abrupt change of the time coefficient of its first eigenvector from positive to negative in mid-1920's, indicating an enhancement of the subtropical high over Southeast Asia and the western Pacific after that time. This is an atmospheric circulation pattern that is favorable to the development of dry climate in China.
基金the National Natural Science Foundation of China(Nos.61702347 and 62027801)the Natural Science Foundation of Hebei Province(Nos.F2022210007 and F2017210161)+1 种基金the Science and Technology Project of Hebei Education Department(Nos.ZD2022100 and QN2017132)the Central Guidance on Local Science and Technology Development Fund(No.226Z0501G)。
文摘Video summarization aims at selecting valuable clips for browsing videos with high efficiency.Previous approaches typically focus on aggregating temporal features while ignoring the potential role of visual representations in summarizing videos.In this paper,we present a global difference-aware network(GDANet)that exploits the feature difference across frame and video as guidance to enhance visual features.Initially,a difference optimization module(DOM)is devised to enhance the discriminability of visual features,bringing gains in accurately aggregating temporal cues.Subsequently,a dual-scale attention module(DSAM)is introduced to capture informative contextual information.Eventually,we design an adaptive feature fusion module(AFFM)to make the network adaptively learn context representations and perform feature fusion effectively.We have conducted experiments on benchmark datasets,and the empirical results demonstrate the effectiveness of the proposed framework.
文摘Heading into the second half of the year,the global apparel fabrics and accessories industry’s attention has begun to focus on the 2016 Autumn Edition of Intertextile Shanghai Apparel Fabrics which will be held from 11–13 October.Over 5,000 exhibitors from more than 25 countries and regions will take part and showcase an all-encompassing range of products across 260,000 sqm.of exhibition space at the Nation Exhibition and Convention Center(Shanghai).To
基金The studies mentioned in this paper were supported in part by Grants R01 CA160205 and R01 CA197150 from the National Cancer Institute,National Institutes of Health,USAGrant HR15-016 from Oklahoma Center for the Advancement of Science and Technology,USA.
文摘In order to develop precision or personalized medicine,identifying new quantitative imaging markers and building machine learning models to predict cancer risk and prognosis has been attracting broad research interest recently.Most of these research approaches use the similar concepts of the conventional computer-aided detection schemes of medical images,which include steps in detecting and segmenting suspicious regions or tumors,followed by training machine learning models based on the fusion of multiple image features computed from the segmented regions or tumors.However,due to the heterogeneity and boundary fuzziness of the suspicious regions or tumors,segmenting subtle regions is often difficult and unreliable.Additionally,ignoring global and/or background parenchymal tissue characteristics may also be a limitation of the conventional approaches.In our recent studies,we investigated the feasibility of developing new computer-aided schemes implemented with the machine learning models that are trained by global image features to predict cancer risk and prognosis.We trained and tested several models using images obtained from full-field digital mammography,magnetic resonance imaging,and computed tomography of breast,lung,and ovarian cancers.Study results showed that many of these new models yielded higher performance than other approaches used in current clinical practice.Furthermore,the computed global image features also contain complementary information from the features computed from the segmented regions or tumors in predicting cancer prognosis.Therefore,the global image features can be used alone to develop new case-based prediction models or can be added to current tumor-based models to increase their discriminatory power.
基金supported by the National Natural Science Foundation of China(No.62376287)the International Science and Technology Innovation Joint Base of Machine Vision and Medical Image Processing in Hunan Province(2021CB1013)the Natural Science Foundation of Hunan Province(Nos.2022JJ30762,2023JJ70016).
文摘Globally,diabetic retinopathy(DR)is the primary cause of blindness,affecting millions of people worldwide.This widespread impact underscores the critical need for reliable and precise diagnostic techniques to ensure prompt diagnosis and effective treatment.Deep learning-based automated diagnosis for diabetic retinopathy can facilitate early detection and treatment.However,traditional deep learning models that focus on local views often learn feature representations that are less discriminative at the semantic level.On the other hand,models that focus on global semantic-level information might overlook critical,subtle local pathological features.To address this issue,we propose an adaptive multi-scale feature fusion network called(AMSFuse),which can adaptively combine multi-scale global and local features without compromising their individual representation.Specifically,our model incorporates global features for extracting high-level contextual information from retinal images.Concurrently,local features capture fine-grained details,such as microaneurysms,hemorrhages,and exudates,which are critical for DR diagnosis.These global and local features are adaptively fused using a fusion block,followed by an Integrated Attention Mechanism(IAM)that refines the fused features by emphasizing relevant regions,thereby enhancing classification accuracy for DR classification.Our model achieves 86.3%accuracy on the APTOS dataset and 96.6%RFMiD,both of which are comparable to state-of-the-art methods.
基金funded by the National Natural Science Foundation of China(grant number:62172292).
文摘Vehicle re-identification involves matching images of vehicles across varying camera views.The diversity of camera locations along different roadways leads to significant intra-class variation and only minimal inter-class similarity in the collected vehicle images,which increases the complexity of re-identification tasks.To tackle these challenges,this study proposes AG-GCN(Attention-Guided Graph Convolutional Network),a novel framework integrating several pivotal components.Initially,AG-GCN embeds a lightweight attention module within the ResNet-50 structure to learn feature weights automatically,thereby improving the representation of vehicle features globally by highlighting salient features and suppressing extraneous ones.Moreover,AG-GCN adopts a graph-based structure to encapsulate deep local features.A graph convolutional network then amalgamates these features to understand the relationships among vehicle-related characteristics.Subsequently,we amalgamate feature maps from both the attention and graph-based branches for a more comprehensive representation of vehicle features.The framework then gauges feature similarities and ranks them,thus enhancing the accuracy of vehicle re-identification.Comprehensive qualitative and quantitative analyses on two publicly available datasets verify the efficacy of AG-GCN in addressing intra-class and inter-class variability issues.
文摘With the aim of extracting the features of face images in face recognition, a new method of face recognition by fusing global features and local features is presented. The global features are extracted using principal component analysis (PCA). Active appearance model (AAM) locates 58 facial fiducial points, from which 17 points are characterized as local features using the Gabor wavelet transform (GWT). Normalized global match degree (local match degree) can be obtained by global features (local features) of the probe image and each gallery image. After the fusion of normalized global match degree and normalized local match degree, the recognition result is the class that included the gallery image corresponding to the largest fused match degree. The method is evaluated by the recognition rates over two face image databases (AR and SJTU-IPPR). The experimental results show that the method outperforms PCA and elastic bunch graph matching (EBGM). Moreover, it is effective and robust to expression, illumination and pose variation in some degree.
基金Hefei Municipal Natural Science Foundation,Grant/Award Number:2022009Suqian Guiding Program Project,Grant/Award Number:Z202309Suqian Traditional Chinese Medicine Science and Technology Plan,Grant/Award Number:MS202301。
文摘Quantitative analysis of clinical function parameters from MRI images is crucial for diagnosing and assessing cardiovascular disease.However,the manual calculation of these parameters is challenging due to the high variability among patients and the time-consuming nature of the process.In this study,the authors introduce a framework named MultiJSQ,comprising the feature presentation network(FRN)and the indicator prediction network(IEN),which is designed for simultaneous joint segmentation and quantification.The FRN is tailored for representing global image features,facilitating the direct acquisition of left ventricle(LV)contour images through pixel classification.Additionally,the IEN incorporates specifically designed modules to extract relevant clinical indices.The authors’method considers the interdependence of different tasks,demonstrating the validity of these relationships and yielding favourable results.Through extensive experiments on cardiac MR images from 145 patients,MultiJSQ achieves impressive outcomes,with low mean absolute errors of 124 mm^(2),1.72 mm,and 1.21 mm for areas,dimensions,and regional wall thicknesses,respectively,along with a Dice metric score of 0.908.The experimental findings underscore the excellent performance of our framework in LV segmentation and quantification,highlighting its promising clinical application prospects.
基金supported by the National Key Research and Development Program[2020YFB1006302].
文摘An exhaustive study has been conducted to investigate span-based models for the joint entity and relation extraction task.However,these models sample a large number of negative entities and negative relations during the model training,which are essential but result in grossly imbalanced data distributions and in turn cause suboptimal model performance.In order to address the above issues,we propose a two-phase paradigm for the span-based joint entity and relation extraction,which involves classifying the entities and relations in the first phase,and predicting the types of these entities and relations in the second phase.The two-phase paradigm enables our model to significantly reduce the data distribution gap,including the gap between negative entities and other entities,aswell as the gap between negative relations and other relations.In addition,we make the first attempt at combining entity type and entity distance as global features,which has proven effective,especially for the relation extraction.Experimental results on several datasets demonstrate that the span-based joint extraction model augmented with the two-phase paradigm and the global features consistently outperforms previous state-ofthe-art span-based models for the joint extraction task,establishing a new standard benchmark.Qualitative and quantitative analyses further validate the effectiveness the proposed paradigm and the global features.
基金supported by the Key Laboratory of Environment Controlled Aquaculture(Dalian Ocean University)Ministry of Education(No.2021-MOEKLECA-KF-05)the National Natural Science Foundation of China Youth Science(No.61802046)。
文摘Aquatic medicine knowledge graph is an effective means to realize intelligent aquaculture.Graph completion technology is key to improving the quality of knowledge graph construction.However,the difficulty of semantic discrimination among similar entities and inconspicuous semantic features result in low accuracy when completing aquatic medicine knowledge graph with complex relationships.In this study,an aquatic medicine knowledge graph completion method(TransH+HConvAM)is proposed.Firstly,TransH is applied to split the vector plane between entities and relations,ameliorating the poor completion effect caused by low semantic resolution of entities.Then,hybrid convolution is introduced to obtain the global interaction of triples based on the complete interaction between head/tail entities and relations,which improves the semantic features of triples and enhances the completion effect of complex relationships in the graph.Experiments are conducted to verify the performance of the proposed method.The MR,MRR and Hit@10 of the TransH+HConvAM are found to be 674,0.339,and 0.361,respectively.This study shows that the model effectively overcomes the poor completion effect of complex relationships and improves the construction quality of the aquatic medicine knowledge graph,providing technical support for intelligent aquaculture.
基金This research was supported by National Natural Science Foundation of China(No.61271353,61871389)Major Funding Projects of National University of Defense Technology(No.ZK18-01-02)Foundation of State Key Laboratory of Pulsed Power Laser Technology(No.SKL2018ZR09).
文摘Airborne LIDAR can flexibly obtain point cloud data with three-dimensional structural information,which can improve its effectiveness of automatic target recognition in the complex environment.Compared with 2D information,3D information performs better in separating objects and background.However,an aircraft platform can have a negative influence on LIDAR obtained data because of various flight attitudes,flight heights and atmospheric disturbances.A structure of global feature based 3D automatic target recognition method for airborne LIDAR is proposed,which is composed of offline phase and online phase.The performance of four global feature descriptors is compared.Considering the summed volume region(SVR) discrepancy in real objects,SVR selection is added into the pre-processing operations to eliminate mismatching clusters compared with the interested target.Highly reliable simulated data are obtained under various sensor’s altitudes,detection distances and atmospheric disturbances.The final experiments results show that the added step increases the recognition rate by above 2.4% and decreases the execution time by about 33%.
基金Supported by the National Natural Science Foundation of China (61802253)。
文摘To address the problem that traditional keypoint detection methods are susceptible to complex backgrounds and local similarity of images resulting in inaccurate descriptor matching and bias in visual localization, keypoints and descriptors based on cross-modality fusion are proposed and applied to the study of camera motion estimation. A convolutional neural network is used to detect the positions of keypoints and generate the corresponding descriptors, and the pyramid convolution is used to extract multi-scale features in the network. The problem of local similarity of images is solved by capturing local and global feature information and fusing the geometric position information of keypoints to generate descriptors. According to our experiments, the repeatability of our method is improved by 3.7%, and the homography estimation is improved by 1.6%. To demonstrate the practicability of the method, the visual odometry part of simultaneous localization and mapping is constructed and our method is 35% higher positioning accuracy than the traditional method.
基金supported by the Beijing Natural Science Foundation(Z210009)the National Science and Technology Innovation 2030 Major Program(STI2030-Major Projects 2022ZD0204800)+1 种基金the National Natural Science Foundation of China(32070987,31722025,31730039)the Chinese Academy of Sciences Key Program of Frontier Sciences(QYZDB-SSW-SMC019).
文摘During natural viewing,we often recognize multiple objects,detect their motion,and select one object as the target to track.It remains to be determined how such behavior is guided by the integration of visual form and motion perception.To address this,we studied how monkeys made a choice to track moving targets with different forms by smooth pursuit eye movements in a two-target task.We found that pursuit responses were biased toward the motion direction of a target with a hole.By computing the relative weighting,we found that the target with a hole exhibited a larger weight for vector computation.The global hole feature dominated other form properties.This dominance failed to account for changes in pursuit responses to a target with different forms moving singly.These findings suggest that the integration of visual form and motion perception can reshape the competition in sensorimotor networks to guide behavioral selection.