To address the issues of frequent identity switches(IDs)and degraded identification accuracy in multi object tracking(MOT)under complex occlusion scenarios,this study proposes an occlusion-robust tracking framework ba...To address the issues of frequent identity switches(IDs)and degraded identification accuracy in multi object tracking(MOT)under complex occlusion scenarios,this study proposes an occlusion-robust tracking framework based on face-pedestrian joint feature modeling.By constructing a joint tracking model centered on“intra-class independent tracking+cross-category dynamic binding”,designing a multi-modal matching metric with spatio-temporal and appearance constraints,and innovatively introducing a cross-category feature mutual verification mechanism and a dual matching strategy,this work effectively resolves performance degradation in traditional single-category tracking methods caused by short-term occlusion,cross-camera tracking,and crowded environments.Experiments on the Chokepoint_Face_Pedestrian_Track test set demonstrate that in complex scenes,the proposed method improves Face-Pedestrian Matching F1 area under the curve(F1 AUC)by approximately 4 to 43 percentage points compared to several traditional methods.The joint tracking model achieves overall performance metrics of IDF1:85.1825%and MOTA:86.5956%,representing improvements of 0.91 and 0.06 percentage points,respectively,over the baseline model.Ablation studies confirm the effectiveness of key modules such as the Intersection over Area(IoA)/Intersection over Union(IoU)joint metric and dynamic threshold adjustment,validating the significant role of the cross-category identity matching mechanism in enhancing tracking stability.Our_model shows a 16.7%frame per second(FPS)drop vs.fairness of detection and re-identification in multiple object tracking(FairMOT),with its cross-category binding module adding aboute 10%overhead,yet maintains near-real-time performance for essential face-pedestrian tracking at small resolutions.展开更多
An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyram...An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyramid network(FPN)structure of the original YOLOv8 mode is replaced by the generalized-FPN(GFPN)structure in GiraffeDet to realize the"cross-layer"and"cross-scale"adaptive feature fusion,to enrich the semantic information and spatial information on the feature map to improve the target detection ability of the model.Secondly,a pyramid-pool module of multi atrous spatial pyramid pooling(MASPP)is designed by using the idea of atrous convolution and feature pyramid structure to extract multi-scale features,so as to improve the processing ability of the model for multi-scale objects.The experimental results show that the detection accuracy of the improved YOLOv8 model on DIOR dataset is 92%and mean average precision(mAP)is 87.9%,respectively 3.5%and 1.7%higher than those of the original model.It is proved the detection and classification ability of the proposed model on multi-dimensional optical remote sensing target has been improved.展开更多
Multi-label image classification is a challenging task due to the diverse sizes and complex backgrounds of objects in images.Obtaining class-specific precise representations at different scales is a key aspect of feat...Multi-label image classification is a challenging task due to the diverse sizes and complex backgrounds of objects in images.Obtaining class-specific precise representations at different scales is a key aspect of feature representation.However,existing methods often rely on the single-scale deep feature,neglecting shallow and deeper layer features,which poses challenges when predicting objects of varying scales within the same image.Although some studies have explored multi-scale features,they rarely address the flow of information between scales or efficiently obtain class-specific precise representations for features at different scales.To address these issues,we propose a two-stage,three-branch Transformer-based framework.The first stage incorporates multi-scale image feature extraction and hierarchical scale attention.This design enables the model to consider objects at various scales while enhancing the flow of information across different feature scales,improving the model’s generalization to diverse object scales.The second stage includes a global feature enhancement module and a region selection module.The global feature enhancement module strengthens interconnections between different image regions,mitigating the issue of incomplete represen-tations,while the region selection module models the cross-modal relationships between image features and labels.Together,these components enable the efficient acquisition of class-specific precise feature representations.Extensive experiments on public datasets,including COCO2014,VOC2007,and VOC2012,demonstrate the effectiveness of our proposed method.Our approach achieves consistent performance gains of 0.3%,0.4%,and 0.2%over state-of-the-art methods on the three datasets,respectively.These results validate the reliability and superiority of our approach for multi-label image classification.展开更多
Aiming at the problem of low detection accuracy due to the different scale sizes of apple leaf disease spots and their similarity to the background,this paper proposes a multi-scale lightweight network(MSL-Net).Firstl...Aiming at the problem of low detection accuracy due to the different scale sizes of apple leaf disease spots and their similarity to the background,this paper proposes a multi-scale lightweight network(MSL-Net).Firstly,a multiplexed aggregated feature extraction network is proposed using residual bottleneck block(RES-Bottleneck)and middle partial-convolution(MP-Conv)to capture multi-scale spatial features and enhance focus on disease features for better differentiation between disease targets and background information.Secondly,a lightweight feature fusion network is designed using scale-fuse concatenation(SF-Cat)and triple-scale sequence feature fusion(TSSF)module to merge multi-scale feature maps comprehensively.Depthwise convolution(DWConv)and GhostNet lighten the network,while the cross stage partial bottleneck with 3 convolutions ghost-normalization attention module(C3-GN)reduces missed detections by suppressing irrelevant background information.Finally,soft non-maximum suppression(Soft-NMS)is used in the post-processing stage to improve the problem of misdetection of dense disease sites.The results show that the MSL-Net improves mean average precision at intersection over union of 0.5(mAP@0.5)by 2.0%over the baseline you only look once version 5s(YOLOv5s)and reduces parameters by 44%,reducing computation by 27%,outperforming other state-of-the-art(SOTA)models overall.This method also shows excellent performance compared to the latest research.展开更多
The impact of location services on people’s lives has grown significantly in the era of widespread smart device usage.Due to global navigation satellite system(GNSS)signal rejection,weak signal strength in indoor env...The impact of location services on people’s lives has grown significantly in the era of widespread smart device usage.Due to global navigation satellite system(GNSS)signal rejection,weak signal strength in indoor environments and radio signal interference caused by multiwall environments,which collectively lead to significant positioning errors,vision-based positioning has emerged as a crucial method in indoor positioning research.This paper introduces a scale hierarchical matching model to tackle challenges associated with large visual databases and high scene similarity,both of which will compromise matching accuracy and lead to prolonged positioning delays.The proposed model establishes an image feature database using GIST features and speeded up robust feature(SURF)in the offline stage.In the online stage,a positioning navigating algorithm is constructed based on Dijkstra’s path planning.Additionally,a corresponding Android application has been developed to facilitate visual positioning and navigation in indoor environments.Experimental results obtained in real indoor environments demonstrate that the proposed method significantly enhances positioning accuracy compared with similar algorithms,while effectively reducing time overhead.This improvement caters to the requirements for indoor positioning and navigation,thereby meeting user needs.展开更多
To address the issues of unknown target size,blurred edges,background interference and low contrast in infrared small target detection,this paper proposes a method based on density peaks searching and weighted multi-f...To address the issues of unknown target size,blurred edges,background interference and low contrast in infrared small target detection,this paper proposes a method based on density peaks searching and weighted multi-feature local difference.Firstly,an improved high-boost filter is used for preprocessing to eliminate background clutter and high-brightness interference,thereby increasing the probability of capturing real targets in the density peak search.Secondly,a triple-layer window is used to extract features from the area surrounding candidate targets,addressing the uncertainty of small target sizes.By calculating multi-feature local differences between the triple-layer windows,the problems of blurred target edges and low contrast are resolved.To balance the contribution of different features,intra-class distance is used to calculate weights,achieving weighted fusion of multi-feature local differences to obtain the weighted multi-feature local differences of candidate targets.The real targets are then extracted using the interquartile range.Experiments on datasets such as SIRST and IRSTD-IK show that the proposed method is suitable for various complex types and demonstrates good robustness and detection performance.展开更多
There is instability in the distributed energy storage cloud group end region on the power grid side.In order to avoid large-scale fluctuating charging and discharging in the power grid environment and make the capaci...There is instability in the distributed energy storage cloud group end region on the power grid side.In order to avoid large-scale fluctuating charging and discharging in the power grid environment and make the capacitor components showa continuous and stable charging and discharging state,a hierarchical time-sharing configuration algorithm of distributed energy storage cloud group end region on the power grid side based on multi-scale and multi feature convolution neural network is proposed.Firstly,a voltage stability analysis model based onmulti-scale and multi feature convolution neural network is constructed,and the multi-scale and multi feature convolution neural network is optimized based on Self-OrganizingMaps(SOM)algorithm to analyze the voltage stability of the cloud group end region of distributed energy storage on the grid side under the framework of credibility.According to the optimal scheduling objectives and network size,the distributed robust optimal configuration control model is solved under the framework of coordinated optimal scheduling at multiple time scales;Finally,the time series characteristics of regional power grid load and distributed generation are analyzed.According to the regional hierarchical time-sharing configuration model of“cloud”,“group”and“end”layer,the grid side distributed energy storage cloud group end regional hierarchical time-sharing configuration algorithm is realized.The experimental results show that after applying this algorithm,the best grid side distributed energy storage configuration scheme can be determined,and the stability of grid side distributed energy storage cloud group end region layered timesharing configuration can be improved.展开更多
A sectional curve feature extraction algorithm based on multi-scale analysis is proposed for reverse engineering. The algorithm consists of two parts: feature segmentation and feature classification. In the first part...A sectional curve feature extraction algorithm based on multi-scale analysis is proposed for reverse engineering. The algorithm consists of two parts: feature segmentation and feature classification. In the first part,curvature scale space is applied to multi-scale analysis and original feature detection. To obtain the primary and secondary curve primitives,feature fusion is realized by multi-scale feature detection information transmission. In the second part: projection height function is presented based on the area of quadrilateral,which improved criterions of sectional curve feature classification. Results of synthetic curves and practical scanned sectional curves are given to illustrate the efficiency of the proposed algorithm on feature extraction. The consistence between feature extraction based on multi-scale curvature analysis and curve primitives is verified.展开更多
Feature extraction of signals plays an important role in classification problems because of data dimension reduction property and potential improvement of a classification accuracy rate. Principal component analysis (...Feature extraction of signals plays an important role in classification problems because of data dimension reduction property and potential improvement of a classification accuracy rate. Principal component analysis (PCA), wavelets transform or Fourier transform methods are often used for feature extraction. In this paper, we propose a multi-scale PCA, which combines discrete wavelet transform, and PCA for feature extraction of signals in both the spatial and temporal domains. Our study shows that the multi-scale PCA combined with the proposed new classification methods leads to high classification accuracy for the considered signals.展开更多
Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware reso...Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.展开更多
In order to obtain a large number of correct matches with high accuracy,this article proposes a robust wide baseline point matching method,which is based on Scott s proximity matrix and uses the scale invariant featur...In order to obtain a large number of correct matches with high accuracy,this article proposes a robust wide baseline point matching method,which is based on Scott s proximity matrix and uses the scale invariant feature transform (SIFT). First,the distance between SIFT features is included in the equations of the proximity matrix to measure the similarity between two feature points; then the normalized cross correlation (NCC) used in Scott s method,which has been modified with adaptive scale and orientation,...展开更多
Face detection is applied to many tasks such as auto focus control, surveillance, user interface, and face recognition. Processing speed and detection accuracy of the face detection have been improved continuously. Th...Face detection is applied to many tasks such as auto focus control, surveillance, user interface, and face recognition. Processing speed and detection accuracy of the face detection have been improved continuously. This paper describes a novel method of fast face detection with multi-scale window search free from image resizing. We adopt statistics of gradient images (SGI) as image features and append an overlapping cell array to improve detection accuracy. The SGI feature is scale invariant and insensitive to small difference of pixel value. These characteristics enable the multi-scale window search without image resizing. Experimental results show that processing speed of our method is 3.66 times faster than a conventional method, adopting HOG features combined to an SVM classifier, without accuracy degradation.展开更多
In the video-based surveillance application, moving shadows can affect the correct localization and detection of moving objects. This paper aims to present a method for shadow detection and suppression used for moving...In the video-based surveillance application, moving shadows can affect the correct localization and detection of moving objects. This paper aims to present a method for shadow detection and suppression used for moving visual object detection. The major novelty of the shadow suppression is the integration of several features including photometric invariant color feature, motion edge feature, and spatial feature etc. By modifying process for false shadow detected, the averaging detection rate of moving object reaches above 90% in the test of Hall-Monitor sequence.展开更多
Image registration is an indispensable component in multi-source remote sensing image processing. In this paper, we put forward a remote sensing image registration method by including an improved multi-scale and multi...Image registration is an indispensable component in multi-source remote sensing image processing. In this paper, we put forward a remote sensing image registration method by including an improved multi-scale and multi-direction Harris algorithm and a novel compound feature. Multi-scale circle Gaussian combined invariant moments and multi-direction gray level co-occurrence matrix are extracted as features for image matching. The proposed algorithm is evaluated on numerous multi-source remote sensor images with noise and illumination changes. Extensive experimental studies prove that our proposed method is capable of receiving stable and even distribution of key points as well as obtaining robust and accurate correspondence matches. It is a promising scheme in multi-source remote sensing image registration.展开更多
Contraposing the need of the robust digital watermark for the copyright protection field, a new digital watermarking algorithm in the non-subsampled contourlet transform (NSCT) domain is proposed. The largest energy...Contraposing the need of the robust digital watermark for the copyright protection field, a new digital watermarking algorithm in the non-subsampled contourlet transform (NSCT) domain is proposed. The largest energy sub-band after NSCT is selected to embed watermark. The watermark is embedded into scaleinvariant feature transform (SIFT) regions. During embedding, the initial region is divided into some cirque sub-regions with the same area, and each watermark bit is embedded into one sub-region. Extensive simulation results and comparisons show that the algorithm gets a good trade-off of invisibility, robustness and capacity, thus obtaining good quality of the image while being able to effectively resist common image processing, and geometric and combo attacks, and normalized similarity is almost all reached.展开更多
Multi-label learning deals with data associated with a set of labels simultaneously. Dimensionality reduction is an important but challenging task in multi-label learning. Feature selection is an efficient technique f...Multi-label learning deals with data associated with a set of labels simultaneously. Dimensionality reduction is an important but challenging task in multi-label learning. Feature selection is an efficient technique for dimensionality reduction to search an optimal feature subset preserving the most relevant information. In this paper, we propose an effective feature evaluation criterion for multi-label feature selection, called neighborhood relationship preserving score. This criterion is inspired by similarity preservation, which is widely used in single-label feature selection. It evaluates each feature subset by measuring its capability in preserving neighborhood relationship among samples. Unlike similarity preservation, we address the order of sample similarities which can well express the neighborhood relationship among samples, not just the pairwise sample similarity. With this criterion, we also design one ranking algorithm and one greedy algorithm for feature selection problem. The proposed algorithms are validated in six publicly available data sets from machine learning repository. Experimental results demonstrate their superiorities over the compared state-of-the-art methods.展开更多
We propose a hierarchical multi-scale attention mechanism-based model in response to the low accuracy and inefficient manual classification of existing oceanic biological image classification methods. Firstly, the hie...We propose a hierarchical multi-scale attention mechanism-based model in response to the low accuracy and inefficient manual classification of existing oceanic biological image classification methods. Firstly, the hierarchical efficient multi-scale attention(H-EMA) module is designed for lightweight feature extraction, achieving outstanding performance at a relatively low cost. Secondly, an improved EfficientNetV2 block is used to integrate information from different scales better and enhance inter-layer message passing. Furthermore, introducing the convolutional block attention module(CBAM) enhances the model's perception of critical features, optimizing its generalization ability. Lastly, Focal Loss is introduced to adjust the weights of complex samples to address the issue of imbalanced categories in the dataset, further improving the model's performance. The model achieved 96.11% accuracy on the intertidal marine organism dataset of Nanji Islands and 84.78% accuracy on the CIFAR-100 dataset, demonstrating its strong generalization ability to meet the demands of oceanic biological image classification.展开更多
Urban land provides a suitable location for various economic activities which affect the development of surrounding areas. With rapid industrialization and urbanization, the contradictions in land-use become more noti...Urban land provides a suitable location for various economic activities which affect the development of surrounding areas. With rapid industrialization and urbanization, the contradictions in land-use become more noticeable. Urban administrators and decision-makers seek modern methods and technology to provide information support for urban growth. Recently, with the fast development of high-resolution sensor technology, more relevant data can be obtained, which is an advantage in studying the sustainable development of urban land-use. However, these data are only information sources and are a mixture of "information" and "noise". Processing, analysis and information extraction from remote sensing data is necessary to provide useful information. This paper extracts urban land-use information from a high-resolution image by using the multi-feature information of the image objects, and adopts an object-oriented image analysis approach and multi-scale image segmentation technology. A classification and extraction model is set up based on the multi-features of the image objects, in order to contribute to information for reasonable planning and effective management. This new image analysis approach offers a satisfactory solution for extracting information quickly and efficiently.展开更多
Aiming at the problem of multi-label classification, a multi-label classification algorithm based on label-specific features is proposed in this paper. In this algorithm, we compute feature density on the positive and...Aiming at the problem of multi-label classification, a multi-label classification algorithm based on label-specific features is proposed in this paper. In this algorithm, we compute feature density on the positive and negative instances set of each class firstly and then select mk features of high density from the positive and negative instances set of each class, respectively; the intersec- tion is taken as the label-specific features of the corresponding class. Finally, multi-label data are classified on the basis of la- bel-specific features. The algorithm can show the label-specific features of each class. Experiments show that our proposed method, the MLSF algorithm, performs significantly better than the other state-of-the-art multi-label learning approaches.展开更多
Cracks represent a significant hazard to pavement integrity,making their efficient and automated extraction essential for effective road health monitoring and maintenance.In response to this challenge,we propose a cra...Cracks represent a significant hazard to pavement integrity,making their efficient and automated extraction essential for effective road health monitoring and maintenance.In response to this challenge,we propose a crack automatic extraction network model that integrates multi⁃scale image features,thereby enhancing the model’s capability to capture crack characteristics and adaptation to complex scenarios.This model is based on the ResUNet architecture,makes modification to the convolutional layer of the model,proposes to construct multiple branches utilizing different convolution kernel sizes,and adds a atrous spatial pyramid pooling module within the intermediate layers.In this paper,comparative experiments on the performance of the basic model,ablation experiments,comparative experiments before and after data augmentation,and generalization verification experiments are conducted.Comparative experimental results indicate that the improved model exhibits superior detail processing capability at crack edges.The overall performance of the model,as measured by the F1⁃score,reaches 71.03%,reflecting a 2.1%improvement over the conventional ResUNet.展开更多
基金supported by the confidential research grant No.a8317。
文摘To address the issues of frequent identity switches(IDs)and degraded identification accuracy in multi object tracking(MOT)under complex occlusion scenarios,this study proposes an occlusion-robust tracking framework based on face-pedestrian joint feature modeling.By constructing a joint tracking model centered on“intra-class independent tracking+cross-category dynamic binding”,designing a multi-modal matching metric with spatio-temporal and appearance constraints,and innovatively introducing a cross-category feature mutual verification mechanism and a dual matching strategy,this work effectively resolves performance degradation in traditional single-category tracking methods caused by short-term occlusion,cross-camera tracking,and crowded environments.Experiments on the Chokepoint_Face_Pedestrian_Track test set demonstrate that in complex scenes,the proposed method improves Face-Pedestrian Matching F1 area under the curve(F1 AUC)by approximately 4 to 43 percentage points compared to several traditional methods.The joint tracking model achieves overall performance metrics of IDF1:85.1825%and MOTA:86.5956%,representing improvements of 0.91 and 0.06 percentage points,respectively,over the baseline model.Ablation studies confirm the effectiveness of key modules such as the Intersection over Area(IoA)/Intersection over Union(IoU)joint metric and dynamic threshold adjustment,validating the significant role of the cross-category identity matching mechanism in enhancing tracking stability.Our_model shows a 16.7%frame per second(FPS)drop vs.fairness of detection and re-identification in multiple object tracking(FairMOT),with its cross-category binding module adding aboute 10%overhead,yet maintains near-real-time performance for essential face-pedestrian tracking at small resolutions.
基金supported by the National Natural Science Foundation of China(No.62241109)the Tianjin Science and Technology Commissioner Project(No.20YDTPJC01110)。
文摘An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyramid network(FPN)structure of the original YOLOv8 mode is replaced by the generalized-FPN(GFPN)structure in GiraffeDet to realize the"cross-layer"and"cross-scale"adaptive feature fusion,to enrich the semantic information and spatial information on the feature map to improve the target detection ability of the model.Secondly,a pyramid-pool module of multi atrous spatial pyramid pooling(MASPP)is designed by using the idea of atrous convolution and feature pyramid structure to extract multi-scale features,so as to improve the processing ability of the model for multi-scale objects.The experimental results show that the detection accuracy of the improved YOLOv8 model on DIOR dataset is 92%and mean average precision(mAP)is 87.9%,respectively 3.5%and 1.7%higher than those of the original model.It is proved the detection and classification ability of the proposed model on multi-dimensional optical remote sensing target has been improved.
基金supported by the National Natural Science Foundation of China(62302167,62477013)Natural Science Foundation of Shanghai(No.24ZR1456100)+1 种基金Science and Technology Commission of Shanghai Municipality(No.24DZ2305900)the Shanghai Municipal Special Fund for Promoting High-Quality Development of Industries(2211106).
文摘Multi-label image classification is a challenging task due to the diverse sizes and complex backgrounds of objects in images.Obtaining class-specific precise representations at different scales is a key aspect of feature representation.However,existing methods often rely on the single-scale deep feature,neglecting shallow and deeper layer features,which poses challenges when predicting objects of varying scales within the same image.Although some studies have explored multi-scale features,they rarely address the flow of information between scales or efficiently obtain class-specific precise representations for features at different scales.To address these issues,we propose a two-stage,three-branch Transformer-based framework.The first stage incorporates multi-scale image feature extraction and hierarchical scale attention.This design enables the model to consider objects at various scales while enhancing the flow of information across different feature scales,improving the model’s generalization to diverse object scales.The second stage includes a global feature enhancement module and a region selection module.The global feature enhancement module strengthens interconnections between different image regions,mitigating the issue of incomplete represen-tations,while the region selection module models the cross-modal relationships between image features and labels.Together,these components enable the efficient acquisition of class-specific precise feature representations.Extensive experiments on public datasets,including COCO2014,VOC2007,and VOC2012,demonstrate the effectiveness of our proposed method.Our approach achieves consistent performance gains of 0.3%,0.4%,and 0.2%over state-of-the-art methods on the three datasets,respectively.These results validate the reliability and superiority of our approach for multi-label image classification.
文摘Aiming at the problem of low detection accuracy due to the different scale sizes of apple leaf disease spots and their similarity to the background,this paper proposes a multi-scale lightweight network(MSL-Net).Firstly,a multiplexed aggregated feature extraction network is proposed using residual bottleneck block(RES-Bottleneck)and middle partial-convolution(MP-Conv)to capture multi-scale spatial features and enhance focus on disease features for better differentiation between disease targets and background information.Secondly,a lightweight feature fusion network is designed using scale-fuse concatenation(SF-Cat)and triple-scale sequence feature fusion(TSSF)module to merge multi-scale feature maps comprehensively.Depthwise convolution(DWConv)and GhostNet lighten the network,while the cross stage partial bottleneck with 3 convolutions ghost-normalization attention module(C3-GN)reduces missed detections by suppressing irrelevant background information.Finally,soft non-maximum suppression(Soft-NMS)is used in the post-processing stage to improve the problem of misdetection of dense disease sites.The results show that the MSL-Net improves mean average precision at intersection over union of 0.5(mAP@0.5)by 2.0%over the baseline you only look once version 5s(YOLOv5s)and reduces parameters by 44%,reducing computation by 27%,outperforming other state-of-the-art(SOTA)models overall.This method also shows excellent performance compared to the latest research.
基金Supported by the National Natural Science Foundation of China(No.61971162,61771186)the Natural Science Foundation of Heilongjiang Province(No.PL2024F025)+1 种基金the Open Research Fund of National Mobile Communications Research Laboratory in Southeast University(No.2023D07)the Fundamental Scientific Research Funds of Heilongjiang Province(No.2022-KYYWF-1050).
文摘The impact of location services on people’s lives has grown significantly in the era of widespread smart device usage.Due to global navigation satellite system(GNSS)signal rejection,weak signal strength in indoor environments and radio signal interference caused by multiwall environments,which collectively lead to significant positioning errors,vision-based positioning has emerged as a crucial method in indoor positioning research.This paper introduces a scale hierarchical matching model to tackle challenges associated with large visual databases and high scene similarity,both of which will compromise matching accuracy and lead to prolonged positioning delays.The proposed model establishes an image feature database using GIST features and speeded up robust feature(SURF)in the offline stage.In the online stage,a positioning navigating algorithm is constructed based on Dijkstra’s path planning.Additionally,a corresponding Android application has been developed to facilitate visual positioning and navigation in indoor environments.Experimental results obtained in real indoor environments demonstrate that the proposed method significantly enhances positioning accuracy compared with similar algorithms,while effectively reducing time overhead.This improvement caters to the requirements for indoor positioning and navigation,thereby meeting user needs.
基金supported by the National Natural Science Foundation of China (No.52205548)。
文摘To address the issues of unknown target size,blurred edges,background interference and low contrast in infrared small target detection,this paper proposes a method based on density peaks searching and weighted multi-feature local difference.Firstly,an improved high-boost filter is used for preprocessing to eliminate background clutter and high-brightness interference,thereby increasing the probability of capturing real targets in the density peak search.Secondly,a triple-layer window is used to extract features from the area surrounding candidate targets,addressing the uncertainty of small target sizes.By calculating multi-feature local differences between the triple-layer windows,the problems of blurred target edges and low contrast are resolved.To balance the contribution of different features,intra-class distance is used to calculate weights,achieving weighted fusion of multi-feature local differences to obtain the weighted multi-feature local differences of candidate targets.The real targets are then extracted using the interquartile range.Experiments on datasets such as SIRST and IRSTD-IK show that the proposed method is suitable for various complex types and demonstrates good robustness and detection performance.
基金supported by State Grid Corporation Limited Science and Technology Project Funding(Contract No.SGCQSQ00YJJS2200380).
文摘There is instability in the distributed energy storage cloud group end region on the power grid side.In order to avoid large-scale fluctuating charging and discharging in the power grid environment and make the capacitor components showa continuous and stable charging and discharging state,a hierarchical time-sharing configuration algorithm of distributed energy storage cloud group end region on the power grid side based on multi-scale and multi feature convolution neural network is proposed.Firstly,a voltage stability analysis model based onmulti-scale and multi feature convolution neural network is constructed,and the multi-scale and multi feature convolution neural network is optimized based on Self-OrganizingMaps(SOM)algorithm to analyze the voltage stability of the cloud group end region of distributed energy storage on the grid side under the framework of credibility.According to the optimal scheduling objectives and network size,the distributed robust optimal configuration control model is solved under the framework of coordinated optimal scheduling at multiple time scales;Finally,the time series characteristics of regional power grid load and distributed generation are analyzed.According to the regional hierarchical time-sharing configuration model of“cloud”,“group”and“end”layer,the grid side distributed energy storage cloud group end regional hierarchical time-sharing configuration algorithm is realized.The experimental results show that after applying this algorithm,the best grid side distributed energy storage configuration scheme can be determined,and the stability of grid side distributed energy storage cloud group end region layered timesharing configuration can be improved.
基金Supported by the Natural Science Foundation of China (50175063)
文摘A sectional curve feature extraction algorithm based on multi-scale analysis is proposed for reverse engineering. The algorithm consists of two parts: feature segmentation and feature classification. In the first part,curvature scale space is applied to multi-scale analysis and original feature detection. To obtain the primary and secondary curve primitives,feature fusion is realized by multi-scale feature detection information transmission. In the second part: projection height function is presented based on the area of quadrilateral,which improved criterions of sectional curve feature classification. Results of synthetic curves and practical scanned sectional curves are given to illustrate the efficiency of the proposed algorithm on feature extraction. The consistence between feature extraction based on multi-scale curvature analysis and curve primitives is verified.
文摘Feature extraction of signals plays an important role in classification problems because of data dimension reduction property and potential improvement of a classification accuracy rate. Principal component analysis (PCA), wavelets transform or Fourier transform methods are often used for feature extraction. In this paper, we propose a multi-scale PCA, which combines discrete wavelet transform, and PCA for feature extraction of signals in both the spatial and temporal domains. Our study shows that the multi-scale PCA combined with the proposed new classification methods leads to high classification accuracy for the considered signals.
文摘Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.
基金National High-tech Research and Development Program (2007AA01Z314)National Natural Science Foundation of China (60873085)
文摘In order to obtain a large number of correct matches with high accuracy,this article proposes a robust wide baseline point matching method,which is based on Scott s proximity matrix and uses the scale invariant feature transform (SIFT). First,the distance between SIFT features is included in the equations of the proximity matrix to measure the similarity between two feature points; then the normalized cross correlation (NCC) used in Scott s method,which has been modified with adaptive scale and orientation,...
文摘Face detection is applied to many tasks such as auto focus control, surveillance, user interface, and face recognition. Processing speed and detection accuracy of the face detection have been improved continuously. This paper describes a novel method of fast face detection with multi-scale window search free from image resizing. We adopt statistics of gradient images (SGI) as image features and append an overlapping cell array to improve detection accuracy. The SGI feature is scale invariant and insensitive to small difference of pixel value. These characteristics enable the multi-scale window search without image resizing. Experimental results show that processing speed of our method is 3.66 times faster than a conventional method, adopting HOG features combined to an SVM classifier, without accuracy degradation.
文摘In the video-based surveillance application, moving shadows can affect the correct localization and detection of moving objects. This paper aims to present a method for shadow detection and suppression used for moving visual object detection. The major novelty of the shadow suppression is the integration of several features including photometric invariant color feature, motion edge feature, and spatial feature etc. By modifying process for false shadow detected, the averaging detection rate of moving object reaches above 90% in the test of Hall-Monitor sequence.
基金supported by National Nature Science Foundation of China (Nos. 61462046 and 61762052)Natural Science Foundation of Jiangxi Province (Nos. 20161BAB202049 and 20161BAB204172)+2 种基金the Bidding Project of the Key Laboratory of Watershed Ecology and Geographical Environment Monitoring, NASG (Nos. WE2016003, WE2016013 and WE2016015)the Science and Technology Research Projects of Jiangxi Province Education Department (Nos. GJJ160741, GJJ170632 and GJJ170633)the Art Planning Project of Jiangxi Province (Nos. YG2016250 and YG2017381)
文摘Image registration is an indispensable component in multi-source remote sensing image processing. In this paper, we put forward a remote sensing image registration method by including an improved multi-scale and multi-direction Harris algorithm and a novel compound feature. Multi-scale circle Gaussian combined invariant moments and multi-direction gray level co-occurrence matrix are extracted as features for image matching. The proposed algorithm is evaluated on numerous multi-source remote sensor images with noise and illumination changes. Extensive experimental studies prove that our proposed method is capable of receiving stable and even distribution of key points as well as obtaining robust and accurate correspondence matches. It is a promising scheme in multi-source remote sensing image registration.
基金supported by the National Natural Science Foundation of China(61379010)the Natural Science Basic Research Plan in Shaanxi Province of China(2015JM6293)
文摘Contraposing the need of the robust digital watermark for the copyright protection field, a new digital watermarking algorithm in the non-subsampled contourlet transform (NSCT) domain is proposed. The largest energy sub-band after NSCT is selected to embed watermark. The watermark is embedded into scaleinvariant feature transform (SIFT) regions. During embedding, the initial region is divided into some cirque sub-regions with the same area, and each watermark bit is embedded into one sub-region. Extensive simulation results and comparisons show that the algorithm gets a good trade-off of invisibility, robustness and capacity, thus obtaining good quality of the image while being able to effectively resist common image processing, and geometric and combo attacks, and normalized similarity is almost all reached.
基金supported in part by the National Natural Science Foundation of China(61379049,61772120)
文摘Multi-label learning deals with data associated with a set of labels simultaneously. Dimensionality reduction is an important but challenging task in multi-label learning. Feature selection is an efficient technique for dimensionality reduction to search an optimal feature subset preserving the most relevant information. In this paper, we propose an effective feature evaluation criterion for multi-label feature selection, called neighborhood relationship preserving score. This criterion is inspired by similarity preservation, which is widely used in single-label feature selection. It evaluates each feature subset by measuring its capability in preserving neighborhood relationship among samples. Unlike similarity preservation, we address the order of sample similarities which can well express the neighborhood relationship among samples, not just the pairwise sample similarity. With this criterion, we also design one ranking algorithm and one greedy algorithm for feature selection problem. The proposed algorithms are validated in six publicly available data sets from machine learning repository. Experimental results demonstrate their superiorities over the compared state-of-the-art methods.
基金supported by the National Natural Science Foundation of China (Nos.61806107 and 61702135)。
文摘We propose a hierarchical multi-scale attention mechanism-based model in response to the low accuracy and inefficient manual classification of existing oceanic biological image classification methods. Firstly, the hierarchical efficient multi-scale attention(H-EMA) module is designed for lightweight feature extraction, achieving outstanding performance at a relatively low cost. Secondly, an improved EfficientNetV2 block is used to integrate information from different scales better and enhance inter-layer message passing. Furthermore, introducing the convolutional block attention module(CBAM) enhances the model's perception of critical features, optimizing its generalization ability. Lastly, Focal Loss is introduced to adjust the weights of complex samples to address the issue of imbalanced categories in the dataset, further improving the model's performance. The model achieved 96.11% accuracy on the intertidal marine organism dataset of Nanji Islands and 84.78% accuracy on the CIFAR-100 dataset, demonstrating its strong generalization ability to meet the demands of oceanic biological image classification.
基金The paper is supported by the Research Foundation for OutstandingYoung Teachers , China University of Geosciences ( Wuhan) ( No .CUGQNL0616) Research Foundationfor State Key Laboratory of Geo-logical Processes and Mineral Resources ( No . MGMR2002-02)Hubei Provincial Depart ment of Education (B) .
文摘Urban land provides a suitable location for various economic activities which affect the development of surrounding areas. With rapid industrialization and urbanization, the contradictions in land-use become more noticeable. Urban administrators and decision-makers seek modern methods and technology to provide information support for urban growth. Recently, with the fast development of high-resolution sensor technology, more relevant data can be obtained, which is an advantage in studying the sustainable development of urban land-use. However, these data are only information sources and are a mixture of "information" and "noise". Processing, analysis and information extraction from remote sensing data is necessary to provide useful information. This paper extracts urban land-use information from a high-resolution image by using the multi-feature information of the image objects, and adopts an object-oriented image analysis approach and multi-scale image segmentation technology. A classification and extraction model is set up based on the multi-features of the image objects, in order to contribute to information for reasonable planning and effective management. This new image analysis approach offers a satisfactory solution for extracting information quickly and efficiently.
基金Supported by the Opening Fund of Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education (93K-17-2010-K02)the Opening Fund of Key Discipline of Computer Soft-Ware and Theory of Zhejiang Province at Zhejiang Normal University (ZSDZZZZXK05)
文摘Aiming at the problem of multi-label classification, a multi-label classification algorithm based on label-specific features is proposed in this paper. In this algorithm, we compute feature density on the positive and negative instances set of each class firstly and then select mk features of high density from the positive and negative instances set of each class, respectively; the intersec- tion is taken as the label-specific features of the corresponding class. Finally, multi-label data are classified on the basis of la- bel-specific features. The algorithm can show the label-specific features of each class. Experiments show that our proposed method, the MLSF algorithm, performs significantly better than the other state-of-the-art multi-label learning approaches.
基金supported in part by the National Natural Science Foundation of China(No.42401166)the Open Fund of Key Laboratory of Polar Environment Monitoring and Public Governance,Ministry of Education(No.202405)the Key Research and Development Program of Hebei Province(No.23375405D).
文摘Cracks represent a significant hazard to pavement integrity,making their efficient and automated extraction essential for effective road health monitoring and maintenance.In response to this challenge,we propose a crack automatic extraction network model that integrates multi⁃scale image features,thereby enhancing the model’s capability to capture crack characteristics and adaptation to complex scenarios.This model is based on the ResUNet architecture,makes modification to the convolutional layer of the model,proposes to construct multiple branches utilizing different convolution kernel sizes,and adds a atrous spatial pyramid pooling module within the intermediate layers.In this paper,comparative experiments on the performance of the basic model,ablation experiments,comparative experiments before and after data augmentation,and generalization verification experiments are conducted.Comparative experimental results indicate that the improved model exhibits superior detail processing capability at crack edges.The overall performance of the model,as measured by the F1⁃score,reaches 71.03%,reflecting a 2.1%improvement over the conventional ResUNet.