Biometric recognition refers to the process of recognizing a person’s identity using physiological or behavioral modalities,such as face,voice,fingerprint,gait,etc.Such biometric modalities are mostly used in recogni...Biometric recognition refers to the process of recognizing a person’s identity using physiological or behavioral modalities,such as face,voice,fingerprint,gait,etc.Such biometric modalities are mostly used in recognition tasks separately as in unimodal systems,or jointly with two or more as in multimodal systems.However,multimodal systems can usually enhance the recognition performance over unimodal systems by integrating the biometric data of multiple modalities at different fusion levels.Despite this enhancement,in real-life applications some factors degrade multimodal systems’performance,such as occlusion,face poses,and noise in voice data.In this paper,we propose two algorithms that effectively apply dynamic fusion at feature level based on the data quality of multimodal biometrics.The proposed algorithms attempt to minimize the negative influence of confusing and low-quality features by either exclusion or weight reduction to achieve better recognition performance.The proposed dynamic fusion was achieved using face and voice biometrics,where face features were extracted using principal component analysis(PCA),and Gabor filters separately,whilst voice features were extracted using Mel-Frequency Cepstral Coefficients(MFCCs).Here,the facial data quality assessment of face images is mainly based on the existence of occlusion,whereas the assessment of voice data quality is substantially based on the calculation of signal to noise ratio(SNR)as per the existence of noise.To evaluate the performance of the proposed algorithms,several experiments were conducted using two combinations of three different databases,AR database,and the extended Yale Face Database B for face images,in addition to VOiCES database for voice data.The obtained results show that both proposed dynamic fusion algorithms attain improved performance and offer more advantages in identification and verification over not only the standard unimodal algorithms but also the multimodal algorithms using standard fusion methods.展开更多
A new framework of region-based dynamic image fusion is proposed. First, the technique of target detection is applied to dynamic images (image sequences) to segment images into different targets and background regions...A new framework of region-based dynamic image fusion is proposed. First, the technique of target detection is applied to dynamic images (image sequences) to segment images into different targets and background regions. Then different fusion rules are employed in different regions so that the target information is preserved as much as possible. In addition, steerable non-separable wavelet frame transform is used in the process of multi-resolution analysis, so the system achieves favorable characters of orientation and invariant shift. Compared with other image fusion methods, experimental results showed that the proposed method has better capabilities of target recognition and preserves clear background information.展开更多
Medical image analysis based on deep learning has become an important technical requirement in the field of smart healthcare.In view of the difficulties in collaborative modeling of local details and global features i...Medical image analysis based on deep learning has become an important technical requirement in the field of smart healthcare.In view of the difficulties in collaborative modeling of local details and global features in multimodal image analysis of ophthalmology,as well as the existence of information redundancy in cross-modal data fusion,this paper proposes amultimodal fusion framework based on cross-modal collaboration and weighted attention mechanism.In terms of feature extraction,the framework collaboratively extracts local fine-grained features and global structural dependencies through a parallel dual-branch architecture,overcoming the limitations of traditional single-modality models in capturing either local or global information;in terms of fusion strategy,the framework innovatively designs a cross-modal dynamic fusion strategy,combining overlappingmulti-head self-attention modules with a bidirectional feature alignment mechanism,addressing the bottlenecks of low feature interaction efficiency and excessive attention fusion computations in traditional parallel fusion,and further introduces cross-domain local integration technology,which enhances the representation ability of the lesion area through pixel-level feature recalibration and optimizes the diagnostic robustness of complex cases.Experiments show that the framework exhibits excellent feature expression and generalization performance in cross-domain scenarios of ophthalmic medical images and natural images,providing a high-precision,low-redundancy fusion paradigm for multimodal medical image analysis,and promoting the upgrade of intelligent diagnosis and treatment fromsingle-modal static analysis to dynamic decision-making.展开更多
Micro-expressions,fleeting involuntary facial cues lasting under half a second,reveal genuine emotions and are valuable in clinical diagnosis and psychotherapy.Real-time recognition on resource-constrained embedded de...Micro-expressions,fleeting involuntary facial cues lasting under half a second,reveal genuine emotions and are valuable in clinical diagnosis and psychotherapy.Real-time recognition on resource-constrained embedded devices remains challenging,as current methods struggle to balance performance and efficiency.This study introduces a semi-lightweight multifunctional network that enhances real-time deployment and accuracy.Unlike prior simplistic feature fusion techniques,our novel multi-feature fusion strategy leverages temporal,spatial,and differential features to better capture dynamic changes.Enhanced by Residual Network(ResNet)architecture with channel and spatial attention mechanisms,the model improves feature representation while maintaining a lightweight design.Evaluations on SMIC,CASME II,SAMM,and their composite dataset show superior performance in Unweighted F1 Score(UF1)and Unweighted Average Recall(UAR),alongside faster detection speeds compared to existing algorithms.展开更多
Network intrusion detection systems(NIDS)based on deep learning have continued to make significant advances.However,the following challenges remain:on the one hand,simply applying only Temporal Convolutional Networks(...Network intrusion detection systems(NIDS)based on deep learning have continued to make significant advances.However,the following challenges remain:on the one hand,simply applying only Temporal Convolutional Networks(TCNs)can lead to models that ignore the impact of network traffic features at different scales on the detection performance.On the other hand,some intrusion detection methods considermulti-scale information of traffic data,but considering only forward network traffic information can lead to deficiencies in capturing multi-scale temporal features.To address both of these issues,we propose a hybrid Convolutional Neural Network that supports a multi-output strategy(BONUS)for industrial internet intrusion detection.First,we create a multiscale Temporal Convolutional Network by stacking TCN of different scales to capture the multiscale information of network traffic.Meanwhile,we propose a bi-directional structure and dynamically set the weights to fuse the forward and backward contextual information of network traffic at each scale to enhance the model’s performance in capturing the multi-scale temporal features of network traffic.In addition,we introduce a gated network for each of the two branches in the proposed method to assist the model in learning the feature representation of each branch.Extensive experiments reveal the effectiveness of the proposed approach on two publicly available traffic intrusion detection datasets named UNSW-NB15 and NSL-KDD with F1 score of 85.03% and 99.31%,respectively,which also validates the effectiveness of enhancing the model’s ability to capture multi-scale temporal features of traffic data on detection performance.展开更多
Detection and counting of abalones is one of key technologies of abalones breeding density estimation.The abalones in the breeding stage are small in size,densely distributed,and occluded between individuals,so the ex...Detection and counting of abalones is one of key technologies of abalones breeding density estimation.The abalones in the breeding stage are small in size,densely distributed,and occluded between individuals,so the existing object detection algorithms have low precision for detecting the abalones in the breeding stage.To solve this problem,a detection and counting method of juvenile abalones based on improved SSD network is proposed in this research.The innovation points of this method are:Firstly,the multi-layer feature dynamic fusion method is proposed to obtain more color and texture information and improve detection precision of juvenile abalones with small size;secondly,the multiscale attention feature extraction method is proposed to highlight shape and edge feature information of juvenile abalones and increase detection precision of juvenile abalones with dense distribution and individual coverage;finally,the loss feedback training method is used to increase the diversity of data and the pixels of juvenile abalones in the images to get the even higher detection precision of juvenile abalones with small size.The experimental results show that the AP@0.5 value,AP@0.7 value and AP@0.75 value of the detection results of the proposed method are 91.14%,89.90% and 80.14%,respectively.The precision and recall rates of the counting results are 99.59% and 97.74%,respectively,which are superior to the counting results of SSD,FSSD,MutualGuide,EfficientDet and VarifocalNet models.The proposed method can provide support for real-time monitoring of aquaculture density for juvenile abalones.展开更多
文摘Biometric recognition refers to the process of recognizing a person’s identity using physiological or behavioral modalities,such as face,voice,fingerprint,gait,etc.Such biometric modalities are mostly used in recognition tasks separately as in unimodal systems,or jointly with two or more as in multimodal systems.However,multimodal systems can usually enhance the recognition performance over unimodal systems by integrating the biometric data of multiple modalities at different fusion levels.Despite this enhancement,in real-life applications some factors degrade multimodal systems’performance,such as occlusion,face poses,and noise in voice data.In this paper,we propose two algorithms that effectively apply dynamic fusion at feature level based on the data quality of multimodal biometrics.The proposed algorithms attempt to minimize the negative influence of confusing and low-quality features by either exclusion or weight reduction to achieve better recognition performance.The proposed dynamic fusion was achieved using face and voice biometrics,where face features were extracted using principal component analysis(PCA),and Gabor filters separately,whilst voice features were extracted using Mel-Frequency Cepstral Coefficients(MFCCs).Here,the facial data quality assessment of face images is mainly based on the existence of occlusion,whereas the assessment of voice data quality is substantially based on the calculation of signal to noise ratio(SNR)as per the existence of noise.To evaluate the performance of the proposed algorithms,several experiments were conducted using two combinations of three different databases,AR database,and the extended Yale Face Database B for face images,in addition to VOiCES database for voice data.The obtained results show that both proposed dynamic fusion algorithms attain improved performance and offer more advantages in identification and verification over not only the standard unimodal algorithms but also the multimodal algorithms using standard fusion methods.
基金Project (No. 2004CB719401) supported by the National Basic Research Program (973) of China
文摘A new framework of region-based dynamic image fusion is proposed. First, the technique of target detection is applied to dynamic images (image sequences) to segment images into different targets and background regions. Then different fusion rules are employed in different regions so that the target information is preserved as much as possible. In addition, steerable non-separable wavelet frame transform is used in the process of multi-resolution analysis, so the system achieves favorable characters of orientation and invariant shift. Compared with other image fusion methods, experimental results showed that the proposed method has better capabilities of target recognition and preserves clear background information.
基金funded by the Ongoing Research Funding Program(ORF-2025-102),King Saud University,Riyadh,Saudi Arabiaby the Science and Technology Research Programof Chongqing Municipal Education Commission(Grant No.KJQN202400813)by the Graduate Research Innovation Project(Grant Nos.yjscxx2025-269-193 and CYS25618).
文摘Medical image analysis based on deep learning has become an important technical requirement in the field of smart healthcare.In view of the difficulties in collaborative modeling of local details and global features in multimodal image analysis of ophthalmology,as well as the existence of information redundancy in cross-modal data fusion,this paper proposes amultimodal fusion framework based on cross-modal collaboration and weighted attention mechanism.In terms of feature extraction,the framework collaboratively extracts local fine-grained features and global structural dependencies through a parallel dual-branch architecture,overcoming the limitations of traditional single-modality models in capturing either local or global information;in terms of fusion strategy,the framework innovatively designs a cross-modal dynamic fusion strategy,combining overlappingmulti-head self-attention modules with a bidirectional feature alignment mechanism,addressing the bottlenecks of low feature interaction efficiency and excessive attention fusion computations in traditional parallel fusion,and further introduces cross-domain local integration technology,which enhances the representation ability of the lesion area through pixel-level feature recalibration and optimizes the diagnostic robustness of complex cases.Experiments show that the framework exhibits excellent feature expression and generalization performance in cross-domain scenarios of ophthalmic medical images and natural images,providing a high-precision,low-redundancy fusion paradigm for multimodal medical image analysis,and promoting the upgrade of intelligent diagnosis and treatment fromsingle-modal static analysis to dynamic decision-making.
文摘Micro-expressions,fleeting involuntary facial cues lasting under half a second,reveal genuine emotions and are valuable in clinical diagnosis and psychotherapy.Real-time recognition on resource-constrained embedded devices remains challenging,as current methods struggle to balance performance and efficiency.This study introduces a semi-lightweight multifunctional network that enhances real-time deployment and accuracy.Unlike prior simplistic feature fusion techniques,our novel multi-feature fusion strategy leverages temporal,spatial,and differential features to better capture dynamic changes.Enhanced by Residual Network(ResNet)architecture with channel and spatial attention mechanisms,the model improves feature representation while maintaining a lightweight design.Evaluations on SMIC,CASME II,SAMM,and their composite dataset show superior performance in Unweighted F1 Score(UF1)and Unweighted Average Recall(UAR),alongside faster detection speeds compared to existing algorithms.
基金sponsored by the Autonomous Region Key R&D Task Special(2022B01008)the National Key R&D Program of China(SQ2022AAA010308-5).
文摘Network intrusion detection systems(NIDS)based on deep learning have continued to make significant advances.However,the following challenges remain:on the one hand,simply applying only Temporal Convolutional Networks(TCNs)can lead to models that ignore the impact of network traffic features at different scales on the detection performance.On the other hand,some intrusion detection methods considermulti-scale information of traffic data,but considering only forward network traffic information can lead to deficiencies in capturing multi-scale temporal features.To address both of these issues,we propose a hybrid Convolutional Neural Network that supports a multi-output strategy(BONUS)for industrial internet intrusion detection.First,we create a multiscale Temporal Convolutional Network by stacking TCN of different scales to capture the multiscale information of network traffic.Meanwhile,we propose a bi-directional structure and dynamically set the weights to fuse the forward and backward contextual information of network traffic at each scale to enhance the model’s performance in capturing the multi-scale temporal features of network traffic.In addition,we introduce a gated network for each of the two branches in the proposed method to assist the model in learning the feature representation of each branch.Extensive experiments reveal the effectiveness of the proposed approach on two publicly available traffic intrusion detection datasets named UNSW-NB15 and NSL-KDD with F1 score of 85.03% and 99.31%,respectively,which also validates the effectiveness of enhancing the model’s ability to capture multi-scale temporal features of traffic data on detection performance.
基金jointly supported by the National Key R&D Project(2020YFD0900204)the Yantai Key R&D Project(2019XDHZ084).
文摘Detection and counting of abalones is one of key technologies of abalones breeding density estimation.The abalones in the breeding stage are small in size,densely distributed,and occluded between individuals,so the existing object detection algorithms have low precision for detecting the abalones in the breeding stage.To solve this problem,a detection and counting method of juvenile abalones based on improved SSD network is proposed in this research.The innovation points of this method are:Firstly,the multi-layer feature dynamic fusion method is proposed to obtain more color and texture information and improve detection precision of juvenile abalones with small size;secondly,the multiscale attention feature extraction method is proposed to highlight shape and edge feature information of juvenile abalones and increase detection precision of juvenile abalones with dense distribution and individual coverage;finally,the loss feedback training method is used to increase the diversity of data and the pixels of juvenile abalones in the images to get the even higher detection precision of juvenile abalones with small size.The experimental results show that the AP@0.5 value,AP@0.7 value and AP@0.75 value of the detection results of the proposed method are 91.14%,89.90% and 80.14%,respectively.The precision and recall rates of the counting results are 99.59% and 97.74%,respectively,which are superior to the counting results of SSD,FSSD,MutualGuide,EfficientDet and VarifocalNet models.The proposed method can provide support for real-time monitoring of aquaculture density for juvenile abalones.