期刊文献+
共找到84,784篇文章
< 1 2 250 >
每页显示 20 50 100
A Fine-Grained RecognitionModel based on Discriminative Region Localization and Efficient Second-Order Feature Encoding
1
作者 Xiaorui Zhang Yingying Wang +3 位作者 Wei Sun Shiyu Zhou Haoming Zhang Pengpai Wang 《Computers, Materials & Continua》 2026年第4期946-965,共20页
Discriminative region localization and efficient feature encoding are crucial for fine-grained object recognition.However,existing data augmentation methods struggle to accurately locate discriminative regions in comp... Discriminative region localization and efficient feature encoding are crucial for fine-grained object recognition.However,existing data augmentation methods struggle to accurately locate discriminative regions in complex backgrounds,small target objects,and limited training data,leading to poor recognition.Fine-grained images exhibit“small inter-class differences,”and while second-order feature encoding enhances discrimination,it often requires dual Convolutional Neural Networks(CNN),increasing training time and complexity.This study proposes a model integrating discriminative region localization and efficient second-order feature encoding.By ranking feature map channels via a fully connected layer,it selects high-importance channels to generate an enhanced map,accurately locating discriminative regions.Cropping and erasing augmentations further refine recognition.To improve efficiency,a novel second-order feature encoding module generates an attention map from the fourth convolutional group of Residual Network 50 layers(ResNet-50)and multiplies it with features from the fifth group,producing second-order features while reducing dimensionality and training time.Experiments on Caltech-University of California,San Diego Birds-200-2011(CUB-200-2011),Stanford Car,and Fine-Grained Visual Classification of Aircraft(FGVC Aircraft)datasets show state-of-the-art accuracy of 88.9%,94.7%,and 93.3%,respectively. 展开更多
关键词 fine-grained recognition feature encoding data augmentation second-order feature discriminative regions
在线阅读 下载PDF
Research on Fine-Grained Recognition Method for Sensitive Information in Social Networks Based on CLIP
2
作者 Menghan Zhang Fangfang Shan +1 位作者 Mengyao Liu Zhenyu Wang 《Computers, Materials & Continua》 SCIE EI 2024年第10期1565-1580,共16页
With the emergence and development of social networks,people can stay in touch with friends,family,and colleagues more quickly and conveniently,regardless of their location.This ubiquitous digital internet environment... With the emergence and development of social networks,people can stay in touch with friends,family,and colleagues more quickly and conveniently,regardless of their location.This ubiquitous digital internet environment has also led to large-scale disclosure of personal privacy.Due to the complexity and subtlety of sensitive information,traditional sensitive information identification technologies cannot thoroughly address the characteristics of each piece of data,thus weakening the deep connections between text and images.In this context,this paper adopts the CLIP model as a modality discriminator.By using comparative learning between sensitive image descriptions and images,the similarity between the images and the sensitive descriptions is obtained to determine whether the images contain sensitive information.This provides the basis for identifying sensitive information using different modalities.Specifically,if the original data does not contain sensitive information,only single-modality text-sensitive information identification is performed;if the original data contains sensitive information,multimodality sensitive information identification is conducted.This approach allows for differentiated processing of each piece of data,thereby achieving more accurate sensitive information identification.The aforementioned modality discriminator can address the limitations of existing sensitive information identification technologies,making the identification of sensitive information from the original data more appropriate and precise. 展开更多
关键词 Deep learning social networks sensitive information recognition multi-modal fusion
在线阅读 下载PDF
A teacher-student based attention network for fine-grainedimage recognition
3
作者 Ang Li Xueyi Zhang +1 位作者 Peilin Li Bin Kang 《Digital Communications and Networks》 2025年第1期52-59,共8页
Fine-grained Image Recognition(FGIR)task is dedicated to distinguishing similar sub-categories that belong to the same super-category,such as bird species and car types.In order to highlight visual differences,existin... Fine-grained Image Recognition(FGIR)task is dedicated to distinguishing similar sub-categories that belong to the same super-category,such as bird species and car types.In order to highlight visual differences,existing FGIR works often follow two steps:discriminative sub-region localization and local feature representation.However,these works pay less attention on global context information.They neglect a fact that the subtle visual difference in challenging scenarios can be highlighted through exploiting the spatial relationship among different subregions from a global view point.Therefore,in this paper,we consider both global and local information for FGIR,and propose a collaborative teacher-student strategy to reinforce and unity the two types of information.Our framework is implemented mainly by convolutional neural network,referred to Teacher-Student Based Attention Convolutional Neural Network(T-S-ACNN).For fine-grained local information,we choose the classic Multi-Attention Network(MA-Net)as our baseline,and propose a type of boundary constraint to further reduce background noises in the local attention maps.In this way,the discriminative sub-regions tend to appear in the area occupied by fine-grained objects,leading to more accurate sub-region localization.For fine-grained global information,we design a graph convolution based Global Attention Network(GA-Net),which can combine extracted local attention maps from MA-Net with non-local techniques to explore spatial relationship among subregions.At last,we develop a collaborative teacher-student strategy to adaptively determine the attended roles and optimization modes,so as to enhance the cooperative reinforcement of MA-Net and GA-Net.Extensive experiments on CUB-200-2011,Stanford Cars and FGVC Aircraft datasets illustrate the promising performance of our framework. 展开更多
关键词 fine-grained image recognition Collaborative teacher-student strategy Multi-attention Global attention
在线阅读 下载PDF
Fine-Grained Ship Recognition Based on Visible and Near-Infrared Multimodal Remote Sensing Images: Dataset,Methodology and Evaluation 被引量:1
4
作者 Shiwen Song Rui Zhang +1 位作者 Min Hu Feiyao Huang 《Computers, Materials & Continua》 SCIE EI 2024年第6期5243-5271,共29页
Fine-grained recognition of ships based on remote sensing images is crucial to safeguarding maritime rights and interests and maintaining national security.Currently,with the emergence of massive high-resolution multi... Fine-grained recognition of ships based on remote sensing images is crucial to safeguarding maritime rights and interests and maintaining national security.Currently,with the emergence of massive high-resolution multi-modality images,the use of multi-modality images for fine-grained recognition has become a promising technology.Fine-grained recognition of multi-modality images imposes higher requirements on the dataset samples.The key to the problem is how to extract and fuse the complementary features of multi-modality images to obtain more discriminative fusion features.The attention mechanism helps the model to pinpoint the key information in the image,resulting in a significant improvement in the model’s performance.In this paper,a dataset for fine-grained recognition of ships based on visible and near-infrared multi-modality remote sensing images has been proposed first,named Dataset for Multimodal Fine-grained Recognition of Ships(DMFGRS).It includes 1,635 pairs of visible and near-infrared remote sensing images divided into 20 categories,collated from digital orthophotos model provided by commercial remote sensing satellites.DMFGRS provides two types of annotation format files,as well as segmentation mask images corresponding to the ship targets.Then,a Multimodal Information Cross-Enhancement Network(MICE-Net)fusing features of visible and near-infrared remote sensing images,has been proposed.In the network,a dual-branch feature extraction and fusion module has been designed to obtain more expressive features.The Feature Cross Enhancement Module(FCEM)achieves the fusion enhancement of the two modal features by making the channel attention and spatial attention work cross-functionally on the feature map.A benchmark is established by evaluating state-of-the-art object recognition algorithms on DMFGRS.MICE-Net conducted experiments on DMFGRS,and the precision,recall,mAP0.5 and mAP0.5:0.95 reached 87%,77.1%,83.8%and 63.9%,respectively.Extensive experiments demonstrate that the proposed MICE-Net has more excellent performance on DMFGRS.Built on lightweight network YOLO,the model has excellent generalizability,and thus has good potential for application in real-life scenarios. 展开更多
关键词 Multi-modality dataset ship recognition fine-grained recognition attention mechanism
在线阅读 下载PDF
Method for Behavior Recognition of Hu Sheep in Intensive Farming Based on HLNC-YOLO
5
作者 JI Ronghua CHANG Hongrui +2 位作者 ZHANG Suoxiang LIU Zhongying WU Zhonghong 《农业机械学报》 北大核心 2026年第2期265-275,共11页
Behavior recognition of Hu sheep contributes to their intensive and intelligent farming.Due to the generally high density of Hu sheep farming,severe occlusion occurs among different behaviors and even among sheep perf... Behavior recognition of Hu sheep contributes to their intensive and intelligent farming.Due to the generally high density of Hu sheep farming,severe occlusion occurs among different behaviors and even among sheep performing the same behavior,leading to missing and false detection issues in existing behavior recognition methods.A high-low frequency aggregated attention and negative sample comprehensive score loss and comprehensive score soft non-maximum suppression-YOLO(HLNC-YOLO)was proposed for identifying the behavior of Hu sheep,addressing the issues of missed and erroneous detections caused by occlusion between Hu sheep in intensive farming.Firstly,images of four typical behaviors-standing,lying,eating,and drinking-were collected from the sheep farm to construct the Hu sheep behavior dataset(HSBD).Next,to solve the occlusion issues,during the training phase,the C2F-HLAtt module was integrated,which combined high-low frequency aggregation attention,into the YOLO v8 Backbone to perceive occluded objects and introduce an auxiliary reversible branch to retain more effective features.Using comprehensive score regression loss(CSLoss)to reduce the scores of suboptimal boxes and enhance the comprehensive scores of occluded object boxes.Finally,the soft comprehensive score non-maximal suppression(Soft-CS-NMS)algorithm filtered prediction boxes during the inferencing.Testing on the HSBD,HLNC-YOLO achieved a mean average precision(mAP@50)of 87.8%,with a memory footprint of 17.4 MB.This represented an improvement of 7.1,2.2,4.6,and 11 percentage points over YOLO v8,YOLO v9,YOLO v10,and Faster R-CNN,respectively.Research indicated that the HLNC-YOLO accurately identified the behavior of Hu sheep in intensive farming and possessed generalization capabilities,providing technical support for smart farming. 展开更多
关键词 behavior recognition YOLO loss function attention mechanism
在线阅读 下载PDF
RSG-Conformer:ReLU-Based Sparse and Grouped Conformer for Audio-Visual Speech Recognition
6
作者 Yewei Xiao Xin Du Wei Zeng 《Computers, Materials & Continua》 2026年第3期1325-1348,共24页
Audio-visual speech recognition(AVSR),which integrates audio and visual modalities to improve recognition performance and robustness in noisy or adverse acoustic conditions,has attracted significant research interest.... Audio-visual speech recognition(AVSR),which integrates audio and visual modalities to improve recognition performance and robustness in noisy or adverse acoustic conditions,has attracted significant research interest.However,Conformer-based architectures remain computational expensive due to the quadratic increase in the spatial and temporal complexity of their softmax-based attention mechanisms with sequence length.In addition,Conformerbased architectures may not provide sufficient flexibility for modeling local dependencies at different granularities.To mitigate these limitations,this study introduces a novel AVSR framework based on a ReLU-based Sparse and Grouped Conformer(RSG-Conformer)architecture.Specifically,we propose a Global-enhanced Sparse Attention(GSA)module incorporating an efficient context restoration block to recover lost contextual cues.Concurrently,a Grouped-scale Convolution(GSC)module replaces the standard Conformer convolution module,providing adaptive local modeling across varying temporal resolutions.Furthermore,we integrate a Refined Intermediate Contextual CTC(RIC-CTC)supervision strategy.This approach applies progressively increasing loss weights combined with convolution-based context aggregation,thereby further relaxing the constraint of conditional independence inherent in standard CTC frameworks.Evaluations on the LRS2 and LRS3 benchmark validate the efficacy of our approach,with word error rates(WERs)reduced to 1.8%and 1.5%,respectively.These results further demonstrate and validate its state-of-the-art performance in AVSR tasks. 展开更多
关键词 Audio-visual speech recognition CONFORMER CTC sparse attention
在线阅读 下载PDF
Hybrid Quantum Gate Enabled CNN Framework with Optimized Features for Human-Object Detection and Recognition
7
作者 Nouf Abdullah Almujally Tanvir Fatima Naik Bukht +3 位作者 Shuaa S.Alharbi Asaad Algarni Ahmad Jalal Jeongmin Park 《Computers, Materials & Continua》 2026年第4期2254-2271,共18页
Recognising human-object interactions(HOI)is a challenging task for traditional machine learning models,including convolutional neural networks(CNNs).Existing models show limited transferability across complex dataset... Recognising human-object interactions(HOI)is a challenging task for traditional machine learning models,including convolutional neural networks(CNNs).Existing models show limited transferability across complex datasets such as D3D-HOI and SYSU 3D HOI.The conventional architecture of CNNs restricts their ability to handle HOI scenarios with high complexity.HOI recognition requires improved feature extraction methods to overcome the current limitations in accuracy and scalability.This work proposes a Novel quantum gate-enabled hybrid CNN(QEH-CNN)for effectiveHOI recognition.Themodel enhancesCNNperformance by integrating quantumcomputing components.The framework begins with bilateral image filtering,followed bymulti-object tracking(MOT)and Felzenszwalb superpixel segmentation.A watershed algorithm refines object boundaries by cleaning merged superpixels.Feature extraction combines a histogram of oriented gradients(HOG),Global Image Statistics for Texture(GIST)descriptors,and a novel 23-joint keypoint extractionmethod using relative joint angles and joint proximitymeasures.A fuzzy optimization process refines the extracted features before feeding them into the QEH-CNNmodel.The proposed model achieves 95.06%accuracy on the 3D-D3D-HOI dataset and 97.29%on the SYSU3DHOI dataset.Theintegration of quantum computing enhances feature optimization,leading to improved accuracy and overall model efficiency. 展开更多
关键词 Pattern recognition image segmentation computer vision object detection
在线阅读 下载PDF
A CNN-Transformer Hybrid Model for Real-Time Recognition of Affective Tactile Biosignals
8
作者 Chang Xu Xianbo Yin +1 位作者 Zhiyong Zhou Bomin Liu 《Computers, Materials & Continua》 2026年第4期2343-2356,共14页
This study presents a hybrid CNN-Transformer model for real-time recognition of affective tactile biosignals.The proposed framework combines convolutional neural networks(CNNs)to extract spatial and local temporal fea... This study presents a hybrid CNN-Transformer model for real-time recognition of affective tactile biosignals.The proposed framework combines convolutional neural networks(CNNs)to extract spatial and local temporal features with the Transformer encoder that captures long-range dependencies in time-series data through multi-head attention.Model performance was evaluated on two widely used tactile biosignal datasets,HAART and CoST,which contain diverse affective touch gestures recorded from pressure sensor arrays.TheCNN-Transformer model achieved recognition rates of 93.33%on HAART and 80.89%on CoST,outperforming existing methods on both benchmarks.By incorporating temporal windowing,the model enables instantaneous prediction,improving generalization across gestures of varying duration.These results highlight the effectiveness of deep learning for tactile biosignal processing and demonstrate the potential of theCNN-Transformer approach for future applications in wearable sensors,affective computing,and biomedical monitoring. 展开更多
关键词 Tactile biosignals affective touch recognition wearable sensors signal processing human-machine interaction
在线阅读 下载PDF
Human Activity Recognition Using Weighted Average Ensemble by Selected Deep Learning Models
9
作者 Waseem Akhtar Mahwish Ilyas +3 位作者 Romana Aziz Ghadah Aldehim Tassawar Iqbal Muhammad Ramzan 《Computer Modeling in Engineering & Sciences》 2026年第2期971-989,共19页
Human Activity Recognition(HAR)is a novel area for computer vision.It has a great impact on healthcare,smart environments,and surveillance while is able to automatically detect human behavior.It plays a vital role in ... Human Activity Recognition(HAR)is a novel area for computer vision.It has a great impact on healthcare,smart environments,and surveillance while is able to automatically detect human behavior.It plays a vital role in many applications,such as smart home,healthcare,human computer interaction,sports analysis,and especially,intelligent surveillance.In this paper,we propose a robust and efficient HAR system by leveraging deep learning paradigms,including pre-trained models,CNN architectures,and their average-weighted fusion.However,due to the diversity of human actions and various environmental influences,as well as a lack of data and resources,achieving high recognition accuracy remain elusive.In this work,a weighted average ensemble technique is employed to fuse three deep learning models:EfficientNet,ResNet50,and a custom CNN.The results of this study indicate that using a weighted average ensemble strategy for developing more effective HAR models may be a promising idea for detection and classification of human activities.Experiments by using the benchmark dataset proved that the proposed weighted ensemble approach outperformed existing approaches in terms of accuracy and other key performance measures.The combined average-weighted ensemble of pre-trained and CNN models obtained an accuracy of 98%,compared to 97%,96%,and 95%for the customized CNN,EfficientNet,and ResNet50 models,respectively. 展开更多
关键词 Artificial intelligence computer vision deep learning recognition human activity classification image processing
在线阅读 下载PDF
Boruta-LSTMAE:Feature-Enhanced Depth Image Denoising for 3D Recognition
10
作者 Fawad Salam Khan Noman Hasany +6 位作者 Muzammil Ahmad Khan Shayan Abbas Sajjad Ahmed Muhammad Zorain Wai Yie Leong Susama Bagchi Sanjoy Kumar Debnath 《Computers, Materials & Continua》 2026年第4期2181-2206,共26页
The initial noise present in the depth images obtained with RGB-D sensors is a combination of hardware limitations in addition to the environmental factors,due to the limited capabilities of sensors,which also produce... The initial noise present in the depth images obtained with RGB-D sensors is a combination of hardware limitations in addition to the environmental factors,due to the limited capabilities of sensors,which also produce poor computer vision results.The common image denoising techniques tend to remove significant image details and also remove noise,provided they are based on space and frequency filtering.The updated framework presented in this paper is a novel denoising model that makes use of Boruta-driven feature selection using a Long Short-Term Memory Autoencoder(LSTMAE).The Boruta algorithm identifies the most useful depth features that are used to maximize the spatial structure integrity and reduce redundancy.An LSTMAE is then used to process these selected features and model depth pixel sequences to generate robust,noise-resistant representations.The system uses the encoder to encode the input data into a latent space that has been compressed before it is decoded to retrieve the clean image.Experiments on a benchmark data set show that the suggested technique attains a PSNR of 45 dB and an SSIM of 0.90,which is 10 dB higher than the performance of conventional convolutional autoencoders and 15 times higher than that of the wavelet-based models.Moreover,the feature selection step will decrease the input dimensionality by 40%,resulting in a 37.5%reduction in training time and a real-time inference rate of 200 FPS.Boruta-LSTMAE framework,therefore,offers a highly efficient and scalable system for depth image denoising,with a high potential to be applied to close-range 3D systems,such as robotic manipulation and gesture-based interfaces. 展开更多
关键词 Boruta LSTM autoencoder feature fusion DENOISING 3D object recognition depth images
在线阅读 下载PDF
Enantioselective recognition of amino acids in water using emission-tunable chiral fluorescent probes
11
作者 Yi-Xin Zhang Fang-Qi Zhang +5 位作者 Ao-Pei Peng Tao Jiang Ya-Xi Meng Yang Li Shuang-Xi Gu Yuan-Yuan Zhu 《Chinese Chemical Letters》 2026年第1期338-343,共6页
The detection of amino acid enantiomers holds significant importance in biomedical,chemical,food,and other fields.Traditional chiral recognition methods using fluorescent probes primarily rely on fluorescence intensit... The detection of amino acid enantiomers holds significant importance in biomedical,chemical,food,and other fields.Traditional chiral recognition methods using fluorescent probes primarily rely on fluorescence intensity changes,which can compromise accuracy and repeatability.In this study,we report a novel fluorescent probe(R)-Z1 that achieves effective enantioselective recognition of chiral amino acids in water by altering emission wavelengths(>60 nm).This water-soluble probe(R)-Z1 exhibits cyan or yellow-green luminescence upon interaction with amino acid enantiomers,enabling reliable chiral detection of 14 natural amino acids.It also allows for the determination of enantiomeric excess through monitoring changes in luminescent color.Additionally,a logic operation with two inputs and three outputs was constructed based on these optical properties.Notably,amino acid enantiomers were successfully detected via dual-channel analysis at both the food and cellular levels.This study provides a new dynamic luminescence-based tool for the accurate sensing and detection of amino acid enantiomers. 展开更多
关键词 Fluorescent probe Amino acid enantiomers Chiral recognition Aqueous solution Dynamic multicolor emissions
原文传递
Improved spatio-temporal evidence fusion for radar aerial target tactical intention recognition
12
作者 HUAI Liangliang ZHANG Xinyu +2 位作者 WU Shuying YUN Peng LI Bo 《Journal of Systems Engineering and Electronics》 2026年第1期148-156,共9页
To address the issue of incorrect fusion results caused by conflicting evidence due to inaccurate evidence and incomplete recognition frameworks in radar airborne target tactical intention recognition,a spatiotemporal... To address the issue of incorrect fusion results caused by conflicting evidence due to inaccurate evidence and incomplete recognition frameworks in radar airborne target tactical intention recognition,a spatiotemporal evidence fusion algorithm is proposed.To resolve the conflict evidence fusion problem caused by inaccurate evidence,the algorithm performs discounting of evidence from both spatial and temporal dimensions.Spatial discounting is influenced by both inter-evidence inconsistency and intra-evidence inconsistency,while temporal discounting is determined by time intervals and information entropy.For the problem of conflicting evidence fusion due to an incomplete recognition framework,an open recognition architecture based on dynamic composite focal elements is proposed.This approach allocates some conflicting information to temporary composite focal elements,avoiding excessive basic probability assignment(BPA)of the empty set after fusion,which can lead to deviations from the actual fusion results.Simulation experiments comparing various methods indicate that the proposed method can effectively improve target intention recognition accuracy and demonstrates good stability. 展开更多
关键词 evidence fusion tactical intention recognition evidence discount open frame of discernment
在线阅读 下载PDF
Intelligent Human Interaction Recognition with Multi-Modal Feature Extraction and Bidirectional LSTM
13
作者 Muhammad Hamdan Azhar Yanfeng Wu +4 位作者 Nouf Abdullah Almujally Shuaa S.Alharbi Asaad Algarni Ahmad Jalal Hui Liu 《Computers, Materials & Continua》 2026年第4期1632-1649,共18页
Recognizing human interactions in RGB videos is a critical task in computer vision,with applications in video surveillance.Existing deep learning-based architectures have achieved strong results,but are computationall... Recognizing human interactions in RGB videos is a critical task in computer vision,with applications in video surveillance.Existing deep learning-based architectures have achieved strong results,but are computationally intensive,sensitive to video resolution changes and often fail in crowded scenes.We propose a novel hybrid system that is computationally efficient,robust to degraded video quality and able to filter out irrelevant individuals,making it suitable for real-life use.The system leverages multi-modal handcrafted features for interaction representation and a deep learning classifier for capturing complex dependencies.Using Mask R-CNN and YOLO11-Pose,we extract grayscale silhouettes and keypoint coordinates of interacting individuals,while filtering out irrelevant individuals using a proposed algorithm.From these,we extract silhouette-based features(local ternary pattern and histogram of optical flow)and keypoint-based features(distances,angles and velocities)that capture distinct spatial and temporal information.A Bidirectional Long Short-Term Memory network(BiLSTM)then classifies the interactions.Extensive experiments on the UT Interaction,SBU Kinect Interaction and the ISR-UOL 3D social activity datasets demonstrate that our system achieves competitive accuracy.They also validate the effectiveness of the chosen features and classifier,along with the proposed system’s computational efficiency and robustness to occlusion. 展开更多
关键词 Human interaction recognition keypoint coordinates grayscale silhouettes bidirectional long shortterm memory network
在线阅读 下载PDF
MDGET-MER:Multi-Level Dynamic Gating and Emotion Transfer for Multi-Modal Emotion Recognition
14
作者 Musheng Chen Qiang Wen +2 位作者 Xiaohong Qiu Junhua Wu Wenqing Fu 《Computers, Materials & Continua》 2026年第3期872-893,共22页
In multi-modal emotion recognition,excessive reliance on historical context often impedes the detection of emotional shifts,while modality heterogeneity and unimodal noise limit recognition performance.Existing method... In multi-modal emotion recognition,excessive reliance on historical context often impedes the detection of emotional shifts,while modality heterogeneity and unimodal noise limit recognition performance.Existing methods struggle to dynamically adjust cross-modal complementary strength to optimize fusion quality and lack effective mechanisms to model the dynamic evolution of emotions.To address these issues,we propose a multi-level dynamic gating and emotion transfer framework for multi-modal emotion recognition.A dynamic gating mechanism is applied across unimodal encoding,cross-modal alignment,and emotion transfer modeling,substantially improving noise robustness and feature alignment.First,we construct a unimodal encoder based on gated recurrent units and feature-selection gating to suppress intra-modal noise and enhance contextual representation.Second,we design a gated-attention crossmodal encoder that dynamically calibrates the complementary contributions of visual and audio modalities to the dominant textual features and eliminates redundant information.Finally,we introduce a gated enhanced emotion transfer module that explicitly models the temporal dependence of emotional evolution in dialogues via transfer gating and optimizes continuity modeling with a comparative learning loss.Experimental results demonstrate that the proposed method outperforms state-of-the-art models on the public MELD and IEMOCAP datasets. 展开更多
关键词 Multi-modal emotion recognition dynamic gating emotion transfer module cross-modal dynamic alignment noise robustness
在线阅读 下载PDF
RNPC-net:Automatic recognition and mapping of weathering degree and groundwater condition of tunnel faces
15
作者 Xiang Wu Fengyan Wang +4 位作者 Jianping Chen Mingchang Wang Lina Cheng Chengyao Zhang Junke Xu 《Journal of Rock Mechanics and Geotechnical Engineering》 2026年第2期1138-1159,共22页
Accurate and rapid recognition of weathering degree(WD)and groundwater condition(GC)is essential for evaluating rock mass quality and conducting stability analyses in underground engineering.Conventional WD and GC rec... Accurate and rapid recognition of weathering degree(WD)and groundwater condition(GC)is essential for evaluating rock mass quality and conducting stability analyses in underground engineering.Conventional WD and GC recognition methods often rely on subjective evaluation by field experts,supplemented by field sampling and laboratory testing.These methods are frequently complex and timeconsuming,making it challenging to meet the rapidly evolving demands of underground engineering.Therefore,this study proposes a rock non-geometric parameter classification network(RNPC-net)to rapidly achieve the recognition and mapping ofWD and GC of tunnel faces.The hybrid feature extraction module(HFEM)in RNPC-net can fully extract,fuse,and utilize multi-scale features of images,enhancing the network's classification performance.Moreover,the designed adaptive weighting auxiliary classifier(AC)helps the network learn features more efficiently.Experimental results show that RNPC-net achieved classification accuracies of 0.8756 and 0.8710 for WD and GC,respectively,representing an improvement of approximately 2%e10%compared to other methods.Both quantitative and qualitative experiments confirm the effectiveness and superiority of RNPC-net.Furthermore,for WD and GC mapping,RNPC-net outperformed other methods by achieving the highest mean intersection over union(mIOU)across most tunnel faces.The mapping results closely align with measurements provided by field experts.The application of WD and GC mapping results to the rock mass rating(RMR)system achieved a transition from conventional qualitative to quantitative evaluation.This advancement enables more accurate and reliable rock mass quality evaluations,particularly under critical conditions of RMR. 展开更多
关键词 Tunnel face Weathering degree Groundwater condition RNPC-net Hybrid feature extraction module recognition and mapping
在线阅读 下载PDF
Shen Weirong:The Identification and Recognition of Reincarnated living Buddhas Must Be Conducted in Strict Accordance with National Laws
16
作者 Wang Xi 《China's Tibet》 2026年第1期19-23,共5页
What are the origins,historical development,and lineages of the reincarnation system of Living Buddhas in Tibetan Buddhism?What kind of academic framework is"Han-Tibetan Buddhist Studies"?In an interview wit... What are the origins,historical development,and lineages of the reincarnation system of Living Buddhas in Tibetan Buddhism?What kind of academic framework is"Han-Tibetan Buddhist Studies"?In an interview with this journal,Professor Shen Weirong ofTsinghua University discusses these issues on the basis of his research. 展开更多
关键词 reincarnated living buddhas identification recognition living buddhas Tibetan Buddhism LINEAGES reincarnation system academic framework historical development
在线阅读 下载PDF
Research on the visualization method of lithology intelligent recognition based on deep learning using mine tunnel images
17
作者 Aiai Wang Shuai Cao +1 位作者 Erol Yilmaz Hui Cao 《International Journal of Minerals,Metallurgy and Materials》 2026年第1期141-152,共12页
An image processing and deep learning method for identifying different types of rock images was proposed.Preprocessing,such as rock image acquisition,gray scaling,Gaussian blurring,and feature dimensionality reduction... An image processing and deep learning method for identifying different types of rock images was proposed.Preprocessing,such as rock image acquisition,gray scaling,Gaussian blurring,and feature dimensionality reduction,was conducted to extract useful feature information and recognize and classify rock images using Tensor Flow-based convolutional neural network(CNN)and Py Qt5.A rock image dataset was established and separated into workouts,confirmation sets,and test sets.The framework was subsequently compiled and trained.The categorization approach was evaluated using image data from the validation and test datasets,and key metrics,such as accuracy,precision,and recall,were analyzed.Finally,the classification model conducted a probabilistic analysis of the measured data to determine the equivalent lithological type for each image.The experimental results indicated that the method combining deep learning,Tensor Flow-based CNN,and Py Qt5 to recognize and classify rock images has an accuracy rate of up to 98.8%,and can be successfully utilized for rock image recognition.The system can be extended to geological exploration,mine engineering,and other rock and mineral resource development to more efficiently and accurately recognize rock samples.Moreover,it can match them with the intelligent support design system to effectively improve the reliability and economy of the support scheme.The system can serve as a reference for supporting the design of other mining and underground space projects. 展开更多
关键词 rock picture recognition convolutional neural network intelligent support for roadways deep learning lithology determination
在线阅读 下载PDF
TENG-Based Self-Powered Silent Speech Recognition Interface:from Assistive Communication to Immersive AR/VR Interaction
18
作者 Shuai Lin Yanmin Guo +4 位作者 Xiangyao Zeng Xiongtu Zhou Yongai Zhang Chengda Li Chaoxing Wu 《Nano-Micro Letters》 2026年第5期31-44,共14页
Lip language provides a silent,intuitive,and efficient mode of communication,offering a promising solution for individuals with speech impairments.Its articulation relies on complex movements of the jaw and the muscle... Lip language provides a silent,intuitive,and efficient mode of communication,offering a promising solution for individuals with speech impairments.Its articulation relies on complex movements of the jaw and the muscles surrounding it.However,the accurate and real-time acquisition and decoding of these movements into reliable silent speech signals remains a significant challenge.In this work,we propose a real-time silent speech recognition system,which integrates a triboelectric nanogenerator-based flexible pressure sensor(FPS)with a deep learning framework.The FPS employs a porous pyramid-structured silicone film as the negative triboelectric layer,enabling highly sensitive pressure detection in the low-force regime(1 V N^(-1) for 0-10 N and 4.6 V N^(-1) for 10-24 N).This allows it to precisely capture jaw movements during speech and convert them into electrical signals.To decode the signals,we proposed a convolutional neural networklong short-term memory(CNN-LSTM)hybrid network,combining CNN and LSTM model to extract both local spatial features and temporal dynamics.The model achieved 95.83%classification accuracy in 30 categories of daily words.Furthermore,the decoded silent speech signals can be directly translated into executable commands for contactless and precise control of the smartphone.The system can also be connected to AR glasses,offering a novel human-machine interaction approach with promising potential in AR/VR applications. 展开更多
关键词 Flexible pressure sensor Silent speech recognition Triboelectric nanogenerator Deep learning AR/VR interaction
在线阅读 下载PDF
Image recognition-based detection system for preventing accidental dislodgement of head-and-neck medical supplies in ICU patients:A feasibility randomized controlled trial
19
作者 Zhongjie Shi Taotao Shi +5 位作者 Xin Gao Jian Li Hong Xu Xiaojun Li Zhanxiang Wang Sifang Chen 《International Journal of Nursing Sciences》 2026年第1期3-10,I0001,共9页
Objectives This study aimed to design and evaluate a detection system for the accidental dislodgement of head-and-neck medical supplies through hand position recognition and tracking in Intensive Care Unit(ICU)patient... Objectives This study aimed to design and evaluate a detection system for the accidental dislodgement of head-and-neck medical supplies through hand position recognition and tracking in Intensive Care Unit(ICU)patients.Methods We conducted a single-center,prospective,parallel-group feasibility randomized controlled trial.We recruited 80 participants using convenience sampling from the ICU of a hospital in Ningbo City,Zhejiang Province,between March 2025 and June 2025,and they were randomly assigned to either the control group(routine care)or the intervention group(routine care plus image recognition-based detection system).The system continuously tracked patients’hand positions via bedside cameras and generated real-time alarms when hands entered predefined risk zones,notifying on-duty nurses to enable early intervention.System stability was assessed by continuous system uptime;system performance and clinical feasibility were evaluated by the frequencies of risk actions and accidental dislodgement of medical supplies(ADMS).Results All 80 participants completed the intervention,with 40 patients in each group.The baseline characteristics and median observation time of the two groups were balanced(intervention group:48 h/patient vs.control group:49 h/patient).Compared with the control group,the intervention group showed fewer ADMS(2/40 vs.9/40)and detected more risk actions per 100 h(36 vs.25);all system-detected events had corroborating images with complete concordance on manual review,and all nurse-recorded hand-contact events were accurately captured.Conclusions The study demonstrated that the image recognition-based detection system can function stably in clinical settings,providing accurate and continuous surveillance while supporting the early detection of risk actions.By reducing the observation burden and offering real-time cognitive support,the system complements routine nursing care and serves as an additional safety measure in ICU practice.With further optimization and larger multicenter validation,this approach could have the potential to make a significant contribution to the development of smart ICUs and the broader digital transformation of nursing care. 展开更多
关键词 Accidental dislodgement of medical supplies Feasibility randomized trial Image recognition Intensive Care Unit Risk monitoring
暂未订购
Automated recognition of rock discontinuity in underground engineering using geometric feature analysis
20
作者 Adili Rusuli Xiaojun Li +1 位作者 Yuyun Wang Yi Rui 《Journal of Rock Mechanics and Geotechnical Engineering》 2026年第2期1016-1033,共18页
Discontinuities in rock masses critically impact the stability and safety of underground engineering.Mainstream discontinuities identificationmethods,which rely on normal vector estimation and clustering algorithms,su... Discontinuities in rock masses critically impact the stability and safety of underground engineering.Mainstream discontinuities identificationmethods,which rely on normal vector estimation and clustering algorithms,suffer from accuracy degradation,omission of critical discontinuities when orientation density is unevenly distributed,and need manual intervention.To overcome these limitations,this paper introduces a novel discontinuities identificationmethod based on geometric feature analysis of rock mass.By analyzing spatial distribution variability of point cloud and integrating an adaptive region growing algorithm,the method accurately detects independent discontinuities under complex geological conditions.Given that rock mass orientations typically follow a Fisher distribution,an adaptive hierarchical clustering algorithm based on statistical analysis is employed to automatically determine the optimal number of structural sets,eliminating the need for preset clusters or thresholds inherent in traditional methods.The proposed approach effectively handles diverse rock mass shapes and sizes,leveraging both local and global geometric features to minimize noise interference.Experimental validation on three real-world rock mass models,alongside comparisons with three conventional directional clustering algorithms,demonstrates superior accuracy and robustness in identifying optimal discontinuity sets.The proposed method offers a reliable and efficienttool for discontinuities detection and grouping in underground engineering,significantlyenhancing design and construction outcomes. 展开更多
关键词 Underground engineering Rock mass discontinuity Orientation grouping Fisher distribution 3D point cloud Automated recognition
在线阅读 下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部