期刊文献+
共找到158,303篇文章
< 1 2 250 >
每页显示 20 50 100
Cryptomining Malware Detection Based on Edge Computing-Oriented Multi-Modal Features Deep Learning 被引量:2
1
作者 Wenjuan Lian Guoqing Nie +2 位作者 Yanyan Kang Bin Jia Yang Zhang 《China Communications》 SCIE CSCD 2022年第2期174-185,共12页
In recent years,with the increase in the price of cryptocurrencies,the number of malicious cryptomining software has increased significantly.With their powerful spreading ability,cryptomining malware can unknowingly o... In recent years,with the increase in the price of cryptocurrencies,the number of malicious cryptomining software has increased significantly.With their powerful spreading ability,cryptomining malware can unknowingly occupy our resources,harm our interests,and damage more legitimate assets.However,although current traditional rule-based malware detection methods have a low false alarm rate,they have a relatively low detection rate when faced with a large volume of emerging malware.Even though common machine learning-based or deep learning-based methods have certain ability to learn and detect unknown malware,the characteristics they learn are single and independent,and cannot be learned adaptively.Aiming at the above problems,we propose a deep learning model with multi-input of multi-modal features,which can simultaneously accept digital features and image features on different dimensions.The model in turn includes parallel learning of three sub-models and ensemble learning of another specific sub-model.The four sub-models can be processed in parallel on different devices and can be further applied to edge computing environments.The model can adaptively learn multi-modal features and output prediction results.The detection rate of our model is as high as 97.01%and the false alarm rate is only 0.63%.The experimental results prove the advantage and effectiveness of the proposed method. 展开更多
关键词 cryptomining malware multi-modal ensemble learning deep learning edge computing
在线阅读 下载PDF
MDGET-MER:Multi-Level Dynamic Gating and Emotion Transfer for Multi-Modal Emotion Recognition
2
作者 Musheng Chen Qiang Wen +2 位作者 Xiaohong Qiu Junhua Wu Wenqing Fu 《Computers, Materials & Continua》 2026年第3期872-893,共22页
In multi-modal emotion recognition,excessive reliance on historical context often impedes the detection of emotional shifts,while modality heterogeneity and unimodal noise limit recognition performance.Existing method... In multi-modal emotion recognition,excessive reliance on historical context often impedes the detection of emotional shifts,while modality heterogeneity and unimodal noise limit recognition performance.Existing methods struggle to dynamically adjust cross-modal complementary strength to optimize fusion quality and lack effective mechanisms to model the dynamic evolution of emotions.To address these issues,we propose a multi-level dynamic gating and emotion transfer framework for multi-modal emotion recognition.A dynamic gating mechanism is applied across unimodal encoding,cross-modal alignment,and emotion transfer modeling,substantially improving noise robustness and feature alignment.First,we construct a unimodal encoder based on gated recurrent units and feature-selection gating to suppress intra-modal noise and enhance contextual representation.Second,we design a gated-attention crossmodal encoder that dynamically calibrates the complementary contributions of visual and audio modalities to the dominant textual features and eliminates redundant information.Finally,we introduce a gated enhanced emotion transfer module that explicitly models the temporal dependence of emotional evolution in dialogues via transfer gating and optimizes continuity modeling with a comparative learning loss.Experimental results demonstrate that the proposed method outperforms state-of-the-art models on the public MELD and IEMOCAP datasets. 展开更多
关键词 multi-modal emotion recognition dynamic gating emotion transfer module cross-modal dynamic alignment noise robustness
在线阅读 下载PDF
Clinical features and prognosis of orbital inflammatory myofibroblastic tumor
3
作者 Jing Li Liang-Yuan Xu +9 位作者 Nan Wang Rui Liu Shan-Feng Zhao Ting-Ting Ren Qi-Han Guo Bin Zhang Hong Zhang Hai-Han Yan Yu-Fei Zhang Jian-Min Ma 《International Journal of Ophthalmology(English edition)》 2026年第1期105-114,共10页
AIM:To investigate the clinical features and prognosis of patients with orbital inflammatory myofibroblastic tumor(IMT).METHODS:This retrospective study collected clinical data from 22 patients diagnosed with orbital ... AIM:To investigate the clinical features and prognosis of patients with orbital inflammatory myofibroblastic tumor(IMT).METHODS:This retrospective study collected clinical data from 22 patients diagnosed with orbital IMT based on histopathological examination.The patients were followed up to assess their prognosis.Clinical data from patients,including age,gender,course of disease,past medical history,primary symptoms,ophthalmologic examination findings,general condition,as well as imaging,laboratory,histopathological,and immunohistochemical results from digital records were collected.Orbital magnetic resonance imaging(MRI)and(or)computed tomography(CT)scans were performed to assess bone destruction of the mass,invasion of surrounding tissues,and any inflammatory changes in periorbital areas.RESULTS:The mean age of patients with orbital IMT was 28.24±3.30y,with a male-to-female ratio of 1.2:1.Main clinical manifestations were proptosis,blurred vision,palpable mass,and pain.Bone destruction and surrounding tissue invasion occurred in 72.73%and 54.55%of cases,respectively.Inflammatory changes in the periorbital site were observed in 77.27%of the patients.Hematoxylin and eosin staining showed proliferation of fibroblasts and myofibroblasts,accompanied by infiltration of lymphocytes and plasma cells.Immunohistochemical staining revealed that smooth muscle actin(SMA)and vimentin were positive in 100%of cases,while anaplastic lymphoma kinase(ALK)showed positivity in 47.37%.The recurrence rate of orbital IMT was 27.27%,and sarcomatous degeneration could occur.There were no significant correlations between recurrence and factors such as age,gender,laterality,duration of the disease,periorbital tissue invasion,bone destruction,periorbital inflammation,tumor size,fever,leukocytosis,or treatment(P>0.05).However,lymphadenopathy and a Ki-67 index of 10%or higher may be risk factors for recurrence(P=0.046;P=0.023).CONCLUSION:Orbital IMT is a locally invasive disease that may recur or lead to sarcomatoid degeneration,primarily affecting young and middle-aged patients.The presence of lymphadenopathy and a Ki-67 index of 10%or higher may signify a poor prognosis. 展开更多
关键词 inflammatory myofibroblastic tumor orbital disease clinical features PROGNOSIS
原文传递
A machine learning-based depression recognition model integrating spiritexpression features from traditional Chinese medicine
4
作者 Minghui Yao Rongrong Zhu +4 位作者 Peng Qian Huilin Liu Xirong Sun Limin Gao Fufeng Li 《Digital Chinese Medicine》 2026年第1期68-79,共12页
Objective To develop a depression recognition model by integrating the spirit-expression diagnostic framework of traditional Chinese medicine(TCM)with machine learning algorithms.The proposed model seeks to establish ... Objective To develop a depression recognition model by integrating the spirit-expression diagnostic framework of traditional Chinese medicine(TCM)with machine learning algorithms.The proposed model seeks to establish a TCM-informed tool for early depression screening,thereby bridging traditional diagnostic principles with modern computational approaches.Methods The study included patients with depression who visited the Shanghai Pudong New Area Mental Health Center from October 1,2022 to October 1,2023,as well as students and teachers from Shanghai University of Traditional Chinese Medicine during the same period as the healthy control group.Videos of 3–10 s were captured using a Xiaomi Pad 5,and the TCM spirit and expressions were determined by TCM experts(at least 3 out of 5 experts agreed to determine the category of TCM spirit and expressions).Basic information,facial images,and interview information were collected through a portable TCM intelligent analysis and diagnosis device,and facial diagnosis features were extracted using the Open CV computer vision library technology.Statistical analysis methods such as parametric and non-parametric tests were used to analyze the baseline data,TCM spirit and expression features,and facial diagnosis feature parameters of the two groups,to compare the differences in TCM spirit and expression and facial features.Five machine learning algorithms,including extreme gradient boosting(XGBoost),decision tree(DT),Bernoulli naive Bayes(BernoulliNB),support vector machine(SVM),and k-nearest neighbor(KNN)classification,were used to construct a depression recognition model based on the fusion of TCM spirit and expression features.The performance of the model was evaluated using metrics such as accuracy,precision,and the area under the receiver operating characteristic(ROC)curve(AUC).The model results were explained using the Shapley Additive exPlanations(SHAP).Results A total of 93 depression patients and 87 healthy individuals were ultimately included in this study.There was no statistically significant difference in the baseline characteristics between the two groups(P>0.05).The differences in the characteristics of the spirit and expressions in TCM and facial features between the two groups were shown as follows.(i)Quantispirit facial analysis revealed that depression patients exhibited significantly reduced facial spirit and luminance compared with healthy controls(P<0.05),with characteristic features such as sad expressions,facial erythema,and changes in the lip color ranging from erythematous to cyanotic.(ii)Depressed patients exhibited significantly lower values in facial complexion L,lip L,and a values,and gloss index,but higher values in facial complexion a and b,lip b,low gloss index,and matte index(all P<0.05).(iii)The results of multiple models show that the XGBoost-based depression recognition model,integrating the TCM“spirit-expression”diagnostic framework,achieved an accuracy of 98.61%and significantly outperformed four benchmark algorithms—DT,BernoulliNB,SVM,and KNN(P<0.01).(iv)The SHAP visualization results show that in the recognition model constructed by the XGBoost algorithm,the complexion b value,categories of facial spirit,high gloss index,low gloss index,categories of facial expression and texture features have significant contribution to the model.Conclusion This study demonstrates that integrating TCM spirit-expression diagnostic features with machine learning enables the construction of a high-precision depression detection model,offering a novel paradigm for objective depression diagnosis. 展开更多
关键词 Traditional Chinese medicine SPIRIT EXPRESSION feature fusion DEPRESSION Recognition model
在线阅读 下载PDF
Clinicopathologic features of SMARCB1/INI1-deficient pancreatic undifferentiated rhabdoid carcinoma:A case report and review of literature
5
作者 Wan-Qi Yao Xin-Yi Ma Gui-Hua Wang 《World Journal of Gastrointestinal Oncology》 2026年第1期250-262,共13页
BACKGROUND SMARCB1/INI1-deficient pancreatic undifferentiated rhabdoid carcinoma is a highly aggressive tumor,and spontaneous splenic rupture(SSR)as its presenting manifestation is rarely reported among pancreatic mal... BACKGROUND SMARCB1/INI1-deficient pancreatic undifferentiated rhabdoid carcinoma is a highly aggressive tumor,and spontaneous splenic rupture(SSR)as its presenting manifestation is rarely reported among pancreatic malignancies.CASE SUMMARY We herein report a rare case of a 59-year-old female who presented with acute left upper quadrant abdominal pain without any history of trauma.Abdominal imaging demonstrated a heterogeneous splenic lesion with hemoperitoneum,raising clinical suspicion of SSR.Emergency laparotomy revealed a pancreatic tumor invading the spleen and left kidney,with associated splenic rupture and dense adhesions,necessitating en bloc resection of the distal pancreas,spleen,and left kidney.Histopathology revealed a biphasic malignancy composed of moderately differentiated pancreatic ductal adenocarcinoma and an undifferentiated carcinoma with rhabdoid morphology and loss of SMARCB1 expression.Immunohistochemical analysis confirmed complete loss of SMARCB1/INI1 in the undifferentiated component,along with a high Ki-67 index(approximately 80%)and CD10 positivity.The ductal adenocarcinoma component retained SMARCB1/INI1 expression and was positive for CK7 and CK-pan.Transitional zones between the two tumor components suggested progressive dedifferentiation and underlying genomic instability.The patient received adjuvant chemotherapy with gemcitabine and nab-paclitaxel and maintained a satisfactory quality of life at the 6-month follow-up.CONCLUSION This study reports a rare case of SMARCB1/INI1-deficient undifferentiated rhabdoid carcinoma of the pancreas combined with ductal adenocarcinoma,presenting as SSR-an exceptionally uncommon initial manifestation of pancreatic malignancy. 展开更多
关键词 d features Switch/sucrose non-fermentable Chemotherapy Case report
暂未订购
Adaptive Reinforcement Learning with Multi-Modal Perception for Autonomous Formation Control and Exploration in Large-Scale Multi-UAV Swarms
6
作者 Ziyuan Ma Huajun Gong Xinhua Wang 《Journal of Beijing Institute of Technology》 2026年第1期63-83,共21页
To address the challenge of achieving decentralized,scalable,and adaptive control for large-scale multiple unmanned aerial vehicle(multi-UAV)swarms in dynamic urban environments with obstacles and wind perturbations,w... To address the challenge of achieving decentralized,scalable,and adaptive control for large-scale multiple unmanned aerial vehicle(multi-UAV)swarms in dynamic urban environments with obstacles and wind perturbations,we proposed a hybrid framework integrating adaptive reinforcement learning(RL),multi-modal perception fusion,and enhanced pigeon flock optimization(PFO)with curiosity-driven exploration to enable robust autonomous and formation control.The framework leverages meta-learning to optimize RL policies for real-time adaptation,fuses sensor data for precise state estimation,and enhances PFO with learned leader-follower dynamics and exploration rewards to maintain cohesive formations and explore uncertain areas.For swarms of 10–30 UAVs,it achieves 34%faster convergence,61%reduced stability root mean square error(RMSE),88%fewer collisions and 85.6%–92.3%success rates in target detection and encirclement,outperforming standard multi-agent RL,pure PFO,and single-modality RL.Three-dimensional trajectory visualizations confirm cohesive formations,collision-free maneuvers,and efficient exploration in urban search-and-rescue scenarios.Innovations include meta-RL for rapid adaptation,multi-modal fusion for robust perception,and curiosity-driven PFO for scalable,decentralized control,advancing real-world multi-UAV swarm autonomy and coordination. 展开更多
关键词 multiple unmanned aerial vehicle(multi-UAV)swarm autonomous control reinforcement learning(RL) multi-modal perception pigeon flock optimization(PFO)
在线阅读 下载PDF
Multimodal deep learning with time-frequency health features for battery SOH and RUL prediction
7
作者 Rongzheng Wang Le Chen +8 位作者 Jiahao Xu Fei Yuan Junjie Han Zongrun Li Zekun Li Yiwei Zhang Peiyan Li Lipeng Zhang Zhouguang Lu 《Journal of Energy Chemistry》 2026年第2期303-314,I0009,共13页
This study proposes a multimodal deep learning framework for joint prediction of the state of health(SOH)and remaining useful life(RUL)of lithium-ion batteries.Twelve representative impedance features-covering charge-... This study proposes a multimodal deep learning framework for joint prediction of the state of health(SOH)and remaining useful life(RUL)of lithium-ion batteries.Twelve representative impedance features-covering charge-transfer resistance,solid electrolyte interface(SEI)layer impedance,and ion diffusion-are extracted from electrochemical impedance spectroscopy(EIS)and combined with short voltage/current segments to form a compact,interpretable feature set.A residual multi-layer perceptron(ResMLP)is employed for SOH regression,and a temporal convolutional network with attention(TCNAttention)is used for RUL estimation.Lifetime experiments on two battery types with different chemistries and form factors,evaluated through three rounds of paired cross-validation,validate the approach.Results show that the proposed features significantly reduce dimensionality and computational cost while substantially lowering SOH error,achieving an average normalized root mean square error of 2.3%.The RUL prediction reaches an average error of 14.8%.Overall,the framework balances interpretability,robustness,and feasibility,providing a practical solution for battery management systems(BMS)monitoring and life prediction. 展开更多
关键词 State of health Remaining useful life feature selection Electrochemical impedance spectroscopy Machine learning
在线阅读 下载PDF
Numerical investigation of flow features and aero-optical effects of turret with different bottom cylinder heights in a transonic flow
8
作者 Xiaotong TAN Heyong XU 《Chinese Journal of Aeronautics》 2026年第2期123-140,共18页
Improved delay detached eddy simulation is performed to explore the flow features and aero-optical effects of turrets with different bottom cylinder height at a freestream Mach number Ma=0.7.Analysis of both the time-... Improved delay detached eddy simulation is performed to explore the flow features and aero-optical effects of turrets with different bottom cylinder height at a freestream Mach number Ma=0.7.Analysis of both the time-averaged and instantaneous flow features demonstrate that the shock motion causes the oscillation of separated shear layer.In flow analysis,two unsteady shock-wake-correlated modes are discerned:the asymmetric shifting mode and the symmetric breathing mode.With the increase of cylinder height,the relative energy of shock gradually increases,which goes from 26%to 59%.The proper orthogonal decomposition analysis yields the single frequency peak for the two dominant modes.The frequency peaks of shifting mode are generally at StD<0.23,while the frequency peaks of breathing mode are generally at StD>0.26.The dynamic mode decomposition analysis gives range of frequency peak.The frequency peaks of shifting mode are in the range of StD=0.11-0.23,and the frequency peaks of breathing mode are in range of StD=0.26-0.41.Optical distortion analysis indicates that the distortion calculated in five cases is linked to the breathing mode.When the beam passes through the turbulent wake,it exhibits the high-frequency and high-amplitude characteristics. 展开更多
关键词 Aero-optical effect Bottom cylinder height Dynamic mode decomposition Flow features Proper orthogonal decomposition
原文传递
HDFPM:A Heterogeneous Disk Failure Prediction Method Based on Time Series Features
9
作者 Zhongrui Jing Hongzhang Yang Jiangpu Guo 《Computers, Materials & Continua》 2026年第2期2187-2211,共25页
Hard disk drives(HDDs)serve as the primary storage devices in modern data centers.Once a failure occurs,it often leads to severe data loss,significantly degrading the reliability of storage systems.Numerous studies ha... Hard disk drives(HDDs)serve as the primary storage devices in modern data centers.Once a failure occurs,it often leads to severe data loss,significantly degrading the reliability of storage systems.Numerous studies have proposed machine learning-based HDD failure prediction models.However,the Self-Monitoring,Analysis,and Reporting Technology(SMART)attributes differ across HDD manufacturers.We define hard drives of the same brand and model as homogeneous HDD groups,and those from different brands or models as heterogeneous HDD groups.In practical engineering scenarios,a data center is often composed of a heterogeneous population of HDDs,spanning multiple vendors and models.Existing research predominantly focuses on homogeneous datasets,ignoring the model’s generalization capability across heterogeneous HDDs.As a result,HDD models with limited samples often suffer from poor training effectiveness and prediction performance.To address this issue,we investigate generalizable SMART predictors across heterogeneous HDD groups.By extracting time-series features within a fixed sliding time window,we propose a Heterogeneous Disk Failure Prediction Method based on Time Series Features(HDFPM)framework.This method is adaptable to HDD models with limited sample sizes,thereby enhancing its applicability and robustness across diverse drive populations.Experimental results show that the proposed model achieves an F1-score of 0.9518 when applied to two different Seagate HDD models,while maintaining the False Positive Rate(FPR)below 1%.After incorporating the Complexity-Ratio Dynamic Time Warping(CDTW)based feature enhancement method,the best prediction model achieves a True Positive Rate(TPR)of up to 0.93 between the two models.For next-day failure prediction across various Seagate models,the model achieves an F1-score of up to 0.8792.Moreover,the experimental results also show that within the same brand,the higher the proportion of shared SMART attributes across different models,the better the prediction performance.In addition,HDFPMdemonstrates the best stability andmost significant performance in heterogeneous environments. 展开更多
关键词 Heterogeneous hard disk drives failure prediction time series feature constrained dynamic time warping sensitivity analysis
在线阅读 下载PDF
Long-range masked autoencoder for pre-extraction of trajectory features in within-visual-range maneuver recognition
10
作者 Feilong Jiang Hutao Cui +2 位作者 Yuqing Li Minqiang Xu Rixin Wang 《Defence Technology(防务技术)》 2026年第1期301-315,共15页
In the field of intelligent air combat,real-time and accurate recognition of within-visual-range(WVR)maneuver actions serves as the foundational cornerstone for constructing autonomous decision-making systems.However,... In the field of intelligent air combat,real-time and accurate recognition of within-visual-range(WVR)maneuver actions serves as the foundational cornerstone for constructing autonomous decision-making systems.However,existing methods face two major challenges:traditional feature engineering suffers from insufficient effective dimensionality in the feature space due to kinematic coupling,making it difficult to distinguish essential differences between maneuvers,while end-to-end deep learning models lack controllability in implicit feature learning and fail to model high-order long-range temporal dependencies.This paper proposes a trajectory feature pre-extraction method based on a Long-range Masked Autoencoder(LMAE),incorporating three key innovations:(1)Random Fragment High-ratio Masking(RFH-Mask),which enforces the model to learn long-range temporal correlations by masking 80%of trajectory data while retaining continuous fragments;(2)Kalman Filter-Guided Objective Function(KFG-OF),integrating trajectory continuity constraints to align the feature space with kinematic principles;and(3)Two-stage Decoupled Architecture,enabling efficient and controllable feature learning through unsupervised pre-training and frozen-feature transfer.Experimental results demonstrate that LMAE significantly improves the average recognition accuracy for 20-class maneuvers compared to traditional end-to-end models,while significantly accelerating convergence speed.The contributions of this work lie in:introducing high-masking-rate autoencoders into low-informationdensity trajectory analysis,proposing a feature engineering framework with enhanced controllability and efficiency,and providing a novel technical pathway for intelligent air combat decision-making systems. 展开更多
关键词 Within-visual-range maneuver recognition Trajectory feature pre-extraction Long-range masked autoencoder Kalman filter constraints Intelligent air combat
在线阅读 下载PDF
Multi-modal retinal disease diagnosis based on fundus photography and OCT images
11
作者 Han Xu Ruichan Lv 《Journal of Innovative Optical Health Sciences》 2026年第2期69-88,共20页
Retinal diseases are a serious threat to human visual health and their early diagnosis is crucial.Currently,most of the retinal disease diagnostic algorithms are based on a single imaging modality of fundus color phot... Retinal diseases are a serious threat to human visual health and their early diagnosis is crucial.Currently,most of the retinal disease diagnostic algorithms are based on a single imaging modality of fundus color photography(FCP)or optical coherence tomography(OCT).These methods can only reflect retinal diseases to a certain extent,ignoring the speci ficity of modalities between different imaging modalities.In this research,a newmulti-scale feature fusion network(MSFF-Net)model for multi-modal retinal image diagnosis is proposed.The MSFF-Net model employs a dualbranch architecture design,enabling efficient learning and extraction of multi-modal feature information related to retinal diseases from CFP and OCT images.MSFF-Net improves disease diagnosis by combining multi-scale features of CFP and OCT images.When evaluated on challenging datasets,the model achieved an accuracy of 95.00%and an F1-score of 95.24%for retinal disease diagnosis.Even under low-quality dataset conditions,it maintained robust performance,with diagnostic accuracy and F1-scores of 71.50%and 71.73%,respectively.In addition,the MSFFNet model outperformed eight state-of-the-art single and multi-modal models in the comparison experiments.The proposed MSFF-Net model provides ophthalmologists with a more accurate and efficient diagnostic pathway that helps them detect and treat retinal diseases earlier. 展开更多
关键词 Fundus color photography(FCP) optical coherence tomography(OCT) multi-scale feature fusion disease diagnosis
原文传递
Tomato Growth Height Prediction Method by Phenotypic Feature Extraction Using Multi-modal Data
12
作者 GONG Yu WANG Ling +3 位作者 ZHAO Rongqiang YOU Haibo ZHOU Mo LIU Jie 《智慧农业(中英文)》 2025年第1期97-110,共14页
[Objective]Accurate prediction of tomato growth height is crucial for optimizing production environments in smart farming.However,current prediction methods predominantly rely on empirical,mechanistic,or learning-base... [Objective]Accurate prediction of tomato growth height is crucial for optimizing production environments in smart farming.However,current prediction methods predominantly rely on empirical,mechanistic,or learning-based models that utilize either images data or environmental data.These methods fail to fully leverage multi-modal data to capture the diverse aspects of plant growth comprehensively.[Methods]To address this limitation,a two-stage phenotypic feature extraction(PFE)model based on deep learning algorithm of recurrent neural network(RNN)and long short-term memory(LSTM)was developed.The model integrated environment and plant information to provide a holistic understanding of the growth process,emploied phenotypic and temporal feature extractors to comprehensively capture both types of features,enabled a deeper understanding of the interaction between tomato plants and their environment,ultimately leading to highly accurate predictions of growth height.[Results and Discussions]The experimental results showed the model's ef‐fectiveness:When predicting the next two days based on the past five days,the PFE-based RNN and LSTM models achieved mean absolute percentage error(MAPE)of 0.81%and 0.40%,respectively,which were significantly lower than the 8.00%MAPE of the large language model(LLM)and 6.72%MAPE of the Transformer-based model.In longer-term predictions,the 10-day prediction for 4 days ahead and the 30-day prediction for 12 days ahead,the PFE-RNN model continued to outperform the other two baseline models,with MAPE of 2.66%and 14.05%,respectively.[Conclusions]The proposed method,which leverages phenotypic-temporal collaboration,shows great potential for intelligent,data-driven management of tomato cultivation,making it a promising approach for enhancing the efficiency and precision of smart tomato planting management. 展开更多
关键词 tomato growth prediction deep learning phenotypic feature extraction multi-modal data recurrent neural net‐work long short-term memory large language model
在线阅读 下载PDF
Learning Multi-Modality Features for Scene Classification of High-Resolution Remote Sensing Images 被引量:1
13
作者 Feng’an Zhao Xiongmei Zhang +2 位作者 Xiaodong Mu Zhaoxiang Yi Zhou Yang 《Journal of Computer and Communications》 2018年第11期185-193,共9页
Scene classification of high-resolution remote sensing (HRRS) image is an important research topic and has been applied broadly in many fields. Deep learning method has shown its high potential to in this domain, owin... Scene classification of high-resolution remote sensing (HRRS) image is an important research topic and has been applied broadly in many fields. Deep learning method has shown its high potential to in this domain, owing to its powerful learning ability of characterizing complex patterns. However the deep learning methods omit some global and local information of the HRRS image. To this end, in this article we show efforts to adopt explicit global and local information to provide complementary information to deep models. Specifically, we use a patch based MS-CLBP method to acquire global and local representations, and then we consider a pretrained CNN model as a feature extractor and extract deep hierarchical features from full-connection layers. After fisher vector (FV) encoding, we obtain the holistic visual representation of the scene image. We view the scene classification as a reconstruction procedure and train several class-specific stack denoising autoencoders (SDAEs) of corresponding class, i.e., one SDAE per class, and classify the test image according to the reconstruction error. Experimental results show that our combination method outperforms the state-of-the-art deep learning classification methods without employing fine-tuning. 展开更多
关键词 featurE Fusion Multiple features SCENE Classification STACK DENOISING Autoencoder
在线阅读 下载PDF
A Hand Features Based Fusion Recognition Network with Enhancing Multi-Modal Correlation
14
作者 Wei Wu Yuan Zhang +2 位作者 Yunpeng Li Chuanyang Li YanHao 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第7期537-555,共19页
Fusing hand-based features in multi-modal biometric recognition enhances anti-spoofing capabilities.Additionally,it leverages inter-modal correlation to enhance recognition performance.Concurrently,the robustness and ... Fusing hand-based features in multi-modal biometric recognition enhances anti-spoofing capabilities.Additionally,it leverages inter-modal correlation to enhance recognition performance.Concurrently,the robustness and recognition performance of the system can be enhanced through judiciously leveraging the correlation among multimodal features.Nevertheless,two issues persist in multi-modal feature fusion recognition:Firstly,the enhancement of recognition performance in fusion recognition has not comprehensively considered the inter-modality correlations among distinct modalities.Secondly,during modal fusion,improper weight selection diminishes the salience of crucial modal features,thereby diminishing the overall recognition performance.To address these two issues,we introduce an enhanced DenseNet multimodal recognition network founded on feature-level fusion.The information from the three modalities is fused akin to RGB,and the input network augments the correlation between modes through channel correlation.Within the enhanced DenseNet network,the Efficient Channel Attention Network(ECA-Net)dynamically adjusts the weight of each channel to amplify the salience of crucial information in each modal feature.Depthwise separable convolution markedly reduces the training parameters and further enhances the feature correlation.Experimental evaluations were conducted on four multimodal databases,comprising six unimodal databases,including multispectral palmprint and palm vein databases from the Chinese Academy of Sciences.The Equal Error Rates(EER)values were 0.0149%,0.0150%,0.0099%,and 0.0050%,correspondingly.In comparison to other network methods for palmprint,palm vein,and finger vein fusion recognition,this approach substantially enhances recognition performance,rendering it suitable for high-security environments with practical applicability.The experiments in this article utilized amodest sample database comprising 200 individuals.The subsequent phase involves preparing for the extension of the method to larger databases. 展开更多
关键词 BIOMETRICS multi-modal CORRELATION deep learning feature-level fusion
在线阅读 下载PDF
MMGC-Net: Deep neural network for classification of mineral grains using multi-modal polarization images 被引量:1
15
作者 Jun Shu Xiaohai He +3 位作者 Qizhi Teng Pengcheng Yan Haibo He Honggang Chen 《Journal of Rock Mechanics and Geotechnical Engineering》 2025年第6期3894-3909,共16页
The multi-modal characteristics of mineral particles play a pivotal role in enhancing the classification accuracy,which is critical for obtaining a profound understanding of the Earth's composition and ensuring ef... The multi-modal characteristics of mineral particles play a pivotal role in enhancing the classification accuracy,which is critical for obtaining a profound understanding of the Earth's composition and ensuring effective exploitation utilization of its resources.However,the existing methods for classifying mineral particles do not fully utilize these multi-modal features,thereby limiting the classification accuracy.Furthermore,when conventional multi-modal image classification methods are applied to planepolarized and cross-polarized sequence images of mineral particles,they encounter issues such as information loss,misaligned features,and challenges in spatiotemporal feature extraction.To address these challenges,we propose a multi-modal mineral particle polarization image classification network(MMGC-Net)for precise mineral particle classification.Initially,MMGC-Net employs a two-dimensional(2D)backbone network with shared parameters to extract features from two types of polarized images to ensure feature alignment.Subsequently,a cross-polarized intra-modal feature fusion module is designed to refine the spatiotemporal features from the extracted features of the cross-polarized sequence images.Ultimately,the inter-modal feature fusion module integrates the two types of modal features to enhance the classification precision.Quantitative and qualitative experimental results indicate that when compared with the current state-of-the-art multi-modal image classification methods,MMGC-Net demonstrates marked superiority in terms of mineral particle multi-modal feature learning and four classification evaluation metrics.It also demonstrates better stability than the existing models. 展开更多
关键词 Mineral particles multi-modal image classification Shared parameters feature fusion Spatiotemporal feature
暂未订购
Multi-modal face parts fusion based on Gabor feature for face recognition 被引量:1
16
作者 相燕 《High Technology Letters》 EI CAS 2009年第1期70-74,共5页
A novel face recognition method, which is a fusion of muhi-modal face parts based on Gabor feature (MMP-GF), is proposed in this paper. Firstly, the bare face image detached from the normalized image was convolved w... A novel face recognition method, which is a fusion of muhi-modal face parts based on Gabor feature (MMP-GF), is proposed in this paper. Firstly, the bare face image detached from the normalized image was convolved with a family of Gabor kernels, and then according to the face structure and the key-points locations, the calculated Gabor images were divided into five parts: Gabor face, Gabor eyebrow, Gabor eye, Gabor nose and Gabor mouth. After that multi-modal Gabor features were spatially partitioned into non-overlapping regions and the averages of regions were concatenated to be a low dimension feature vector, whose dimension was further reduced by principal component analysis (PCA). In the decision level fusion, match results respectively calculated based on the five parts were combined according to linear discriminant analysis (LDA) and a normalized matching algorithm was used to improve the performance. Experiments on FERET database show that the proposed MMP-GF method achieves good robustness to the expression and age variations. 展开更多
关键词 Gabor filter multi-modal Gabor features principal component analysis (PCA) linear discriminant analysis (IDA) normalized matching algorithm
在线阅读 下载PDF
Construction and evaluation of a predictive model for the degree of coronary artery occlusion based on adaptive weighted multi-modal fusion of traditional Chinese and western medicine data 被引量:2
17
作者 Jiyu ZHANG Jiatuo XU +1 位作者 Liping TU Hongyuan FU 《Digital Chinese Medicine》 2025年第2期163-173,共11页
Objective To develop a non-invasive predictive model for coronary artery stenosis severity based on adaptive multi-modal integration of traditional Chinese and western medicine data.Methods Clinical indicators,echocar... Objective To develop a non-invasive predictive model for coronary artery stenosis severity based on adaptive multi-modal integration of traditional Chinese and western medicine data.Methods Clinical indicators,echocardiographic data,traditional Chinese medicine(TCM)tongue manifestations,and facial features were collected from patients who underwent coro-nary computed tomography angiography(CTA)in the Cardiac Care Unit(CCU)of Shanghai Tenth People's Hospital between May 1,2023 and May 1,2024.An adaptive weighted multi-modal data fusion(AWMDF)model based on deep learning was constructed to predict the severity of coronary artery stenosis.The model was evaluated using metrics including accura-cy,precision,recall,F1 score,and the area under the receiver operating characteristic(ROC)curve(AUC).Further performance assessment was conducted through comparisons with six ensemble machine learning methods,data ablation,model component ablation,and various decision-level fusion strategies.Results A total of 158 patients were included in the study.The AWMDF model achieved ex-cellent predictive performance(AUC=0.973,accuracy=0.937,precision=0.937,recall=0.929,and F1 score=0.933).Compared with model ablation,data ablation experiments,and various traditional machine learning models,the AWMDF model demonstrated superior per-formance.Moreover,the adaptive weighting strategy outperformed alternative approaches,including simple weighting,averaging,voting,and fixed-weight schemes.Conclusion The AWMDF model demonstrates potential clinical value in the non-invasive prediction of coronary artery disease and could serve as a tool for clinical decision support. 展开更多
关键词 Coronary artery disease Deep learning multi-modal Clinical prediction Traditional Chinese medicine diagnosis
暂未订购
TCM network pharmacology:new perspective integrating network target with artificial intelligence and multi-modal multi-omics technologies 被引量:1
18
作者 Ziyi Wang Tingyu Zhang +1 位作者 Boyang Wang Shao Li 《Chinese Journal of Natural Medicines》 2025年第11期1425-1434,共10页
Traditional Chinese medicine(TCM)demonstrates distinctive advantages in disease prevention and treatment.However,analyzing its biological mechanisms through the modern medical research paradigm of“single drug,single ... Traditional Chinese medicine(TCM)demonstrates distinctive advantages in disease prevention and treatment.However,analyzing its biological mechanisms through the modern medical research paradigm of“single drug,single target”presents significant challenges due to its holistic approach.Network pharmacology and its core theory of network targets connect drugs and diseases from a holistic and systematic perspective based on biological networks,overcoming the limitations of reductionist research models and showing considerable value in TCM research.Recent integration of network target computational and experimental methods with artificial intelligence(AI)and multi-modal multi-omics technologies has substantially enhanced network pharmacology methodology.The advancement in computational and experimental techniques provides complementary support for network target theory in decoding TCM principles.This review,centered on network targets,examines the progress of network target methods combined with AI in predicting disease molecular mechanisms and drug-target relationships,alongside the application of multi-modal multi-omics technologies in analyzing TCM formulae,syndromes,and toxicity.Looking forward,network target theory is expected to incorporate emerging technologies while developing novel approaches aligned with its unique characteristics,potentially leading to significant breakthroughs in TCM research and advancing scientific understanding and innovation in TCM. 展开更多
关键词 Network pharmacology Traditional Chinese medicine Network target Artificial intelligence multi-modal Multi-omics
原文传递
Multi-modal intelligent situation awareness in real-time air traffic control: Control intent understanding and flight trajectory prediction 被引量:1
19
作者 Dongyue GUO Jianwei ZHANG +1 位作者 Bo YANG Yi LIN 《Chinese Journal of Aeronautics》 2025年第6期41-57,共17页
With the advent of the next-generation Air Traffic Control(ATC)system,there is growing interest in using Artificial Intelligence(AI)techniques to enhance Situation Awareness(SA)for ATC Controllers(ATCOs),i.e.,Intellig... With the advent of the next-generation Air Traffic Control(ATC)system,there is growing interest in using Artificial Intelligence(AI)techniques to enhance Situation Awareness(SA)for ATC Controllers(ATCOs),i.e.,Intelligent SA(ISA).However,the existing AI-based SA approaches often rely on unimodal data and lack a comprehensive description and benchmark of the ISA tasks utilizing multi-modal data for real-time ATC environments.To address this gap,by analyzing the situation awareness procedure of the ATCOs,the ISA task is refined to the processing of the two primary elements,i.e.,spoken instructions and flight trajectories.Subsequently,the ISA is further formulated into Controlling Intent Understanding(CIU)and Flight Trajectory Prediction(FTP)tasks.For the CIU task,an innovative automatic speech recognition and understanding framework is designed to extract the controlling intent from unstructured and continuous ATC communications.For the FTP task,the single-and multi-horizon FTP approaches are investigated to support the high-precision prediction of the situation evolution.A total of 32 unimodal/multi-modal advanced methods with extensive evaluation metrics are introduced to conduct the benchmarks on the real-world multi-modal ATC situation dataset.Experimental results demonstrate the effectiveness of AI-based techniques in enhancing ISA for the ATC environment. 展开更多
关键词 Airtraffic control Automatic speechrecognition and understanding Flight trajectory prediction multi-modal Situationawareness
原文传递
Test method of laser paint removal based on multi-modal feature fusion 被引量:2
20
作者 HUANG Hai-peng HAO Ben-tian +2 位作者 YE De-jun GAO Hao LI Liang 《Journal of Central South University》 SCIE EI CAS CSCD 2022年第10期3385-3398,共14页
Laser cleaning is a highly nonlinear physical process for solving poor single-modal(e.g., acoustic or vision)detection performance and low inter-information utilization. In this study, a multi-modal feature fusion net... Laser cleaning is a highly nonlinear physical process for solving poor single-modal(e.g., acoustic or vision)detection performance and low inter-information utilization. In this study, a multi-modal feature fusion network model was constructed based on a laser paint removal experiment. The alignment of heterogeneous data under different modals was solved by combining the piecewise aggregate approximation and gramian angular field. Moreover, the attention mechanism was introduced to optimize the dual-path network and dense connection network, enabling the sampling characteristics to be extracted and integrated. Consequently, the multi-modal discriminant detection of laser paint removal was realized. According to the experimental results, the verification accuracy of the constructed model on the experimental dataset was 99.17%, which is 5.77% higher than the optimal single-modal detection results of the laser paint removal. The feature extraction network was optimized by the attention mechanism, and the model accuracy was increased by 3.3%. Results verify the improved classification performance of the constructed multi-modal feature fusion model in detecting laser paint removal, the effective integration of acoustic data and visual image data, and the accurate detection of laser paint removal. 展开更多
关键词 laser cleaning multi-modal fusion image processing deep learning
在线阅读 下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部