Biometric characteristics are playing a vital role in security for the last few years.Human gait classification in video sequences is an important biometrics attribute and is used for security purposes.A new framework...Biometric characteristics are playing a vital role in security for the last few years.Human gait classification in video sequences is an important biometrics attribute and is used for security purposes.A new framework for human gait classification in video sequences using deep learning(DL)fusion assisted and posterior probability-based moth flames optimization(MFO)is proposed.In the first step,the video frames are resized and finetuned by two pre-trained lightweight DL models,EfficientNetB0 and MobileNetV2.Both models are selected based on the top-5 accuracy and less number of parameters.Later,both models are trained through deep transfer learning and extracted deep features fused using a voting scheme.In the last step,the authors develop a posterior probabilitybased MFO feature selection algorithm to select the best features.The selected features are classified using several supervised learning methods.The CASIA-B publicly available dataset has been employed for the experimental process.On this dataset,the authors selected six angles such as 0°,18°,90°,108°,162°,and 180°and obtained an average accuracy of 96.9%,95.7%,86.8%,90.0%,95.1%,and 99.7%.Results demonstrate comparable improvement in accuracy and significantly minimize the computational time with recent state-of-the-art techniques.展开更多
Hematoxylin and Eosin(H&E)images,popularly used in the field of digital pathology,often pose challenges due to their limited color richness,hindering the differentiation of subtle cell features crucial for accurat...Hematoxylin and Eosin(H&E)images,popularly used in the field of digital pathology,often pose challenges due to their limited color richness,hindering the differentiation of subtle cell features crucial for accurate classification.Enhancing the visibility of these elusive cell features helps train robust deep-learning models.However,the selection and application of image processing techniques for such enhancement have not been systematically explored in the research community.To address this challenge,we introduce Salient Features Guided Augmentation(SFGA),an approach that strategically integrates machine learning and image processing.SFGA utilizes machine learning algorithms to identify crucial features within cell images,subsequently mapping these features to appropriate image processing techniques to enhance training images.By emphasizing salient features and aligning them with corresponding image processing methods,SFGA is designed to enhance the discriminating power of deep learning models in cell classification tasks.Our research undertakes a series of experiments,each exploring the performance of different datasets and data enhancement techniques in classifying cell types,highlighting the significance of data quality and enhancement in mitigating overfitting and distinguishing cell characteristics.Specifically,SFGA focuses on identifying tumor cells from tissue for extranodal extension detection,with the SFGA-enhanced dataset showing notable advantages in accuracy.We conducted a preliminary study of five experiments,among which the accuracy of the pleomorphism experiment improved significantly from 50.81%to 95.15%.The accuracy of the other four experiments also increased,with improvements ranging from 3 to 43 percentage points.Our preliminary study shows the possibilities to enhance the diagnostic accuracy of deep learning models and proposes a systematic approach that could enhance cancer diagnosis,contributing as a first step in using SFGA in medical image enhancement.展开更多
Human Activity Recognition(HAR)has become increasingly critical in civic surveillance,medical care monitoring,and institutional protection.Current deep learning-based approaches often suffer from excessive computation...Human Activity Recognition(HAR)has become increasingly critical in civic surveillance,medical care monitoring,and institutional protection.Current deep learning-based approaches often suffer from excessive computational complexity,limited generalizability under varying conditions,and compromised real-time performance.To counter these,this paper introduces an Active Learning-aided Heuristic Deep Spatio-Textural Ensemble Learning(ALH-DSEL)framework.The model initially identifies keyframes from the surveillance videos with a Multi-Constraint Active Learning(MCAL)approach,with features extracted from DenseNet121.The frames are then segmented employing an optimized Fuzzy C-Means clustering algorithm with Firefly to identify areas of interest.A deep ensemble feature extractor,comprising DenseNet121,EfficientNet-B7,MobileNet,and GLCM,extracts varied spatial and textural features.Fused characteristics are enhanced through PCA and Min-Max normalization and discriminated by a maximum voting ensemble of RF,AdaBoost,and XGBoost.The experimental results show that ALH-DSEL provides higher accuracy,precision,recall,and F1-score,validating its superiority for real-time HAR in surveillance scenarios.展开更多
As petroleum exploration advances and as most of the oil-gas reservoirs in shallow layers have been explored, petroleum exploration starts to move toward deep basins, which has become an inevitable choice. In this pap...As petroleum exploration advances and as most of the oil-gas reservoirs in shallow layers have been explored, petroleum exploration starts to move toward deep basins, which has become an inevitable choice. In this paper, the petroleum geology features and research progress on oil-gas reservoirs in deep petroliferous basins across the world are characterized by using the latest results of worldwide deep petroleum exploration. Research has demonstrated that the deep petroleum shows ten major geological features. (1) While oil-gas reservoirs have been discovered in many different types of deep petroliferous basins, most have been discovered in low heat flux deep basins. (2) Many types of petroliferous traps are developed in deep basins, and tight oil-gas reservoirs in deep basin traps are arousing increasing attention. (3) Deep petroleum normally has more natural gas than liquid oil, and the natural gas ratio increases with the burial depth. (4) The residual organic matter in deep source rocks reduces but the hydrocarbon expulsion rate and efficiency increase with the burial depth. (5) There are many types of rocks in deep hydrocarbon reservoirs, and most are clastic rocks and carbonates. (6) The age of deep hydrocarbon reservoirs is widely different, but those recently discovered are pre- dominantly Paleogene and Upper Paleozoic. (7) The porosity and permeability of deep hydrocarbon reservoirs differ widely, but they vary in a regular way with lithology and burial depth. (8) The temperatures of deep oil-gas reservoirs are widely different, but they typically vary with the burial depth and basin geothermal gradient. (9) The pressures of deep oil-gas reservoirs differ significantly, but they typically vary with burial depth, genesis, and evolu- tion period. (10) Deep oil-gas reservoirs may exist with or without a cap, and those without a cap are typically of unconventional genesis. Over the past decade, six major steps have been made in the understanding of deep hydrocarbon reservoir formation. (1) Deep petroleum in petroliferous basins has multiple sources and many dif- ferent genetic mechanisms. (2) There are high-porosity, high-permeability reservoirs in deep basins, the formation of which is associated with tectonic events and subsurface fluid movement. (3) Capillary pressure differences inside and outside the target reservoir are the principal driving force of hydrocarbon enrichment in deep basins. (4) There are three dynamic boundaries for deep oil-gas reservoirs; a buoyancy-controlled threshold, hydrocarbon accumulation limits, and the upper limit of hydrocarbon generation. (5) The formation and distribution of deep hydrocarbon res- ervoirs are controlled by free, limited, and bound fluid dynamic fields. And (6) tight conventional, tight deep, tight superimposed, and related reconstructed hydrocarbon reservoirs formed in deep-limited fluid dynamic fields have great resource potential and vast scope for exploration. Compared with middle-shallow strata, the petroleum geology and accumulation in deep basins are more complex, which overlap the feature of basin evolution in different stages. We recommend that further study should pay more attention to four aspects: (1) identification of deep petroleum sources and evaluation of their relative contributions; (2) preservation conditions and genetic mechanisms of deep high-quality reservoirs with high permeability and high porosity; (3) facies feature and transformation of deep petroleum and their potential distribution; and (4) economic feasibility evaluation of deep tight petroleum exploration and development.展开更多
In the area of medical image processing,stomach cancer is one of the most important cancers which need to be diagnose at the early stage.In this paper,an optimized deep learning method is presented for multiple stomac...In the area of medical image processing,stomach cancer is one of the most important cancers which need to be diagnose at the early stage.In this paper,an optimized deep learning method is presented for multiple stomach disease classication.The proposed method work in few important steps—preprocessing using the fusion of ltering images along with Ant Colony Optimization(ACO),deep transfer learning-based features extraction,optimization of deep extracted features using nature-inspired algorithms,and nally fusion of optimal vectors and classication using Multi-Layered Perceptron Neural Network(MLNN).In the feature extraction step,pretrained Inception V3 is utilized and retrained on selected stomach infection classes using the deep transfer learning step.Later on,the activation function is applied to Global Average Pool(GAP)for feature extraction.However,the extracted features are optimized through two different nature-inspired algorithms—Particle Swarm Optimization(PSO)with dynamic tness function and Crow Search Algorithm(CSA).Hence,both methods’output is fused by a maximal value approach and classied the fused feature vector by MLNN.Two datasets are used to evaluate the proposed method—CUI WahStomach Diseases and Combined dataset and achieved an average accuracy of 99.5%.The comparison with existing techniques,it is shown that the proposed method shows signicant performance.展开更多
Human action recognition under complex environment is a challenging work.Recently,sparse representation has achieved excellent results of dealing with human action recognition problem under different conditions.The ma...Human action recognition under complex environment is a challenging work.Recently,sparse representation has achieved excellent results of dealing with human action recognition problem under different conditions.The main idea of sparse representation classification is to construct a general classification scheme where the training samples of each class can be considered as the dictionary to express the query class,and the minimal reconstruction error indicates its corresponding class.However,how to learn a discriminative dictionary is still a difficult work.In this work,we make two contributions.First,we build a new and robust human action recognition framework by combining one modified sparse classification model and deep convolutional neural network(CNN)features.Secondly,we construct a novel classification model which consists of the representation-constrained term and the coefficients incoherence term.Experimental results on benchmark datasets show that our modified model can obtain competitive results in comparison to other state-of-the-art models.展开更多
Multi-object tracking(MOT) techniques have been increasingly applied in a diverse range of tasks. Unmanned aerial vehicle(UAV) is one of its typical application scenarios. Due to the scene complexity and the low resol...Multi-object tracking(MOT) techniques have been increasingly applied in a diverse range of tasks. Unmanned aerial vehicle(UAV) is one of its typical application scenarios. Due to the scene complexity and the low resolution of moving targets in UAV applications, it is difficult to extract target features and identify them. In order to solve this problem, we propose a new re-identification(re-ID) network to extract association features for tracking in the association stage. Moreover, in order to reduce the complexity of detection model, we perform the lightweight optimization for it. Experimental results show that the proposed re-ID network can effectively reduce the number of identity switches, and surpass current state-of-the-art algorithms. In the meantime, the optimized detector can increase the speed by 27% owing to its lightweight design, which enables it to further meet the requirements of UAV tracking tasks.展开更多
The Deep Seismic Sounding( DSS) projects carried out from the 1970 s in the lower Yangtze region and its neighboring area were reviewed in this paper,then the basic wave group features of those wide angle reflection /...The Deep Seismic Sounding( DSS) projects carried out from the 1970 s in the lower Yangtze region and its neighboring area were reviewed in this paper,then the basic wave group features of those wide angle reflection / refraction record sections,and of the crustal structure are summarized. It shows that there were in total five clear wave groups on the record sections,which include the first arrival Pg,the reflection P1 from the bottom interface of the upper crust,the reflection P3 from the bottom interface of the middle crust,the strong reflection Pm from the Moho boundary,and the refraction Pn from uppermost mantle. In general,these phases are easily consistently traced and compared,despite some first arrivals being delayed or arriving earlier than normal due to the shallow sedimentary cover or bedrocks. In particular,in the Dabie Mountain region the seismic events of a few gathered shots always have weak reflection energy,are twisted,or exhibit disorganized waveforms, which could be attributed to the disruption variations of reflection depth,the broken Moho,and the discontinuity of the reflection boundary within crust. The regional crustal structures are composed of the upper,middle and lower crust,of which the middle and lower layers can be divided into two weak reflection ones. The crustal thickness of the North China and Yangtze platform are 30km- 36 km,and the Moho exhibits a flat geometry despite some local uplifts. The average pressure velocity in lower crust beneath this two tectonic area is 6. 7 ± 0. 3km / s. Nevertheless,beneath the Dabieshan area the crustal thickness is 32km- 41 km,the Moho bends down sharply andtakes an abrupt 4km- 7km dislocation in the vertical direction. The average pressure velocity in the lower crust beneath the Dabieshan area is 6. 8 ± 0. 2km / s.展开更多
Identifying fruit disease manually is time-consuming, expertrequired,and expensive;thus, a computer-based automated system is widelyrequired. Fruit diseases affect not only the quality but also the quantity.As a resul...Identifying fruit disease manually is time-consuming, expertrequired,and expensive;thus, a computer-based automated system is widelyrequired. Fruit diseases affect not only the quality but also the quantity.As a result, it is possible to detect the disease early on and cure the fruitsusing computer-based techniques. However, computer-based methods faceseveral challenges, including low contrast, a lack of dataset for training amodel, and inappropriate feature extraction for final classification. In thispaper, we proposed an automated framework for detecting apple fruit leafdiseases usingCNNand a hybrid optimization algorithm. Data augmentationis performed initially to balance the selected apple dataset. After that, twopre-trained deep models are fine-tuning and trained using transfer learning.Then, a fusion technique is proposed named Parallel Correlation Threshold(PCT). The fused feature vector is optimized in the next step using a hybridoptimization algorithm. The selected features are finally classified usingmachine learning algorithms. Four different experiments have been carriedout on the augmented Plant Village dataset and yielded the best accuracy of99.8%. The accuracy of the proposed framework is also compared to that ofseveral neural nets, and it outperforms them all.展开更多
Gait recognition is an active research area that uses a walking theme to identify the subject correctly.Human Gait Recognition(HGR)is performed without any cooperation from the individual.However,in practice,it remain...Gait recognition is an active research area that uses a walking theme to identify the subject correctly.Human Gait Recognition(HGR)is performed without any cooperation from the individual.However,in practice,it remains a challenging task under diverse walking sequences due to the covariant factors such as normal walking and walking with wearing a coat.Researchers,over the years,have worked on successfully identifying subjects using different techniques,but there is still room for improvement in accuracy due to these covariant factors.This paper proposes an automated model-free framework for human gait recognition in this article.There are a few critical steps in the proposed method.Firstly,optical flow-based motion region esti-mation and dynamic coordinates-based cropping are performed.The second step involves training a fine-tuned pre-trained MobileNetV2 model on both original and optical flow cropped frames;the training has been conducted using static hyperparameters.The third step proposed a fusion technique known as normal distribution serially fusion.In the fourth step,a better optimization algorithm is applied to select the best features,which are then classified using a Bi-Layered neural network.Three publicly available datasets,CASIA A,CASIA B,and CASIA C,were used in the experimental process and obtained average accuracies of 99.6%,91.6%,and 95.02%,respectively.The proposed framework has achieved improved accuracy compared to the other methods.展开更多
Artificial intelligence aids for healthcare have received a great deal of attention.Approximately one million patients with gastrointestinal diseases have been diagnosed via wireless capsule endoscopy(WCE).Early diagn...Artificial intelligence aids for healthcare have received a great deal of attention.Approximately one million patients with gastrointestinal diseases have been diagnosed via wireless capsule endoscopy(WCE).Early diagnosis facilitates appropriate treatment and saves lives.Deep learning-based techniques have been used to identify gastrointestinal ulcers,bleeding sites,and polyps.However,small lesions may be misclassified.We developed a deep learning-based best-feature method to classify various stomach diseases evident in WCE images.Initially,we use hybrid contrast enhancement to distinguish diseased from normal regions.Then,a pretrained model is fine-tuned,and further training is done via transfer learning.Deep features are extracted from the last two layers and fused using a vector length-based approach.We improve the genetic algorithm using a fitness function and kurtosis to select optimal features that are graded by a classifier.We evaluate a database containing 24,000 WCE images of ulcers,bleeding sites,polyps,and healthy tissue.The cubic support vector machine classifier was optimal;the average accuracy was 99%.展开更多
Wind speed forecasting is important for wind energy forecasting.In the modern era,the increase in energy demand can be managed effectively by fore-casting the wind speed accurately.The main objective of this research ...Wind speed forecasting is important for wind energy forecasting.In the modern era,the increase in energy demand can be managed effectively by fore-casting the wind speed accurately.The main objective of this research is to improve the performance of wind speed forecasting by handling uncertainty,the curse of dimensionality,overfitting and non-linearity issues.The curse of dimensionality and overfitting issues are handled by using Boruta feature selec-tion.The uncertainty and the non-linearity issues are addressed by using the deep learning based Bi-directional Long Short Term Memory(Bi-LSTM).In this paper,Bi-LSTM with Boruta feature selection named BFS-Bi-LSTM is proposed to improve the performance of wind speed forecasting.The model identifies relevant features for wind speed forecasting from the meteorological features using Boruta wrapper feature selection(BFS).Followed by Bi-LSTM predicts the wind speed by considering the wind speed from the past and future time steps.The proposed BFS-Bi-LSTM model is compared against Multilayer perceptron(MLP),MLP with Boruta(BFS-MLP),Long Short Term Memory(LSTM),LSTM with Boruta(BFS-LSTM)and Bi-LSTM in terms of Root Mean Square Error(RMSE),Mean Absolute Error(MAE),Mean Square Error(MSE)and R2.The BFS-Bi-LSTM surpassed other models by producing RMSE of 0.784,MAE of 0.530,MSE of 0.615 and R2 of 0.8766.The experimental result shows that the BFS-Bi-LSTM produced better forecasting results compared to others.展开更多
In healthcare sector,image classification is one of the crucial problems that impact the quality output from image processing domain.The purpose of image classification is to categorize different healthcare images under...In healthcare sector,image classification is one of the crucial problems that impact the quality output from image processing domain.The purpose of image classification is to categorize different healthcare images under various class labels which in turn helps in the detection and management of diseases.Magnetic Resonance Imaging(MRI)is one of the effective non-invasive strate-gies that generate a huge and distinct number of tissue contrasts in every imaging modality.This technique is commonly utilized by healthcare professionals for Brain Tumor(BT)diagnosis.With recent advancements in Machine Learning(ML)and Deep Learning(DL)models,it is possible to detect the tumor from images automatically,using a computer-aided design.The current study focuses on the design of automated Deep Learning-based BT Detection and Classification model using MRI images(DLBTDC-MRI).The proposed DLBTDC-MRI techni-que aims at detecting and classifying different stages of BT.The proposed DLBTDC-MRI technique involves medianfiltering technique to remove the noise and enhance the quality of MRI images.Besides,morphological operations-based image segmentation approach is also applied to determine the BT-affected regions in brain MRI image.Moreover,a fusion of handcrafted deep features using VGGNet is utilized to derive a valuable set of feature vectors.Finally,Artificial Fish Swarm Optimization(AFSO)with Artificial Neural Network(ANN)model is utilized as a classifier to decide the presence of BT.In order to assess the enhanced BT classification performance of the proposed model,a comprehensive set of simulations was performed on benchmark dataset and the results were vali-dated under several measures.展开更多
Due to the limitations of existing imaging hardware, obtaining high-resolution hyperspectral images is challenging. Hyperspectral image super-resolution(HSI SR) has been a very attractive research topic in computer vi...Due to the limitations of existing imaging hardware, obtaining high-resolution hyperspectral images is challenging. Hyperspectral image super-resolution(HSI SR) has been a very attractive research topic in computer vision, attracting the attention of many researchers. However, most HSI SR methods focus on the tradeoff between spatial resolution and spectral information, and cannot guarantee the efficient extraction of image information. In this paper, a multidimensional features network(MFNet) for HSI SR is proposed, which simultaneously learns and fuses the spatial,spectral, and frequency multidimensional features of HSI. Spatial features contain rich local details,spectral features contain the information and correlation between spectral bands, and frequency feature can reflect the global information of the image and can be used to obtain the global context of HSI. The fusion of the three features can better guide image super-resolution, to obtain higher-quality high-resolution hyperspectral images. In MFNet, we use the frequency feature extraction module(FFEM) to extract the frequency feature. On this basis, a multidimensional features extraction module(MFEM) is designed to learn and fuse multidimensional features. In addition, experimental results on two public datasets demonstrate that MFNet achieves state-of-the-art performance.展开更多
Existing multi-view deep subspace clustering methods aim to learn a unified representation from multi-view data,while the learned representation is difficult to maintain the underlying structure hidden in the origin s...Existing multi-view deep subspace clustering methods aim to learn a unified representation from multi-view data,while the learned representation is difficult to maintain the underlying structure hidden in the origin samples,especially the high-order neighbor relationship between samples.To overcome the above challenges,this paper proposes a novel multi-order neighborhood fusion based multi-view deep subspace clustering model.We creatively integrate the multi-order proximity graph structures of different views into the self-expressive layer by a multi-order neighborhood fusion module.By this design,the multi-order Laplacian matrix supervises the learning of the view-consistent self-representation affinity matrix;then,we can obtain an optimal global affinity matrix where each connected node belongs to one cluster.In addition,the discriminative constraint between views is designed to further improve the clustering performance.A range of experiments on six public datasets demonstrates that the method performs better than other advanced multi-view clustering methods.The code is available at https://github.com/songzuolong/MNF-MDSC(accessed on 25 December 2024).展开更多
Accurate and robust navigation in complex surgical environments is crucial for bronchoscopic surgeries.This study purposes a bronchoscopic lumen feature matching network(BLFM-Net)based on deep learning to address the ...Accurate and robust navigation in complex surgical environments is crucial for bronchoscopic surgeries.This study purposes a bronchoscopic lumen feature matching network(BLFM-Net)based on deep learning to address the challenges of image noise,anatomical complexity,and the stringent real-time requirements.The BLFM-Net enhances bronchoscopic image processing by integrating several functional modules.The FFA-Net preprocessing module mitigates image fogging and improves visual clarity for subsequent processing.The feature extraction module derives multi-dimensional features,such as centroids,area,and shape descriptors,from dehazed images.The Faster RCNN Object detection module detects bronchial regions of interest and generates bounding boxes to localize key areas.The feature matching module accelerates the process by combining detection boxes,extracted features,and a KD-Tree(K-Dimensional Tree)-based algorithm,ensuring efficient and accurate regional feature associations.The BLFM-Net was evaluated on 5212 bronchoscopic images,demonstrating superior performance compared to traditional and other deep learning-based image matching methods.It achieved real-time matching with an average frame time of 6 ms,with a matching accuracy of over 96%.The method remained robust under challenging conditions including frame dropping(0,5,10,20),shadowed regions,and variable lighting,maintaining accuracy of above 94%even with the frame dropping of 20.This study presents BLFM-Net,a deep learning-based matching network designed to enhance and match bronchial features in bronchoscopic images.The BLFM-Net shows improved accuracy,real-time performance,and reliability,making a valuable tool for bronchoscopic surgeries.展开更多
Acute lymphoblastic leukemia(ALL)is characterized by overgrowth of immature lymphoid cells in the bone marrow at the expense of normal hematopoiesis.One of the most prioritized tasks is the early and correct diagnosis...Acute lymphoblastic leukemia(ALL)is characterized by overgrowth of immature lymphoid cells in the bone marrow at the expense of normal hematopoiesis.One of the most prioritized tasks is the early and correct diagnosis of this malignancy;however,manual observation of the blood smear is very time-consuming and requires labor and expertise.Transfer learning in deep neural networks is of growing importance to intricate medical tasks such as medical imaging.Our work proposes an application of a novel ensemble architecture that puts together Vision Transformer and EfficientNetV2.This approach fuses deep and spatial features to optimize discriminative power by selecting features accurately,reducing redundancy,and promoting sparsity.Besides the architecture of the ensemble,the advanced feature selection is performed by the Frog-Snake Prey-Predation Relationship Optimization(FSRO)algorithm.FSRO prioritizes the most relevant features while dynamically reducing redundant and noisy data,hence improving the efficiency and accuracy of the classification model.We have compared our method for feature selection against state-of-the-art techniques and recorded an accuracy of 94.88%,a recall of 94.38%,a precision of 96.18%,and an F1-score of 95.63%.These figures are therefore better than the classical methods for deep learning.Though our dataset,collected from four different hospitals,is non-standard and heterogeneous,making the analysis more challenging,although computationally expensive,our approach proves diagnostically superior in cancer detection.Source codes and datasets are available on GitHub.展开更多
Heart disease prediction is a critical issue in healthcare,where accurate early diagnosis can save lives and reduce healthcare costs.The problem is inherently complex due to the high dimensionality of medical data,irr...Heart disease prediction is a critical issue in healthcare,where accurate early diagnosis can save lives and reduce healthcare costs.The problem is inherently complex due to the high dimensionality of medical data,irrelevant or redundant features,and the variability in risk factors such as age,lifestyle,andmedical history.These challenges often lead to inefficient and less accuratemodels.Traditional predictionmethodologies face limitations in effectively handling large feature sets and optimizing classification performance,which can result in overfitting poor generalization,and high computational cost.This work proposes a novel classification model for heart disease prediction that addresses these challenges by integrating feature selection through a Genetic Algorithm(GA)with an ensemble deep learning approach optimized using the Tunicate Swarm Algorithm(TSA).GA selects the most relevant features,reducing dimensionality and improvingmodel efficiency.Theselected features are then used to train an ensemble of deep learning models,where the TSA optimizes the weight of each model in the ensemble to enhance prediction accuracy.This hybrid approach addresses key challenges in the field,such as high dimensionality,redundant features,and classification performance,by introducing an efficient feature selection mechanism and optimizing the weighting of deep learning models in the ensemble.These enhancements result in a model that achieves superior accuracy,generalization,and efficiency compared to traditional methods.The proposed model demonstrated notable advancements in both prediction accuracy and computational efficiency over traditionalmodels.Specifically,it achieved an accuracy of 97.5%,a sensitivity of 97.2%,and a specificity of 97.8%.Additionally,with a 60-40 data split and 5-fold cross-validation,the model showed a significant reduction in training time(90 s),memory consumption(950 MB),and CPU usage(80%),highlighting its effectiveness in processing large,complex medical datasets for heart disease prediction.展开更多
Image captioning,the task of generating descriptive sentences for images,has advanced significantly with the integration of semantic information.However,traditional models still rely on static visual features that do ...Image captioning,the task of generating descriptive sentences for images,has advanced significantly with the integration of semantic information.However,traditional models still rely on static visual features that do not evolve with the changing linguistic context,which can hinder the ability to form meaningful connections between the image and the generated captions.This limitation often leads to captions that are less accurate or descriptive.In this paper,we propose a novel approach to enhance image captioning by introducing dynamic interactions where visual features continuously adapt to the evolving linguistic context.Our model strengthens the alignment between visual and linguistic elements,resulting in more coherent and contextually appropriate captions.Specifically,we introduce two innovative modules:the Visual Weighting Module(VWM)and the Enhanced Features Attention Module(EFAM).The VWM adjusts visual features using partial attention,enabling dynamic reweighting of the visual inputs,while the EFAM further refines these features to improve their relevance to the generated caption.By continuously adjusting visual features in response to the linguistic context,our model bridges the gap between static visual features and dynamic language generation.We demonstrate the effectiveness of our approach through experiments on the MS-COCO dataset,where our method outperforms state-of-the-art techniques in terms of caption quality and contextual relevance.Our results show that dynamic visual-linguistic alignment significantly enhances image captioning performance.展开更多
Considering that the algorithm accuracy of the traditional sparse representation models is not high under the influence of multiple complex environmental factors,this study focuses on the improvement of feature extrac...Considering that the algorithm accuracy of the traditional sparse representation models is not high under the influence of multiple complex environmental factors,this study focuses on the improvement of feature extraction and model construction.Firstly,the convolutional neural network(CNN)features of the face are extracted by the trained deep learning network.Next,the steady-state and dynamic classifiers for face recognition are constructed based on the CNN features and Haar features respectively,with two-stage sparse representation introduced in the process of constructing the steady-state classifier and the feature templates with high reliability are dynamically selected as alternative templates from the sparse representation template dictionary constructed using the CNN features.Finally,the results of face recognition are given based on the classification results of the steady-state classifier and the dynamic classifier together.Based on this,the feature weights of the steady-state classifier template are adjusted in real time and the dictionary set is dynamically updated to reduce the probability of irrelevant features entering the dictionary set.The average recognition accuracy of this method is 94.45%on the CMU PIE face database and 96.58%on the AR face database,which is significantly improved compared with that of the traditional face recognition methods.展开更多
基金King Saud University,Grant/Award Number:RSP2024R157。
文摘Biometric characteristics are playing a vital role in security for the last few years.Human gait classification in video sequences is an important biometrics attribute and is used for security purposes.A new framework for human gait classification in video sequences using deep learning(DL)fusion assisted and posterior probability-based moth flames optimization(MFO)is proposed.In the first step,the video frames are resized and finetuned by two pre-trained lightweight DL models,EfficientNetB0 and MobileNetV2.Both models are selected based on the top-5 accuracy and less number of parameters.Later,both models are trained through deep transfer learning and extracted deep features fused using a voting scheme.In the last step,the authors develop a posterior probabilitybased MFO feature selection algorithm to select the best features.The selected features are classified using several supervised learning methods.The CASIA-B publicly available dataset has been employed for the experimental process.On this dataset,the authors selected six angles such as 0°,18°,90°,108°,162°,and 180°and obtained an average accuracy of 96.9%,95.7%,86.8%,90.0%,95.1%,and 99.7%.Results demonstrate comparable improvement in accuracy and significantly minimize the computational time with recent state-of-the-art techniques.
基金supported by grants fromthe North China University of Technology Research Start-Up Fund(11005136024XN147-14)and(110051360024XN151-97)Guangzhou Development Zone Science and Technology Project(2023GH02)+4 种基金the National Key R&D Program of China(2021YFE0201100 and 2022YFA1103401 to Juntao Gao)National Natural Science Foundation of China(981890991 to Juntao Gao)Beijing Municipal Natural Science Foundation(Z200021 to Juntao Gao)CAS Interdisciplinary Innovation Team(JCTD-2020-04 to Juntao Gao)0032/2022/A,by Macao FDCT,and MYRG2022-00271-FST.
文摘Hematoxylin and Eosin(H&E)images,popularly used in the field of digital pathology,often pose challenges due to their limited color richness,hindering the differentiation of subtle cell features crucial for accurate classification.Enhancing the visibility of these elusive cell features helps train robust deep-learning models.However,the selection and application of image processing techniques for such enhancement have not been systematically explored in the research community.To address this challenge,we introduce Salient Features Guided Augmentation(SFGA),an approach that strategically integrates machine learning and image processing.SFGA utilizes machine learning algorithms to identify crucial features within cell images,subsequently mapping these features to appropriate image processing techniques to enhance training images.By emphasizing salient features and aligning them with corresponding image processing methods,SFGA is designed to enhance the discriminating power of deep learning models in cell classification tasks.Our research undertakes a series of experiments,each exploring the performance of different datasets and data enhancement techniques in classifying cell types,highlighting the significance of data quality and enhancement in mitigating overfitting and distinguishing cell characteristics.Specifically,SFGA focuses on identifying tumor cells from tissue for extranodal extension detection,with the SFGA-enhanced dataset showing notable advantages in accuracy.We conducted a preliminary study of five experiments,among which the accuracy of the pleomorphism experiment improved significantly from 50.81%to 95.15%.The accuracy of the other four experiments also increased,with improvements ranging from 3 to 43 percentage points.Our preliminary study shows the possibilities to enhance the diagnostic accuracy of deep learning models and proposes a systematic approach that could enhance cancer diagnosis,contributing as a first step in using SFGA in medical image enhancement.
文摘Human Activity Recognition(HAR)has become increasingly critical in civic surveillance,medical care monitoring,and institutional protection.Current deep learning-based approaches often suffer from excessive computational complexity,limited generalizability under varying conditions,and compromised real-time performance.To counter these,this paper introduces an Active Learning-aided Heuristic Deep Spatio-Textural Ensemble Learning(ALH-DSEL)framework.The model initially identifies keyframes from the surveillance videos with a Multi-Constraint Active Learning(MCAL)approach,with features extracted from DenseNet121.The frames are then segmented employing an optimized Fuzzy C-Means clustering algorithm with Firefly to identify areas of interest.A deep ensemble feature extractor,comprising DenseNet121,EfficientNet-B7,MobileNet,and GLCM,extracts varied spatial and textural features.Fused characteristics are enhanced through PCA and Min-Max normalization and discriminated by a maximum voting ensemble of RF,AdaBoost,and XGBoost.The experimental results show that ALH-DSEL provides higher accuracy,precision,recall,and F1-score,validating its superiority for real-time HAR in surveillance scenarios.
基金the National Basic Research Program of China (973 Program, 2011CB201100)‘‘Complex hydrocarbon accumulation mechanism and enrichmentregularities of deep superimposed basins in Western China’’ National Natural Science Foundation of China (U1262205) under the guidance of related department heads and experts
文摘As petroleum exploration advances and as most of the oil-gas reservoirs in shallow layers have been explored, petroleum exploration starts to move toward deep basins, which has become an inevitable choice. In this paper, the petroleum geology features and research progress on oil-gas reservoirs in deep petroliferous basins across the world are characterized by using the latest results of worldwide deep petroleum exploration. Research has demonstrated that the deep petroleum shows ten major geological features. (1) While oil-gas reservoirs have been discovered in many different types of deep petroliferous basins, most have been discovered in low heat flux deep basins. (2) Many types of petroliferous traps are developed in deep basins, and tight oil-gas reservoirs in deep basin traps are arousing increasing attention. (3) Deep petroleum normally has more natural gas than liquid oil, and the natural gas ratio increases with the burial depth. (4) The residual organic matter in deep source rocks reduces but the hydrocarbon expulsion rate and efficiency increase with the burial depth. (5) There are many types of rocks in deep hydrocarbon reservoirs, and most are clastic rocks and carbonates. (6) The age of deep hydrocarbon reservoirs is widely different, but those recently discovered are pre- dominantly Paleogene and Upper Paleozoic. (7) The porosity and permeability of deep hydrocarbon reservoirs differ widely, but they vary in a regular way with lithology and burial depth. (8) The temperatures of deep oil-gas reservoirs are widely different, but they typically vary with the burial depth and basin geothermal gradient. (9) The pressures of deep oil-gas reservoirs differ significantly, but they typically vary with burial depth, genesis, and evolu- tion period. (10) Deep oil-gas reservoirs may exist with or without a cap, and those without a cap are typically of unconventional genesis. Over the past decade, six major steps have been made in the understanding of deep hydrocarbon reservoir formation. (1) Deep petroleum in petroliferous basins has multiple sources and many dif- ferent genetic mechanisms. (2) There are high-porosity, high-permeability reservoirs in deep basins, the formation of which is associated with tectonic events and subsurface fluid movement. (3) Capillary pressure differences inside and outside the target reservoir are the principal driving force of hydrocarbon enrichment in deep basins. (4) There are three dynamic boundaries for deep oil-gas reservoirs; a buoyancy-controlled threshold, hydrocarbon accumulation limits, and the upper limit of hydrocarbon generation. (5) The formation and distribution of deep hydrocarbon res- ervoirs are controlled by free, limited, and bound fluid dynamic fields. And (6) tight conventional, tight deep, tight superimposed, and related reconstructed hydrocarbon reservoirs formed in deep-limited fluid dynamic fields have great resource potential and vast scope for exploration. Compared with middle-shallow strata, the petroleum geology and accumulation in deep basins are more complex, which overlap the feature of basin evolution in different stages. We recommend that further study should pay more attention to four aspects: (1) identification of deep petroleum sources and evaluation of their relative contributions; (2) preservation conditions and genetic mechanisms of deep high-quality reservoirs with high permeability and high porosity; (3) facies feature and transformation of deep petroleum and their potential distribution; and (4) economic feasibility evaluation of deep tight petroleum exploration and development.
基金supported by Korea Institute for Advancement of Technology(KIAT)grant funded by the Korea Government(MOTIE)(P0012724,The Competency Development Program for Industry Specialist)and the Soonchunhyang University Research Fund.
文摘In the area of medical image processing,stomach cancer is one of the most important cancers which need to be diagnose at the early stage.In this paper,an optimized deep learning method is presented for multiple stomach disease classication.The proposed method work in few important steps—preprocessing using the fusion of ltering images along with Ant Colony Optimization(ACO),deep transfer learning-based features extraction,optimization of deep extracted features using nature-inspired algorithms,and nally fusion of optimal vectors and classication using Multi-Layered Perceptron Neural Network(MLNN).In the feature extraction step,pretrained Inception V3 is utilized and retrained on selected stomach infection classes using the deep transfer learning step.Later on,the activation function is applied to Global Average Pool(GAP)for feature extraction.However,the extracted features are optimized through two different nature-inspired algorithms—Particle Swarm Optimization(PSO)with dynamic tness function and Crow Search Algorithm(CSA).Hence,both methods’output is fused by a maximal value approach and classied the fused feature vector by MLNN.Two datasets are used to evaluate the proposed method—CUI WahStomach Diseases and Combined dataset and achieved an average accuracy of 99.5%.The comparison with existing techniques,it is shown that the proposed method shows signicant performance.
基金This research was funded by the National Natural Science Foundation of China(21878124,31771680 and 61773182).
文摘Human action recognition under complex environment is a challenging work.Recently,sparse representation has achieved excellent results of dealing with human action recognition problem under different conditions.The main idea of sparse representation classification is to construct a general classification scheme where the training samples of each class can be considered as the dictionary to express the query class,and the minimal reconstruction error indicates its corresponding class.However,how to learn a discriminative dictionary is still a difficult work.In this work,we make two contributions.First,we build a new and robust human action recognition framework by combining one modified sparse classification model and deep convolutional neural network(CNN)features.Secondly,we construct a novel classification model which consists of the representation-constrained term and the coefficients incoherence term.Experimental results on benchmark datasets show that our modified model can obtain competitive results in comparison to other state-of-the-art models.
基金supported by the Research Foundation of Nanjing University of Posts and Telecommunications (No.NY219076)。
文摘Multi-object tracking(MOT) techniques have been increasingly applied in a diverse range of tasks. Unmanned aerial vehicle(UAV) is one of its typical application scenarios. Due to the scene complexity and the low resolution of moving targets in UAV applications, it is difficult to extract target features and identify them. In order to solve this problem, we propose a new re-identification(re-ID) network to extract association features for tracking in the association stage. Moreover, in order to reduce the complexity of detection model, we perform the lightweight optimization for it. Experimental results show that the proposed re-ID network can effectively reduce the number of identity switches, and surpass current state-of-the-art algorithms. In the meantime, the optimized detector can increase the speed by 27% owing to its lightweight design, which enables it to further meet the requirements of UAV tracking tasks.
基金funded by the Special Public Welfare Industry Research of China Earthquake Administration(201408023)Academician Chen Yong Workstation Special Funds of Yunnan Province and Natural Science Foundation of China(41374062,41174075)
文摘The Deep Seismic Sounding( DSS) projects carried out from the 1970 s in the lower Yangtze region and its neighboring area were reviewed in this paper,then the basic wave group features of those wide angle reflection / refraction record sections,and of the crustal structure are summarized. It shows that there were in total five clear wave groups on the record sections,which include the first arrival Pg,the reflection P1 from the bottom interface of the upper crust,the reflection P3 from the bottom interface of the middle crust,the strong reflection Pm from the Moho boundary,and the refraction Pn from uppermost mantle. In general,these phases are easily consistently traced and compared,despite some first arrivals being delayed or arriving earlier than normal due to the shallow sedimentary cover or bedrocks. In particular,in the Dabie Mountain region the seismic events of a few gathered shots always have weak reflection energy,are twisted,or exhibit disorganized waveforms, which could be attributed to the disruption variations of reflection depth,the broken Moho,and the discontinuity of the reflection boundary within crust. The regional crustal structures are composed of the upper,middle and lower crust,of which the middle and lower layers can be divided into two weak reflection ones. The crustal thickness of the North China and Yangtze platform are 30km- 36 km,and the Moho exhibits a flat geometry despite some local uplifts. The average pressure velocity in lower crust beneath this two tectonic area is 6. 7 ± 0. 3km / s. Nevertheless,beneath the Dabieshan area the crustal thickness is 32km- 41 km,the Moho bends down sharply andtakes an abrupt 4km- 7km dislocation in the vertical direction. The average pressure velocity in the lower crust beneath the Dabieshan area is 6. 8 ± 0. 2km / s.
基金supported by“Human Resources Program in Energy Technology”of the Korea Institute of Energy Technology Evaluation and Planning (KETEP)granted financial resources from the Ministry of Trade,Industry&Energy,Republic of Korea. (No.20204010600090).
文摘Identifying fruit disease manually is time-consuming, expertrequired,and expensive;thus, a computer-based automated system is widelyrequired. Fruit diseases affect not only the quality but also the quantity.As a result, it is possible to detect the disease early on and cure the fruitsusing computer-based techniques. However, computer-based methods faceseveral challenges, including low contrast, a lack of dataset for training amodel, and inappropriate feature extraction for final classification. In thispaper, we proposed an automated framework for detecting apple fruit leafdiseases usingCNNand a hybrid optimization algorithm. Data augmentationis performed initially to balance the selected apple dataset. After that, twopre-trained deep models are fine-tuning and trained using transfer learning.Then, a fusion technique is proposed named Parallel Correlation Threshold(PCT). The fused feature vector is optimized in the next step using a hybridoptimization algorithm. The selected features are finally classified usingmachine learning algorithms. Four different experiments have been carriedout on the augmented Plant Village dataset and yielded the best accuracy of99.8%. The accuracy of the proposed framework is also compared to that ofseveral neural nets, and it outperforms them all.
基金supported by“Human Resources Program in Energy Technology”of the Korea Institute of Energy Technology Evaluation and Planning(KETEP)granted financial resources from the Ministry of Trade,Industry&Energy,Republic of Korea.(No.20204010600090).
文摘Gait recognition is an active research area that uses a walking theme to identify the subject correctly.Human Gait Recognition(HGR)is performed without any cooperation from the individual.However,in practice,it remains a challenging task under diverse walking sequences due to the covariant factors such as normal walking and walking with wearing a coat.Researchers,over the years,have worked on successfully identifying subjects using different techniques,but there is still room for improvement in accuracy due to these covariant factors.This paper proposes an automated model-free framework for human gait recognition in this article.There are a few critical steps in the proposed method.Firstly,optical flow-based motion region esti-mation and dynamic coordinates-based cropping are performed.The second step involves training a fine-tuned pre-trained MobileNetV2 model on both original and optical flow cropped frames;the training has been conducted using static hyperparameters.The third step proposed a fusion technique known as normal distribution serially fusion.In the fourth step,a better optimization algorithm is applied to select the best features,which are then classified using a Bi-Layered neural network.Three publicly available datasets,CASIA A,CASIA B,and CASIA C,were used in the experimental process and obtained average accuracies of 99.6%,91.6%,and 95.02%,respectively.The proposed framework has achieved improved accuracy compared to the other methods.
基金supported by Korea Institute for Advancement of Technology(KIAT)grant funded by the Korea Government(MOTIE)(P0012724,The Competency Development Program for Industry Specialist)the Soonchunhyang University Research Fund.
文摘Artificial intelligence aids for healthcare have received a great deal of attention.Approximately one million patients with gastrointestinal diseases have been diagnosed via wireless capsule endoscopy(WCE).Early diagnosis facilitates appropriate treatment and saves lives.Deep learning-based techniques have been used to identify gastrointestinal ulcers,bleeding sites,and polyps.However,small lesions may be misclassified.We developed a deep learning-based best-feature method to classify various stomach diseases evident in WCE images.Initially,we use hybrid contrast enhancement to distinguish diseased from normal regions.Then,a pretrained model is fine-tuned,and further training is done via transfer learning.Deep features are extracted from the last two layers and fused using a vector length-based approach.We improve the genetic algorithm using a fitness function and kurtosis to select optimal features that are graded by a classifier.We evaluate a database containing 24,000 WCE images of ulcers,bleeding sites,polyps,and healthy tissue.The cubic support vector machine classifier was optimal;the average accuracy was 99%.
文摘Wind speed forecasting is important for wind energy forecasting.In the modern era,the increase in energy demand can be managed effectively by fore-casting the wind speed accurately.The main objective of this research is to improve the performance of wind speed forecasting by handling uncertainty,the curse of dimensionality,overfitting and non-linearity issues.The curse of dimensionality and overfitting issues are handled by using Boruta feature selec-tion.The uncertainty and the non-linearity issues are addressed by using the deep learning based Bi-directional Long Short Term Memory(Bi-LSTM).In this paper,Bi-LSTM with Boruta feature selection named BFS-Bi-LSTM is proposed to improve the performance of wind speed forecasting.The model identifies relevant features for wind speed forecasting from the meteorological features using Boruta wrapper feature selection(BFS).Followed by Bi-LSTM predicts the wind speed by considering the wind speed from the past and future time steps.The proposed BFS-Bi-LSTM model is compared against Multilayer perceptron(MLP),MLP with Boruta(BFS-MLP),Long Short Term Memory(LSTM),LSTM with Boruta(BFS-LSTM)and Bi-LSTM in terms of Root Mean Square Error(RMSE),Mean Absolute Error(MAE),Mean Square Error(MSE)and R2.The BFS-Bi-LSTM surpassed other models by producing RMSE of 0.784,MAE of 0.530,MSE of 0.615 and R2 of 0.8766.The experimental result shows that the BFS-Bi-LSTM produced better forecasting results compared to others.
基金supported through the Annual Funding track by the Deanship of Scientific Research,Vice Presidency for Graduate Studies and Scientific Research,King Faisal University,Saudi Arabia[Project No.AN000684].
文摘In healthcare sector,image classification is one of the crucial problems that impact the quality output from image processing domain.The purpose of image classification is to categorize different healthcare images under various class labels which in turn helps in the detection and management of diseases.Magnetic Resonance Imaging(MRI)is one of the effective non-invasive strate-gies that generate a huge and distinct number of tissue contrasts in every imaging modality.This technique is commonly utilized by healthcare professionals for Brain Tumor(BT)diagnosis.With recent advancements in Machine Learning(ML)and Deep Learning(DL)models,it is possible to detect the tumor from images automatically,using a computer-aided design.The current study focuses on the design of automated Deep Learning-based BT Detection and Classification model using MRI images(DLBTDC-MRI).The proposed DLBTDC-MRI techni-que aims at detecting and classifying different stages of BT.The proposed DLBTDC-MRI technique involves medianfiltering technique to remove the noise and enhance the quality of MRI images.Besides,morphological operations-based image segmentation approach is also applied to determine the BT-affected regions in brain MRI image.Moreover,a fusion of handcrafted deep features using VGGNet is utilized to derive a valuable set of feature vectors.Finally,Artificial Fish Swarm Optimization(AFSO)with Artificial Neural Network(ANN)model is utilized as a classifier to decide the presence of BT.In order to assess the enhanced BT classification performance of the proposed model,a comprehensive set of simulations was performed on benchmark dataset and the results were vali-dated under several measures.
基金supported by the Fundamental Research Funds for the Provincial Universities of Zhejiang (No.GK249909299001-036)National Key Research and Development Program of China (No. 2023YFB4502803)Zhejiang Provincial Natural Science Foundation of China (No.LDT23F01014F01)。
文摘Due to the limitations of existing imaging hardware, obtaining high-resolution hyperspectral images is challenging. Hyperspectral image super-resolution(HSI SR) has been a very attractive research topic in computer vision, attracting the attention of many researchers. However, most HSI SR methods focus on the tradeoff between spatial resolution and spectral information, and cannot guarantee the efficient extraction of image information. In this paper, a multidimensional features network(MFNet) for HSI SR is proposed, which simultaneously learns and fuses the spatial,spectral, and frequency multidimensional features of HSI. Spatial features contain rich local details,spectral features contain the information and correlation between spectral bands, and frequency feature can reflect the global information of the image and can be used to obtain the global context of HSI. The fusion of the three features can better guide image super-resolution, to obtain higher-quality high-resolution hyperspectral images. In MFNet, we use the frequency feature extraction module(FFEM) to extract the frequency feature. On this basis, a multidimensional features extraction module(MFEM) is designed to learn and fuse multidimensional features. In addition, experimental results on two public datasets demonstrate that MFNet achieves state-of-the-art performance.
基金supported by the National Key R&D Program of China(2023YFC3304600).
文摘Existing multi-view deep subspace clustering methods aim to learn a unified representation from multi-view data,while the learned representation is difficult to maintain the underlying structure hidden in the origin samples,especially the high-order neighbor relationship between samples.To overcome the above challenges,this paper proposes a novel multi-order neighborhood fusion based multi-view deep subspace clustering model.We creatively integrate the multi-order proximity graph structures of different views into the self-expressive layer by a multi-order neighborhood fusion module.By this design,the multi-order Laplacian matrix supervises the learning of the view-consistent self-representation affinity matrix;then,we can obtain an optimal global affinity matrix where each connected node belongs to one cluster.In addition,the discriminative constraint between views is designed to further improve the clustering performance.A range of experiments on six public datasets demonstrates that the method performs better than other advanced multi-view clustering methods.The code is available at https://github.com/songzuolong/MNF-MDSC(accessed on 25 December 2024).
基金funded by the National Natural Science Foundation of China(Grant No.52175028).
文摘Accurate and robust navigation in complex surgical environments is crucial for bronchoscopic surgeries.This study purposes a bronchoscopic lumen feature matching network(BLFM-Net)based on deep learning to address the challenges of image noise,anatomical complexity,and the stringent real-time requirements.The BLFM-Net enhances bronchoscopic image processing by integrating several functional modules.The FFA-Net preprocessing module mitigates image fogging and improves visual clarity for subsequent processing.The feature extraction module derives multi-dimensional features,such as centroids,area,and shape descriptors,from dehazed images.The Faster RCNN Object detection module detects bronchial regions of interest and generates bounding boxes to localize key areas.The feature matching module accelerates the process by combining detection boxes,extracted features,and a KD-Tree(K-Dimensional Tree)-based algorithm,ensuring efficient and accurate regional feature associations.The BLFM-Net was evaluated on 5212 bronchoscopic images,demonstrating superior performance compared to traditional and other deep learning-based image matching methods.It achieved real-time matching with an average frame time of 6 ms,with a matching accuracy of over 96%.The method remained robust under challenging conditions including frame dropping(0,5,10,20),shadowed regions,and variable lighting,maintaining accuracy of above 94%even with the frame dropping of 20.This study presents BLFM-Net,a deep learning-based matching network designed to enhance and match bronchial features in bronchoscopic images.The BLFM-Net shows improved accuracy,real-time performance,and reliability,making a valuable tool for bronchoscopic surgeries.
文摘Acute lymphoblastic leukemia(ALL)is characterized by overgrowth of immature lymphoid cells in the bone marrow at the expense of normal hematopoiesis.One of the most prioritized tasks is the early and correct diagnosis of this malignancy;however,manual observation of the blood smear is very time-consuming and requires labor and expertise.Transfer learning in deep neural networks is of growing importance to intricate medical tasks such as medical imaging.Our work proposes an application of a novel ensemble architecture that puts together Vision Transformer and EfficientNetV2.This approach fuses deep and spatial features to optimize discriminative power by selecting features accurately,reducing redundancy,and promoting sparsity.Besides the architecture of the ensemble,the advanced feature selection is performed by the Frog-Snake Prey-Predation Relationship Optimization(FSRO)algorithm.FSRO prioritizes the most relevant features while dynamically reducing redundant and noisy data,hence improving the efficiency and accuracy of the classification model.We have compared our method for feature selection against state-of-the-art techniques and recorded an accuracy of 94.88%,a recall of 94.38%,a precision of 96.18%,and an F1-score of 95.63%.These figures are therefore better than the classical methods for deep learning.Though our dataset,collected from four different hospitals,is non-standard and heterogeneous,making the analysis more challenging,although computationally expensive,our approach proves diagnostically superior in cancer detection.Source codes and datasets are available on GitHub.
文摘Heart disease prediction is a critical issue in healthcare,where accurate early diagnosis can save lives and reduce healthcare costs.The problem is inherently complex due to the high dimensionality of medical data,irrelevant or redundant features,and the variability in risk factors such as age,lifestyle,andmedical history.These challenges often lead to inefficient and less accuratemodels.Traditional predictionmethodologies face limitations in effectively handling large feature sets and optimizing classification performance,which can result in overfitting poor generalization,and high computational cost.This work proposes a novel classification model for heart disease prediction that addresses these challenges by integrating feature selection through a Genetic Algorithm(GA)with an ensemble deep learning approach optimized using the Tunicate Swarm Algorithm(TSA).GA selects the most relevant features,reducing dimensionality and improvingmodel efficiency.Theselected features are then used to train an ensemble of deep learning models,where the TSA optimizes the weight of each model in the ensemble to enhance prediction accuracy.This hybrid approach addresses key challenges in the field,such as high dimensionality,redundant features,and classification performance,by introducing an efficient feature selection mechanism and optimizing the weighting of deep learning models in the ensemble.These enhancements result in a model that achieves superior accuracy,generalization,and efficiency compared to traditional methods.The proposed model demonstrated notable advancements in both prediction accuracy and computational efficiency over traditionalmodels.Specifically,it achieved an accuracy of 97.5%,a sensitivity of 97.2%,and a specificity of 97.8%.Additionally,with a 60-40 data split and 5-fold cross-validation,the model showed a significant reduction in training time(90 s),memory consumption(950 MB),and CPU usage(80%),highlighting its effectiveness in processing large,complex medical datasets for heart disease prediction.
基金supported by the National Natural Science Foundation of China(Nos.U22A2034,62177047)High Caliber Foreign Experts Introduction Plan funded by MOST,and Central South University Research Programme of Advanced Interdisciplinary Studies(No.2023QYJC020).
文摘Image captioning,the task of generating descriptive sentences for images,has advanced significantly with the integration of semantic information.However,traditional models still rely on static visual features that do not evolve with the changing linguistic context,which can hinder the ability to form meaningful connections between the image and the generated captions.This limitation often leads to captions that are less accurate or descriptive.In this paper,we propose a novel approach to enhance image captioning by introducing dynamic interactions where visual features continuously adapt to the evolving linguistic context.Our model strengthens the alignment between visual and linguistic elements,resulting in more coherent and contextually appropriate captions.Specifically,we introduce two innovative modules:the Visual Weighting Module(VWM)and the Enhanced Features Attention Module(EFAM).The VWM adjusts visual features using partial attention,enabling dynamic reweighting of the visual inputs,while the EFAM further refines these features to improve their relevance to the generated caption.By continuously adjusting visual features in response to the linguistic context,our model bridges the gap between static visual features and dynamic language generation.We demonstrate the effectiveness of our approach through experiments on the MS-COCO dataset,where our method outperforms state-of-the-art techniques in terms of caption quality and contextual relevance.Our results show that dynamic visual-linguistic alignment significantly enhances image captioning performance.
基金the financial support from Natural Science Foundation of Gansu Province(Nos.22JR5RA217,22JR5RA216)Lanzhou Science and Technology Program(No.2022-2-111)+1 种基金Lanzhou University of Arts and Sciences School Innovation Fund Project(No.XJ2022000103)Lanzhou College of Arts and Sciences 2023 Talent Cultivation Quality Improvement Project(No.2023-ZL-jxzz-03)。
文摘Considering that the algorithm accuracy of the traditional sparse representation models is not high under the influence of multiple complex environmental factors,this study focuses on the improvement of feature extraction and model construction.Firstly,the convolutional neural network(CNN)features of the face are extracted by the trained deep learning network.Next,the steady-state and dynamic classifiers for face recognition are constructed based on the CNN features and Haar features respectively,with two-stage sparse representation introduced in the process of constructing the steady-state classifier and the feature templates with high reliability are dynamically selected as alternative templates from the sparse representation template dictionary constructed using the CNN features.Finally,the results of face recognition are given based on the classification results of the steady-state classifier and the dynamic classifier together.Based on this,the feature weights of the steady-state classifier template are adjusted in real time and the dictionary set is dynamically updated to reduce the probability of irrelevant features entering the dictionary set.The average recognition accuracy of this method is 94.45%on the CMU PIE face database and 96.58%on the AR face database,which is significantly improved compared with that of the traditional face recognition methods.