The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to u...The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to understand complex mobility patterns.Deep learning techniques,such as graph neural networks(GNNs),are popular for their ability to capture spatio-temporal dependencies.However,these models often become overly complex due to the large number of hyper-parameters involved.In this study,we introduce Dynamic Multi-Graph Spatial-Temporal Graph Neural Ordinary Differential Equation Networks(DMST-GNODE),a framework based on ordinary differential equations(ODEs)that autonomously discovers effective spatial-temporal graph neural network(STGNN)architectures for traffic prediction tasks.The comparative analysis of DMST-GNODE and baseline models indicates that DMST-GNODE model demonstrates superior performance across multiple datasets,consistently achieving the lowest Root Mean Square Error(RMSE)and Mean Absolute Error(MAE)values,alongside the highest accuracy.On the BKK(Bangkok)dataset,it outperformed other models with an RMSE of 3.3165 and an accuracy of 0.9367 for a 20-min interval,maintaining this trend across 40 and 60 min.Similarly,on the PeMS08 dataset,DMST-GNODE achieved the best performance with an RMSE of 19.4863 and an accuracy of 0.9377 at 20 min,demonstrating its effectiveness over longer periods.The Los_Loop dataset results further emphasise this model’s advantage,with an RMSE of 3.3422 and an accuracy of 0.7643 at 20 min,consistently maintaining superiority across all time intervals.These numerical highlights indicate that DMST-GNODE not only outperforms baseline models but also achieves higher accuracy and lower errors across different time intervals and datasets.展开更多
Gastrointestinal(GI)diseases,including gastric and colorectal cancers,signi-ficantly impact global health,necessitating accurate and efficient diagnostic me-thods.Endoscopic examination is the primary diagnostic tool;...Gastrointestinal(GI)diseases,including gastric and colorectal cancers,signi-ficantly impact global health,necessitating accurate and efficient diagnostic me-thods.Endoscopic examination is the primary diagnostic tool;however,its accu-racy is limited by operator dependency and interobserver variability.Advance-ments in deep learning,particularly convolutional neural networks(CNNs),show great potential for enhancing GI disease detection and classification.This review explores the application of CNNs in endoscopic imaging,focusing on polyp and tumor detection,disease classification,endoscopic ultrasound,and capsule endo-scopy analysis.We discuss the performance of CNN models with traditional dia-gnostic methods,highlighting their advantages in accuracy and real-time decision support.Despite promising results,challenges remain,including data availability,model interpretability,and clinical integration.Future directions include impro-ving model generalization,enhancing explainability,and conducting large-scale clinical trials.With continued advancements,CNN-powered artificial intelligence systems could revolutionize GI endoscopy by enhancing early disease detection,reducing diagnostic errors,and improving patient outcomes.展开更多
System design and optimization problems require large-scale chemical kinetic models. Pure kinetic models of naphtha pyrolysis need to solve a complete set of stiff ODEs and is therefore too computational expensive. On...System design and optimization problems require large-scale chemical kinetic models. Pure kinetic models of naphtha pyrolysis need to solve a complete set of stiff ODEs and is therefore too computational expensive. On the other hand, artificial neural networks that completely neglect the topology of the reaction networks often have poor generalization. In this paper, a framework is proposed for learning local representations from largescale chemical reaction networks. At first, the features of naphtha pyrolysis reactions are extracted by applying complex network characterization methods. The selected features are then used as inputs in convolutional architectures. Different CNN models are established and compared to optimize the neural network structure.After the pre-training and fine-tuning step, the ultimate CNN model reduces the computational cost of the previous kinetic model by over 300 times and predicts the yields of main products with the average error of less than 3%. The obtained results demonstrate the high efficiency of the proposed framework.展开更多
AIM: To explore a segmentation algorithm based on deep learning to achieve accurate diagnosis and treatment of patients with retinal fluid.METHODS: A two-dimensional(2D) fully convolutional network for retinal segment...AIM: To explore a segmentation algorithm based on deep learning to achieve accurate diagnosis and treatment of patients with retinal fluid.METHODS: A two-dimensional(2D) fully convolutional network for retinal segmentation was employed. In order to solve the category imbalance in retinal optical coherence tomography(OCT) images, the network parameters and loss function based on the 2D fully convolutional network were modified. For this network, the correlations of corresponding positions among adjacent images in space are ignored. Thus, we proposed a three-dimensional(3D) fully convolutional network for segmentation in the retinal OCT images.RESULTS: The algorithm was evaluated according to segmentation accuracy, Kappa coefficient, and F1 score. For the 3D fully convolutional network proposed in this paper, the overall segmentation accuracy rate is 99.56%, Kappa coefficient is 98.47%, and F1 score of retinal fluid is 95.50%. CONCLUSION: The OCT image segmentation algorithm based on deep learning is primarily founded on the 2D convolutional network. The 3D network architecture proposed in this paper reduces the influence of category imbalance, realizes end-to-end segmentation of volume images, and achieves optimal segmentation results. The segmentation maps are practically the same as the manual annotations of doctors, and can provide doctors with more accurate diagnostic data.展开更多
Aiming at the difficulty of fault identification caused by manual extraction of fault features of rotating machinery,a one-dimensional multi-scale convolutional auto-encoder fault diagnosis model is proposed,based on ...Aiming at the difficulty of fault identification caused by manual extraction of fault features of rotating machinery,a one-dimensional multi-scale convolutional auto-encoder fault diagnosis model is proposed,based on the standard convolutional auto-encoder.In this model,the parallel convolutional and deconvolutional kernels of different scales are used to extract the features from the input signal and reconstruct the input signal;then the feature map extracted by multi-scale convolutional kernels is used as the input of the classifier;and finally the parameters of the whole model are fine-tuned using labeled data.Experiments on one set of simulation fault data and two sets of rolling bearing fault data are conducted to validate the proposed method.The results show that the model can achieve 99.75%,99.3%and 100%diagnostic accuracy,respectively.In addition,the diagnostic accuracy and reconstruction error of the one-dimensional multi-scale convolutional auto-encoder are compared with traditional machine learning,convolutional neural networks and a traditional convolutional auto-encoder.The final results show that the proposed model has a better recognition effect for rolling bearing fault data.展开更多
When carrying out calculations for turbulent flow simulation,one inevitably has to face the choice between accuracy and speed of calculations.In order to simultaneously obtain both a computationally efficient and more...When carrying out calculations for turbulent flow simulation,one inevitably has to face the choice between accuracy and speed of calculations.In order to simultaneously obtain both a computationally efficient and more accurate model,a surrogate model can be built on the basis of some fast special model and knowledge of previous calculations obtained by more accurate base models from various test bases or some results of serial calculations.The objective of this work is to construct a surrogate model which allows to improve the accuracy of turbulent calculations obtained by a special model on unstructured meshes.For this purpose,we use 1D Convolutional Neural Network(CNN)of the encoder-decoder architecture and reduce the problem to a single dimension by applying space-filling curves.Such an approach would have the benefit of being applicable to solutions obtained on unstructured meshes.In this work,a non-local approach is applied where entire flow fields obtained by the special and base models are used as input and ground truth output respectively.Spalart-Allmaras(SA)model and Near-wall Domain Decomposition(NDD)method for SA are taken as the base and special models respectively.The efficiency and accuracy of the obtained surrogate model are demonstrated in a case of supersonic flow over a compression corner with different values for angleαand Reynolds number Re.We conducted an investigation into interpolation and extrapolation by Re and also into interpolation byα.展开更多
Since chemical processes are highly non-linear and multiscale,it is vital to deeply mine the multiscale coupling relationships embedded in the massive process data for the prediction and anomaly tracing of crucial pro...Since chemical processes are highly non-linear and multiscale,it is vital to deeply mine the multiscale coupling relationships embedded in the massive process data for the prediction and anomaly tracing of crucial process parameters and production indicators.While the integrated method of adaptive signal decomposition combined with time series models could effectively predict process variables,it does have limitations in capturing the high-frequency detail of the operation state when applied to complex chemical processes.In light of this,a novel Multiscale Multi-radius Multi-step Convolutional Neural Network(Msrt Net)is proposed for mining spatiotemporal multiscale information.First,the industrial data from the Fluid Catalytic Cracking(FCC)process decomposition using Complete Ensemble Empirical Mode Decomposition with Adaptive Noise(CEEMDAN)extract the multi-energy scale information of the feature subset.Then,convolution kernels with varying stride and padding structures are established to decouple the long-period operation process information encapsulated within the multi-energy scale data.Finally,a reconciliation network is trained to reconstruct the multiscale prediction results and obtain the final output.Msrt Net is initially assessed for its capability to untangle the spatiotemporal multiscale relationships among variables in the Tennessee Eastman Process(TEP).Subsequently,the performance of Msrt Net is evaluated in predicting product yield for a 2.80×10^(6) t/a FCC unit,taking diesel and gasoline yield as examples.In conclusion,Msrt Net can decouple and effectively extract spatiotemporal multiscale information from chemical process data and achieve a approximately reduction of 30%in prediction error compared to other time-series models.Furthermore,its robustness and transferability underscore its promising potential for broader applications.展开更多
Due to the strong background noise and the acquisition system noise,the useful characteristics are often difficult to be detected.To solve this problem,sparse coding captures a concise representation of the high-level...Due to the strong background noise and the acquisition system noise,the useful characteristics are often difficult to be detected.To solve this problem,sparse coding captures a concise representation of the high-level features in the signal using the underlying structure of the signal.Recently,an Online Convolutional Sparse Coding(OCSC)denoising algorithm has been proposed.However,it does not consider the structural characteristics of the signal,the sparsity of each iteration is not enough.Therefore,a threshold shrinkage algorithm considering neighborhood sparsity is proposed,and a training strategy from loose to tight is developed to further improve the denoising performance of the algorithm,called Variable Threshold Neighborhood Online Convolution Sparse Coding(VTNOCSC).By embedding the structural sparse threshold shrinkage operator into the process of solving the sparse coefficient and gradually approaching the optimal noise separation point in the training,the signal denoising performance of the algorithm is greatly improved.VTNOCSC is used to process the actual bearing fault signal,the noise interference is successfully reduced and the interest features are more evident.Compared with other existing methods,VTNOCSC has better denoising performance.展开更多
This paper proposes a novel grading method of apples,in an automated grading device that uses convolutional neural networks to extract the size,color,texture,and roundness of an apple.The developed machine learning me...This paper proposes a novel grading method of apples,in an automated grading device that uses convolutional neural networks to extract the size,color,texture,and roundness of an apple.The developed machine learning method uses the ability of learning representative features by means of a convolutional neural network(CNN),to determine suitable features of apples for the grading process.This information is fed into a one-to-one classifier that uses a support vector machine(SVM),instead of the softmax output layer of the CNN.In this manner,Yantai apples with similar shapes and low discrimination are graded using four different approaches.The fusion model using both CNN and SVM classifiers is much more accurate than the simple k-nearest neighbor(KNN),SVM,and CNN model when used separately for grading,and the learning ability and the generalization ability of the model is correspondingly increased by the combined method.Grading tests are carried out using the automated grading device that is developed in the present work.It is verified that the actual effect of apple grading using the combined CNN-SVM model is fast and accurate,which greatly reduces the manpower and labor costs of manual grading,and has important commercial prospects.展开更多
In recent years,deep convolution neural network has exhibited excellent performance in computer vision and has a far-reaching impact.Traditional plant taxonomic identification requires high expertise,which is time-con...In recent years,deep convolution neural network has exhibited excellent performance in computer vision and has a far-reaching impact.Traditional plant taxonomic identification requires high expertise,which is time-consuming.Most nature reserves have problems such as incomplete species surveys,inaccurate taxonomic identification,and untimely updating of status data.Simple and accurate recognition of plant images can be achieved by applying convolutional neural network technology to explore the best network model.Taking 24 typical desert plant species that are widely distributed in the nature reserves in Xinjiang Uygur Autonomous Region of China as the research objects,this study established an image database and select the optimal network model for the image recognition of desert plant species to provide decision support for fine management in the nature reserves in Xinjiang,such as species investigation and monitoring,by using deep learning.Since desert plant species were not included in the public dataset,the images used in this study were mainly obtained through field shooting and downloaded from the Plant Photo Bank of China(PPBC).After the sorting process and statistical analysis,a total of 2331 plant images were finally collected(2071 images from field collection and 260 images from the PPBC),including 24 plant species belonging to 14 families and 22 genera.A large number of numerical experiments were also carried out to compare a series of 37 convolutional neural network models with good performance,from different perspectives,to find the optimal network model that is most suitable for the image recognition of desert plant species in Xinjiang.The results revealed 24 models with a recognition Accuracy,of greater than 70.000%.Among which,Residual Network X_8GF(RegNetX_8GF)performs the best,with Accuracy,Precision,Recall,and F1(which refers to the harmonic mean of the Precision and Recall values)values of 78.33%,77.65%,69.55%,and 71.26%,respectively.Considering the demand factors of hardware equipment and inference time,Mobile NetworkV2 achieves the best balance among the Accuracy,the number of parameters and the number of floating-point operations.The number of parameters for Mobile Network V2(MobileNetV2)is 1/16 of RegNetX_8GF,and the number of floating-point operations is 1/24.Our findings can facilitate efficient decision-making for the management of species survey,cataloging,inspection,and monitoring in the nature reserves in Xinjiang,providing a scientific basis for the protection and utilization of natural plant resources.展开更多
With the improvement of people’s living standards,the demand for health monitoring and exercise detection is increasing.It is of great significance to study human activity recognition(HAR)methods that are different f...With the improvement of people’s living standards,the demand for health monitoring and exercise detection is increasing.It is of great significance to study human activity recognition(HAR)methods that are different from traditional feature extraction methods.This article uses convolutional neural network(CNN)algorithms in deep learning to automatically extract features of activities related to human life.We used a stochastic gradient descent algorithm to optimize the parameters of the CNN.The trained network model is compressed on STM32CubeMX-AI.Finally,this article introduces the use of neural networks on embedded devices to recognize six human activities of daily life,such as sitting,standing,walking,jogging,upstairs,and downstairs.The acceleration sensor related to human activity information is used to obtain the relevant characteristics of the activity,thereby solving the HAR problem.By drawing the accuracy curve,loss function curve,and confusion matrix diagram of the training model,the recognition effect of the convolutional neural network can be seen more intuitively.After comparing the average accuracy of each set of experiments and the test set of the best model obtained from it,the best model is then selected.展开更多
Convolutional neural networks(CNNs)require a lot of multiplication and addition operations completed by traditional electrical multipliers,leading to high power consumption and limited speed.Here,a silicon waveguide-b...Convolutional neural networks(CNNs)require a lot of multiplication and addition operations completed by traditional electrical multipliers,leading to high power consumption and limited speed.Here,a silicon waveguide-based wavelength division multiplexing(WDM)architecture for CNN is optimized with high energy efficiency Fano resonator.Coupling of T-waveguide and micro-ring resonator generates Fano resonance with small half-width,which can significantly reduce the modulator power consumption.Insulator dataset from state grid is used to test Fano resonance modulator-based CNNs.The results show that accuracy for insulator defect recognition reaches 99.27%with much lower power consumption.Obviously,our optimized photonic integration architecture for CNNs has broad potential for the artificial intelligence hardware platform.展开更多
Traditional data-driven fault diagnosis methods depend on expert experience to manually extract effective fault features of signals,which has certain limitations.Conversely,deep learning techniques have gained promine...Traditional data-driven fault diagnosis methods depend on expert experience to manually extract effective fault features of signals,which has certain limitations.Conversely,deep learning techniques have gained prominence as a central focus of research in the field of fault diagnosis by strong fault feature extraction ability and end-to-end fault diagnosis efficiency.Recently,utilizing the respective advantages of convolution neural network(CNN)and Transformer in local and global feature extraction,research on cooperating the two have demonstrated promise in the field of fault diagnosis.However,the cross-channel convolution mechanism in CNN and the self-attention calculations in Transformer contribute to excessive complexity in the cooperative model.This complexity results in high computational costs and limited industrial applicability.To tackle the above challenges,this paper proposes a lightweight CNN-Transformer named as SEFormer for rotating machinery fault diagnosis.First,a separable multiscale depthwise convolution block is designed to extract and integrate multiscale feature information from different channel dimensions of vibration signals.Then,an efficient self-attention block is developed to capture critical fine-grained features of the signal from a global perspective.Finally,experimental results on the planetary gearbox dataset and themotor roller bearing dataset prove that the proposed framework can balance the advantages of robustness,generalization and lightweight compared to recent state-of-the-art fault diagnosis models based on CNN and Transformer.This study presents a feasible strategy for developing a lightweight rotating machinery fault diagnosis framework aimed at economical deployment.展开更多
Graph Convolutional Neural Networks(GCNs)have been widely used in various fields due to their powerful capabilities in processing graph-structured data.However,GCNs encounter significant challenges when applied to sca...Graph Convolutional Neural Networks(GCNs)have been widely used in various fields due to their powerful capabilities in processing graph-structured data.However,GCNs encounter significant challenges when applied to scale-free graphs with power-law distributions,resulting in substantial distortions.Moreover,most of the existing GCN models are shallow structures,which restricts their ability to capture dependencies among distant nodes and more refined high-order node features in scale-free graphs with hierarchical structures.To more broadly and precisely apply GCNs to real-world graphs exhibiting scale-free or hierarchical structures and utilize multi-level aggregation of GCNs for capturing high-level information in local representations,we propose the Hyperbolic Deep Graph Convolutional Neural Network(HDGCNN),an end-to-end deep graph representation learning framework that can map scale-free graphs from Euclidean space to hyperbolic space.In HDGCNN,we define the fundamental operations of deep graph convolutional neural networks in hyperbolic space.Additionally,we introduce a hyperbolic feature transformation method based on identity mapping and a dense connection scheme based on a novel non-local message passing framework.In addition,we present a neighborhood aggregation method that combines initial structural featureswith hyperbolic attention coefficients.Through the above methods,HDGCNN effectively leverages both the structural features and node features of graph data,enabling enhanced exploration of non-local structural features and more refined node features in scale-free or hierarchical graphs.Experimental results demonstrate that HDGCNN achieves remarkable performance improvements over state-ofthe-art GCNs in node classification and link prediction tasks,even when utilizing low-dimensional embedding representations.Furthermore,when compared to shallow hyperbolic graph convolutional neural network models,HDGCNN exhibits notable advantages and performance enhancements.展开更多
BACKGROUND A convolutional neural network(CNN) is a deep learning algorithm based on the principle of human brain visual cortex processing and image recognition.AIM To automatically identify the invasion depth and ori...BACKGROUND A convolutional neural network(CNN) is a deep learning algorithm based on the principle of human brain visual cortex processing and image recognition.AIM To automatically identify the invasion depth and origin of esophageal lesions based on a CNN.METHODS A total of 1670 white-light images were used to train and validate the CNN system.The method proposed in this paper included the following two parts:(1)Location module,an object detection network,locating the classified main image feature regions of the image for subsequent classification tasks;and(2) Classification module,a traditional classification CNN,classifying the images cut out by the object detection network.RESULTS The CNN system proposed in this study achieved an overall accuracy of 82.49%,sensitivity of 80.23%,and specificity of 90.56%.In this study,after follow-up pathology,726 patients were compared for endoscopic pathology.The misdiagnosis rate of endoscopic diagnosis in the lesion invasion range was approximately 9.5%;41 patients showed no lesion invasion to the muscularis propria,but 36 of them pathologically showed invasion to the superficial muscularis propria.The patients with invasion of the tunica adventitia were all treated by surgery with an accuracy rate of 100%.For the examination of submucosal lesions,the accuracy of endoscopic ultrasonography(EUS) was approximately 99.3%.Results of this study showed that EUS had a high accuracy rate for the origin of submucosal lesions,whereas the misdiagnosis rate was slightly high in the evaluation of the invasion scope of lesions.Misdiagnosis could be due to different operating and diagnostic levels of endoscopists,unclear ultrasound probes,and unclear lesions.CONCLUSION This study is the first to recognize esophageal EUS images through deep learning,which can automatically identify the invasion depth and lesion origin of submucosal tumors and classify such tumors,thereby achieving good accuracy.In future studies,this method can provide guidance and help to clinical endoscopists.展开更多
In the era of intelligent economy, the click-through rate(CTR) prediction system can evaluate massive service information based on user historical information, and screen out the products that are most likely to be fa...In the era of intelligent economy, the click-through rate(CTR) prediction system can evaluate massive service information based on user historical information, and screen out the products that are most likely to be favored by users, thus realizing customized push of information and achieve the ultimate goal of improving economic benefits. Sequence modeling is one of the main research directions of CTR prediction models based on deep learning. The user's general interest hidden in the entire click history and the short-term interest hidden in the recent click behaviors have different influences on the CTR prediction results, which are highly important. In terms of capturing the user's general interest, existing models paid more attention to the relationships between item embedding vectors(point-level), while ignoring the relationships between elements in item embedding vectors(union-level). The Lambda layer-based Convolutional Sequence Embedding(LCSE) model proposed in this paper uses the Lambda layer to capture features from click history through weight distribution, and uses horizontal and vertical filters on this basis to learn the user's general preferences from union-level and point-level. In addition, we also incorporate the user's short-term preferences captured by the embedding-based convolutional model to further improve the prediction results. The AUC(Area Under Curve) values of the LCSE model on the datasets Electronic, Movie & TV and MovieLens are 0.870 7, 0.903 6 and 0.946 7, improving 0.45%, 0.36% and 0.07% over the Caser model, proving the effectiveness of our proposed model.展开更多
BACKGROUND Esophageal cancer is the seventh-most common cancer type worldwide,accounting for 5%of death from malignancy.Development of novel diagnostic techniques has facilitated screening,early detection,and improved...BACKGROUND Esophageal cancer is the seventh-most common cancer type worldwide,accounting for 5%of death from malignancy.Development of novel diagnostic techniques has facilitated screening,early detection,and improved prognosis.Convolutional neural network(CNN)-based image analysis promises great potential for diagnosing and determining the prognosis of esophageal cancer,enabling even early detection of dysplasia.METHODS PubMed,EMBASE,Web of Science and Cochrane Library databases were searched for articles published up to November 30,2022.We evaluated the diagnostic accuracy of using the CNN model with still image-based analysis and with video-based analysis for esophageal cancer or HGD,as well as for the invasion depth of esophageal cancer.The pooled sensitivity,pooled specificity,positive likelihood ratio(PLR),negative likelihood ratio(NLR),diagnostic odds ratio(DOR)and area under the curve(AUC)were estimated,together with the 95%confidence intervals(CI).A bivariate method and hierarchical summary receiver operating characteristic method were used to calculate the diagnostic test accuracy of the CNN model.Meta-regression and subgroup analyses were used to identify sources of hetero-geneity.RESULTS A total of 28 studies were included in this systematic review and meta-analysis.Using still image-based analysis for the diagnosis of esophageal cancer or HGD provided a pooled sensitivity of 0.95(95%CI:0.92-0.97),pooled specificity of 0.92(0.89-0.94),PLR of 11.5(8.3-16.0),NLR of 0.06(0.04-0.09),DOR of 205(115-365),and AUC of 0.98(0.96-0.99).When video-based analysis was used,a pooled sensitivity of 0.85(0.77-0.91),pooled specificity of 0.73(0.59-0.83),PLR of 3.1(1.9-5.0),NLR of 0.20(0.12-0.34),DOR of 15(6-38)and AUC of 0.87(0.84-0.90)were found.Prediction of invasion depth resulted in a pooled sensitivity of 0.90(0.87-0.92),pooled specificity of 0.83(95%CI:0.76-0.88),PLR of 7.8(1.9-32.0),NLR of 0.10(0.41-0.25),DOR of 118(11-1305),and AUC of 0.95(0.92-0.96).CONCLUSION CNN-based image analysis in diagnosing esophageal cancer and HGD is an excellent diagnostic method with high sensitivity and specificity that merits further investigation in large,multicenter clinical trials.展开更多
In recent years,gait-based emotion recognition has been widely applied in the field of computer vision.However,existing gait emotion recognition methods typically rely on complete human skeleton data,and their accurac...In recent years,gait-based emotion recognition has been widely applied in the field of computer vision.However,existing gait emotion recognition methods typically rely on complete human skeleton data,and their accuracy significantly declines when the data is occluded.To enhance the accuracy of gait emotion recognition under occlusion,this paper proposes a Multi-scale Suppression Graph ConvolutionalNetwork(MS-GCN).TheMS-GCN consists of three main components:Joint Interpolation Module(JI Moudle),Multi-scale Temporal Convolution Network(MS-TCN),and Suppression Graph Convolutional Network(SGCN).The JI Module completes the spatially occluded skeletal joints using the(K-Nearest Neighbors)KNN interpolation method.The MS-TCN employs convolutional kernels of various sizes to comprehensively capture the emotional information embedded in the gait,compensating for the temporal occlusion of gait information.The SGCN extracts more non-prominent human gait features by suppressing the extraction of key body part features,thereby reducing the negative impact of occlusion on emotion recognition results.The proposed method is evaluated on two comprehensive datasets:Emotion-Gait,containing 4227 real gaits from sources like BML,ICT-Pollick,and ELMD,and 1000 synthetic gaits generated using STEP-Gen technology,and ELMB,consisting of 3924 gaits,with 1835 labeled with emotions such as“Happy,”“Sad,”“Angry,”and“Neutral.”On the standard datasets Emotion-Gait and ELMB,the proposed method achieved accuracies of 0.900 and 0.896,respectively,attaining performance comparable to other state-ofthe-artmethods.Furthermore,on occlusion datasets,the proposedmethod significantly mitigates the performance degradation caused by occlusion compared to other methods,the accuracy is significantly higher than that of other methods.展开更多
One of the most obvious clinical reasons of dementia or The Behavioral and Psychological Symptoms of Dementia(BPSD)are the lack of emotional expression,the increased frequency of negative emotions,and the impermanence...One of the most obvious clinical reasons of dementia or The Behavioral and Psychological Symptoms of Dementia(BPSD)are the lack of emotional expression,the increased frequency of negative emotions,and the impermanence of emotions.Observing the reduction of BPSD in dementia through emotions can be considered effective and widely used in the field of non-pharmacological therapy.At present,this article will verify whether the image recognition artificial intelligence(AI)system can correctly reflect the emotional performance of the elderly with dementia through a questionnaire survey of three professional elderly nursing staff.The ANOVA(sig.=0.50)is used to determine that the judgment given by the nursing staff has no obvious deviation,and then Kendall's test(0.722**)and spearman's test(0.863**)are used to verify the judgment severity of the emotion recognition system and the nursing staff unanimously.This implies the usability of the tool.Additionally,it can be expected to be further applied in the research related to BPSD elderly emotion detection.展开更多
文摘The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to understand complex mobility patterns.Deep learning techniques,such as graph neural networks(GNNs),are popular for their ability to capture spatio-temporal dependencies.However,these models often become overly complex due to the large number of hyper-parameters involved.In this study,we introduce Dynamic Multi-Graph Spatial-Temporal Graph Neural Ordinary Differential Equation Networks(DMST-GNODE),a framework based on ordinary differential equations(ODEs)that autonomously discovers effective spatial-temporal graph neural network(STGNN)architectures for traffic prediction tasks.The comparative analysis of DMST-GNODE and baseline models indicates that DMST-GNODE model demonstrates superior performance across multiple datasets,consistently achieving the lowest Root Mean Square Error(RMSE)and Mean Absolute Error(MAE)values,alongside the highest accuracy.On the BKK(Bangkok)dataset,it outperformed other models with an RMSE of 3.3165 and an accuracy of 0.9367 for a 20-min interval,maintaining this trend across 40 and 60 min.Similarly,on the PeMS08 dataset,DMST-GNODE achieved the best performance with an RMSE of 19.4863 and an accuracy of 0.9377 at 20 min,demonstrating its effectiveness over longer periods.The Los_Loop dataset results further emphasise this model’s advantage,with an RMSE of 3.3422 and an accuracy of 0.7643 at 20 min,consistently maintaining superiority across all time intervals.These numerical highlights indicate that DMST-GNODE not only outperforms baseline models but also achieves higher accuracy and lower errors across different time intervals and datasets.
基金Supported by Open Funds for Shaanxi Provincial Key Laboratory of Infection and Immune Diseases,No.2023-KFMS-1.
文摘Gastrointestinal(GI)diseases,including gastric and colorectal cancers,signi-ficantly impact global health,necessitating accurate and efficient diagnostic me-thods.Endoscopic examination is the primary diagnostic tool;however,its accu-racy is limited by operator dependency and interobserver variability.Advance-ments in deep learning,particularly convolutional neural networks(CNNs),show great potential for enhancing GI disease detection and classification.This review explores the application of CNNs in endoscopic imaging,focusing on polyp and tumor detection,disease classification,endoscopic ultrasound,and capsule endo-scopy analysis.We discuss the performance of CNN models with traditional dia-gnostic methods,highlighting their advantages in accuracy and real-time decision support.Despite promising results,challenges remain,including data availability,model interpretability,and clinical integration.Future directions include impro-ving model generalization,enhancing explainability,and conducting large-scale clinical trials.With continued advancements,CNN-powered artificial intelligence systems could revolutionize GI endoscopy by enhancing early disease detection,reducing diagnostic errors,and improving patient outcomes.
基金Supported by the National Natural Science Foundation of China(U1462206)
文摘System design and optimization problems require large-scale chemical kinetic models. Pure kinetic models of naphtha pyrolysis need to solve a complete set of stiff ODEs and is therefore too computational expensive. On the other hand, artificial neural networks that completely neglect the topology of the reaction networks often have poor generalization. In this paper, a framework is proposed for learning local representations from largescale chemical reaction networks. At first, the features of naphtha pyrolysis reactions are extracted by applying complex network characterization methods. The selected features are then used as inputs in convolutional architectures. Different CNN models are established and compared to optimize the neural network structure.After the pre-training and fine-tuning step, the ultimate CNN model reduces the computational cost of the previous kinetic model by over 300 times and predicts the yields of main products with the average error of less than 3%. The obtained results demonstrate the high efficiency of the proposed framework.
基金Supported by National Science Foundation of China(No.81800878)Interdisciplinary Program of Shanghai Jiao Tong University(No.YG2017QN24)+1 种基金Key Technological Research Projects of Songjiang District(No.18sjkjgg24)Bethune Langmu Ophthalmological Research Fund for Young and Middle-aged People(No.BJ-LM2018002J)
文摘AIM: To explore a segmentation algorithm based on deep learning to achieve accurate diagnosis and treatment of patients with retinal fluid.METHODS: A two-dimensional(2D) fully convolutional network for retinal segmentation was employed. In order to solve the category imbalance in retinal optical coherence tomography(OCT) images, the network parameters and loss function based on the 2D fully convolutional network were modified. For this network, the correlations of corresponding positions among adjacent images in space are ignored. Thus, we proposed a three-dimensional(3D) fully convolutional network for segmentation in the retinal OCT images.RESULTS: The algorithm was evaluated according to segmentation accuracy, Kappa coefficient, and F1 score. For the 3D fully convolutional network proposed in this paper, the overall segmentation accuracy rate is 99.56%, Kappa coefficient is 98.47%, and F1 score of retinal fluid is 95.50%. CONCLUSION: The OCT image segmentation algorithm based on deep learning is primarily founded on the 2D convolutional network. The 3D network architecture proposed in this paper reduces the influence of category imbalance, realizes end-to-end segmentation of volume images, and achieves optimal segmentation results. The segmentation maps are practically the same as the manual annotations of doctors, and can provide doctors with more accurate diagnostic data.
基金The National Natural Science Foundation of China(No.51675098)
文摘Aiming at the difficulty of fault identification caused by manual extraction of fault features of rotating machinery,a one-dimensional multi-scale convolutional auto-encoder fault diagnosis model is proposed,based on the standard convolutional auto-encoder.In this model,the parallel convolutional and deconvolutional kernels of different scales are used to extract the features from the input signal and reconstruct the input signal;then the feature map extracted by multi-scale convolutional kernels is used as the input of the classifier;and finally the parameters of the whole model are fine-tuned using labeled data.Experiments on one set of simulation fault data and two sets of rolling bearing fault data are conducted to validate the proposed method.The results show that the model can achieve 99.75%,99.3%and 100%diagnostic accuracy,respectively.In addition,the diagnostic accuracy and reconstruction error of the one-dimensional multi-scale convolutional auto-encoder are compared with traditional machine learning,convolutional neural networks and a traditional convolutional auto-encoder.The final results show that the proposed model has a better recognition effect for rolling bearing fault data.
文摘When carrying out calculations for turbulent flow simulation,one inevitably has to face the choice between accuracy and speed of calculations.In order to simultaneously obtain both a computationally efficient and more accurate model,a surrogate model can be built on the basis of some fast special model and knowledge of previous calculations obtained by more accurate base models from various test bases or some results of serial calculations.The objective of this work is to construct a surrogate model which allows to improve the accuracy of turbulent calculations obtained by a special model on unstructured meshes.For this purpose,we use 1D Convolutional Neural Network(CNN)of the encoder-decoder architecture and reduce the problem to a single dimension by applying space-filling curves.Such an approach would have the benefit of being applicable to solutions obtained on unstructured meshes.In this work,a non-local approach is applied where entire flow fields obtained by the special and base models are used as input and ground truth output respectively.Spalart-Allmaras(SA)model and Near-wall Domain Decomposition(NDD)method for SA are taken as the base and special models respectively.The efficiency and accuracy of the obtained surrogate model are demonstrated in a case of supersonic flow over a compression corner with different values for angleαand Reynolds number Re.We conducted an investigation into interpolation and extrapolation by Re and also into interpolation byα.
文摘Since chemical processes are highly non-linear and multiscale,it is vital to deeply mine the multiscale coupling relationships embedded in the massive process data for the prediction and anomaly tracing of crucial process parameters and production indicators.While the integrated method of adaptive signal decomposition combined with time series models could effectively predict process variables,it does have limitations in capturing the high-frequency detail of the operation state when applied to complex chemical processes.In light of this,a novel Multiscale Multi-radius Multi-step Convolutional Neural Network(Msrt Net)is proposed for mining spatiotemporal multiscale information.First,the industrial data from the Fluid Catalytic Cracking(FCC)process decomposition using Complete Ensemble Empirical Mode Decomposition with Adaptive Noise(CEEMDAN)extract the multi-energy scale information of the feature subset.Then,convolution kernels with varying stride and padding structures are established to decouple the long-period operation process information encapsulated within the multi-energy scale data.Finally,a reconciliation network is trained to reconstruct the multiscale prediction results and obtain the final output.Msrt Net is initially assessed for its capability to untangle the spatiotemporal multiscale relationships among variables in the Tennessee Eastman Process(TEP).Subsequently,the performance of Msrt Net is evaluated in predicting product yield for a 2.80×10^(6) t/a FCC unit,taking diesel and gasoline yield as examples.In conclusion,Msrt Net can decouple and effectively extract spatiotemporal multiscale information from chemical process data and achieve a approximately reduction of 30%in prediction error compared to other time-series models.Furthermore,its robustness and transferability underscore its promising potential for broader applications.
基金supported by the National Key Research and Development Program of China(No.2018YFB2003300)National Science and Technology Major Project,China(No.2017-IV-0008-0045)National Natural Science Foundation of China(No.51675262).
文摘Due to the strong background noise and the acquisition system noise,the useful characteristics are often difficult to be detected.To solve this problem,sparse coding captures a concise representation of the high-level features in the signal using the underlying structure of the signal.Recently,an Online Convolutional Sparse Coding(OCSC)denoising algorithm has been proposed.However,it does not consider the structural characteristics of the signal,the sparsity of each iteration is not enough.Therefore,a threshold shrinkage algorithm considering neighborhood sparsity is proposed,and a training strategy from loose to tight is developed to further improve the denoising performance of the algorithm,called Variable Threshold Neighborhood Online Convolution Sparse Coding(VTNOCSC).By embedding the structural sparse threshold shrinkage operator into the process of solving the sparse coefficient and gradually approaching the optimal noise separation point in the training,the signal denoising performance of the algorithm is greatly improved.VTNOCSC is used to process the actual bearing fault signal,the noise interference is successfully reduced and the interest features are more evident.Compared with other existing methods,VTNOCSC has better denoising performance.
文摘This paper proposes a novel grading method of apples,in an automated grading device that uses convolutional neural networks to extract the size,color,texture,and roundness of an apple.The developed machine learning method uses the ability of learning representative features by means of a convolutional neural network(CNN),to determine suitable features of apples for the grading process.This information is fed into a one-to-one classifier that uses a support vector machine(SVM),instead of the softmax output layer of the CNN.In this manner,Yantai apples with similar shapes and low discrimination are graded using four different approaches.The fusion model using both CNN and SVM classifiers is much more accurate than the simple k-nearest neighbor(KNN),SVM,and CNN model when used separately for grading,and the learning ability and the generalization ability of the model is correspondingly increased by the combined method.Grading tests are carried out using the automated grading device that is developed in the present work.It is verified that the actual effect of apple grading using the combined CNN-SVM model is fast and accurate,which greatly reduces the manpower and labor costs of manual grading,and has important commercial prospects.
基金supported by the West Light Foundation of the Chinese Academy of Sciences(2019-XBQNXZ-A-007)the National Natural Science Foundation of China(12071458,71731009).
文摘In recent years,deep convolution neural network has exhibited excellent performance in computer vision and has a far-reaching impact.Traditional plant taxonomic identification requires high expertise,which is time-consuming.Most nature reserves have problems such as incomplete species surveys,inaccurate taxonomic identification,and untimely updating of status data.Simple and accurate recognition of plant images can be achieved by applying convolutional neural network technology to explore the best network model.Taking 24 typical desert plant species that are widely distributed in the nature reserves in Xinjiang Uygur Autonomous Region of China as the research objects,this study established an image database and select the optimal network model for the image recognition of desert plant species to provide decision support for fine management in the nature reserves in Xinjiang,such as species investigation and monitoring,by using deep learning.Since desert plant species were not included in the public dataset,the images used in this study were mainly obtained through field shooting and downloaded from the Plant Photo Bank of China(PPBC).After the sorting process and statistical analysis,a total of 2331 plant images were finally collected(2071 images from field collection and 260 images from the PPBC),including 24 plant species belonging to 14 families and 22 genera.A large number of numerical experiments were also carried out to compare a series of 37 convolutional neural network models with good performance,from different perspectives,to find the optimal network model that is most suitable for the image recognition of desert plant species in Xinjiang.The results revealed 24 models with a recognition Accuracy,of greater than 70.000%.Among which,Residual Network X_8GF(RegNetX_8GF)performs the best,with Accuracy,Precision,Recall,and F1(which refers to the harmonic mean of the Precision and Recall values)values of 78.33%,77.65%,69.55%,and 71.26%,respectively.Considering the demand factors of hardware equipment and inference time,Mobile NetworkV2 achieves the best balance among the Accuracy,the number of parameters and the number of floating-point operations.The number of parameters for Mobile Network V2(MobileNetV2)is 1/16 of RegNetX_8GF,and the number of floating-point operations is 1/24.Our findings can facilitate efficient decision-making for the management of species survey,cataloging,inspection,and monitoring in the nature reserves in Xinjiang,providing a scientific basis for the protection and utilization of natural plant resources.
文摘With the improvement of people’s living standards,the demand for health monitoring and exercise detection is increasing.It is of great significance to study human activity recognition(HAR)methods that are different from traditional feature extraction methods.This article uses convolutional neural network(CNN)algorithms in deep learning to automatically extract features of activities related to human life.We used a stochastic gradient descent algorithm to optimize the parameters of the CNN.The trained network model is compressed on STM32CubeMX-AI.Finally,this article introduces the use of neural networks on embedded devices to recognize six human activities of daily life,such as sitting,standing,walking,jogging,upstairs,and downstairs.The acceleration sensor related to human activity information is used to obtain the relevant characteristics of the activity,thereby solving the HAR problem.By drawing the accuracy curve,loss function curve,and confusion matrix diagram of the training model,the recognition effect of the convolutional neural network can be seen more intuitively.After comparing the average accuracy of each set of experiments and the test set of the best model obtained from it,the best model is then selected.
基金supported by the Science and Technology Project of the State Grid Zhejiang Electric Power Company Limited(No.B311XT21004G)。
文摘Convolutional neural networks(CNNs)require a lot of multiplication and addition operations completed by traditional electrical multipliers,leading to high power consumption and limited speed.Here,a silicon waveguide-based wavelength division multiplexing(WDM)architecture for CNN is optimized with high energy efficiency Fano resonator.Coupling of T-waveguide and micro-ring resonator generates Fano resonance with small half-width,which can significantly reduce the modulator power consumption.Insulator dataset from state grid is used to test Fano resonance modulator-based CNNs.The results show that accuracy for insulator defect recognition reaches 99.27%with much lower power consumption.Obviously,our optimized photonic integration architecture for CNNs has broad potential for the artificial intelligence hardware platform.
基金supported by the National Natural Science Foundation of China(No.52277055).
文摘Traditional data-driven fault diagnosis methods depend on expert experience to manually extract effective fault features of signals,which has certain limitations.Conversely,deep learning techniques have gained prominence as a central focus of research in the field of fault diagnosis by strong fault feature extraction ability and end-to-end fault diagnosis efficiency.Recently,utilizing the respective advantages of convolution neural network(CNN)and Transformer in local and global feature extraction,research on cooperating the two have demonstrated promise in the field of fault diagnosis.However,the cross-channel convolution mechanism in CNN and the self-attention calculations in Transformer contribute to excessive complexity in the cooperative model.This complexity results in high computational costs and limited industrial applicability.To tackle the above challenges,this paper proposes a lightweight CNN-Transformer named as SEFormer for rotating machinery fault diagnosis.First,a separable multiscale depthwise convolution block is designed to extract and integrate multiscale feature information from different channel dimensions of vibration signals.Then,an efficient self-attention block is developed to capture critical fine-grained features of the signal from a global perspective.Finally,experimental results on the planetary gearbox dataset and themotor roller bearing dataset prove that the proposed framework can balance the advantages of robustness,generalization and lightweight compared to recent state-of-the-art fault diagnosis models based on CNN and Transformer.This study presents a feasible strategy for developing a lightweight rotating machinery fault diagnosis framework aimed at economical deployment.
基金supported by the National Natural Science Foundation of China-China State Railway Group Co.,Ltd.Railway Basic Research Joint Fund (Grant No.U2268217)the Scientific Funding for China Academy of Railway Sciences Corporation Limited (No.2021YJ183).
文摘Graph Convolutional Neural Networks(GCNs)have been widely used in various fields due to their powerful capabilities in processing graph-structured data.However,GCNs encounter significant challenges when applied to scale-free graphs with power-law distributions,resulting in substantial distortions.Moreover,most of the existing GCN models are shallow structures,which restricts their ability to capture dependencies among distant nodes and more refined high-order node features in scale-free graphs with hierarchical structures.To more broadly and precisely apply GCNs to real-world graphs exhibiting scale-free or hierarchical structures and utilize multi-level aggregation of GCNs for capturing high-level information in local representations,we propose the Hyperbolic Deep Graph Convolutional Neural Network(HDGCNN),an end-to-end deep graph representation learning framework that can map scale-free graphs from Euclidean space to hyperbolic space.In HDGCNN,we define the fundamental operations of deep graph convolutional neural networks in hyperbolic space.Additionally,we introduce a hyperbolic feature transformation method based on identity mapping and a dense connection scheme based on a novel non-local message passing framework.In addition,we present a neighborhood aggregation method that combines initial structural featureswith hyperbolic attention coefficients.Through the above methods,HDGCNN effectively leverages both the structural features and node features of graph data,enabling enhanced exploration of non-local structural features and more refined node features in scale-free or hierarchical graphs.Experimental results demonstrate that HDGCNN achieves remarkable performance improvements over state-ofthe-art GCNs in node classification and link prediction tasks,even when utilizing low-dimensional embedding representations.Furthermore,when compared to shallow hyperbolic graph convolutional neural network models,HDGCNN exhibits notable advantages and performance enhancements.
基金Supported by the Natural Science Foundation of Jiangsu,No.BK20171508.
文摘BACKGROUND A convolutional neural network(CNN) is a deep learning algorithm based on the principle of human brain visual cortex processing and image recognition.AIM To automatically identify the invasion depth and origin of esophageal lesions based on a CNN.METHODS A total of 1670 white-light images were used to train and validate the CNN system.The method proposed in this paper included the following two parts:(1)Location module,an object detection network,locating the classified main image feature regions of the image for subsequent classification tasks;and(2) Classification module,a traditional classification CNN,classifying the images cut out by the object detection network.RESULTS The CNN system proposed in this study achieved an overall accuracy of 82.49%,sensitivity of 80.23%,and specificity of 90.56%.In this study,after follow-up pathology,726 patients were compared for endoscopic pathology.The misdiagnosis rate of endoscopic diagnosis in the lesion invasion range was approximately 9.5%;41 patients showed no lesion invasion to the muscularis propria,but 36 of them pathologically showed invasion to the superficial muscularis propria.The patients with invasion of the tunica adventitia were all treated by surgery with an accuracy rate of 100%.For the examination of submucosal lesions,the accuracy of endoscopic ultrasonography(EUS) was approximately 99.3%.Results of this study showed that EUS had a high accuracy rate for the origin of submucosal lesions,whereas the misdiagnosis rate was slightly high in the evaluation of the invasion scope of lesions.Misdiagnosis could be due to different operating and diagnostic levels of endoscopists,unclear ultrasound probes,and unclear lesions.CONCLUSION This study is the first to recognize esophageal EUS images through deep learning,which can automatically identify the invasion depth and lesion origin of submucosal tumors and classify such tumors,thereby achieving good accuracy.In future studies,this method can provide guidance and help to clinical endoscopists.
基金Supported by the National Natural Science Foundation of China (62272214)。
文摘In the era of intelligent economy, the click-through rate(CTR) prediction system can evaluate massive service information based on user historical information, and screen out the products that are most likely to be favored by users, thus realizing customized push of information and achieve the ultimate goal of improving economic benefits. Sequence modeling is one of the main research directions of CTR prediction models based on deep learning. The user's general interest hidden in the entire click history and the short-term interest hidden in the recent click behaviors have different influences on the CTR prediction results, which are highly important. In terms of capturing the user's general interest, existing models paid more attention to the relationships between item embedding vectors(point-level), while ignoring the relationships between elements in item embedding vectors(union-level). The Lambda layer-based Convolutional Sequence Embedding(LCSE) model proposed in this paper uses the Lambda layer to capture features from click history through weight distribution, and uses horizontal and vertical filters on this basis to learn the user's general preferences from union-level and point-level. In addition, we also incorporate the user's short-term preferences captured by the embedding-based convolutional model to further improve the prediction results. The AUC(Area Under Curve) values of the LCSE model on the datasets Electronic, Movie & TV and MovieLens are 0.870 7, 0.903 6 and 0.946 7, improving 0.45%, 0.36% and 0.07% over the Caser model, proving the effectiveness of our proposed model.
基金Supported by the Special Program for Science and Technology Cooperation and Exchange of Shanxi,No.202104041101034.
文摘BACKGROUND Esophageal cancer is the seventh-most common cancer type worldwide,accounting for 5%of death from malignancy.Development of novel diagnostic techniques has facilitated screening,early detection,and improved prognosis.Convolutional neural network(CNN)-based image analysis promises great potential for diagnosing and determining the prognosis of esophageal cancer,enabling even early detection of dysplasia.METHODS PubMed,EMBASE,Web of Science and Cochrane Library databases were searched for articles published up to November 30,2022.We evaluated the diagnostic accuracy of using the CNN model with still image-based analysis and with video-based analysis for esophageal cancer or HGD,as well as for the invasion depth of esophageal cancer.The pooled sensitivity,pooled specificity,positive likelihood ratio(PLR),negative likelihood ratio(NLR),diagnostic odds ratio(DOR)and area under the curve(AUC)were estimated,together with the 95%confidence intervals(CI).A bivariate method and hierarchical summary receiver operating characteristic method were used to calculate the diagnostic test accuracy of the CNN model.Meta-regression and subgroup analyses were used to identify sources of hetero-geneity.RESULTS A total of 28 studies were included in this systematic review and meta-analysis.Using still image-based analysis for the diagnosis of esophageal cancer or HGD provided a pooled sensitivity of 0.95(95%CI:0.92-0.97),pooled specificity of 0.92(0.89-0.94),PLR of 11.5(8.3-16.0),NLR of 0.06(0.04-0.09),DOR of 205(115-365),and AUC of 0.98(0.96-0.99).When video-based analysis was used,a pooled sensitivity of 0.85(0.77-0.91),pooled specificity of 0.73(0.59-0.83),PLR of 3.1(1.9-5.0),NLR of 0.20(0.12-0.34),DOR of 15(6-38)and AUC of 0.87(0.84-0.90)were found.Prediction of invasion depth resulted in a pooled sensitivity of 0.90(0.87-0.92),pooled specificity of 0.83(95%CI:0.76-0.88),PLR of 7.8(1.9-32.0),NLR of 0.10(0.41-0.25),DOR of 118(11-1305),and AUC of 0.95(0.92-0.96).CONCLUSION CNN-based image analysis in diagnosing esophageal cancer and HGD is an excellent diagnostic method with high sensitivity and specificity that merits further investigation in large,multicenter clinical trials.
基金supported by the National Natural Science Foundation of China(62272049,62236006,62172045)the Key Projects of Beijing Union University(ZKZD202301).
文摘In recent years,gait-based emotion recognition has been widely applied in the field of computer vision.However,existing gait emotion recognition methods typically rely on complete human skeleton data,and their accuracy significantly declines when the data is occluded.To enhance the accuracy of gait emotion recognition under occlusion,this paper proposes a Multi-scale Suppression Graph ConvolutionalNetwork(MS-GCN).TheMS-GCN consists of three main components:Joint Interpolation Module(JI Moudle),Multi-scale Temporal Convolution Network(MS-TCN),and Suppression Graph Convolutional Network(SGCN).The JI Module completes the spatially occluded skeletal joints using the(K-Nearest Neighbors)KNN interpolation method.The MS-TCN employs convolutional kernels of various sizes to comprehensively capture the emotional information embedded in the gait,compensating for the temporal occlusion of gait information.The SGCN extracts more non-prominent human gait features by suppressing the extraction of key body part features,thereby reducing the negative impact of occlusion on emotion recognition results.The proposed method is evaluated on two comprehensive datasets:Emotion-Gait,containing 4227 real gaits from sources like BML,ICT-Pollick,and ELMD,and 1000 synthetic gaits generated using STEP-Gen technology,and ELMB,consisting of 3924 gaits,with 1835 labeled with emotions such as“Happy,”“Sad,”“Angry,”and“Neutral.”On the standard datasets Emotion-Gait and ELMB,the proposed method achieved accuracies of 0.900 and 0.896,respectively,attaining performance comparable to other state-ofthe-artmethods.Furthermore,on occlusion datasets,the proposedmethod significantly mitigates the performance degradation caused by occlusion compared to other methods,the accuracy is significantly higher than that of other methods.
文摘One of the most obvious clinical reasons of dementia or The Behavioral and Psychological Symptoms of Dementia(BPSD)are the lack of emotional expression,the increased frequency of negative emotions,and the impermanence of emotions.Observing the reduction of BPSD in dementia through emotions can be considered effective and widely used in the field of non-pharmacological therapy.At present,this article will verify whether the image recognition artificial intelligence(AI)system can correctly reflect the emotional performance of the elderly with dementia through a questionnaire survey of three professional elderly nursing staff.The ANOVA(sig.=0.50)is used to determine that the judgment given by the nursing staff has no obvious deviation,and then Kendall's test(0.722**)and spearman's test(0.863**)are used to verify the judgment severity of the emotion recognition system and the nursing staff unanimously.This implies the usability of the tool.Additionally,it can be expected to be further applied in the research related to BPSD elderly emotion detection.