AIM: To explore a segmentation algorithm based on deep learning to achieve accurate diagnosis and treatment of patients with retinal fluid.METHODS: A two-dimensional(2D) fully convolutional network for retinal segment...AIM: To explore a segmentation algorithm based on deep learning to achieve accurate diagnosis and treatment of patients with retinal fluid.METHODS: A two-dimensional(2D) fully convolutional network for retinal segmentation was employed. In order to solve the category imbalance in retinal optical coherence tomography(OCT) images, the network parameters and loss function based on the 2D fully convolutional network were modified. For this network, the correlations of corresponding positions among adjacent images in space are ignored. Thus, we proposed a three-dimensional(3D) fully convolutional network for segmentation in the retinal OCT images.RESULTS: The algorithm was evaluated according to segmentation accuracy, Kappa coefficient, and F1 score. For the 3D fully convolutional network proposed in this paper, the overall segmentation accuracy rate is 99.56%, Kappa coefficient is 98.47%, and F1 score of retinal fluid is 95.50%. CONCLUSION: The OCT image segmentation algorithm based on deep learning is primarily founded on the 2D convolutional network. The 3D network architecture proposed in this paper reduces the influence of category imbalance, realizes end-to-end segmentation of volume images, and achieves optimal segmentation results. The segmentation maps are practically the same as the manual annotations of doctors, and can provide doctors with more accurate diagnostic data.展开更多
Three-dimensional(3 D)gravitational and magnetic exploration is performed using aerial measurement tools,however,this has difficulties with measuring-height design and the construction of a joint-interpretation scheme...Three-dimensional(3 D)gravitational and magnetic exploration is performed using aerial measurement tools,however,this has difficulties with measuring-height design and the construction of a joint-interpretation scheme.At present,the height in such experiments is set according to the measurement scale,and the distribution characteristics of anomalies are not fully considered.Here,we present the idea of using the attenuation characteristics of a singular-value spectrum to evaluate the contributions of various measurement heights and multi-height combinations for inversion to correctly and reasonably design appropriate measuring heights and the number of various measurement heights to be set.The jointgradient Euler-deconvolution method can accurately obtain the distribution of geological bodies from 3 D gravitational and magnetic data at an improved resolution,and experimental tests confirm these findings.Therefore,an actual 3 D aeromagnetic-data-acquisition and inversion test were carried out in the vicinity of the Zhurihe Iron Mine in Inner Mongolia.The fl ight-height diff erence was set to 60 m,and the specifi c distribution of lodes was obtained by the joint-gradient Euler-deconvolution method.This provides a reliable basis for future detailed exploration and proves that the methods presented in this paper have good practicalapplication eff ects and prospects.展开更多
Hyperspectral images carry numerous spectral bands,and their wealth of band data is a valuable source of information for the accurate classification of ground objects.Three-dimensional(3D)convolution,although an excel...Hyperspectral images carry numerous spectral bands,and their wealth of band data is a valuable source of information for the accurate classification of ground objects.Three-dimensional(3D)convolution,although an excellent spectral information extraction method,is limited by its huge number of parameters and long model training time.To allow better integration of 3D convolution with the most popular transformer models currently available,a new architecture called mobile 3D convolutional vision transformer(MDvT)is proposed.The MDvT introduces inverted residual structure to reduce the number of model parameters and balance the data mining efficiency of low-dimensional data input.Simultaneously,a square patch is used to cut the sequence of tokens to accelerate the model operation.Through extensive experiments,we evaluated the classification overall performance of the proposed MDvT on the WHU-Hi and Pavia University datasets,and demonstrated significant improvements in classification accuracy and model runtime compared with classical deep learning models.It is worth noting that compared with directly integrating 3D convolution into the transformer model,the MDvT architecture improves the accuracy while reducing the time to train an epoch by approximately 58.54%.To facilitate the reproduction of the work in this paper,the model code is available at https://github.com/gloryofroad/MDvT.展开更多
This paper presents a formulation for three-dimensional elasto-dynamics with an elliptic crack based on the Laplace and Fourier transforms and the convolution theorem. The dynamic stress intensity factor for the crack...This paper presents a formulation for three-dimensional elasto-dynamics with an elliptic crack based on the Laplace and Fourier transforms and the convolution theorem. The dynamic stress intensity factor for the crack is determined by solving a Fredholm integral equation of the first kind. The results of this paper are very close to those given by the two-dimensional dual integral equation method.展开更多
Traditional data-driven fault diagnosis methods depend on expert experience to manually extract effective fault features of signals,which has certain limitations.Conversely,deep learning techniques have gained promine...Traditional data-driven fault diagnosis methods depend on expert experience to manually extract effective fault features of signals,which has certain limitations.Conversely,deep learning techniques have gained prominence as a central focus of research in the field of fault diagnosis by strong fault feature extraction ability and end-to-end fault diagnosis efficiency.Recently,utilizing the respective advantages of convolution neural network(CNN)and Transformer in local and global feature extraction,research on cooperating the two have demonstrated promise in the field of fault diagnosis.However,the cross-channel convolution mechanism in CNN and the self-attention calculations in Transformer contribute to excessive complexity in the cooperative model.This complexity results in high computational costs and limited industrial applicability.To tackle the above challenges,this paper proposes a lightweight CNN-Transformer named as SEFormer for rotating machinery fault diagnosis.First,a separable multiscale depthwise convolution block is designed to extract and integrate multiscale feature information from different channel dimensions of vibration signals.Then,an efficient self-attention block is developed to capture critical fine-grained features of the signal from a global perspective.Finally,experimental results on the planetary gearbox dataset and themotor roller bearing dataset prove that the proposed framework can balance the advantages of robustness,generalization and lightweight compared to recent state-of-the-art fault diagnosis models based on CNN and Transformer.This study presents a feasible strategy for developing a lightweight rotating machinery fault diagnosis framework aimed at economical deployment.展开更多
In recent years,gait-based emotion recognition has been widely applied in the field of computer vision.However,existing gait emotion recognition methods typically rely on complete human skeleton data,and their accurac...In recent years,gait-based emotion recognition has been widely applied in the field of computer vision.However,existing gait emotion recognition methods typically rely on complete human skeleton data,and their accuracy significantly declines when the data is occluded.To enhance the accuracy of gait emotion recognition under occlusion,this paper proposes a Multi-scale Suppression Graph ConvolutionalNetwork(MS-GCN).TheMS-GCN consists of three main components:Joint Interpolation Module(JI Moudle),Multi-scale Temporal Convolution Network(MS-TCN),and Suppression Graph Convolutional Network(SGCN).The JI Module completes the spatially occluded skeletal joints using the(K-Nearest Neighbors)KNN interpolation method.The MS-TCN employs convolutional kernels of various sizes to comprehensively capture the emotional information embedded in the gait,compensating for the temporal occlusion of gait information.The SGCN extracts more non-prominent human gait features by suppressing the extraction of key body part features,thereby reducing the negative impact of occlusion on emotion recognition results.The proposed method is evaluated on two comprehensive datasets:Emotion-Gait,containing 4227 real gaits from sources like BML,ICT-Pollick,and ELMD,and 1000 synthetic gaits generated using STEP-Gen technology,and ELMB,consisting of 3924 gaits,with 1835 labeled with emotions such as“Happy,”“Sad,”“Angry,”and“Neutral.”On the standard datasets Emotion-Gait and ELMB,the proposed method achieved accuracies of 0.900 and 0.896,respectively,attaining performance comparable to other state-ofthe-artmethods.Furthermore,on occlusion datasets,the proposedmethod significantly mitigates the performance degradation caused by occlusion compared to other methods,the accuracy is significantly higher than that of other methods.展开更多
Three-dimensional(3D)urban structures play a critical role in informing climate mitigation strategies aimed at the built environment and facilitating sustainable urban development.Regrettably,there exists a significan...Three-dimensional(3D)urban structures play a critical role in informing climate mitigation strategies aimed at the built environment and facilitating sustainable urban development.Regrettably,there exists a significant gap in detailed and consistent data on 3D building space structures with global coverage due to the challenges inherent in the data collection and model calibration processes.In this study,we constructed a global urban structure(GUS-3D)dataset,including building volume,height,and footprint information,at a 500 m spatial resolution using extensive satellite observation products and numerous reference building samples.Our analysis indicated that the total volume of buildings worldwide in2015 exceeded 1×10^(12)m^(3).Over the 1985 to 2015 period,we observed a slight increase in the magnitude of 3D building volume growth(i.e.,it increased from 166.02 km3 during the 1985–2000 period to 175.08km3 during the 2000–2015 period),while the expansion magnitudes of the two-dimensional(2D)building footprint(22.51×10^(3) vs 13.29×10^(3)km^(2))and urban extent(157×10^(3) vs 133.8×10^(3)km^(2))notably decreased.This trend highlights the significant increase in intensive vertical utilization of urban land.Furthermore,we identified significant heterogeneity in building space provision and inequality across cities worldwide.This inequality is particularly pronounced in many populous Asian cities,which has been overlooked in previous studies on economic inequality.The GUS-3D dataset shows great potential to deepen our understanding of the urban environment and creates new horizons for numerous 3D urban studies.展开更多
To address the problem of multi-missile cooperative interception against maneuvering targets at a prespecified impact time and desired Line-of-Sight(LOS)angles in ThreeDimensional(3D)space,this paper proposes a 3D lea...To address the problem of multi-missile cooperative interception against maneuvering targets at a prespecified impact time and desired Line-of-Sight(LOS)angles in ThreeDimensional(3D)space,this paper proposes a 3D leader-following cooperative interception guidance law.First,in the LOS direction of the leader,an impact time-controlled guidance law is derived based on the fixed-time stability theory,which enables the leader to complete the interception task at a prespecified impact time.Next,in the LOS direction of the followers,by introducing a time consensus tracking error function,a fixed-time consensus tracking guidance law is investigated to guarantee the consensus tracking convergence of the time-to-go.Then,in the direction normal to the LOS,by combining the designed global integral sliding mode surface and the second-order Sliding Mode Control(SMC)theory,an innovative 3D LOS-angle-constrained interception guidance law is developed,which eliminates the reaching phase in the traditional sliding mode guidance laws and effectively saves energy consumption.Moreover,it effectively suppresses the chattering phenomenon while avoiding the singularity issue,and compensates for unknown interference caused by target maneuvering online,making it convenient for practical engineering applications.Finally,theoretical proof analysis and multiple sets of numerical simulation results verify the effectiveness,superiority,and robustness of the investigated guidance law.展开更多
Severe ground-level ozone(O_(3))pollution over major Chinese cities has become one of the most challenging problems,which have deleterious effects on human health and the sustainability of society.This study explored ...Severe ground-level ozone(O_(3))pollution over major Chinese cities has become one of the most challenging problems,which have deleterious effects on human health and the sustainability of society.This study explored the spatiotemporal distribution characteristics of ground-level O_(3) and its precursors based on conventional pollutant and meteorological monitoring data in Zhejiang Province from 2016 to 2021.Then,a high-performance convolutional neural network(CNN)model was established by expanding the moment and the concentration variations to general factors.Finally,the response mechanism of O_(3) to the variation with crucial influencing factors is explored by controlling variables and interpolating target variables.The results indicated that the annual average MDA8-90th concentrations in Zhejiang Province are higher in the northern and lower in the southern.When the wind direction(WD)ranges from east to southwest and the wind speed(WS)ranges between 2 and 3 m/sec,higher O_(3) concentration prone to occur.At different temperatures(T),the O_(3) concentration showed a trend of first increasing and subsequently decreasing with increasing NO_(2) concentration,peaks at the NO_(2) concentration around 0.02mg/m^(3).The sensitivity of NO_(2) to O_(3) formation is not easily affected by temperature,barometric pressure and dew point temperature.Additionally,there is a minimum IRNO_(2) at each temperature when the NO_(2) concentration is 0.03 mg/m^(3),and this minimum IRNO_(2) decreases with increasing temperature.The study explores the response mechanism of O_(3) with the change of driving variables,which can provide a scientific foundation and methodological support for the targeted management of O_(3) pollution.展开更多
The isolated fracture-vug systems controlled by small-scale strike-slip faults within ultra-deep carbonate rocks of the Tarim Basin exhibit significant exploration potential.The study employs a novel training set inco...The isolated fracture-vug systems controlled by small-scale strike-slip faults within ultra-deep carbonate rocks of the Tarim Basin exhibit significant exploration potential.The study employs a novel training set incorporating innovative fault labels to train a U-Net-structured CNN model,enabling effective identification of small-scale strike-slip faults through seismic data interpretation.Based on the CNN faults,we analyze the distribution patterns of small-scale strike-slip faults.The small-scale strike-slip faults can be categorized into NNW-trending and NE-trending groups with strike lengths ranging 200–5000 m.The development intensity of small-scale strike-slip faults in the Lower Yingshan Member notably exceeds that in the Upper Member.The Lower and Upper Yingshan members are two distinct mechanical layers with contrasting brittleness characteristics,separated by a low-brittleness layer.The superior brittleness of the Lower Yingshan Member enhances the development intensity of small-scale strike-slip faults compared to the upper member,while the low-brittleness layer exerts restrictive effects on vertical fault propagation.Fracture-vug systems formed by interactions of two or more small-scale strike-slip faults demonstrate larger sizes than those controlled by individual faults.All fracture-vug system sizes show positive correlations with the vertical extents of associated small-scale strike-slip faults,particularly intersection and approaching fracture-vug systems exhibit accelerated size increases proportional to the vertical extents.展开更多
With the emphasis on user privacy and communication security, encrypted traffic has increased dramatically, which brings great challenges to traffic classification. The classification method of encrypted traffic based...With the emphasis on user privacy and communication security, encrypted traffic has increased dramatically, which brings great challenges to traffic classification. The classification method of encrypted traffic based on GNN can deal with encrypted traffic well. However, existing GNN-based approaches ignore the relationship between client or server packets. In this paper, we design a network traffic topology based on GCN, called Flow Mapping Graph (FMG). FMG establishes sequential edges between vertexes by the arrival order of packets and establishes jump-order edges between vertexes by connecting packets in different bursts with the same direction. It not only reflects the time characteristics of the packet but also strengthens the relationship between the client or server packets. According to FMG, a Traffic Mapping Classification model (TMC-GCN) is designed, which can automatically capture and learn the characteristics and structure information of the top vertex in FMG. The TMC-GCN model is used to classify the encrypted traffic. The encryption stream classification problem is transformed into a graph classification problem, which can effectively deal with data from different data sources and application scenarios. By comparing the performance of TMC-GCN with other classical models in four public datasets, including CICIOT2023, ISCXVPN2016, CICAAGM2017, and GraphDapp, the effectiveness of the FMG algorithm is verified. The experimental results show that the accuracy rate of the TMC-GCN model is 96.13%, the recall rate is 95.04%, and the F1 rate is 94.54%.展开更多
The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to u...The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to understand complex mobility patterns.Deep learning techniques,such as graph neural networks(GNNs),are popular for their ability to capture spatio-temporal dependencies.However,these models often become overly complex due to the large number of hyper-parameters involved.In this study,we introduce Dynamic Multi-Graph Spatial-Temporal Graph Neural Ordinary Differential Equation Networks(DMST-GNODE),a framework based on ordinary differential equations(ODEs)that autonomously discovers effective spatial-temporal graph neural network(STGNN)architectures for traffic prediction tasks.The comparative analysis of DMST-GNODE and baseline models indicates that DMST-GNODE model demonstrates superior performance across multiple datasets,consistently achieving the lowest Root Mean Square Error(RMSE)and Mean Absolute Error(MAE)values,alongside the highest accuracy.On the BKK(Bangkok)dataset,it outperformed other models with an RMSE of 3.3165 and an accuracy of 0.9367 for a 20-min interval,maintaining this trend across 40 and 60 min.Similarly,on the PeMS08 dataset,DMST-GNODE achieved the best performance with an RMSE of 19.4863 and an accuracy of 0.9377 at 20 min,demonstrating its effectiveness over longer periods.The Los_Loop dataset results further emphasise this model’s advantage,with an RMSE of 3.3422 and an accuracy of 0.7643 at 20 min,consistently maintaining superiority across all time intervals.These numerical highlights indicate that DMST-GNODE not only outperforms baseline models but also achieves higher accuracy and lower errors across different time intervals and datasets.展开更多
Micro-expressions(ME)recognition is a complex task that requires advanced techniques to extract informative features fromfacial expressions.Numerous deep neural networks(DNNs)with convolutional structures have been pr...Micro-expressions(ME)recognition is a complex task that requires advanced techniques to extract informative features fromfacial expressions.Numerous deep neural networks(DNNs)with convolutional structures have been proposed.However,unlike DNNs,shallow convolutional neural networks often outperform deeper models in mitigating overfitting,particularly with small datasets.Still,many of these methods rely on a single feature for recognition,resulting in an insufficient ability to extract highly effective features.To address this limitation,in this paper,an Improved Dual-stream Shallow Convolutional Neural Network based on an Extreme Gradient Boosting Algorithm(IDSSCNN-XgBoost)is introduced for ME Recognition.The proposed method utilizes a dual-stream architecture where motion vectors(temporal features)are extracted using Optical Flow TV-L1 and amplify subtle changes(spatial features)via EulerianVideoMagnification(EVM).These features are processed by IDSSCNN,with an attention mechanism applied to refine the extracted effective features.The outputs are then fused,concatenated,and classified using the XgBoost algorithm.This comprehensive approach significantly improves recognition accuracy by leveraging the strengths of both temporal and spatial information,supported by the robust classification power of XgBoost.The proposed method is evaluated on three publicly available ME databases named Chinese Academy of Sciences Micro-expression Database(CASMEII),Spontaneous Micro-Expression Database(SMICHS),and Spontaneous Actions and Micro-Movements(SAMM).Experimental results indicate that the proposed model can achieve outstanding results compared to recent models.The accuracy results are 79.01%,69.22%,and 68.99%on CASMEII,SMIC-HS,and SAMM,and the F1-score are 75.47%,68.91%,and 63.84%,respectively.The proposed method has the advantage of operational efficiency and less computational time.展开更多
Human disturbance activities is one of the main reasons for inducing geohazards.Ecological impact assessment metrics of roads are inconsistent criteria and multiple.From the perspective of visual observation,the envir...Human disturbance activities is one of the main reasons for inducing geohazards.Ecological impact assessment metrics of roads are inconsistent criteria and multiple.From the perspective of visual observation,the environment damage can be shown through detecting the uncovered area of vegetation in the images along road.To realize this,an end-to-end environment damage detection model based on convolutional neural network is proposed.A 50-layer residual network is used to extract feature map.The initial parameters are optimized by transfer learning.An example is shown by this method.The dataset including cliff and landslide damage are collected by us along road in Shennongjia national forest park.Results show 0.4703 average precision(AP)rating for cliff damage and 0.4809 average precision(AP)rating for landslide damage.Compared with YOLOv3,our model shows a better accuracy in cliff and landslide detection although a certain amount of speed is sacrificed.展开更多
Liposarcoma is one of the most common soft tissue sarcomas,however,its occurrence rate is still rare compared to other cancers.Due to its rarity,in vitro experiments are an essential approach to elucidate liposarcoma ...Liposarcoma is one of the most common soft tissue sarcomas,however,its occurrence rate is still rare compared to other cancers.Due to its rarity,in vitro experiments are an essential approach to elucidate liposarcoma pathobiology.Conventional cell culture-based research(2D cell culture)is still playing a pivotal role,while several shortcomings have been recently under discussion.In vivo,mouse models are usually adopted for pre-clinical analyses with expectations to overcome the issues of 2D cell culture.However,they do not fully recapitulate human dedifferentiated liposarcoma(DDLPS)characteristics.Therefore,three-dimensional(3D)culture systems have been the recent research focus in the cell biology field with the expectation to overcome at the same time the disadvantages of 2D cell culture and in vivo animal models and fill in the gap between them.Given the liposarcoma rarity,we believe that 3D cell culture techniques,including 3D cell cultures/co-cultures,and Patient-Derived tumor Organoids(PDOs),represent a promising approach to facilitate liposarcoma investigation and elucidate its molecular mechanisms and effective therapy development.In this review,we first provide a general overview of 3D cell cultures compared to 2D cell cultures.We then focus on one of the recent 3D cell culture applications,Patient-Derived Organoids(PDOs),summarizing and discussing several PDO methodologies.Finally,we discuss the current and future applications of PDOs to sarcoma,particularly in the field of liposarcoma.展开更多
Landslide susceptibility mapping(LSM)plays a crucial role in assessing geological risks.The current LSM techniques face a significant challenge in achieving accurate results due to uncertainties associated with region...Landslide susceptibility mapping(LSM)plays a crucial role in assessing geological risks.The current LSM techniques face a significant challenge in achieving accurate results due to uncertainties associated with regional-scale geotechnical parameters.To explore rainfall-induced LSM,this study proposes a hybrid model that combines the physically-based probabilistic model(PPM)with convolutional neural network(CNN).The PPM is capable of effectively capturing the spatial distribution of landslides by incorporating the probability of failure(POF)considering the slope stability mechanism under rainfall conditions.This significantly characterizes the variation of POF caused by parameter uncertainties.CNN was used as a binary classifier to capture the spatial and channel correlation between landslide conditioning factors and the probability of landslide occurrence.OpenCV image enhancement technique was utilized to extract non-landslide points based on the POF of landslides.The proposed model comprehensively considers physical mechanics when selecting non-landslide samples,effectively filtering out samples that do not adhere to physical principles and reduce the risk of overfitting.The results indicate that the proposed PPM-CNN hybrid model presents a higher prediction accuracy,with an area under the curve(AUC)value of 0.85 based on the landslide case of the Niangniangba area of Gansu Province,China compared with the individual CNN model(AUC=0.61)and the PPM(AUC=0.74).This model can also consider the statistical correlation and non-normal probability distributions of model parameters.These results offer practical guidance for future research on rainfall-induced LSM at the regional scale.展开更多
An evolution inequality of Sobolev type involving a nonlinear convolution term is considered.By using the nonlinear capacity method and the contradiction argument,the non-existence of the nontrivial local weak solutio...An evolution inequality of Sobolev type involving a nonlinear convolution term is considered.By using the nonlinear capacity method and the contradiction argument,the non-existence of the nontrivial local weak solution is proved.展开更多
The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and hist...The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and historical context,DL offers a powerful complement by enhancing the speed,objectivity,and precision of the classification process.This research explores the significance of image data augmentation techniques in optimizing the performance of convolutional neural networks(CNNs)for geological image analysis,particularly in the classification of igneous,metamorphic,and sedimentary rock types from rock thin section(RTS)images.This study primarily focuses on classic image augmentation techniques and evaluates their impact on model accuracy and precision.Results demonstrate that augmentation techniques like Equalize significantly enhance the model's classification capabilities,achieving an F1-Score of 0.9869 for igneous rocks,0.9884 for metamorphic rocks,and 0.9929 for sedimentary rocks,representing improvements compared to the baseline original results.Moreover,the weighted average F1-Score across all classes and techniques is 0.9886,indicating an enhancement.Conversely,methods like Distort lead to decreased accuracy and F1-Score,with an F1-Score of 0.949 for igneous rocks,0.954 for metamorphic rocks,and 0.9416 for sedimentary rocks,exacerbating the performance compared to the baseline.The study underscores the practicality of image data augmentation in geological image classification and advocates for the adoption of DL methods in this domain for automation and improved results.The findings of this study can benefit various fields,including remote sensing,mineral exploration,and environmental monitoring,by enhancing the accuracy of geological image analysis both for scientific research and industrial applications.展开更多
Due to self-occlusion and high degree of freedom,estimating 3D hand pose from a single RGB image is a great challenging problem.Graph convolutional networks(GCNs)use graphs to describe the physical connection relation...Due to self-occlusion and high degree of freedom,estimating 3D hand pose from a single RGB image is a great challenging problem.Graph convolutional networks(GCNs)use graphs to describe the physical connection relationships between hand joints and improve the accuracy of 3D hand pose regression.However,GCNs cannot effectively describe the relationships between non-adjacent hand joints.Recently,hypergraph convolutional networks(HGCNs)have received much attention as they can describe multi-dimensional relationships between nodes through hyperedges;therefore,this paper proposes a framework for 3D hand pose estimation based on HGCN,which can better extract correlated relationships between adjacent and non-adjacent hand joints.To overcome the shortcomings of predefined hypergraph structures,a kind of dynamic hypergraph convolutional network is proposed,in which hyperedges are constructed dynamically based on hand joint feature similarity.To better explore the local semantic relationships between nodes,a kind of semantic dynamic hypergraph convolution is proposed.The proposed method is evaluated on publicly available benchmark datasets.Qualitative and quantitative experimental results both show that the proposed HGCN and improved methods for 3D hand pose estimation are better than GCN,and achieve state-of-the-art performance compared with existing methods.展开更多
Aspect-oriented sentiment analysis is a meticulous sentiment analysis task that aims to analyse the sentiment polarity of specific aspects. Most of the current research builds graph convolutional networks based on dep...Aspect-oriented sentiment analysis is a meticulous sentiment analysis task that aims to analyse the sentiment polarity of specific aspects. Most of the current research builds graph convolutional networks based on dependent syntactic trees, which improves the classification performance of the models to some extent. However, the technical limitations of dependent syntactic trees can introduce considerable noise into the model. Meanwhile, it is difficult for a single graph convolutional network to aggregate both semantic and syntactic structural information of nodes, which affects the final sentence classification. To cope with the above problems, this paper proposes a bi-channel graph convolutional network model. The model introduces a phrase structure tree and transforms it into a hierarchical phrase matrix. The adjacency matrix of the dependent syntactic tree and the hierarchical phrase matrix are combined as the initial matrix of the graph convolutional network to enhance the syntactic information. The semantic information feature representations of the sentences are obtained by the graph convolutional network with a multi-head attention mechanism and fused to achieve complementary learning of dual-channel features. Experimental results show that the model performs well and improves the accuracy of sentiment classification on three public benchmark datasets, namely Rest14, Lap14 and Twitter.展开更多
基金Supported by National Science Foundation of China(No.81800878)Interdisciplinary Program of Shanghai Jiao Tong University(No.YG2017QN24)+1 种基金Key Technological Research Projects of Songjiang District(No.18sjkjgg24)Bethune Langmu Ophthalmological Research Fund for Young and Middle-aged People(No.BJ-LM2018002J)
文摘AIM: To explore a segmentation algorithm based on deep learning to achieve accurate diagnosis and treatment of patients with retinal fluid.METHODS: A two-dimensional(2D) fully convolutional network for retinal segmentation was employed. In order to solve the category imbalance in retinal optical coherence tomography(OCT) images, the network parameters and loss function based on the 2D fully convolutional network were modified. For this network, the correlations of corresponding positions among adjacent images in space are ignored. Thus, we proposed a three-dimensional(3D) fully convolutional network for segmentation in the retinal OCT images.RESULTS: The algorithm was evaluated according to segmentation accuracy, Kappa coefficient, and F1 score. For the 3D fully convolutional network proposed in this paper, the overall segmentation accuracy rate is 99.56%, Kappa coefficient is 98.47%, and F1 score of retinal fluid is 95.50%. CONCLUSION: The OCT image segmentation algorithm based on deep learning is primarily founded on the 2D convolutional network. The 3D network architecture proposed in this paper reduces the influence of category imbalance, realizes end-to-end segmentation of volume images, and achieves optimal segmentation results. The segmentation maps are practically the same as the manual annotations of doctors, and can provide doctors with more accurate diagnostic data.
基金supported by the National Key Research and Development Program of China (Nos. 2017YFC0602203,2017YFC0601606,2017YFC0601305 and 2017YFC0602000)National Science and Technology Major Project task (No.2016ZX05027-002-003)+1 种基金National Natural Science Foundation of China (No. 41604098)State Key Program of National Natural Science of China (No. 41430322)。
文摘Three-dimensional(3 D)gravitational and magnetic exploration is performed using aerial measurement tools,however,this has difficulties with measuring-height design and the construction of a joint-interpretation scheme.At present,the height in such experiments is set according to the measurement scale,and the distribution characteristics of anomalies are not fully considered.Here,we present the idea of using the attenuation characteristics of a singular-value spectrum to evaluate the contributions of various measurement heights and multi-height combinations for inversion to correctly and reasonably design appropriate measuring heights and the number of various measurement heights to be set.The jointgradient Euler-deconvolution method can accurately obtain the distribution of geological bodies from 3 D gravitational and magnetic data at an improved resolution,and experimental tests confirm these findings.Therefore,an actual 3 D aeromagnetic-data-acquisition and inversion test were carried out in the vicinity of the Zhurihe Iron Mine in Inner Mongolia.The fl ight-height diff erence was set to 60 m,and the specifi c distribution of lodes was obtained by the joint-gradient Euler-deconvolution method.This provides a reliable basis for future detailed exploration and proves that the methods presented in this paper have good practicalapplication eff ects and prospects.
基金funded by the National Science and Technology Basic Resource Investigation Program(No..2017FY100900).
文摘Hyperspectral images carry numerous spectral bands,and their wealth of band data is a valuable source of information for the accurate classification of ground objects.Three-dimensional(3D)convolution,although an excellent spectral information extraction method,is limited by its huge number of parameters and long model training time.To allow better integration of 3D convolution with the most popular transformer models currently available,a new architecture called mobile 3D convolutional vision transformer(MDvT)is proposed.The MDvT introduces inverted residual structure to reduce the number of model parameters and balance the data mining efficiency of low-dimensional data input.Simultaneously,a square patch is used to cut the sequence of tokens to accelerate the model operation.Through extensive experiments,we evaluated the classification overall performance of the proposed MDvT on the WHU-Hi and Pavia University datasets,and demonstrated significant improvements in classification accuracy and model runtime compared with classical deep learning models.It is worth noting that compared with directly integrating 3D convolution into the transformer model,the MDvT architecture improves the accuracy while reducing the time to train an epoch by approximately 58.54%.To facilitate the reproduction of the work in this paper,the model code is available at https://github.com/gloryofroad/MDvT.
基金The project supported by the National Natural Science Foundation of China (K19672007)
文摘This paper presents a formulation for three-dimensional elasto-dynamics with an elliptic crack based on the Laplace and Fourier transforms and the convolution theorem. The dynamic stress intensity factor for the crack is determined by solving a Fredholm integral equation of the first kind. The results of this paper are very close to those given by the two-dimensional dual integral equation method.
基金supported by the National Natural Science Foundation of China(No.52277055).
文摘Traditional data-driven fault diagnosis methods depend on expert experience to manually extract effective fault features of signals,which has certain limitations.Conversely,deep learning techniques have gained prominence as a central focus of research in the field of fault diagnosis by strong fault feature extraction ability and end-to-end fault diagnosis efficiency.Recently,utilizing the respective advantages of convolution neural network(CNN)and Transformer in local and global feature extraction,research on cooperating the two have demonstrated promise in the field of fault diagnosis.However,the cross-channel convolution mechanism in CNN and the self-attention calculations in Transformer contribute to excessive complexity in the cooperative model.This complexity results in high computational costs and limited industrial applicability.To tackle the above challenges,this paper proposes a lightweight CNN-Transformer named as SEFormer for rotating machinery fault diagnosis.First,a separable multiscale depthwise convolution block is designed to extract and integrate multiscale feature information from different channel dimensions of vibration signals.Then,an efficient self-attention block is developed to capture critical fine-grained features of the signal from a global perspective.Finally,experimental results on the planetary gearbox dataset and themotor roller bearing dataset prove that the proposed framework can balance the advantages of robustness,generalization and lightweight compared to recent state-of-the-art fault diagnosis models based on CNN and Transformer.This study presents a feasible strategy for developing a lightweight rotating machinery fault diagnosis framework aimed at economical deployment.
基金supported by the National Natural Science Foundation of China(62272049,62236006,62172045)the Key Projects of Beijing Union University(ZKZD202301).
文摘In recent years,gait-based emotion recognition has been widely applied in the field of computer vision.However,existing gait emotion recognition methods typically rely on complete human skeleton data,and their accuracy significantly declines when the data is occluded.To enhance the accuracy of gait emotion recognition under occlusion,this paper proposes a Multi-scale Suppression Graph ConvolutionalNetwork(MS-GCN).TheMS-GCN consists of three main components:Joint Interpolation Module(JI Moudle),Multi-scale Temporal Convolution Network(MS-TCN),and Suppression Graph Convolutional Network(SGCN).The JI Module completes the spatially occluded skeletal joints using the(K-Nearest Neighbors)KNN interpolation method.The MS-TCN employs convolutional kernels of various sizes to comprehensively capture the emotional information embedded in the gait,compensating for the temporal occlusion of gait information.The SGCN extracts more non-prominent human gait features by suppressing the extraction of key body part features,thereby reducing the negative impact of occlusion on emotion recognition results.The proposed method is evaluated on two comprehensive datasets:Emotion-Gait,containing 4227 real gaits from sources like BML,ICT-Pollick,and ELMD,and 1000 synthetic gaits generated using STEP-Gen technology,and ELMB,consisting of 3924 gaits,with 1835 labeled with emotions such as“Happy,”“Sad,”“Angry,”and“Neutral.”On the standard datasets Emotion-Gait and ELMB,the proposed method achieved accuracies of 0.900 and 0.896,respectively,attaining performance comparable to other state-ofthe-artmethods.Furthermore,on occlusion datasets,the proposedmethod significantly mitigates the performance degradation caused by occlusion compared to other methods,the accuracy is significantly higher than that of other methods.
基金supported by the National Science Fund for Distinguished Young Scholars(42225107)the National Natural Science Foundation of China(42001326,42371414,42171409,and 42271419)+1 种基金the Natural Science Foundation of Guangdong Province of China(2022A1515012207)the Basic and Applied Basic Research Project of Guangzhou Science and Technology Planning(202201011539)。
文摘Three-dimensional(3D)urban structures play a critical role in informing climate mitigation strategies aimed at the built environment and facilitating sustainable urban development.Regrettably,there exists a significant gap in detailed and consistent data on 3D building space structures with global coverage due to the challenges inherent in the data collection and model calibration processes.In this study,we constructed a global urban structure(GUS-3D)dataset,including building volume,height,and footprint information,at a 500 m spatial resolution using extensive satellite observation products and numerous reference building samples.Our analysis indicated that the total volume of buildings worldwide in2015 exceeded 1×10^(12)m^(3).Over the 1985 to 2015 period,we observed a slight increase in the magnitude of 3D building volume growth(i.e.,it increased from 166.02 km3 during the 1985–2000 period to 175.08km3 during the 2000–2015 period),while the expansion magnitudes of the two-dimensional(2D)building footprint(22.51×10^(3) vs 13.29×10^(3)km^(2))and urban extent(157×10^(3) vs 133.8×10^(3)km^(2))notably decreased.This trend highlights the significant increase in intensive vertical utilization of urban land.Furthermore,we identified significant heterogeneity in building space provision and inequality across cities worldwide.This inequality is particularly pronounced in many populous Asian cities,which has been overlooked in previous studies on economic inequality.The GUS-3D dataset shows great potential to deepen our understanding of the urban environment and creates new horizons for numerous 3D urban studies.
文摘To address the problem of multi-missile cooperative interception against maneuvering targets at a prespecified impact time and desired Line-of-Sight(LOS)angles in ThreeDimensional(3D)space,this paper proposes a 3D leader-following cooperative interception guidance law.First,in the LOS direction of the leader,an impact time-controlled guidance law is derived based on the fixed-time stability theory,which enables the leader to complete the interception task at a prespecified impact time.Next,in the LOS direction of the followers,by introducing a time consensus tracking error function,a fixed-time consensus tracking guidance law is investigated to guarantee the consensus tracking convergence of the time-to-go.Then,in the direction normal to the LOS,by combining the designed global integral sliding mode surface and the second-order Sliding Mode Control(SMC)theory,an innovative 3D LOS-angle-constrained interception guidance law is developed,which eliminates the reaching phase in the traditional sliding mode guidance laws and effectively saves energy consumption.Moreover,it effectively suppresses the chattering phenomenon while avoiding the singularity issue,and compensates for unknown interference caused by target maneuvering online,making it convenient for practical engineering applications.Finally,theoretical proof analysis and multiple sets of numerical simulation results verify the effectiveness,superiority,and robustness of the investigated guidance law.
基金supported by the National Key Research and Development Program of China (Nos.2022YFC3702000 and 2022YFC3703500)the Key R&D Project of Zhejiang Province (No.2022C03146).
文摘Severe ground-level ozone(O_(3))pollution over major Chinese cities has become one of the most challenging problems,which have deleterious effects on human health and the sustainability of society.This study explored the spatiotemporal distribution characteristics of ground-level O_(3) and its precursors based on conventional pollutant and meteorological monitoring data in Zhejiang Province from 2016 to 2021.Then,a high-performance convolutional neural network(CNN)model was established by expanding the moment and the concentration variations to general factors.Finally,the response mechanism of O_(3) to the variation with crucial influencing factors is explored by controlling variables and interpolating target variables.The results indicated that the annual average MDA8-90th concentrations in Zhejiang Province are higher in the northern and lower in the southern.When the wind direction(WD)ranges from east to southwest and the wind speed(WS)ranges between 2 and 3 m/sec,higher O_(3) concentration prone to occur.At different temperatures(T),the O_(3) concentration showed a trend of first increasing and subsequently decreasing with increasing NO_(2) concentration,peaks at the NO_(2) concentration around 0.02mg/m^(3).The sensitivity of NO_(2) to O_(3) formation is not easily affected by temperature,barometric pressure and dew point temperature.Additionally,there is a minimum IRNO_(2) at each temperature when the NO_(2) concentration is 0.03 mg/m^(3),and this minimum IRNO_(2) decreases with increasing temperature.The study explores the response mechanism of O_(3) with the change of driving variables,which can provide a scientific foundation and methodological support for the targeted management of O_(3) pollution.
基金supported by the National Natural Science Foundation of China(No.U21B2062).
文摘The isolated fracture-vug systems controlled by small-scale strike-slip faults within ultra-deep carbonate rocks of the Tarim Basin exhibit significant exploration potential.The study employs a novel training set incorporating innovative fault labels to train a U-Net-structured CNN model,enabling effective identification of small-scale strike-slip faults through seismic data interpretation.Based on the CNN faults,we analyze the distribution patterns of small-scale strike-slip faults.The small-scale strike-slip faults can be categorized into NNW-trending and NE-trending groups with strike lengths ranging 200–5000 m.The development intensity of small-scale strike-slip faults in the Lower Yingshan Member notably exceeds that in the Upper Member.The Lower and Upper Yingshan members are two distinct mechanical layers with contrasting brittleness characteristics,separated by a low-brittleness layer.The superior brittleness of the Lower Yingshan Member enhances the development intensity of small-scale strike-slip faults compared to the upper member,while the low-brittleness layer exerts restrictive effects on vertical fault propagation.Fracture-vug systems formed by interactions of two or more small-scale strike-slip faults demonstrate larger sizes than those controlled by individual faults.All fracture-vug system sizes show positive correlations with the vertical extents of associated small-scale strike-slip faults,particularly intersection and approaching fracture-vug systems exhibit accelerated size increases proportional to the vertical extents.
基金supported by the National Key Research and Development Program of China No.2023YFA1009500.
文摘With the emphasis on user privacy and communication security, encrypted traffic has increased dramatically, which brings great challenges to traffic classification. The classification method of encrypted traffic based on GNN can deal with encrypted traffic well. However, existing GNN-based approaches ignore the relationship between client or server packets. In this paper, we design a network traffic topology based on GCN, called Flow Mapping Graph (FMG). FMG establishes sequential edges between vertexes by the arrival order of packets and establishes jump-order edges between vertexes by connecting packets in different bursts with the same direction. It not only reflects the time characteristics of the packet but also strengthens the relationship between the client or server packets. According to FMG, a Traffic Mapping Classification model (TMC-GCN) is designed, which can automatically capture and learn the characteristics and structure information of the top vertex in FMG. The TMC-GCN model is used to classify the encrypted traffic. The encryption stream classification problem is transformed into a graph classification problem, which can effectively deal with data from different data sources and application scenarios. By comparing the performance of TMC-GCN with other classical models in four public datasets, including CICIOT2023, ISCXVPN2016, CICAAGM2017, and GraphDapp, the effectiveness of the FMG algorithm is verified. The experimental results show that the accuracy rate of the TMC-GCN model is 96.13%, the recall rate is 95.04%, and the F1 rate is 94.54%.
文摘The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to understand complex mobility patterns.Deep learning techniques,such as graph neural networks(GNNs),are popular for their ability to capture spatio-temporal dependencies.However,these models often become overly complex due to the large number of hyper-parameters involved.In this study,we introduce Dynamic Multi-Graph Spatial-Temporal Graph Neural Ordinary Differential Equation Networks(DMST-GNODE),a framework based on ordinary differential equations(ODEs)that autonomously discovers effective spatial-temporal graph neural network(STGNN)architectures for traffic prediction tasks.The comparative analysis of DMST-GNODE and baseline models indicates that DMST-GNODE model demonstrates superior performance across multiple datasets,consistently achieving the lowest Root Mean Square Error(RMSE)and Mean Absolute Error(MAE)values,alongside the highest accuracy.On the BKK(Bangkok)dataset,it outperformed other models with an RMSE of 3.3165 and an accuracy of 0.9367 for a 20-min interval,maintaining this trend across 40 and 60 min.Similarly,on the PeMS08 dataset,DMST-GNODE achieved the best performance with an RMSE of 19.4863 and an accuracy of 0.9377 at 20 min,demonstrating its effectiveness over longer periods.The Los_Loop dataset results further emphasise this model’s advantage,with an RMSE of 3.3422 and an accuracy of 0.7643 at 20 min,consistently maintaining superiority across all time intervals.These numerical highlights indicate that DMST-GNODE not only outperforms baseline models but also achieves higher accuracy and lower errors across different time intervals and datasets.
基金supported by the Key Research and Development Program of Jiangsu Province under Grant BE2022059-3,CTBC Bank through the Industry-Academia Cooperation Project,as well as by the Ministry of Science and Technology of Taiwan through Grants MOST-108-2218-E-002-055,MOST-109-2223-E-009-002-MY3,MOST-109-2218-E-009-025,and MOST431109-2218-E-002-015.
文摘Micro-expressions(ME)recognition is a complex task that requires advanced techniques to extract informative features fromfacial expressions.Numerous deep neural networks(DNNs)with convolutional structures have been proposed.However,unlike DNNs,shallow convolutional neural networks often outperform deeper models in mitigating overfitting,particularly with small datasets.Still,many of these methods rely on a single feature for recognition,resulting in an insufficient ability to extract highly effective features.To address this limitation,in this paper,an Improved Dual-stream Shallow Convolutional Neural Network based on an Extreme Gradient Boosting Algorithm(IDSSCNN-XgBoost)is introduced for ME Recognition.The proposed method utilizes a dual-stream architecture where motion vectors(temporal features)are extracted using Optical Flow TV-L1 and amplify subtle changes(spatial features)via EulerianVideoMagnification(EVM).These features are processed by IDSSCNN,with an attention mechanism applied to refine the extracted effective features.The outputs are then fused,concatenated,and classified using the XgBoost algorithm.This comprehensive approach significantly improves recognition accuracy by leveraging the strengths of both temporal and spatial information,supported by the robust classification power of XgBoost.The proposed method is evaluated on three publicly available ME databases named Chinese Academy of Sciences Micro-expression Database(CASMEII),Spontaneous Micro-Expression Database(SMICHS),and Spontaneous Actions and Micro-Movements(SAMM).Experimental results indicate that the proposed model can achieve outstanding results compared to recent models.The accuracy results are 79.01%,69.22%,and 68.99%on CASMEII,SMIC-HS,and SAMM,and the F1-score are 75.47%,68.91%,and 63.84%,respectively.The proposed method has the advantage of operational efficiency and less computational time.
文摘Human disturbance activities is one of the main reasons for inducing geohazards.Ecological impact assessment metrics of roads are inconsistent criteria and multiple.From the perspective of visual observation,the environment damage can be shown through detecting the uncovered area of vegetation in the images along road.To realize this,an end-to-end environment damage detection model based on convolutional neural network is proposed.A 50-layer residual network is used to extract feature map.The initial parameters are optimized by transfer learning.An example is shown by this method.The dataset including cliff and landslide damage are collected by us along road in Shennongjia national forest park.Results show 0.4703 average precision(AP)rating for cliff damage and 0.4809 average precision(AP)rating for landslide damage.Compared with YOLOv3,our model shows a better accuracy in cliff and landslide detection although a certain amount of speed is sacrificed.
文摘Liposarcoma is one of the most common soft tissue sarcomas,however,its occurrence rate is still rare compared to other cancers.Due to its rarity,in vitro experiments are an essential approach to elucidate liposarcoma pathobiology.Conventional cell culture-based research(2D cell culture)is still playing a pivotal role,while several shortcomings have been recently under discussion.In vivo,mouse models are usually adopted for pre-clinical analyses with expectations to overcome the issues of 2D cell culture.However,they do not fully recapitulate human dedifferentiated liposarcoma(DDLPS)characteristics.Therefore,three-dimensional(3D)culture systems have been the recent research focus in the cell biology field with the expectation to overcome at the same time the disadvantages of 2D cell culture and in vivo animal models and fill in the gap between them.Given the liposarcoma rarity,we believe that 3D cell culture techniques,including 3D cell cultures/co-cultures,and Patient-Derived tumor Organoids(PDOs),represent a promising approach to facilitate liposarcoma investigation and elucidate its molecular mechanisms and effective therapy development.In this review,we first provide a general overview of 3D cell cultures compared to 2D cell cultures.We then focus on one of the recent 3D cell culture applications,Patient-Derived Organoids(PDOs),summarizing and discussing several PDO methodologies.Finally,we discuss the current and future applications of PDOs to sarcoma,particularly in the field of liposarcoma.
基金funding support from the National Natural Science Foundation of China(Grant Nos.U22A20594,52079045)Hong-Zhi Cui acknowledges the financial support of the China Scholarship Council(Grant No.CSC:202206710014)for his research at Universitat Politecnica de Catalunya,Barcelona.
文摘Landslide susceptibility mapping(LSM)plays a crucial role in assessing geological risks.The current LSM techniques face a significant challenge in achieving accurate results due to uncertainties associated with regional-scale geotechnical parameters.To explore rainfall-induced LSM,this study proposes a hybrid model that combines the physically-based probabilistic model(PPM)with convolutional neural network(CNN).The PPM is capable of effectively capturing the spatial distribution of landslides by incorporating the probability of failure(POF)considering the slope stability mechanism under rainfall conditions.This significantly characterizes the variation of POF caused by parameter uncertainties.CNN was used as a binary classifier to capture the spatial and channel correlation between landslide conditioning factors and the probability of landslide occurrence.OpenCV image enhancement technique was utilized to extract non-landslide points based on the POF of landslides.The proposed model comprehensively considers physical mechanics when selecting non-landslide samples,effectively filtering out samples that do not adhere to physical principles and reduce the risk of overfitting.The results indicate that the proposed PPM-CNN hybrid model presents a higher prediction accuracy,with an area under the curve(AUC)value of 0.85 based on the landslide case of the Niangniangba area of Gansu Province,China compared with the individual CNN model(AUC=0.61)and the PPM(AUC=0.74).This model can also consider the statistical correlation and non-normal probability distributions of model parameters.These results offer practical guidance for future research on rainfall-induced LSM at the regional scale.
基金Supported by Scientific Research Fund of Hunan Provincial Education Departmen(t23A0361)。
文摘An evolution inequality of Sobolev type involving a nonlinear convolution term is considered.By using the nonlinear capacity method and the contradiction argument,the non-existence of the nontrivial local weak solution is proved.
文摘The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and historical context,DL offers a powerful complement by enhancing the speed,objectivity,and precision of the classification process.This research explores the significance of image data augmentation techniques in optimizing the performance of convolutional neural networks(CNNs)for geological image analysis,particularly in the classification of igneous,metamorphic,and sedimentary rock types from rock thin section(RTS)images.This study primarily focuses on classic image augmentation techniques and evaluates their impact on model accuracy and precision.Results demonstrate that augmentation techniques like Equalize significantly enhance the model's classification capabilities,achieving an F1-Score of 0.9869 for igneous rocks,0.9884 for metamorphic rocks,and 0.9929 for sedimentary rocks,representing improvements compared to the baseline original results.Moreover,the weighted average F1-Score across all classes and techniques is 0.9886,indicating an enhancement.Conversely,methods like Distort lead to decreased accuracy and F1-Score,with an F1-Score of 0.949 for igneous rocks,0.954 for metamorphic rocks,and 0.9416 for sedimentary rocks,exacerbating the performance compared to the baseline.The study underscores the practicality of image data augmentation in geological image classification and advocates for the adoption of DL methods in this domain for automation and improved results.The findings of this study can benefit various fields,including remote sensing,mineral exploration,and environmental monitoring,by enhancing the accuracy of geological image analysis both for scientific research and industrial applications.
基金the National Key Research and Development Program of China(No.2021ZD0111902)the National Natural Science Foundation of China(Nos.62172022 and U21B2038)。
文摘Due to self-occlusion and high degree of freedom,estimating 3D hand pose from a single RGB image is a great challenging problem.Graph convolutional networks(GCNs)use graphs to describe the physical connection relationships between hand joints and improve the accuracy of 3D hand pose regression.However,GCNs cannot effectively describe the relationships between non-adjacent hand joints.Recently,hypergraph convolutional networks(HGCNs)have received much attention as they can describe multi-dimensional relationships between nodes through hyperedges;therefore,this paper proposes a framework for 3D hand pose estimation based on HGCN,which can better extract correlated relationships between adjacent and non-adjacent hand joints.To overcome the shortcomings of predefined hypergraph structures,a kind of dynamic hypergraph convolutional network is proposed,in which hyperedges are constructed dynamically based on hand joint feature similarity.To better explore the local semantic relationships between nodes,a kind of semantic dynamic hypergraph convolution is proposed.The proposed method is evaluated on publicly available benchmark datasets.Qualitative and quantitative experimental results both show that the proposed HGCN and improved methods for 3D hand pose estimation are better than GCN,and achieve state-of-the-art performance compared with existing methods.
文摘Aspect-oriented sentiment analysis is a meticulous sentiment analysis task that aims to analyse the sentiment polarity of specific aspects. Most of the current research builds graph convolutional networks based on dependent syntactic trees, which improves the classification performance of the models to some extent. However, the technical limitations of dependent syntactic trees can introduce considerable noise into the model. Meanwhile, it is difficult for a single graph convolutional network to aggregate both semantic and syntactic structural information of nodes, which affects the final sentence classification. To cope with the above problems, this paper proposes a bi-channel graph convolutional network model. The model introduces a phrase structure tree and transforms it into a hierarchical phrase matrix. The adjacency matrix of the dependent syntactic tree and the hierarchical phrase matrix are combined as the initial matrix of the graph convolutional network to enhance the syntactic information. The semantic information feature representations of the sentences are obtained by the graph convolutional network with a multi-head attention mechanism and fused to achieve complementary learning of dual-channel features. Experimental results show that the model performs well and improves the accuracy of sentiment classification on three public benchmark datasets, namely Rest14, Lap14 and Twitter.