Two-dimensional endoscopic images are susceptible to interferences such as specular reflections and monotonous texture illumination,hindering accurate three-dimensional lesion reconstruction by surgical robots.This st...Two-dimensional endoscopic images are susceptible to interferences such as specular reflections and monotonous texture illumination,hindering accurate three-dimensional lesion reconstruction by surgical robots.This study proposes a novel end-to-end disparity estimation model to address these challenges.Our approach combines a Pseudo-Siamese neural network architecture with pyramid dilated convolutions,integrating multi-scale image information to enhance robustness against lighting interferences.This study introduces a Pseudo-Siamese structure-based disparity regression model that simplifies left-right image comparison,improving accuracy and efficiency.The model was evaluated using a dataset of stereo endoscopic videos captured by the Da Vinci surgical robot,comprising simulated silicone heart sequences and real heart video data.Experimental results demonstrate significant improvement in the network’s resistance to lighting interference without substantially increasing parameters.Moreover,the model exhibited faster convergence during training,contributing to overall performance enhancement.This study advances endoscopic image processing accuracy and has potential implications for surgical robot applications in complex environments.展开更多
Although Convolutional Neural Networks(CNNs)have achieved remarkable success in image classification,most CNNs use image datasets in the Red-Green-Blue(RGB)color space(one of the most commonly used color spaces).The e...Although Convolutional Neural Networks(CNNs)have achieved remarkable success in image classification,most CNNs use image datasets in the Red-Green-Blue(RGB)color space(one of the most commonly used color spaces).The existing literature regarding the influence of color space use on the performance of CNNs is limited.This paper explores the impact of different color spaces on image classification using CNNs.We compare the performance of five CNN models with different convolution operations and numbers of layers on four image datasets,each converted to nine color spaces.We find that color space selection can significantly affect classification accuracy,and that some classes are more sensitive to color space changes than others.Different color spaces may have different expression abilities for different image features,such as brightness,saturation,hue,etc.To leverage the complementary information from different color spaces,we propose a pseudo-Siamese network that fuses two color spaces without modifying the network architecture.Our experiments show that our proposed model can outperform the single-color-space models on most datasets.We also find that our method is simple,flexible,and compatible with any CNN and image dataset.展开更多
Automatic extraction of tailing ponds from Very High-Resolution(VHR)remotely sensed images is vital for mineral resource management.This study proposes a Pseudo-Siamese Visual Geometry Group Encoder-Decoder network(PS...Automatic extraction of tailing ponds from Very High-Resolution(VHR)remotely sensed images is vital for mineral resource management.This study proposes a Pseudo-Siamese Visual Geometry Group Encoder-Decoder network(PSVED)to achieve high accuracy tailing ponds extraction from VHR images.First,handcrafted feature(HCF)images are calculated from VHR images based on the index calculation algorithm,highlighting the tailing ponds'signals.Second,considering the information gap between VHR images and HCF images,the Pseudo-Siamese Visual Geometry Group(Pseudo-Siamese VGG)is utilized to extract independent and representative deep semantic features from VHR images and HCF images,respectively.Third,the deep supervision mechanism is attached to handle the optimization problem of gradients vanishing or exploding.A self-made tailing ponds extraction dataset(TPSet)produced with the Gaofen-6 images of part of Hebei province,China,was employed to conduct experiments.The results show that the proposed'method_achieves the best visual performance and accuracy for tailing ponds extraction in all the tested methods,whereas the running time of the proposed method maintains at the same level as other methods.This study has practical significance in automatically extracting tailing ponds from VHR images which is beneficial to tailing ponds management and monitoring.展开更多
基金Supported by Sichuan Science and Technology Program(2023YFSY0026,2023YFH0004)Supported by the Institute of Information&Communications Technology Planning&Evaluation(IITP)grant funded by the Korean government(MSIT)(No.RS-2022-00155885,Artificial Intelligence Convergence Innovation Human Resources Development(Hanyang University ERICA)).
文摘Two-dimensional endoscopic images are susceptible to interferences such as specular reflections and monotonous texture illumination,hindering accurate three-dimensional lesion reconstruction by surgical robots.This study proposes a novel end-to-end disparity estimation model to address these challenges.Our approach combines a Pseudo-Siamese neural network architecture with pyramid dilated convolutions,integrating multi-scale image information to enhance robustness against lighting interferences.This study introduces a Pseudo-Siamese structure-based disparity regression model that simplifies left-right image comparison,improving accuracy and efficiency.The model was evaluated using a dataset of stereo endoscopic videos captured by the Da Vinci surgical robot,comprising simulated silicone heart sequences and real heart video data.Experimental results demonstrate significant improvement in the network’s resistance to lighting interference without substantially increasing parameters.Moreover,the model exhibited faster convergence during training,contributing to overall performance enhancement.This study advances endoscopic image processing accuracy and has potential implications for surgical robot applications in complex environments.
基金supported by the Science and Technology Development Fund of Macao,Macao SAR(Nos.0021/2023/RIA1 and 0046/2021/A)the Faculty Research Grant of Macao University of Science and Technology(No.FRG-22-103-FIE)supported by the National Natural Science Foundation of China(Nos.61872167 and 61502205).
文摘Although Convolutional Neural Networks(CNNs)have achieved remarkable success in image classification,most CNNs use image datasets in the Red-Green-Blue(RGB)color space(one of the most commonly used color spaces).The existing literature regarding the influence of color space use on the performance of CNNs is limited.This paper explores the impact of different color spaces on image classification using CNNs.We compare the performance of five CNN models with different convolution operations and numbers of layers on four image datasets,each converted to nine color spaces.We find that color space selection can significantly affect classification accuracy,and that some classes are more sensitive to color space changes than others.Different color spaces may have different expression abilities for different image features,such as brightness,saturation,hue,etc.To leverage the complementary information from different color spaces,we propose a pseudo-Siamese network that fuses two color spaces without modifying the network architecture.Our experiments show that our proposed model can outperform the single-color-space models on most datasets.We also find that our method is simple,flexible,and compatible with any CNN and image dataset.
基金supported by the National Key Research and Development Program[grant number:2022YFF1303301]The Open Foundation of the Key Laboratory of Coupling Process and Effect of Natural Resources Elements[grant number:2022KFKTC001]+1 种基金The National Natural Science Foundation of China[grant number:42271480]The Fundamental Research Funds for the Central Universities[grant number:2023ZKPYDC10,BBJ2023026].
文摘Automatic extraction of tailing ponds from Very High-Resolution(VHR)remotely sensed images is vital for mineral resource management.This study proposes a Pseudo-Siamese Visual Geometry Group Encoder-Decoder network(PSVED)to achieve high accuracy tailing ponds extraction from VHR images.First,handcrafted feature(HCF)images are calculated from VHR images based on the index calculation algorithm,highlighting the tailing ponds'signals.Second,considering the information gap between VHR images and HCF images,the Pseudo-Siamese Visual Geometry Group(Pseudo-Siamese VGG)is utilized to extract independent and representative deep semantic features from VHR images and HCF images,respectively.Third,the deep supervision mechanism is attached to handle the optimization problem of gradients vanishing or exploding.A self-made tailing ponds extraction dataset(TPSet)produced with the Gaofen-6 images of part of Hebei province,China,was employed to conduct experiments.The results show that the proposed'method_achieves the best visual performance and accuracy for tailing ponds extraction in all the tested methods,whereas the running time of the proposed method maintains at the same level as other methods.This study has practical significance in automatically extracting tailing ponds from VHR images which is beneficial to tailing ponds management and monitoring.