Recent multimedia and computer vision research has focused on analyzing human behavior and activity using images.Skeleton estimation,known as pose estimation,has received a significant attention.For human pose estimat...Recent multimedia and computer vision research has focused on analyzing human behavior and activity using images.Skeleton estimation,known as pose estimation,has received a significant attention.For human pose estimation,deep learning approaches primarily emphasize on the keypoint features.Conversely,in the case of occluded or incomplete poses,the keypoint feature is insufficiently substantial,especially when there are multiple humans in a single frame.Other features,such as the body border and visibility conditions,can contribute to pose estimation in addition to the keypoint feature.Our model framework integrates multiple features,namely the human body mask features,which can serve as a constraint to keypoint location estimation,the body keypoint features,and the keypoint visibility via mask region-based convolutional neural network(Mask-RCNN).A sequential multi-feature learning setup is formed to share multi-features across the structure,whereas,in the Mask-RCNN,the only feature that could be shared through the system is the region of interest feature.By two-way up-scaling with the shared weight process to produce the mask,we have addressed the problems of improper segmentation,small intrusion,and object loss when Mask-RCNN is used,for instance,segmentation.Accuracy is indicated by the percentage of correct keypoint,and our model can identify 86.1%of the correct keypoints.展开更多
With the increasing awareness of privacy protection and the improvement of relevant laws,federal learning has gradually become a new choice for cross-agency and cross-device machine learning.In order to solve the prob...With the increasing awareness of privacy protection and the improvement of relevant laws,federal learning has gradually become a new choice for cross-agency and cross-device machine learning.In order to solve the problems of privacy leakage,high computational overhead and high traffic in some federated learning schemes,this paper proposes amultiplicative double privacymask algorithm which is convenient for homomorphic addition aggregation.The combination of homomorphic encryption and secret sharing ensures that the server cannot compromise user privacy from the private gradient uploaded by the participants.At the same time,the proposed TQRR(Top-Q-Random-R)gradient selection algorithm is used to filter the gradient of encryption and upload efficiently,which reduces the computing overhead of 51.78%and the traffic of 64.87%on the premise of ensuring the accuracy of themodel,whichmakes the framework of privacy protection federated learning lighter to adapt to more miniaturized federated learning terminals.展开更多
Significant advancements have beenwitnessed in visual tracking applications leveragingViT in recent years,mainly due to the formidablemodeling capabilities of Vision Transformer(ViT).However,the strong performance of ...Significant advancements have beenwitnessed in visual tracking applications leveragingViT in recent years,mainly due to the formidablemodeling capabilities of Vision Transformer(ViT).However,the strong performance of such trackers heavily relies on ViT models pretrained for long periods,limitingmore flexible model designs for tracking tasks.To address this issue,we propose an efficient unsupervised ViT pretraining method for the tracking task based on masked autoencoders,called TrackMAE.During pretraining,we employ two shared-parameter ViTs,serving as the appearance encoder and motion encoder,respectively.The appearance encoder encodes randomly masked image data,while the motion encoder encodes randomly masked pairs of video frames.Subsequently,an appearance decoder and a motion decoder separately reconstruct the original image data and video frame data at the pixel level.In this way,ViT learns to understand both the appearance of images and the motion between video frames simultaneously.Experimental results demonstrate that ViT-Base and ViT-Large models,pretrained with TrackMAE and combined with a simple tracking head,achieve state-of-the-art(SOTA)performance without additional design.Moreover,compared to the currently popular MAE pretraining methods,TrackMAE consumes only 1/5 of the training time,which will facilitate the customization of diverse models for tracking.For instance,we additionally customize a lightweight ViT-XS,which achieves SOTA efficient tracking performance.展开更多
Detecting pavement cracks is critical for road safety and infrastructure management.Traditional methods,relying on manual inspection and basic image processing,are time-consuming and prone to errors.Recent deep-learni...Detecting pavement cracks is critical for road safety and infrastructure management.Traditional methods,relying on manual inspection and basic image processing,are time-consuming and prone to errors.Recent deep-learning(DL)methods automate crack detection,but many still struggle with variable crack patterns and environmental conditions.This study aims to address these limitations by introducing the Masker Transformer,a novel hybrid deep learning model that integrates the precise localization capabilities of Mask Region-based Convolutional Neural Network(Mask R-CNN)with the global contextual awareness of Vision Transformer(ViT).The research focuses on leveraging the strengths of both architectures to enhance segmentation accuracy and adaptability across different pavement conditions.We evaluated the performance of theMaskerTransformer against other state-of-theartmodels such asU-Net,TransformerU-Net(TransUNet),U-NetTransformer(UNETr),SwinU-NetTransformer(Swin-UNETr),You Only Look Once version 8(YoloV8),and Mask R-CNN using two benchmark datasets:Crack500 and DeepCrack.The findings reveal that the MaskerTransformer significantly outperforms the existing models,achieving the highest Dice SimilarityCoefficient(DSC),precision,recall,and F1-Score across both datasets.Specifically,the model attained a DSC of 80.04%on Crack500 and 91.37%on DeepCrack,demonstrating superior segmentation accuracy and reliability.The high precision and recall rates further substantiate its effectiveness in real-world applications,suggesting that the Masker Transformer can serve as a robust tool for automated pavement crack detection,potentially replacing more traditional methods.展开更多
Typical masking techniques adopted in the conventional secure communication schemes are the additive masking and modulation by multiplication. In order to enhance security, this paper presents a nonlinear masking meth...Typical masking techniques adopted in the conventional secure communication schemes are the additive masking and modulation by multiplication. In order to enhance security, this paper presents a nonlinear masking methodology, applicable to the conventional schemes. In the proposed cryptographic scheme, the plaintext spans over a pre-specified finite-time interval, which is modulated through parameter modulation, and masked chaotically by a nonlinear mechanism. An efficient iterative learning algorithm is exploited for decryption, and the sufficient condition for convergence is derived, by which the learning gain can be chosen. Case studies are conducted to demonstrate the effectiveness of the proposed masking method.展开更多
With the construction of the power Internet of Things(IoT),communication between smart devices in urban distribution networks has been gradually moving towards high speed,high compatibility,and low latency,which provi...With the construction of the power Internet of Things(IoT),communication between smart devices in urban distribution networks has been gradually moving towards high speed,high compatibility,and low latency,which provides reliable support for reconfiguration optimization in urban distribution networks.Thus,this study proposed a deep reinforcement learning based multi-level dynamic reconfiguration method for urban distribution networks in a cloud-edge collaboration architecture to obtain a real-time optimal multi-level dynamic reconfiguration solution.First,the multi-level dynamic reconfiguration method was discussed,which included feeder-,transformer-,and substation-levels.Subsequently,the multi-agent system was combined with the cloud-edge collaboration architecture to build a deep reinforcement learning model for multi-level dynamic reconfiguration in an urban distribution network.The cloud-edge collaboration architecture can effectively support the multi-agent system to conduct“centralized training and decentralized execution”operation modes and improve the learning efficiency of the model.Thereafter,for a multi-agent system,this study adopted a combination of offline and online learning to endow the model with the ability to realize automatic optimization and updation of the strategy.In the offline learning phase,a Q-learning-based multi-agent conservative Q-learning(MACQL)algorithm was proposed to stabilize the learning results and reduce the risk of the next online learning phase.In the online learning phase,a multi-agent deep deterministic policy gradient(MADDPG)algorithm based on policy gradients was proposed to explore the action space and update the experience pool.Finally,the effectiveness of the proposed method was verified through a simulation analysis of a real-world 445-node system.展开更多
Short video applications like Tik Tok have seen significant growth in recent years.One common behavior of users on these platforms is watching and swiping through videos,which can lead to a significant waste of bandwi...Short video applications like Tik Tok have seen significant growth in recent years.One common behavior of users on these platforms is watching and swiping through videos,which can lead to a significant waste of bandwidth.As such,an important challenge in short video streaming is to design a preloading algorithm that can effectively decide which videos to download,at what bitrate,and when to pause the download in order to reduce bandwidth waste while improving the Quality of Experience(QoE).However,designing such an algorithm is non-trivial,especially when considering the conflicting objectives of minimizing bandwidth waste and maximizing QoE.In this paper,we propose an end-to-end Deep reinforcement learning framework with Action Masking called DAM that leverages domain knowledge to learn an optimal policy for short video preloading.To achieve this,we introduce a reward shaping technique to minimize bandwidth waste and use action masking to make actions more reasonable,reduce playback rebuffering,and accelerate the training process.We have conducted extensive experiments using real-world video datasets and network traces including 4G/Wi Fi/5G.Our results show that DAM improves the Qo E score by 3.73%-11.28%compared to state-of-the-art algorithms,and achieves an average bandwidth waste of only 10.27%-12.07%,outperforming all baseline methods.展开更多
Today,due to the pandemic of COVID-19 the entire world is facing a serious health crisis.According to the World Health Organization(WHO),people in public places should wear a face mask to control the rapid transmissio...Today,due to the pandemic of COVID-19 the entire world is facing a serious health crisis.According to the World Health Organization(WHO),people in public places should wear a face mask to control the rapid transmission of COVID-19.The governmental bodies of different countries imposed that wearing a face mask is compulsory in public places.Therefore,it is very difficult to manually monitor people in overcrowded areas.This research focuses on providing a solution to enforce one of the important preventative measures of COVID-19 in public places,by presenting an automated system that automatically localizes masked and unmasked human faces within an image or video of an area which assist in this outbreak of COVID-19.This paper demonstrates a transfer learning approach with the Faster-RCNN model to detect faces that are masked or unmasked.The proposed framework is built by fine-tuning the state-of-the-art deep learning model,Faster-RCNN,and has been validated on a publicly available dataset named Face Mask Dataset(FMD)and achieving the highest average precision(AP)of 81%and highest average Recall(AR)of 84%.This shows the strong robustness and capabilities of the Faster-RCNN model to detect individuals with masked and un-masked faces.Moreover,this work applies to real-time and can be implemented in any public service area.展开更多
Notwithstanding the religious intention of billions of devotees,the religious mass gathering increased major public health concerns since it likely became a huge super spreading event for the severe acute respiratory ...Notwithstanding the religious intention of billions of devotees,the religious mass gathering increased major public health concerns since it likely became a huge super spreading event for the severe acute respiratory syndrome coronavirus 2(SARS-CoV-2).Most attendees ignored preventive measures,namely maintaining physical distance,practising hand hygiene,and wearing facemasks.Wearing a face mask in public areas protects people from spreading COVID-19.Artificial intelligence(AI)based on deep learning(DL)and machine learning(ML)could assist in fighting covid-19 in several ways.This study introduces a new deep learning-based Face Mask Detection in Religious Mass Gathering(DLFMD-RMG)technique during the COVID-19 pandemic.The DLFMD-RMG technique focuses mainly on detecting face masks in a religious mass gathering.To accomplish this,the presented DLFMD-RMG technique undergoes two pre-processing levels:Bilateral Filtering(BF)and Contrast Enhancement.For face detection,the DLFMD-RMG technique uses YOLOv5 with a ResNet-50 detector.In addition,the face detection performance can be improved by the seeker optimization algorithm(SOA)for tuning the hyperparameter of the ResNet-50 module,showing the novelty of the work.At last,the faces with and without masks are classified using the Fuzzy Neural Network(FNN)model.The stimulation study of the DLFMD-RMG algorithm is examined on a benchmark dataset.The results highlighted the remarkable performance of the DLFMD-RMG model algorithm in other recent approaches.展开更多
M50轴承钢中主要的碳化物类型为MC、M_(2)C和M_(23)C_(6)。扫描电子显微镜(Scanning Electron Microscopy,SEM)下,3种碳化物的形状、尺寸和在材料中的分布存在明显的区别。有些碳化物的尺寸较大且分布不均匀。轴承受载过程中,这些碳化...M50轴承钢中主要的碳化物类型为MC、M_(2)C和M_(23)C_(6)。扫描电子显微镜(Scanning Electron Microscopy,SEM)下,3种碳化物的形状、尺寸和在材料中的分布存在明显的区别。有些碳化物的尺寸较大且分布不均匀。轴承受载过程中,这些碳化物会成为应力集中的区域,对轴承疲劳性能产生负面影响。为了高效地获得材料中的碳化物信息,提出一种改进的掩膜基于区域的卷积神经网络(Mask Region-based Convolutional Neural Network,Mask R-CNN)模型,可批量鉴别SEM图像中3种碳化物的种类,确定其尺寸大小及分布。网络模型输出的图像和数值结果显示,M50轴承钢中M_(2)C型碳化物尺寸大且分布不均匀,但总体尺寸最大的MC型碳化物和尺寸最小的M_(23)C_(6)型碳化物分布相对均匀。展开更多
As we look ahead to future lunar exploration missions, such as crewed lunar exploration and establishing lunar scientific research stations, the lunar rovers will need to cover vast distances. These distances could ra...As we look ahead to future lunar exploration missions, such as crewed lunar exploration and establishing lunar scientific research stations, the lunar rovers will need to cover vast distances. These distances could range from kilometers to tens of kilometers, and even hundreds and thousands of kilometers. Therefore, it is crucial to develop effective long-range path planning for lunar rovers to meet the demands of lunar patrol exploration. This paper presents a hierarchical map model path planning method that utilizes the existing high-resolution images, digital elevation models and mineral abundance maps. The objective is to address the issue of the construction of lunar rover travel costs in the absence of large-scale, high-resolution digital elevation models. This method models the reference and semantic layers using the middle- and low-resolution remote sensing data. The multi-scale obstacles on the lunar surface are extracted by combining the deep learning algorithm on the high-resolution image, and the obstacle avoidance layer is modeled. A two-stage exploratory path planning decision is employed for long-distance driving path planning on a global–local scale. The proposed method analyzes the long-distance accessibility of various areas of scientific significance, such as Rima Bode. A high-precision digital elevation model is created using stereo images to validate the method. Based on the findings, it can be observed that the entire route spans a distance of 930.32 km. The route demonstrates an impressive ability to avoid meter-level impact craters and linear structures while maintaining an average slope of less than 8°. This paper explores scientific research by traversing at least seven basalt units, uncovering the secrets of lunar volcanic activities, and establishing ‘golden spike’ reference points for lunar stratigraphy. The final result of path planning can serve as a valuable reference for the design, mission demonstration, and subsequent project implementation of the new manned lunar rover.展开更多
基金the Industry-University-Research Cooperation Fund Project of the Eighth Research Institute of China Aerospace Science and Technology Corporation(No.USCAST2021-5)。
文摘Recent multimedia and computer vision research has focused on analyzing human behavior and activity using images.Skeleton estimation,known as pose estimation,has received a significant attention.For human pose estimation,deep learning approaches primarily emphasize on the keypoint features.Conversely,in the case of occluded or incomplete poses,the keypoint feature is insufficiently substantial,especially when there are multiple humans in a single frame.Other features,such as the body border and visibility conditions,can contribute to pose estimation in addition to the keypoint feature.Our model framework integrates multiple features,namely the human body mask features,which can serve as a constraint to keypoint location estimation,the body keypoint features,and the keypoint visibility via mask region-based convolutional neural network(Mask-RCNN).A sequential multi-feature learning setup is formed to share multi-features across the structure,whereas,in the Mask-RCNN,the only feature that could be shared through the system is the region of interest feature.By two-way up-scaling with the shared weight process to produce the mask,we have addressed the problems of improper segmentation,small intrusion,and object loss when Mask-RCNN is used,for instance,segmentation.Accuracy is indicated by the percentage of correct keypoint,and our model can identify 86.1%of the correct keypoints.
基金supported by the National Natural Science Foundation of China(Grant Nos.62172436,62102452)the National Key Research and Development Program of China(2023YFB3106100,2021YFB3100100)the Natural Science Foundation of Shaanxi Province(2023-JC-YB-584).
文摘With the increasing awareness of privacy protection and the improvement of relevant laws,federal learning has gradually become a new choice for cross-agency and cross-device machine learning.In order to solve the problems of privacy leakage,high computational overhead and high traffic in some federated learning schemes,this paper proposes amultiplicative double privacymask algorithm which is convenient for homomorphic addition aggregation.The combination of homomorphic encryption and secret sharing ensures that the server cannot compromise user privacy from the private gradient uploaded by the participants.At the same time,the proposed TQRR(Top-Q-Random-R)gradient selection algorithm is used to filter the gradient of encryption and upload efficiently,which reduces the computing overhead of 51.78%and the traffic of 64.87%on the premise of ensuring the accuracy of themodel,whichmakes the framework of privacy protection federated learning lighter to adapt to more miniaturized federated learning terminals.
基金supported in part by National Natural Science Foundation of China(No.62176041)in part by Excellent Science and Technique Talent Foundation of Dalian(No.2022RY21).
文摘Significant advancements have beenwitnessed in visual tracking applications leveragingViT in recent years,mainly due to the formidablemodeling capabilities of Vision Transformer(ViT).However,the strong performance of such trackers heavily relies on ViT models pretrained for long periods,limitingmore flexible model designs for tracking tasks.To address this issue,we propose an efficient unsupervised ViT pretraining method for the tracking task based on masked autoencoders,called TrackMAE.During pretraining,we employ two shared-parameter ViTs,serving as the appearance encoder and motion encoder,respectively.The appearance encoder encodes randomly masked image data,while the motion encoder encodes randomly masked pairs of video frames.Subsequently,an appearance decoder and a motion decoder separately reconstruct the original image data and video frame data at the pixel level.In this way,ViT learns to understand both the appearance of images and the motion between video frames simultaneously.Experimental results demonstrate that ViT-Base and ViT-Large models,pretrained with TrackMAE and combined with a simple tracking head,achieve state-of-the-art(SOTA)performance without additional design.Moreover,compared to the currently popular MAE pretraining methods,TrackMAE consumes only 1/5 of the training time,which will facilitate the customization of diverse models for tracking.For instance,we additionally customize a lightweight ViT-XS,which achieves SOTA efficient tracking performance.
文摘Detecting pavement cracks is critical for road safety and infrastructure management.Traditional methods,relying on manual inspection and basic image processing,are time-consuming and prone to errors.Recent deep-learning(DL)methods automate crack detection,but many still struggle with variable crack patterns and environmental conditions.This study aims to address these limitations by introducing the Masker Transformer,a novel hybrid deep learning model that integrates the precise localization capabilities of Mask Region-based Convolutional Neural Network(Mask R-CNN)with the global contextual awareness of Vision Transformer(ViT).The research focuses on leveraging the strengths of both architectures to enhance segmentation accuracy and adaptability across different pavement conditions.We evaluated the performance of theMaskerTransformer against other state-of-theartmodels such asU-Net,TransformerU-Net(TransUNet),U-NetTransformer(UNETr),SwinU-NetTransformer(Swin-UNETr),You Only Look Once version 8(YoloV8),and Mask R-CNN using two benchmark datasets:Crack500 and DeepCrack.The findings reveal that the MaskerTransformer significantly outperforms the existing models,achieving the highest Dice SimilarityCoefficient(DSC),precision,recall,and F1-Score across both datasets.Specifically,the model attained a DSC of 80.04%on Crack500 and 91.37%on DeepCrack,demonstrating superior segmentation accuracy and reliability.The high precision and recall rates further substantiate its effectiveness in real-world applications,suggesting that the Masker Transformer can serve as a robust tool for automated pavement crack detection,potentially replacing more traditional methods.
基金supported by National Natural Science Foundation of China(No.61174034)
文摘Typical masking techniques adopted in the conventional secure communication schemes are the additive masking and modulation by multiplication. In order to enhance security, this paper presents a nonlinear masking methodology, applicable to the conventional schemes. In the proposed cryptographic scheme, the plaintext spans over a pre-specified finite-time interval, which is modulated through parameter modulation, and masked chaotically by a nonlinear mechanism. An efficient iterative learning algorithm is exploited for decryption, and the sufficient condition for convergence is derived, by which the learning gain can be chosen. Case studies are conducted to demonstrate the effectiveness of the proposed masking method.
基金supported by the National Natural Science Foundation of China under Grant 52077146.
文摘With the construction of the power Internet of Things(IoT),communication between smart devices in urban distribution networks has been gradually moving towards high speed,high compatibility,and low latency,which provides reliable support for reconfiguration optimization in urban distribution networks.Thus,this study proposed a deep reinforcement learning based multi-level dynamic reconfiguration method for urban distribution networks in a cloud-edge collaboration architecture to obtain a real-time optimal multi-level dynamic reconfiguration solution.First,the multi-level dynamic reconfiguration method was discussed,which included feeder-,transformer-,and substation-levels.Subsequently,the multi-agent system was combined with the cloud-edge collaboration architecture to build a deep reinforcement learning model for multi-level dynamic reconfiguration in an urban distribution network.The cloud-edge collaboration architecture can effectively support the multi-agent system to conduct“centralized training and decentralized execution”operation modes and improve the learning efficiency of the model.Thereafter,for a multi-agent system,this study adopted a combination of offline and online learning to endow the model with the ability to realize automatic optimization and updation of the strategy.In the offline learning phase,a Q-learning-based multi-agent conservative Q-learning(MACQL)algorithm was proposed to stabilize the learning results and reduce the risk of the next online learning phase.In the online learning phase,a multi-agent deep deterministic policy gradient(MADDPG)algorithm based on policy gradients was proposed to explore the action space and update the experience pool.Finally,the effectiveness of the proposed method was verified through a simulation analysis of a real-world 445-node system.
基金supported by the National Key Research and Development Program of China(No.2021YFF0900503)partly by the National Natural Science Foundation of China(No.62262018,61971382)。
文摘Short video applications like Tik Tok have seen significant growth in recent years.One common behavior of users on these platforms is watching and swiping through videos,which can lead to a significant waste of bandwidth.As such,an important challenge in short video streaming is to design a preloading algorithm that can effectively decide which videos to download,at what bitrate,and when to pause the download in order to reduce bandwidth waste while improving the Quality of Experience(QoE).However,designing such an algorithm is non-trivial,especially when considering the conflicting objectives of minimizing bandwidth waste and maximizing QoE.In this paper,we propose an end-to-end Deep reinforcement learning framework with Action Masking called DAM that leverages domain knowledge to learn an optimal policy for short video preloading.To achieve this,we introduce a reward shaping technique to minimize bandwidth waste and use action masking to make actions more reasonable,reduce playback rebuffering,and accelerate the training process.We have conducted extensive experiments using real-world video datasets and network traces including 4G/Wi Fi/5G.Our results show that DAM improves the Qo E score by 3.73%-11.28%compared to state-of-the-art algorithms,and achieves an average bandwidth waste of only 10.27%-12.07%,outperforming all baseline methods.
基金This work was supported King Abdulaziz University under grant number IFPHI-033-611-2020.
文摘Today,due to the pandemic of COVID-19 the entire world is facing a serious health crisis.According to the World Health Organization(WHO),people in public places should wear a face mask to control the rapid transmission of COVID-19.The governmental bodies of different countries imposed that wearing a face mask is compulsory in public places.Therefore,it is very difficult to manually monitor people in overcrowded areas.This research focuses on providing a solution to enforce one of the important preventative measures of COVID-19 in public places,by presenting an automated system that automatically localizes masked and unmasked human faces within an image or video of an area which assist in this outbreak of COVID-19.This paper demonstrates a transfer learning approach with the Faster-RCNN model to detect faces that are masked or unmasked.The proposed framework is built by fine-tuning the state-of-the-art deep learning model,Faster-RCNN,and has been validated on a publicly available dataset named Face Mask Dataset(FMD)and achieving the highest average precision(AP)of 81%and highest average Recall(AR)of 84%.This shows the strong robustness and capabilities of the Faster-RCNN model to detect individuals with masked and un-masked faces.Moreover,this work applies to real-time and can be implemented in any public service area.
基金This work was funded by the Deanship of Scientific Research(DSR),King Abdulaziz University(KAU),Jeddah,Saudi Arabia,under grant no.(HO:023-611-1443)The authors,therefore,gratefully acknowledge DSR technical and financial support。
文摘Notwithstanding the religious intention of billions of devotees,the religious mass gathering increased major public health concerns since it likely became a huge super spreading event for the severe acute respiratory syndrome coronavirus 2(SARS-CoV-2).Most attendees ignored preventive measures,namely maintaining physical distance,practising hand hygiene,and wearing facemasks.Wearing a face mask in public areas protects people from spreading COVID-19.Artificial intelligence(AI)based on deep learning(DL)and machine learning(ML)could assist in fighting covid-19 in several ways.This study introduces a new deep learning-based Face Mask Detection in Religious Mass Gathering(DLFMD-RMG)technique during the COVID-19 pandemic.The DLFMD-RMG technique focuses mainly on detecting face masks in a religious mass gathering.To accomplish this,the presented DLFMD-RMG technique undergoes two pre-processing levels:Bilateral Filtering(BF)and Contrast Enhancement.For face detection,the DLFMD-RMG technique uses YOLOv5 with a ResNet-50 detector.In addition,the face detection performance can be improved by the seeker optimization algorithm(SOA)for tuning the hyperparameter of the ResNet-50 module,showing the novelty of the work.At last,the faces with and without masks are classified using the Fuzzy Neural Network(FNN)model.The stimulation study of the DLFMD-RMG algorithm is examined on a benchmark dataset.The results highlighted the remarkable performance of the DLFMD-RMG model algorithm in other recent approaches.
基金co-supported by the National Key Research and Development Program of China(No.2022YFF0503100)the Youth Innovation Project of Pandeng Program of National Space Science Center,Chinese Academy of Sciences(No.E3PD40012S).
文摘As we look ahead to future lunar exploration missions, such as crewed lunar exploration and establishing lunar scientific research stations, the lunar rovers will need to cover vast distances. These distances could range from kilometers to tens of kilometers, and even hundreds and thousands of kilometers. Therefore, it is crucial to develop effective long-range path planning for lunar rovers to meet the demands of lunar patrol exploration. This paper presents a hierarchical map model path planning method that utilizes the existing high-resolution images, digital elevation models and mineral abundance maps. The objective is to address the issue of the construction of lunar rover travel costs in the absence of large-scale, high-resolution digital elevation models. This method models the reference and semantic layers using the middle- and low-resolution remote sensing data. The multi-scale obstacles on the lunar surface are extracted by combining the deep learning algorithm on the high-resolution image, and the obstacle avoidance layer is modeled. A two-stage exploratory path planning decision is employed for long-distance driving path planning on a global–local scale. The proposed method analyzes the long-distance accessibility of various areas of scientific significance, such as Rima Bode. A high-precision digital elevation model is created using stereo images to validate the method. Based on the findings, it can be observed that the entire route spans a distance of 930.32 km. The route demonstrates an impressive ability to avoid meter-level impact craters and linear structures while maintaining an average slope of less than 8°. This paper explores scientific research by traversing at least seven basalt units, uncovering the secrets of lunar volcanic activities, and establishing ‘golden spike’ reference points for lunar stratigraphy. The final result of path planning can serve as a valuable reference for the design, mission demonstration, and subsequent project implementation of the new manned lunar rover.