Audio-visual scene classification(AVSC)poses a formidable challenge owing to the intricate spatial-temporal relationships exhibited by audio-visual signals,coupled with the complex spatial patterns of objects and text...Audio-visual scene classification(AVSC)poses a formidable challenge owing to the intricate spatial-temporal relationships exhibited by audio-visual signals,coupled with the complex spatial patterns of objects and textures found in visual images.The focus of recent studies has predominantly revolved around extracting features from diverse neural network structures,inadvertently neglecting the acquisition of semantically meaningful regions and crucial components within audio-visual data.The authors present a feature pyramid attention network(FPANet)for audio-visual scene understanding,which extracts semantically significant characteristics from audio-visual data.The authors’approach builds multi-scale hierarchical features of sound spectrograms and visual images using a feature pyramid representation and localises the semantically relevant regions with a feature pyramid attention module(FPAM).A dimension alignment(DA)strategy is employed to align feature maps from multiple layers,a pyramid spatial attention(PSA)to spatially locate essential regions,and a pyramid channel attention(PCA)to pinpoint significant temporal frames.Experiments on visual scene classification(VSC),audio scene classification(ASC),and AVSC tasks demonstrate that FPANet achieves performance on par with state-of-the-art(SOTA)approaches,with a 95.9 F1-score on the ADVANCE dataset and a relative improvement of 28.8%.Visualisation results show that FPANet can prioritise semantically meaningful areas in audio-visual signals.展开更多
Pest detection techniques are helpful in reducing the frequency and scale of pest outbreaks;however,their application in the actual agricultural production process is still challenging owing to the problems of intersp...Pest detection techniques are helpful in reducing the frequency and scale of pest outbreaks;however,their application in the actual agricultural production process is still challenging owing to the problems of interspecies similarity,multi-scale,and background complexity of pests.To address these problems,this study proposes an FD-YOLO pest target detection model.The FD-YOLO model uses a Fully Connected Feature Pyramid Network(FC-FPN)instead of a PANet in the neck,which can adaptively fuse multi-scale information so that the model can retain small-scale target features in the deep layer,enhance large-scale target features in the shallow layer,and enhance the multiplexing of effective features.A dual self-attention module(DSA)is then embedded in the C3 module of the neck,which captures the dependencies between the information in both spatial and channel dimensions,effectively enhancing global features.We selected 16 types of pests that widely damage field crops in the IP102 pest dataset,which were used as our dataset after data supplementation and enhancement.The experimental results showed that FD-YOLO’s mAP@0.5 improved by 6.8%compared to YOLOv5,reaching 82.6%and 19.1%–5%better than other state-of-the-art models.This method provides an effective new approach for detecting similar or multiscale pests in field crops.展开更多
Pyramidal dislocations are important for ductility enhancement of magnesium alloys.In this work,molecular dynamics simulations were employed to study the gliding behavior of pyramidal(c+a)dislocations under c-axis com...Pyramidal dislocations are important for ductility enhancement of magnesium alloys.In this work,molecular dynamics simulations were employed to study the gliding behavior of pyramidal(c+a)dislocations under c-axis compressive loading and tensile loading.The Peierls stress of Py-Ⅰ dislocation shows strong tension-compression asymmetry.However,no tension-compression asymmetry is seen on the Py-Ⅱ dislocation and basal dislocation.The tension-compression asymmetry origins from the asymmetry of partial dislocations of Py-Ⅰ dislocation,which leads to the dislocation core contracted under c-axis compressive loading and expanded under tensile loading.By analyzing the forces acting on the partial dislocations,we defined a neutral direction,which deviates from the full dislocation Burgers vector by 70.3°.The neutral direction is dependent on the ratio of lattice stresses of partial dislocations.If the shear stress is applied along the neutral direction,tension-compression asymmetry is eliminated and the dislocation core is un-contracted/un-expanded.The neutral direction of symmetrical dislocations(Py-Ⅱ dislocation and basal dislocation)is just the full dislocation Burgers vector.The tension-compression asymmetry and dislocation core contraction/expansion have an important influence on the dislocation behaviors,such as cross-slip,decomposition,basaltransition and mobility,which can be used to explain the mechanical behaviors of Mg single-crystals compressed along c-axis.展开更多
Dynamic recrystallization(DRX)in inhomogeneous deformation zones,such as grain boundaries,shear bands,and deformation bands,is critical for texture modification in magnesium alloys during deformation at elevated temper...Dynamic recrystallization(DRX)in inhomogeneous deformation zones,such as grain boundaries,shear bands,and deformation bands,is critical for texture modification in magnesium alloys during deformation at elevated temperatures.This study investigates the DRX mechanisms in AZWX3100 magnesium alloy under plane strain compression at 200℃.Microstructural analysis revealed necklace-type DRX accompanied by evidence of local grain boundary bulging.Additionally,ribbons of recrystallized grains were observed withinfine deformation bands,aligned with theoretical pyramidal I and II slip traces derived from the matrix.The distribution of local misorientation within the deformed microstructure demonstrated a clear association between deformation bands and localized strain.Dislocation analysis of lamellar specimens extracted from two pyramidal slip bands revealed<c+a>dislocations,indicating a connection between<c+a>slip activation and the formation of deformation bands.Crystal plasticity simulations suggest that the orientation of deformation bands is responsible for the unique recrystallization texture of the DRX grains within these bands.The texture characteristics imply a progressive,glide-induced DRX mechanism.A fundamental understanding of the role of deformation bands in texture modification can facilitate future alloy and process design.展开更多
The generation of high-quality,realistic face generation has emerged as a key field of research in computer vision.This paper proposes a robust approach that combines a Super-Resolution Generative Adversarial Network(...The generation of high-quality,realistic face generation has emerged as a key field of research in computer vision.This paper proposes a robust approach that combines a Super-Resolution Generative Adversarial Network(SRGAN)with a Pyramid Attention Module(PAM)to enhance the quality of deep face generation.The SRGAN framework is designed to improve the resolution of generated images,addressing common challenges such as blurriness and a lack of intricate details.The Pyramid Attention Module further complements the process by focusing on multi-scale feature extraction,enabling the network to capture finer details and complex facial features more effectively.The proposed method was trained and evaluated over 100 epochs on the CelebA dataset,demonstrating consistent improvements in image quality and a marked decrease in generator and discriminator losses,reflecting the model’s capacity to learn and synthesize high-quality images effectively,given adequate computational resources.Experimental outcome demonstrates that the SRGAN model with PAM module has outperformed,yielding an aggregate discriminator loss of 0.055 for real,0.043 for fake,and a generator loss of 10.58 after training for 100 epochs.The model has yielded an structural similarity index measure of 0.923,that has outperformed the other models that are considered in the current study for analysis.展开更多
Rail surface damage is a critical component of high-speed railway infrastructure,directly affecting train operational stability and safety.Existing methods face limitations in accuracy and speed for small-sample,multi...Rail surface damage is a critical component of high-speed railway infrastructure,directly affecting train operational stability and safety.Existing methods face limitations in accuracy and speed for small-sample,multi-category,and multi-scale target segmentation tasks.To address these challenges,this paper proposes Pyramid-MixNet,an intelligent segmentation model for high-speed rail surface damage,leveraging dataset construction and expansion alongside a feature pyramid-based encoder-decoder network with multi-attention mechanisms.The encoding net-work integrates Spatial Reduction Masked Multi-Head Attention(SRMMHA)to enhance global feature extraction while reducing trainable parameters.The decoding network incorporates Mix-Attention(MA),enabling multi-scale structural understanding and cross-scale token group correlation learning.Experimental results demonstrate that the proposed method achieves 62.17%average segmentation accuracy,80.28%Damage Dice Coefficient,and 56.83 FPS,meeting real-time detection requirements.The model’s high accuracy and scene adaptability significantly improve the detection of small-scale and complex multi-scale rail damage,offering practical value for real-time monitoring in high-speed railway maintenance systems.展开更多
This study focuses on tool condition recognition through data-driven approaches to enhance the intelligence level of computerized numerical control(CNC)machining processes and improve tool utilization efficiency.Tradi...This study focuses on tool condition recognition through data-driven approaches to enhance the intelligence level of computerized numerical control(CNC)machining processes and improve tool utilization efficiency.Traditional tool monitoring methods that rely on empirical knowledge or limited mathematical models struggle to adapt to complex and dynamic machining environments.To address this,we implement real-time tool condition recognition by introducing deep learning technology.Aiming to the insufficient recognition accuracy,we propose a pyramid pooling-based vision Transformer network(P2ViT-Net)method for tool condition recognition.Using images as input effectively mitigates the issue of low-dimensional signal features.We enhance the vision Transformer(ViT)framework for image classification by developing the P2ViT model and adapt it to tool condition recognition.Experimental results demonstrate that our improved P2ViT model achieves 94.4%recognition accuracy,showing a 10%improvement over conventional ViT and outperforming all comparative convolutional neural network models.展开更多
Conventional concentrator photovoltaics(CPV)face a persistent trade-off between high efficiency and high cost,driven by expensive multi-junction solar cells and complex active cooling systems.This study presents a com...Conventional concentrator photovoltaics(CPV)face a persistent trade-off between high efficiency and high cost,driven by expensive multi-junction solar cells and complex active cooling systems.This study presents a computational investigation of a novel Multi-Focal Pyramidal Array(MFPA)-based CPV system designed to overcome this limitation.The MFPA architecture employs a geometrically optimized pyramidal concentrator to distribute concen-trated sunlight onto strategically placed,low-cost monocrystalline silicon cells,enabling high efficiency energy capture while passively managing thermal loads.Coupled optical thermal electrical simulations in COMSOL Multiphysics demonstrate a geometric concentration ratio of 120×,with system temperatures maintained below 110℃ under standard 1000 W/m2 Direct Normal Irradiance(DNI).Ray tracing confirms 95%optical efficiency and a concentrated light spot radius of 2.48 mm.Compared with conventional CPV designs,the MFPA improves power-per-cost by 25%and reduces tracking requirements by 50%owing to its wide±15°acceptance angle.These results highlight the MFPA’s potential as a scalable,low-cost,and energy-efficient pathway for expanding solar power generation.展开更多
In remote sensing imagery,approximately 67%of the data are affected by cloud cover,significantly increasing the difficulty of image classification,recognition,and other downstream interpretation tasks.To effectively a...In remote sensing imagery,approximately 67%of the data are affected by cloud cover,significantly increasing the difficulty of image classification,recognition,and other downstream interpretation tasks.To effectively address the randomness of cloud distribution and the non-uniformity of cloud thickness,we propose a coarse-to-fine thin cloud removal architecture based on the observations of the random distribution and uneven thickness of cloud.In the coarse-level declouding network,we innovatively introduce a multi-scale attention mechanism,i.e.,pyramid nonlocal attention(PNA).By integrating global context with local detail information,it specifically addresses image quality degradation caused by the uncertainty in cloud distribution.During the fine-level declouding stage,we focus on the impact of cloud thickness on declouding results(primarily manifested as insufficient detail information).Through a carefully designed residual dense module,we significantly enhance the extraction and utilization of feature details.Thus,our approach precisely restores lost local texture features on top of coarse-level results,achieving a substantial leap in declouding quality.To evaluate the effectiveness of our cloud removal technology and attention mechanism,we conducted comprehensive analyses on publicly available datasets.Results demonstrate that our method achieves state-of-the-art performance across a wide range of techniques.展开更多
Dear Editor,The importance of the medial entorhinal cortex(MEC)for memory and spatial navigation has been shown repeatedly in many species,including mice and humans[1,2].It is,therefore,not surprising that the connect...Dear Editor,The importance of the medial entorhinal cortex(MEC)for memory and spatial navigation has been shown repeatedly in many species,including mice and humans[1,2].It is,therefore,not surprising that the connectivity of this structure has been studied extensively over the past century,mainly using a range of anterograde and retrograde anatomical tracers[3].展开更多
Recent advances in convolution neural network (CNN) have fostered the progress in object recognition and semantic segmentation, which in turn has improved the performance of hyperspectral image (HSI) classification. N...Recent advances in convolution neural network (CNN) have fostered the progress in object recognition and semantic segmentation, which in turn has improved the performance of hyperspectral image (HSI) classification. Nevertheless, the difficulty of high dimensional feature extraction and the shortage of small training samples seriously hinder the future development of HSI classification. In this paper, we propose a novel algorithm for HSI classification based on three-dimensional (3D) CNN and a feature pyramid network (FPN), called 3D-FPN. The framework contains a principle component analysis, a feature extraction structure and a logistic regression. Specifically, the FPN built with 3D convolutions not only retains the advantages of 3D convolution to fully extract the spectral-spatial feature maps, but also concentrates on more detailed information and performs multi-scale feature fusion. This method avoids the excessive complexity of the model and is suitable for small sample hyperspectral classification with varying categories and spatial resolutions. In order to test the performance of our proposed 3D-FPN method, rigorous experimental analysis was performed on three public hyperspectral data sets and hyperspectral data of GF-5 satellite. Quantitative and qualitative results indicated that our proposed method attained the best performance among other current state-of-the-art end-to-end deep learning-based methods.展开更多
Transportation systems are experiencing a significant transformation due to the integration of advanced technologies, including artificial intelligence and machine learning. In the context of intelligent transportatio...Transportation systems are experiencing a significant transformation due to the integration of advanced technologies, including artificial intelligence and machine learning. In the context of intelligent transportation systems (ITS) and Advanced Driver Assistance Systems (ADAS), the development of efficient and reliable traffic light detection mechanisms is crucial for enhancing road safety and traffic management. This paper presents an optimized convolutional neural network (CNN) framework designed to detect traffic lights in real-time within complex urban environments. Leveraging multi-scale pyramid feature maps, the proposed model addresses key challenges such as the detection of small, occluded, and low-resolution traffic lights amidst complex backgrounds. The integration of dilated convolutions, Region of Interest (ROI) alignment, and Soft Non-Maximum Suppression (Soft-NMS) further improves detection accuracy and reduces false positives. By optimizing computational efficiency and parameter complexity, the framework is designed to operate seamlessly on embedded systems, ensuring robust performance in real-world applications. Extensive experiments using real-world datasets demonstrate that our model significantly outperforms existing methods, providing a scalable solution for ITS and ADAS applications. This research contributes to the advancement of Artificial Intelligence-driven (AI-driven) pattern recognition in transportation systems and offers a mathematical approach to improving efficiency and safety in logistics and transportation networks.展开更多
针对动态场景下视觉SLAM(Simultaneous Localization and Mapping)系统中深度学习分割网络实时性不足,以及相机非期望运动导致位姿估计偏差的问题,提出一种基于跨域掩膜分割的视觉SLAM算法.该算法采用轻量化YOLO-fastest网络结合背景减...针对动态场景下视觉SLAM(Simultaneous Localization and Mapping)系统中深度学习分割网络实时性不足,以及相机非期望运动导致位姿估计偏差的问题,提出一种基于跨域掩膜分割的视觉SLAM算法.该算法采用轻量化YOLO-fastest网络结合背景减除法实现运动物体检测,利用深度图结合深度阈值分割构建跨域掩膜分割机制,并设计相机运动几何校正策略补偿检测框坐标误差,在实现运动物体分割的同时提升处理速度.为优化特征点利用率,采用金字塔光流对动态特征点进行帧间连续跟踪与更新,同时确保仅由静态特征点参与位姿估计过程.在TUM数据集上进行系统性评估,实验结果表明,相比于ORB-SLAM3算法,该算法的绝对位姿误差平均降幅达97.1%,与使用深度学习分割网络的DynaSLAM和DS-SLAM的动态SLAM算法相比,其单帧跟踪时间大幅减少,在精度与效率之间实现了更好的平衡.展开更多
基金Shenzhen Institute of Artificial Intelligence and Robotics for Society,Grant/Award Number:AC01202201003-02GuangDong Basic and Applied Basic Research Foundation,Grant/Award Number:2024A1515010252Longgang District Shenzhen's“Ten Action Plan”for Supporting Innovation Projects,Grant/Award Number:LGKCSDPT2024002。
文摘Audio-visual scene classification(AVSC)poses a formidable challenge owing to the intricate spatial-temporal relationships exhibited by audio-visual signals,coupled with the complex spatial patterns of objects and textures found in visual images.The focus of recent studies has predominantly revolved around extracting features from diverse neural network structures,inadvertently neglecting the acquisition of semantically meaningful regions and crucial components within audio-visual data.The authors present a feature pyramid attention network(FPANet)for audio-visual scene understanding,which extracts semantically significant characteristics from audio-visual data.The authors’approach builds multi-scale hierarchical features of sound spectrograms and visual images using a feature pyramid representation and localises the semantically relevant regions with a feature pyramid attention module(FPAM).A dimension alignment(DA)strategy is employed to align feature maps from multiple layers,a pyramid spatial attention(PSA)to spatially locate essential regions,and a pyramid channel attention(PCA)to pinpoint significant temporal frames.Experiments on visual scene classification(VSC),audio scene classification(ASC),and AVSC tasks demonstrate that FPANet achieves performance on par with state-of-the-art(SOTA)approaches,with a 95.9 F1-score on the ADVANCE dataset and a relative improvement of 28.8%.Visualisation results show that FPANet can prioritise semantically meaningful areas in audio-visual signals.
基金funded by Liaoning Provincial Department of Education Project,Award number JYTMS20230418.
文摘Pest detection techniques are helpful in reducing the frequency and scale of pest outbreaks;however,their application in the actual agricultural production process is still challenging owing to the problems of interspecies similarity,multi-scale,and background complexity of pests.To address these problems,this study proposes an FD-YOLO pest target detection model.The FD-YOLO model uses a Fully Connected Feature Pyramid Network(FC-FPN)instead of a PANet in the neck,which can adaptively fuse multi-scale information so that the model can retain small-scale target features in the deep layer,enhance large-scale target features in the shallow layer,and enhance the multiplexing of effective features.A dual self-attention module(DSA)is then embedded in the C3 module of the neck,which captures the dependencies between the information in both spatial and channel dimensions,effectively enhancing global features.We selected 16 types of pests that widely damage field crops in the IP102 pest dataset,which were used as our dataset after data supplementation and enhancement.The experimental results showed that FD-YOLO’s mAP@0.5 improved by 6.8%compared to YOLOv5,reaching 82.6%and 19.1%–5%better than other state-of-the-art models.This method provides an effective new approach for detecting similar or multiscale pests in field crops.
基金financially supported by National Natural Science Foundation of China(12072211,12232008)Foundation of Key laboratory(2022JCJQLB05703)Sichuan Province Science and Technology Project(2023NSFSC0914)。
文摘Pyramidal dislocations are important for ductility enhancement of magnesium alloys.In this work,molecular dynamics simulations were employed to study the gliding behavior of pyramidal(c+a)dislocations under c-axis compressive loading and tensile loading.The Peierls stress of Py-Ⅰ dislocation shows strong tension-compression asymmetry.However,no tension-compression asymmetry is seen on the Py-Ⅱ dislocation and basal dislocation.The tension-compression asymmetry origins from the asymmetry of partial dislocations of Py-Ⅰ dislocation,which leads to the dislocation core contracted under c-axis compressive loading and expanded under tensile loading.By analyzing the forces acting on the partial dislocations,we defined a neutral direction,which deviates from the full dislocation Burgers vector by 70.3°.The neutral direction is dependent on the ratio of lattice stresses of partial dislocations.If the shear stress is applied along the neutral direction,tension-compression asymmetry is eliminated and the dislocation core is un-contracted/un-expanded.The neutral direction of symmetrical dislocations(Py-Ⅱ dislocation and basal dislocation)is just the full dislocation Burgers vector.The tension-compression asymmetry and dislocation core contraction/expansion have an important influence on the dislocation behaviors,such as cross-slip,decomposition,basaltransition and mobility,which can be used to explain the mechanical behaviors of Mg single-crystals compressed along c-axis.
基金by the Deutsche Forschungsgemeinschaft(DFG)through projects 420149269,394480829as part of the CRC1394“Structural and Chemical Atomic Complexity-From Defect Phase Diagrams to Material Properties”(project 409476157).
文摘Dynamic recrystallization(DRX)in inhomogeneous deformation zones,such as grain boundaries,shear bands,and deformation bands,is critical for texture modification in magnesium alloys during deformation at elevated temperatures.This study investigates the DRX mechanisms in AZWX3100 magnesium alloy under plane strain compression at 200℃.Microstructural analysis revealed necklace-type DRX accompanied by evidence of local grain boundary bulging.Additionally,ribbons of recrystallized grains were observed withinfine deformation bands,aligned with theoretical pyramidal I and II slip traces derived from the matrix.The distribution of local misorientation within the deformed microstructure demonstrated a clear association between deformation bands and localized strain.Dislocation analysis of lamellar specimens extracted from two pyramidal slip bands revealed<c+a>dislocations,indicating a connection between<c+a>slip activation and the formation of deformation bands.Crystal plasticity simulations suggest that the orientation of deformation bands is responsible for the unique recrystallization texture of the DRX grains within these bands.The texture characteristics imply a progressive,glide-induced DRX mechanism.A fundamental understanding of the role of deformation bands in texture modification can facilitate future alloy and process design.
基金supported by the National Research Foundation of Korea(NRF)grant funded by the Korea government(*MSIT)(No.2018R1A5A7059549).
文摘The generation of high-quality,realistic face generation has emerged as a key field of research in computer vision.This paper proposes a robust approach that combines a Super-Resolution Generative Adversarial Network(SRGAN)with a Pyramid Attention Module(PAM)to enhance the quality of deep face generation.The SRGAN framework is designed to improve the resolution of generated images,addressing common challenges such as blurriness and a lack of intricate details.The Pyramid Attention Module further complements the process by focusing on multi-scale feature extraction,enabling the network to capture finer details and complex facial features more effectively.The proposed method was trained and evaluated over 100 epochs on the CelebA dataset,demonstrating consistent improvements in image quality and a marked decrease in generator and discriminator losses,reflecting the model’s capacity to learn and synthesize high-quality images effectively,given adequate computational resources.Experimental outcome demonstrates that the SRGAN model with PAM module has outperformed,yielding an aggregate discriminator loss of 0.055 for real,0.043 for fake,and a generator loss of 10.58 after training for 100 epochs.The model has yielded an structural similarity index measure of 0.923,that has outperformed the other models that are considered in the current study for analysis.
基金supported in part by the National Natural Science Foundation of China under Grant 6226070954Jiangxi Provincial Key R&D Programme under Grant 20244BBG73002.
文摘Rail surface damage is a critical component of high-speed railway infrastructure,directly affecting train operational stability and safety.Existing methods face limitations in accuracy and speed for small-sample,multi-category,and multi-scale target segmentation tasks.To address these challenges,this paper proposes Pyramid-MixNet,an intelligent segmentation model for high-speed rail surface damage,leveraging dataset construction and expansion alongside a feature pyramid-based encoder-decoder network with multi-attention mechanisms.The encoding net-work integrates Spatial Reduction Masked Multi-Head Attention(SRMMHA)to enhance global feature extraction while reducing trainable parameters.The decoding network incorporates Mix-Attention(MA),enabling multi-scale structural understanding and cross-scale token group correlation learning.Experimental results demonstrate that the proposed method achieves 62.17%average segmentation accuracy,80.28%Damage Dice Coefficient,and 56.83 FPS,meeting real-time detection requirements.The model’s high accuracy and scene adaptability significantly improve the detection of small-scale and complex multi-scale rail damage,offering practical value for real-time monitoring in high-speed railway maintenance systems.
基金supported by China Postdoctoral Science Foundation(No.2024M754122)the Postdoctoral Fellowship Programof CPSF(No.GZB20240972)+3 种基金the Jiangsu Funding Program for Excellent Postdoctoral Talent(No.2024ZB194)Natural Science Foundation of Jiangsu Province(No.BK20241389)Basic Science ResearchFund of China(No.JCKY2023203C026)2024 Jiangsu Province Talent Programme Qinglan Project.
文摘This study focuses on tool condition recognition through data-driven approaches to enhance the intelligence level of computerized numerical control(CNC)machining processes and improve tool utilization efficiency.Traditional tool monitoring methods that rely on empirical knowledge or limited mathematical models struggle to adapt to complex and dynamic machining environments.To address this,we implement real-time tool condition recognition by introducing deep learning technology.Aiming to the insufficient recognition accuracy,we propose a pyramid pooling-based vision Transformer network(P2ViT-Net)method for tool condition recognition.Using images as input effectively mitigates the issue of low-dimensional signal features.We enhance the vision Transformer(ViT)framework for image classification by developing the P2ViT model and adapt it to tool condition recognition.Experimental results demonstrate that our improved P2ViT model achieves 94.4%recognition accuracy,showing a 10%improvement over conventional ViT and outperforming all comparative convolutional neural network models.
文摘Conventional concentrator photovoltaics(CPV)face a persistent trade-off between high efficiency and high cost,driven by expensive multi-junction solar cells and complex active cooling systems.This study presents a computational investigation of a novel Multi-Focal Pyramidal Array(MFPA)-based CPV system designed to overcome this limitation.The MFPA architecture employs a geometrically optimized pyramidal concentrator to distribute concen-trated sunlight onto strategically placed,low-cost monocrystalline silicon cells,enabling high efficiency energy capture while passively managing thermal loads.Coupled optical thermal electrical simulations in COMSOL Multiphysics demonstrate a geometric concentration ratio of 120×,with system temperatures maintained below 110℃ under standard 1000 W/m2 Direct Normal Irradiance(DNI).Ray tracing confirms 95%optical efficiency and a concentrated light spot radius of 2.48 mm.Compared with conventional CPV designs,the MFPA improves power-per-cost by 25%and reduces tracking requirements by 50%owing to its wide±15°acceptance angle.These results highlight the MFPA’s potential as a scalable,low-cost,and energy-efficient pathway for expanding solar power generation.
基金supported by the Fundamental Research Funds for the Central Universities(No.2572025BR14)the China Energy Digital Intelligence Technology Development(Beijing)Co.,Ltd.Science and Technology Innovation Project(No.YA2024001500).
文摘In remote sensing imagery,approximately 67%of the data are affected by cloud cover,significantly increasing the difficulty of image classification,recognition,and other downstream interpretation tasks.To effectively address the randomness of cloud distribution and the non-uniformity of cloud thickness,we propose a coarse-to-fine thin cloud removal architecture based on the observations of the random distribution and uneven thickness of cloud.In the coarse-level declouding network,we innovatively introduce a multi-scale attention mechanism,i.e.,pyramid nonlocal attention(PNA).By integrating global context with local detail information,it specifically addresses image quality degradation caused by the uncertainty in cloud distribution.During the fine-level declouding stage,we focus on the impact of cloud thickness on declouding results(primarily manifested as insufficient detail information).Through a carefully designed residual dense module,we significantly enhance the extraction and utilization of feature details.Thus,our approach precisely restores lost local texture features on top of coarse-level results,achieving a substantial leap in declouding quality.To evaluate the effectiveness of our cloud removal technology and attention mechanism,we conducted comprehensive analyses on publicly available datasets.Results demonstrate that our method achieves state-of-the-art performance across a wide range of techniques.
文摘Dear Editor,The importance of the medial entorhinal cortex(MEC)for memory and spatial navigation has been shown repeatedly in many species,including mice and humans[1,2].It is,therefore,not surprising that the connectivity of this structure has been studied extensively over the past century,mainly using a range of anterograde and retrograde anatomical tracers[3].
基金the National Natural Science Foundation of China(No.51975374)。
文摘Recent advances in convolution neural network (CNN) have fostered the progress in object recognition and semantic segmentation, which in turn has improved the performance of hyperspectral image (HSI) classification. Nevertheless, the difficulty of high dimensional feature extraction and the shortage of small training samples seriously hinder the future development of HSI classification. In this paper, we propose a novel algorithm for HSI classification based on three-dimensional (3D) CNN and a feature pyramid network (FPN), called 3D-FPN. The framework contains a principle component analysis, a feature extraction structure and a logistic regression. Specifically, the FPN built with 3D convolutions not only retains the advantages of 3D convolution to fully extract the spectral-spatial feature maps, but also concentrates on more detailed information and performs multi-scale feature fusion. This method avoids the excessive complexity of the model and is suitable for small sample hyperspectral classification with varying categories and spatial resolutions. In order to test the performance of our proposed 3D-FPN method, rigorous experimental analysis was performed on three public hyperspectral data sets and hyperspectral data of GF-5 satellite. Quantitative and qualitative results indicated that our proposed method attained the best performance among other current state-of-the-art end-to-end deep learning-based methods.
基金funded by the Deanship of Scientific Research at Northern Border University,Arar,Saudi Arabia through research group No.(RG-NBU-2022-1234).
文摘Transportation systems are experiencing a significant transformation due to the integration of advanced technologies, including artificial intelligence and machine learning. In the context of intelligent transportation systems (ITS) and Advanced Driver Assistance Systems (ADAS), the development of efficient and reliable traffic light detection mechanisms is crucial for enhancing road safety and traffic management. This paper presents an optimized convolutional neural network (CNN) framework designed to detect traffic lights in real-time within complex urban environments. Leveraging multi-scale pyramid feature maps, the proposed model addresses key challenges such as the detection of small, occluded, and low-resolution traffic lights amidst complex backgrounds. The integration of dilated convolutions, Region of Interest (ROI) alignment, and Soft Non-Maximum Suppression (Soft-NMS) further improves detection accuracy and reduces false positives. By optimizing computational efficiency and parameter complexity, the framework is designed to operate seamlessly on embedded systems, ensuring robust performance in real-world applications. Extensive experiments using real-world datasets demonstrate that our model significantly outperforms existing methods, providing a scalable solution for ITS and ADAS applications. This research contributes to the advancement of Artificial Intelligence-driven (AI-driven) pattern recognition in transportation systems and offers a mathematical approach to improving efficiency and safety in logistics and transportation networks.
文摘针对动态场景下视觉SLAM(Simultaneous Localization and Mapping)系统中深度学习分割网络实时性不足,以及相机非期望运动导致位姿估计偏差的问题,提出一种基于跨域掩膜分割的视觉SLAM算法.该算法采用轻量化YOLO-fastest网络结合背景减除法实现运动物体检测,利用深度图结合深度阈值分割构建跨域掩膜分割机制,并设计相机运动几何校正策略补偿检测框坐标误差,在实现运动物体分割的同时提升处理速度.为优化特征点利用率,采用金字塔光流对动态特征点进行帧间连续跟踪与更新,同时确保仅由静态特征点参与位姿估计过程.在TUM数据集上进行系统性评估,实验结果表明,相比于ORB-SLAM3算法,该算法的绝对位姿误差平均降幅达97.1%,与使用深度学习分割网络的DynaSLAM和DS-SLAM的动态SLAM算法相比,其单帧跟踪时间大幅减少,在精度与效率之间实现了更好的平衡.