Visible and infrared(RGB-IR)fusion object detection plays an important role in security,disaster relief,etc.In recent years,deep-learning-based RGB-IR fusion detection methods have been developing rapidly,but still st...Visible and infrared(RGB-IR)fusion object detection plays an important role in security,disaster relief,etc.In recent years,deep-learning-based RGB-IR fusion detection methods have been developing rapidly,but still struggle to deal with the complex and changing scenarios captured by drones,mainly due to two reasons:(A)RGB-IR fusion detectors are susceptible to inferior inputs that degrade performance and stability.(B)RGB-IR fusion detectors are susceptible to redundant features that reduce accuracy and efficiency.In this paper,an innovative RGB-IR fusion detection framework based on global-local feature optimization,named GLFDet,is proposed to improve the detection performance and efficiency of drone-captured objects.The key components of GLFDet include a Global Feature Optimization(GFO)module,a Local Feature Optimization(LFO)module and a Channel Separation Fusion(CSF)module.Specifically,GFO calculates the information content of the input image from the frequency domain and optimizes the features holistically.Then,LFO dynamically selects high-value features and filters out low-value features before fusion,which significantly improves the efficiency of fusion.Finally,CSF fuses the RGB and IR features across the corresponding channels,which avoids the rearrangement of the channel relationships and enhances the model stability.Extensive experimental results show that the proposed method achieves the best performance on three popular RGB-IR datasets Drone Vehicle,VEDAI,and LLVIP.In addition,GLFDet is more lightweight than other comparable models,making it more appealing to edge devices such as drones.The code is available at https://github.com/lao chen330/GLFDet.展开更多
Ultrasonic-Assisted Grinding(UAG)is a novel manufacturing technology that shows promising promise for use in processing Ceramic Matrix Composites(CMCs).Nevertheless,analyzing the material removal process of CMCs with ...Ultrasonic-Assisted Grinding(UAG)is a novel manufacturing technology that shows promising promise for use in processing Ceramic Matrix Composites(CMCs).Nevertheless,analyzing the material removal process of CMCs with multidirectional structure during UAG is challenging,impeding the progress and improvement of the UAG process.This work examined the impact of ultrasonic vibration on the dynamic mechanical characteristics during processing.Additionally,we experimentally elucidated the material removal mechanism of CMCs during the scratching process under the influence of vertical vibration.The results indicate that the introduction of ultrasonic vibration causes a strain rate effect,resulting in a modification of the material removal mechanism,subsequently impacting the processing quality.Ultrasonic vibration increases the dynamic strength and brittleness of the fibers in CMCs,leading to more cracks at fracture,which changes from the original bending fracture to shear fracture.In addition,ultrasonic vibration can effectively inhibit the impact of scratching depth and anisotropy on the removal mechanism of CMCs,resulting in a more uniform surface of CMCs after processing.展开更多
Human object detection and recognition is essential for elderly monitoring and assisted living however,models relying solely on pose or scene context often struggle in cluttered or visually ambiguous settings.To addre...Human object detection and recognition is essential for elderly monitoring and assisted living however,models relying solely on pose or scene context often struggle in cluttered or visually ambiguous settings.To address this,we present SCENET-3D,a transformer-drivenmultimodal framework that unifies human-centric skeleton features with scene-object semantics for intelligent robotic vision through a three-stage pipeline.In the first stage,scene analysis,rich geometric and texture descriptors are extracted from RGB frames,including surface-normal histograms,angles between neighboring normals,Zernike moments,directional standard deviation,and Gabor-filter responses.In the second stage,scene-object analysis,non-human objects are segmented and represented using local feature descriptors and complementary surface-normal information.In the third stage,human-pose estimation,silhouettes are processed through an enhanced MoveNet to obtain 2D anatomical keypoints,which are fused with depth information and converted into RGB-based point clouds to construct pseudo-3D skeletons.Features from all three stages are fused and fed in a transformer encoder with multi-head attention to resolve visually similar activities.Experiments on UCLA(95.8%),ETRI-Activity3D(89.4%),andCAD-120(91.2%)demonstrate that combining pseudo-3D skeletonswith rich scene-object fusion significantly improves generalizable activity recognition,enabling safer elderly care,natural human–robot interaction,and robust context-aware robotic perception in real-world environments.展开更多
The multi-objective optimization problems,especially in constrained environments such as power distribution planning,demand robust strategies for discovering effective solutions.This work presents the improved variant...The multi-objective optimization problems,especially in constrained environments such as power distribution planning,demand robust strategies for discovering effective solutions.This work presents the improved variant of the Multi-population Cooperative Constrained Multi-Objective Optimization(MCCMO)Algorithm,termed Adaptive Diversity Preservation(ADP).This enhancement is primarily focused on the improvement of constraint handling strategies,local search integration,hybrid selection approaches,and adaptive parameter control.Theimproved variant was experimented on with the RWMOP50 power distribution systemplanning benchmark.As per the findings,the improved variant outperformed the original MCCMO across the eleven performance metrics,particularly in terms of convergence speed,constraint handling efficiency,and solution diversity.The results also establish that MCCMOADP consistently delivers substantial performance gains over the baseline MCCMO,demonstrating its effectiveness across performancemetrics.The new variant also excels atmaintaining the balanced trade-off between exploration and exploitation throughout the search process,making it especially suitable for complex optimization problems in multiconstrained power systems.These enhancements make MCCMO-ADP a valuable and promising candidate for handling problems such as renewable energy scheduling,logistics planning,and power system optimization.Future work will benchmark the MCCMO-ADP against widely recognized algorithms such as NSGA-Ⅱ,NSGA-Ⅲ,and MOEA/D and will also extend its validation to large-scale real-world optimization domains to further consolidate its generalizability.展开更多
Abstract:Graphene-Based separation membranes hold promise for water treatment.However,their practical deployment in high-salinity brines remains challenging due to structural instability.Herein,a defect-free Na^(+)-Cu...Abstract:Graphene-Based separation membranes hold promise for water treatment.However,their practical deployment in high-salinity brines remains challenging due to structural instability.Herein,a defect-free Na^(+)-Cu^(2+)/GO-PEI nanocomposite membrane was fabricated via a pH-controlled cross-linking polymerization strategy.Polyethyleneimine(PEI)serves as a critical interfacial stabilizer,enhancing the connection between the Na^(+)-GO and Cu^(2+)-GO layers through amide bond formation with GO nanosheets while facilitating Cu^(2+)chelation.The Na^(+)/GO layer modifies the pore structure of the polyether sulfone(PES)substrate,synergistically optimizing the membrane’s microstructure.Performances evaluation revealed that the as-prepared membrane achieved exceptional separation efficiency(>98%)for tributyl phosphate,sulfonated kerosene,and bis(2-ethylhexyl)phosphate in high-salinity brine,accompanied by a high flux of 160~224 L·m^(-2)·h^(-1).Notably,it exhibited robust chemical stability in corrosive environment and maintained mechanical durability after 500 folding cycles coupled with consistent separation performances over 10 recycles.This study presents a novel multi-component modification approach for constructing high-performance GObased membrane,promising practical applications in organic pollutant removal from high salt solution.展开更多
Recognising human-object interactions(HOI)is a challenging task for traditional machine learning models,including convolutional neural networks(CNNs).Existing models show limited transferability across complex dataset...Recognising human-object interactions(HOI)is a challenging task for traditional machine learning models,including convolutional neural networks(CNNs).Existing models show limited transferability across complex datasets such as D3D-HOI and SYSU 3D HOI.The conventional architecture of CNNs restricts their ability to handle HOI scenarios with high complexity.HOI recognition requires improved feature extraction methods to overcome the current limitations in accuracy and scalability.This work proposes a Novel quantum gate-enabled hybrid CNN(QEH-CNN)for effectiveHOI recognition.Themodel enhancesCNNperformance by integrating quantumcomputing components.The framework begins with bilateral image filtering,followed bymulti-object tracking(MOT)and Felzenszwalb superpixel segmentation.A watershed algorithm refines object boundaries by cleaning merged superpixels.Feature extraction combines a histogram of oriented gradients(HOG),Global Image Statistics for Texture(GIST)descriptors,and a novel 23-joint keypoint extractionmethod using relative joint angles and joint proximitymeasures.A fuzzy optimization process refines the extracted features before feeding them into the QEH-CNNmodel.The proposed model achieves 95.06%accuracy on the 3D-D3D-HOI dataset and 97.29%on the SYSU3DHOI dataset.Theintegration of quantum computing enhances feature optimization,leading to improved accuracy and overall model efficiency.展开更多
The discharge of micro-polluted water from sources such as agricultural runoff,urban stormwater,and treated effluents presents significant challenges to aquatic ecosystems.Constructed wetlands(CWs)have gained recog-ni...The discharge of micro-polluted water from sources such as agricultural runoff,urban stormwater,and treated effluents presents significant challenges to aquatic ecosystems.Constructed wetlands(CWs)have gained recog-nition as an eco-friendly solution for removing pollutants from various wastewater sources and are increasingly applied for micro-polluted water treatment.By reviewing 78 full-scale CW studies from Web of Science,it is summarized that the ranges of ammonium nitrogen(NH4+-N)concentrations in runoff,wastewater treatment plant effluent and polluted river were 0.1–6.6,0.3–12.3,and 0.2–41.1 mg/L,respectively.The ranges of ni-trate nitrogen concentrations were 0.2–14.2,0–5.7,and 0–2.6 mg/L,respectively.Removal efficiencies of CWs for micro-polluted water varied by CW types.The total nitrogen removal efficiencies for subsurface-flow CWs,free-water surface-flow CWs,and hybrid CWs ranged from 27.4%to 66.5%,16.8%to 89.8%,and 19.4%to 88.2%,respectively.The NH4+-N removal efficiencies ranged from 34.2%to 73.6%,38.4%to 89.4%and 13.5%to 94.2%,respectively.Additionally,other factors influencing contaminant removal efficiency such as hydraulic retention time,vegetation types,redox micro-environment and influent water quality were evaluated.Based on these findings,two strategies for improving the purification performance of CWs were proposed:the selection of incorporating electron donor substrates and the optimization of operation parameters.This paper serves as a synthesis of information to guide future research and full-scale CW applications in micro-polluted water treatment.展开更多
Deep learning has made significant progress in the field of oriented object detection for remote sensing images.However,existing methods still face challenges when dealing with difficult tasks such as multi-scale targ...Deep learning has made significant progress in the field of oriented object detection for remote sensing images.However,existing methods still face challenges when dealing with difficult tasks such as multi-scale targets,complex backgrounds,and small objects in remote sensing.Maintaining model lightweight to address resource constraints in remote sensing scenarios while improving task completion for remote sensing tasks remains a research hotspot.Therefore,we propose an enhanced multi-scale feature extraction lightweight network EM-YOLO based on the YOLOv8s architecture,specifically optimized for the characteristics of large target scale variations,diverse orientations,and numerous small objects in remote sensing images.Our innovations lie in two main aspects:First,a dynamic snake convolution(DSC)is introduced into the backbone network to enhance the model’s feature extraction capability for oriented targets.Second,an innovative focusing-diffusion module is designed in the feature fusion neck to effectively integrate multi-scale feature information.Finally,we introduce Layer-Adaptive Sparsity for magnitude-based Pruning(LASP)method to perform lightweight network pruning to better complete tasks in resource-constrained scenarios.Experimental results on the lightweight platform Orin demonstrate that the proposed method significantly outperforms the original YOLOv8s model in oriented remote sensing object detection tasks,and achieves comparable or superior performance to state-of-the-art methods on three authoritative remote sensing datasets(DOTA v1.0,DOTA v1.5,and HRSC2016).展开更多
The ubiquity of mobile devices has driven advancements in mobile object detection.However,challenges in multi-scale object detection in open,complex environments persist due to limited computational resources.Traditio...The ubiquity of mobile devices has driven advancements in mobile object detection.However,challenges in multi-scale object detection in open,complex environments persist due to limited computational resources.Traditional approaches like network compression,quantization,and lightweight design often sacrifice accuracy or feature representation robustness.This article introduces the Fast Multi-scale Channel Shuffling Network(FMCSNet),a novel lightweight detection model optimized for mobile devices.FMCSNet integrates a fully convolutional Multilayer Perceptron(MLP)module,offering global perception without significantly increasing parameters,effectively bridging the gap between CNNs and Vision Transformers.FMCSNet achieves a delicate balance between computation and accuracy mainly by two key modules:the ShiftMLP module,including a shift operation and an MLP module,and a Partial group Convolutional(PGConv)module,reducing computation while enhancing information exchange between channels.With a computational complexity of 1.4G FLOPs and 1.3M parameters,FMCSNet outperforms CNN-based and DWConv-based ShuffleNetv2 by 1%and 4.5%mAP on the Pascal VOC 2007 dataset,respectively.Additionally,FMCSNet achieves a mAP of 30.0(0.5:0.95 IoU threshold)with only 2.5G FLOPs and 2.0M parameters.It achieves 32 FPS on low-performance i5-series CPUs,meeting real-time detection requirements.The versatility of the PGConv module’s adaptability across scenarios further highlights FMCSNet as a promising solution for real-time mobile object detection.展开更多
Small object detection has been a focus of attention since the emergence of deep learning-based object detection.Although classical object detection frameworks have made significant contributions to the development of...Small object detection has been a focus of attention since the emergence of deep learning-based object detection.Although classical object detection frameworks have made significant contributions to the development of object detection,there are still many issues to be resolved in detecting small objects due to the inherent complexity and diversity of real-world visual scenes.In particular,the YOLO(You Only Look Once)series of detection models,renowned for their real-time performance,have undergone numerous adaptations aimed at improving the detection of small targets.In this survey,we summarize the state-of-the-art YOLO-based small object detection methods.This review presents a systematic categorization of YOLO-based approaches for small-object detection,organized into four methodological avenues,namely attention-based feature enhancement,detection-head optimization,loss function,and multi-scale feature fusion strategies.We then examine the principal challenges addressed by each category.Finally,we analyze the performance of thesemethods on public benchmarks and,by comparing current approaches,identify limitations and outline directions for future research.展开更多
In recent years,with the rapid advancement of artificial intelligence,object detection algorithms have made significant strides in accuracy and computational efficiency.Notably,research and applications of Anchor-Free...In recent years,with the rapid advancement of artificial intelligence,object detection algorithms have made significant strides in accuracy and computational efficiency.Notably,research and applications of Anchor-Free models have opened new avenues for real-time target detection in optical remote sensing images(ORSIs).However,in the realmof adversarial attacks,developing adversarial techniques tailored to Anchor-Freemodels remains challenging.Adversarial examples generated based on Anchor-Based models often exhibit poor transferability to these new model architectures.Furthermore,the growing diversity of Anchor-Free models poses additional hurdles to achieving robust transferability of adversarial attacks.This study presents an improved cross-conv-block feature fusion You Only Look Once(YOLO)architecture,meticulously engineered to facilitate the extraction ofmore comprehensive semantic features during the backpropagation process.To address the asymmetry between densely distributed objects in ORSIs and the corresponding detector outputs,a novel dense bounding box attack strategy is proposed.This approach leverages dense target bounding boxes loss in the calculation of adversarial loss functions.Furthermore,by integrating translation-invariant(TI)and momentum-iteration(MI)adversarial methodologies,the proposed framework significantly improves the transferability of adversarial attacks.Experimental results demonstrate that our method achieves superior adversarial attack performance,with adversarial transferability rates(ATR)of 67.53%on the NWPU VHR-10 dataset and 90.71%on the HRSC2016 dataset.Compared to ensemble adversarial attack and cascaded adversarial attack approaches,our method generates adversarial examples in an average of 0.64 s,representing an approximately 14.5%improvement in efficiency under equivalent conditions.展开更多
In modern industrial production,foreign object detection in complex environments is crucial to ensure product quality and production safety.Detection systems based on deep-learning image processing algorithms often fa...In modern industrial production,foreign object detection in complex environments is crucial to ensure product quality and production safety.Detection systems based on deep-learning image processing algorithms often face challenges with handling high-resolution images and achieving accurate detection against complex backgrounds.To address these issues,this study employs the PatchCore unsupervised anomaly detection algorithm combined with data augmentation techniques to enhance the system’s generalization capability across varying lighting conditions,viewing angles,and object scales.The proposed method is evaluated in a complex industrial detection scenario involving the bogie of an electric multiple unit(EMU).A dataset consisting of complex backgrounds,diverse lighting conditions,and multiple viewing angles is constructed to validate the performance of the detection system in real industrial environments.Experimental results show that the proposed model achieves an average area under the receiver operating characteristic curve(AUROC)of 0.92 and an average F1 score of 0.85.Combined with data augmentation,the proposed model exhibits improvements in AUROC by 0.06 and F1 score by 0.03,demonstrating enhanced accuracy and robustness for foreign object detection in complex industrial settings.In addition,the effects of key factors on detection performance are systematically analyzed,providing practical guidance for parameter selection in real industrial applications.展开更多
Salient object detection(SOD)models struggle to simultaneously preserve global structure,maintain sharp object boundaries,and sustain computational efficiency in complex scenes.In this study,we propose SPSALNet,a task...Salient object detection(SOD)models struggle to simultaneously preserve global structure,maintain sharp object boundaries,and sustain computational efficiency in complex scenes.In this study,we propose SPSALNet,a task-driven two-stage(macro–micro)architecture that restructures the SOD process around superpixel representations.In the proposed approach,a“split-and-enhance”principle,introduced to our knowledge for the first time in the SOD literature,hierarchically classifies superpixels and then applies targeted refinement only to ambiguous or error-prone regions.At the macro stage,the image is partitioned into content-adaptive superpixel regions,and each superpixel is represented by a high-dimensional region-level feature vector.These representations define a regional decomposition problem in which superpixels are assigned to three classes:background,object interior,and transition regions.Superpixel tokens interact with a global feature vector from a deep network backbone through a cross-attention module and are projected into an enriched embedding space that jointly encodes local topology and global context.At the micro stage,the model employs a U-Net-based refinement process that allocates computational resources only to ambiguous transition regions.The image and distance–similarity maps derived from superpixels are processed through a dual-encoder pathway.Subsequently,channel-aware fusion blocks adaptively combine information from these two sources,producing sharper and more stable object boundaries.Experimental results show that SPSALNet achieves high accuracy with lower computational cost compared to recent competing methods.On the PASCAL-S and DUT-OMRON datasets,SPSALNet exhibits a clear performance advantage across all key metrics,and it ranks first on accuracy-oriented measures on HKU-IS.On the challenging DUT-OMRON benchmark,SPSALNet reaches a MAE of 0.034.Across all datasets,it preserves object boundaries and regional structure in a stable and competitive manner.展开更多
Laser-induced aerosols,predominantly submicron in size,pose significant environmental and health risks during the decommissioning of nuclear reactors.This study experimentally investigated the removal of laser-generat...Laser-induced aerosols,predominantly submicron in size,pose significant environmental and health risks during the decommissioning of nuclear reactors.This study experimentally investigated the removal of laser-generated aerosol particles using a water spray system integrated with an innovative system for pre-injecting electrically charged mist in our facility.To simulate aerosol generation in reactor decommissioning,a high-power laser was used to irradiate various materials(including stainless steel,carbon steel,and concrete),generating aerosol particles that were agglomerated with injected water mist and subsequently scavenged by water spray.Experimental results demonstrate enhanced aerosol removal via aerosol-mist agglomeration,with charged mist significantly improving particle capture by increasing wettability and size.The average improvements for the stainless steel,carbon steel,and concrete were 40%,44%,and 21%,respectively.The results of experiments using charged mist with different polarities(both positive and negative)and different surface coatings reveal that the dominant polarity of aerosols varies with the irradiated materials,influenced by their crystal structure and electron emission properties.Notably,surface coatings such as ZrO_(2)and CeO_(2)were found to possibly alter aerosol charging characteristics,thereby affecting aerosol removal efficiency with charged mist configurations.The innovative aerosol-mist agglomeration approach shows promise in mitigating radiation exposure,ensuring environmental safety,and reducing contaminated water during reactor dismantling.This study contributes critical knowledge for the development of advanced aerosol management strategies for nuclear reactor decommissioning.The understanding obtained in this work is also expected to be useful for various environmental and chemical engineering applications such as gas decontamination,air purification,and pollution control.展开更多
To address the issues of frequent identity switches(IDs)and degraded identification accuracy in multi object tracking(MOT)under complex occlusion scenarios,this study proposes an occlusion-robust tracking framework ba...To address the issues of frequent identity switches(IDs)and degraded identification accuracy in multi object tracking(MOT)under complex occlusion scenarios,this study proposes an occlusion-robust tracking framework based on face-pedestrian joint feature modeling.By constructing a joint tracking model centered on“intra-class independent tracking+cross-category dynamic binding”,designing a multi-modal matching metric with spatio-temporal and appearance constraints,and innovatively introducing a cross-category feature mutual verification mechanism and a dual matching strategy,this work effectively resolves performance degradation in traditional single-category tracking methods caused by short-term occlusion,cross-camera tracking,and crowded environments.Experiments on the Chokepoint_Face_Pedestrian_Track test set demonstrate that in complex scenes,the proposed method improves Face-Pedestrian Matching F1 area under the curve(F1 AUC)by approximately 4 to 43 percentage points compared to several traditional methods.The joint tracking model achieves overall performance metrics of IDF1:85.1825%and MOTA:86.5956%,representing improvements of 0.91 and 0.06 percentage points,respectively,over the baseline model.Ablation studies confirm the effectiveness of key modules such as the Intersection over Area(IoA)/Intersection over Union(IoU)joint metric and dynamic threshold adjustment,validating the significant role of the cross-category identity matching mechanism in enhancing tracking stability.Our_model shows a 16.7%frame per second(FPS)drop vs.fairness of detection and re-identification in multiple object tracking(FairMOT),with its cross-category binding module adding aboute 10%overhead,yet maintains near-real-time performance for essential face-pedestrian tracking at small resolutions.展开更多
With the rapid expansion of drone applications,accurate detection of objects in aerial imagery has become crucial for intelligent transportation,urban management,and emergency rescue missions.However,existing methods ...With the rapid expansion of drone applications,accurate detection of objects in aerial imagery has become crucial for intelligent transportation,urban management,and emergency rescue missions.However,existing methods face numerous challenges in practical deployment,including scale variation handling,feature degradation,and complex backgrounds.To address these issues,we propose Edge-enhanced and Detail-Capturing You Only Look Once(EHDC-YOLO),a novel framework for object detection in Unmanned Aerial Vehicle(UAV)imagery.Based on the You Only Look Once version 11 nano(YOLOv11n)baseline,EHDC-YOLO systematically introduces several architectural enhancements:(1)a Multi-Scale Edge Enhancement(MSEE)module that leverages multi-scale pooling and edge information to enhance boundary feature extraction;(2)an Enhanced Feature Pyramid Network(EFPN)that integrates P2-level features with Cross Stage Partial(CSP)structures and OmniKernel convolutions for better fine-grained representation;and(3)Dynamic Head(DyHead)with multi-dimensional attention mechanisms for enhanced cross-scale modeling and perspective adaptability.Comprehensive experiments on the Vision meets Drones for Detection(VisDrone-DET)2019 dataset demonstrate that EHDC-YOLO achieves significant improvements,increasing mean Average Precision(mAP)@0.5 from 33.2%to 46.1%(an absolute improvement of 12.9 percentage points)and mAP@0.5:0.95 from 19.5%to 28.0%(an absolute improvement of 8.5 percentage points)compared with the YOLOv11n baseline,while maintaining a reasonable parameter count(2.81 M vs the baseline’s 2.58 M).Further ablation studies confirm the effectiveness of each proposed component,while visualization results highlight EHDC-YOLO’s superior performance in detecting objects and handling occlusions in complex drone scenarios.展开更多
Modern manufacturing processes have become more reliant on automation because of the accelerated transition from Industry 3.0 to Industry 4.0.Manual inspection of products on assembly lines remains inefficient,prone t...Modern manufacturing processes have become more reliant on automation because of the accelerated transition from Industry 3.0 to Industry 4.0.Manual inspection of products on assembly lines remains inefficient,prone to errors and lacks consistency,emphasizing the need for a reliable and automated inspection system.Leveraging both object detection and image segmentation approaches,this research proposes a vision-based solution for the detection of various kinds of tools in the toolkit using deep learning(DL)models.Two Intel RealSense D455f depth cameras were arranged in a top down configuration to capture both RGB and depth images of the toolkits.After applying multiple constraints and enhancing them through preprocessing and augmentation,a dataset consisting of 3300 annotated RGB-D photos was generated.Several DL models were selected through a comprehensive assessment of mean Average Precision(mAP),precision-recall equilibrium,inference latency(target≥30 FPS),and computational burden,resulting in a preference for YOLO and Region-based Convolutional Neural Networks(R-CNN)variants over ViT-based models due to the latter’s increased latency and resource requirements.YOLOV5,YOLOV8,YOLOV11,Faster R-CNN,and Mask R-CNN were trained on the annotated dataset and evaluated using key performance metrics(Recall,Accuracy,F1-score,and Precision).YOLOV11 demonstrated balanced excellence with 93.0%precision,89.9%recall,and a 90.6%F1-score in object detection,as well as 96.9%precision,95.3%recall,and a 96.5%F1-score in instance segmentation with an average inference time of 25 ms per frame(≈40 FPS),demonstrating real-time performance.Leveraging these results,a YOLOV11-based windows application was successfully deployed in a real-time assembly line environment,where it accurately processed live video streams to detect and segment tools within toolkits,demonstrating its practical effectiveness in industrial automation.The application is capable of precisely measuring socket dimensions by utilising edge detection techniques on YOLOv11 segmentation masks,in addition to detection and segmentation.This makes it possible to do specification-level quality control right on the assembly line,which improves the ability to examine things in real time.The implementation is a big step forward for intelligent manufacturing in the Industry 4.0 paradigm.It provides a scalable,efficient,and accurate way to do automated inspection and dimensional verification activities.展开更多
The goal of the present work is to demonstrate the potential of Artificial Neural Network(ANN)-driven Genetic Algorithm(GA)methods for energy efficiency and economic performance optimization of energy efficiency measu...The goal of the present work is to demonstrate the potential of Artificial Neural Network(ANN)-driven Genetic Algorithm(GA)methods for energy efficiency and economic performance optimization of energy efficiency measures in a multi-family house building in Greece.The energy efficiency measures include different heating/cooling systems(such as low-temperature and high-temperature heat pumps,natural gas boilers,split units),building envelope components for floor,walls,roof and windows of variable heat transfer coefficients,the installation of solar thermal collectors and PVs.The calculations of the building loads and investment and operating and maintenance costs of the measures are based on the methodology defined in Directive 2010/31/EU,while economic assumptions are based on EN 15459-1 standard.Typically,multi-objective optimization of energy efficiency measures often requires the simulation of very large numbers of cases involving numerous possible combinations,resulting in intense computational load.The results of the study indicate that ANN-driven GA methods can be used as an alternative,valuable tool for reliably predicting the optimal measures which minimize primary energy consumption and life cycle cost of the building with greatly reduced computational requirements.Through GA methods,the computational time needed for obtaining the optimal solutions is reduced by 96.4%-96.8%.展开更多
To enhance the efficiency of wastewater biotreatment with microalgae, the effects of physical parameters need to be investigated and optimized. In this regard, the individual and interactive effects of temperature, p ...To enhance the efficiency of wastewater biotreatment with microalgae, the effects of physical parameters need to be investigated and optimized. In this regard, the individual and interactive effects of temperature, p H and aeration rate on the performance of biological removal of nitrate and phosphate by Chlorella vulgaris were studied by response surface methodology(RSM). Furthermore, a multi-objective optimization technique was applied to the response equations to simultaneously find optimal combinations of input parameters capable of removing the highest possible amount of nitrate and phosphate. The optimal calculated values were temperature of 26.3 °C, pH of 8 and aeration rate of 4.7 L·min^(-1). Interestingly, under the optimum condition, approximately 85% of total nitrate and 77% of whole phosphate were removed after 48 h and 24 h, respectively, which were in excellent agreement with the predicted values. Finally, the effect of baffle on mixing performance and, as a result, on bioremoval efficiency was investigated in Stirred Tank Photobioreactor(STP) by means of Computational Fluid Dynamics(CFD). Flow behavior indicated substantial enhancement in mixing performance when the baffle was inserted into the tank. Obtained simulation results were validated experimentally. Under the optimum condition, due to proper mixing in baffled STP, nitrate and phosphate removal increased up to 93% and 86%,respectively, compared to unbaffled one.展开更多
Dear Editor,This letter focuses on the fact that small objects with few pixels disappear in feature maps with large receptive fields, as the network deepens, in object detection tasks. Therefore, the detection of dens...Dear Editor,This letter focuses on the fact that small objects with few pixels disappear in feature maps with large receptive fields, as the network deepens, in object detection tasks. Therefore, the detection of dense small objects is challenging.展开更多
基金supported by the National Natural Science Foundation of China(No.62276204)the Fundamental Research Funds for the Central Universities,China(No.YJSJ24011)+1 种基金the Natural Science Basic Research Program of Shaanxi,China(Nos.2022JM-340 and 2023-JC-QN-0710)the China Postdoctoral Science Foundation(Nos.2020T130494 and 2018M633470)。
文摘Visible and infrared(RGB-IR)fusion object detection plays an important role in security,disaster relief,etc.In recent years,deep-learning-based RGB-IR fusion detection methods have been developing rapidly,but still struggle to deal with the complex and changing scenarios captured by drones,mainly due to two reasons:(A)RGB-IR fusion detectors are susceptible to inferior inputs that degrade performance and stability.(B)RGB-IR fusion detectors are susceptible to redundant features that reduce accuracy and efficiency.In this paper,an innovative RGB-IR fusion detection framework based on global-local feature optimization,named GLFDet,is proposed to improve the detection performance and efficiency of drone-captured objects.The key components of GLFDet include a Global Feature Optimization(GFO)module,a Local Feature Optimization(LFO)module and a Channel Separation Fusion(CSF)module.Specifically,GFO calculates the information content of the input image from the frequency domain and optimizes the features holistically.Then,LFO dynamically selects high-value features and filters out low-value features before fusion,which significantly improves the efficiency of fusion.Finally,CSF fuses the RGB and IR features across the corresponding channels,which avoids the rearrangement of the channel relationships and enhances the model stability.Extensive experimental results show that the proposed method achieves the best performance on three popular RGB-IR datasets Drone Vehicle,VEDAI,and LLVIP.In addition,GLFDet is more lightweight than other comparable models,making it more appealing to edge devices such as drones.The code is available at https://github.com/lao chen330/GLFDet.
基金supported by the National Science Foundation for Distinguished Young Scholars of China(No.52325506)the Fundamental Research Funds for the Central Universities(No.DUT22LAB501)。
文摘Ultrasonic-Assisted Grinding(UAG)is a novel manufacturing technology that shows promising promise for use in processing Ceramic Matrix Composites(CMCs).Nevertheless,analyzing the material removal process of CMCs with multidirectional structure during UAG is challenging,impeding the progress and improvement of the UAG process.This work examined the impact of ultrasonic vibration on the dynamic mechanical characteristics during processing.Additionally,we experimentally elucidated the material removal mechanism of CMCs during the scratching process under the influence of vertical vibration.The results indicate that the introduction of ultrasonic vibration causes a strain rate effect,resulting in a modification of the material removal mechanism,subsequently impacting the processing quality.Ultrasonic vibration increases the dynamic strength and brittleness of the fibers in CMCs,leading to more cracks at fracture,which changes from the original bending fracture to shear fracture.In addition,ultrasonic vibration can effectively inhibit the impact of scratching depth and anisotropy on the removal mechanism of CMCs,resulting in a more uniform surface of CMCs after processing.
基金funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2025R410),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Human object detection and recognition is essential for elderly monitoring and assisted living however,models relying solely on pose or scene context often struggle in cluttered or visually ambiguous settings.To address this,we present SCENET-3D,a transformer-drivenmultimodal framework that unifies human-centric skeleton features with scene-object semantics for intelligent robotic vision through a three-stage pipeline.In the first stage,scene analysis,rich geometric and texture descriptors are extracted from RGB frames,including surface-normal histograms,angles between neighboring normals,Zernike moments,directional standard deviation,and Gabor-filter responses.In the second stage,scene-object analysis,non-human objects are segmented and represented using local feature descriptors and complementary surface-normal information.In the third stage,human-pose estimation,silhouettes are processed through an enhanced MoveNet to obtain 2D anatomical keypoints,which are fused with depth information and converted into RGB-based point clouds to construct pseudo-3D skeletons.Features from all three stages are fused and fed in a transformer encoder with multi-head attention to resolve visually similar activities.Experiments on UCLA(95.8%),ETRI-Activity3D(89.4%),andCAD-120(91.2%)demonstrate that combining pseudo-3D skeletonswith rich scene-object fusion significantly improves generalizable activity recognition,enabling safer elderly care,natural human–robot interaction,and robust context-aware robotic perception in real-world environments.
文摘The multi-objective optimization problems,especially in constrained environments such as power distribution planning,demand robust strategies for discovering effective solutions.This work presents the improved variant of the Multi-population Cooperative Constrained Multi-Objective Optimization(MCCMO)Algorithm,termed Adaptive Diversity Preservation(ADP).This enhancement is primarily focused on the improvement of constraint handling strategies,local search integration,hybrid selection approaches,and adaptive parameter control.Theimproved variant was experimented on with the RWMOP50 power distribution systemplanning benchmark.As per the findings,the improved variant outperformed the original MCCMO across the eleven performance metrics,particularly in terms of convergence speed,constraint handling efficiency,and solution diversity.The results also establish that MCCMOADP consistently delivers substantial performance gains over the baseline MCCMO,demonstrating its effectiveness across performancemetrics.The new variant also excels atmaintaining the balanced trade-off between exploration and exploitation throughout the search process,making it especially suitable for complex optimization problems in multiconstrained power systems.These enhancements make MCCMO-ADP a valuable and promising candidate for handling problems such as renewable energy scheduling,logistics planning,and power system optimization.Future work will benchmark the MCCMO-ADP against widely recognized algorithms such as NSGA-Ⅱ,NSGA-Ⅲ,and MOEA/D and will also extend its validation to large-scale real-world optimization domains to further consolidate its generalizability.
基金Special Research Assistant Program,China(2024000020)the Science and Technology Department of Qinghai Province,China(2024-ZJ-918)the“Kunlun Talents”Program of Qinghai(2024000075)。
文摘Abstract:Graphene-Based separation membranes hold promise for water treatment.However,their practical deployment in high-salinity brines remains challenging due to structural instability.Herein,a defect-free Na^(+)-Cu^(2+)/GO-PEI nanocomposite membrane was fabricated via a pH-controlled cross-linking polymerization strategy.Polyethyleneimine(PEI)serves as a critical interfacial stabilizer,enhancing the connection between the Na^(+)-GO and Cu^(2+)-GO layers through amide bond formation with GO nanosheets while facilitating Cu^(2+)chelation.The Na^(+)/GO layer modifies the pore structure of the polyether sulfone(PES)substrate,synergistically optimizing the membrane’s microstructure.Performances evaluation revealed that the as-prepared membrane achieved exceptional separation efficiency(>98%)for tributyl phosphate,sulfonated kerosene,and bis(2-ethylhexyl)phosphate in high-salinity brine,accompanied by a high flux of 160~224 L·m^(-2)·h^(-1).Notably,it exhibited robust chemical stability in corrosive environment and maintained mechanical durability after 500 folding cycles coupled with consistent separation performances over 10 recycles.This study presents a novel multi-component modification approach for constructing high-performance GObased membrane,promising practical applications in organic pollutant removal from high salt solution.
基金supported and funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2025R410),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Recognising human-object interactions(HOI)is a challenging task for traditional machine learning models,including convolutional neural networks(CNNs).Existing models show limited transferability across complex datasets such as D3D-HOI and SYSU 3D HOI.The conventional architecture of CNNs restricts their ability to handle HOI scenarios with high complexity.HOI recognition requires improved feature extraction methods to overcome the current limitations in accuracy and scalability.This work proposes a Novel quantum gate-enabled hybrid CNN(QEH-CNN)for effectiveHOI recognition.Themodel enhancesCNNperformance by integrating quantumcomputing components.The framework begins with bilateral image filtering,followed bymulti-object tracking(MOT)and Felzenszwalb superpixel segmentation.A watershed algorithm refines object boundaries by cleaning merged superpixels.Feature extraction combines a histogram of oriented gradients(HOG),Global Image Statistics for Texture(GIST)descriptors,and a novel 23-joint keypoint extractionmethod using relative joint angles and joint proximitymeasures.A fuzzy optimization process refines the extracted features before feeding them into the QEH-CNNmodel.The proposed model achieves 95.06%accuracy on the 3D-D3D-HOI dataset and 97.29%on the SYSU3DHOI dataset.Theintegration of quantum computing enhances feature optimization,leading to improved accuracy and overall model efficiency.
基金supported by the Natural Science Foundation of China(No.52470105)the Young Taishan Scholars Program of Shandong Province(No.358202103017).
文摘The discharge of micro-polluted water from sources such as agricultural runoff,urban stormwater,and treated effluents presents significant challenges to aquatic ecosystems.Constructed wetlands(CWs)have gained recog-nition as an eco-friendly solution for removing pollutants from various wastewater sources and are increasingly applied for micro-polluted water treatment.By reviewing 78 full-scale CW studies from Web of Science,it is summarized that the ranges of ammonium nitrogen(NH4+-N)concentrations in runoff,wastewater treatment plant effluent and polluted river were 0.1–6.6,0.3–12.3,and 0.2–41.1 mg/L,respectively.The ranges of ni-trate nitrogen concentrations were 0.2–14.2,0–5.7,and 0–2.6 mg/L,respectively.Removal efficiencies of CWs for micro-polluted water varied by CW types.The total nitrogen removal efficiencies for subsurface-flow CWs,free-water surface-flow CWs,and hybrid CWs ranged from 27.4%to 66.5%,16.8%to 89.8%,and 19.4%to 88.2%,respectively.The NH4+-N removal efficiencies ranged from 34.2%to 73.6%,38.4%to 89.4%and 13.5%to 94.2%,respectively.Additionally,other factors influencing contaminant removal efficiency such as hydraulic retention time,vegetation types,redox micro-environment and influent water quality were evaluated.Based on these findings,two strategies for improving the purification performance of CWs were proposed:the selection of incorporating electron donor substrates and the optimization of operation parameters.This paper serves as a synthesis of information to guide future research and full-scale CW applications in micro-polluted water treatment.
基金funded by the Hainan Province Science and Technology Special Fund under Grant ZDYF2024GXJS292.
文摘Deep learning has made significant progress in the field of oriented object detection for remote sensing images.However,existing methods still face challenges when dealing with difficult tasks such as multi-scale targets,complex backgrounds,and small objects in remote sensing.Maintaining model lightweight to address resource constraints in remote sensing scenarios while improving task completion for remote sensing tasks remains a research hotspot.Therefore,we propose an enhanced multi-scale feature extraction lightweight network EM-YOLO based on the YOLOv8s architecture,specifically optimized for the characteristics of large target scale variations,diverse orientations,and numerous small objects in remote sensing images.Our innovations lie in two main aspects:First,a dynamic snake convolution(DSC)is introduced into the backbone network to enhance the model’s feature extraction capability for oriented targets.Second,an innovative focusing-diffusion module is designed in the feature fusion neck to effectively integrate multi-scale feature information.Finally,we introduce Layer-Adaptive Sparsity for magnitude-based Pruning(LASP)method to perform lightweight network pruning to better complete tasks in resource-constrained scenarios.Experimental results on the lightweight platform Orin demonstrate that the proposed method significantly outperforms the original YOLOv8s model in oriented remote sensing object detection tasks,and achieves comparable or superior performance to state-of-the-art methods on three authoritative remote sensing datasets(DOTA v1.0,DOTA v1.5,and HRSC2016).
基金funded by the National Natural Science Foundation of China under Grant No.62371187the Open Program of Hunan Intelligent Rehabilitation Robot and Auxiliary Equipment Engineering Technology Research Center under Grant No.2024JS101.
文摘The ubiquity of mobile devices has driven advancements in mobile object detection.However,challenges in multi-scale object detection in open,complex environments persist due to limited computational resources.Traditional approaches like network compression,quantization,and lightweight design often sacrifice accuracy or feature representation robustness.This article introduces the Fast Multi-scale Channel Shuffling Network(FMCSNet),a novel lightweight detection model optimized for mobile devices.FMCSNet integrates a fully convolutional Multilayer Perceptron(MLP)module,offering global perception without significantly increasing parameters,effectively bridging the gap between CNNs and Vision Transformers.FMCSNet achieves a delicate balance between computation and accuracy mainly by two key modules:the ShiftMLP module,including a shift operation and an MLP module,and a Partial group Convolutional(PGConv)module,reducing computation while enhancing information exchange between channels.With a computational complexity of 1.4G FLOPs and 1.3M parameters,FMCSNet outperforms CNN-based and DWConv-based ShuffleNetv2 by 1%and 4.5%mAP on the Pascal VOC 2007 dataset,respectively.Additionally,FMCSNet achieves a mAP of 30.0(0.5:0.95 IoU threshold)with only 2.5G FLOPs and 2.0M parameters.It achieves 32 FPS on low-performance i5-series CPUs,meeting real-time detection requirements.The versatility of the PGConv module’s adaptability across scenarios further highlights FMCSNet as a promising solution for real-time mobile object detection.
基金supported in part by the by Chongqing Research Program of Basic Research and Frontier Technology under Grant CSTB2025NSCQ-GPX1309.
文摘Small object detection has been a focus of attention since the emergence of deep learning-based object detection.Although classical object detection frameworks have made significant contributions to the development of object detection,there are still many issues to be resolved in detecting small objects due to the inherent complexity and diversity of real-world visual scenes.In particular,the YOLO(You Only Look Once)series of detection models,renowned for their real-time performance,have undergone numerous adaptations aimed at improving the detection of small targets.In this survey,we summarize the state-of-the-art YOLO-based small object detection methods.This review presents a systematic categorization of YOLO-based approaches for small-object detection,organized into four methodological avenues,namely attention-based feature enhancement,detection-head optimization,loss function,and multi-scale feature fusion strategies.We then examine the principal challenges addressed by each category.Finally,we analyze the performance of thesemethods on public benchmarks and,by comparing current approaches,identify limitations and outline directions for future research.
文摘In recent years,with the rapid advancement of artificial intelligence,object detection algorithms have made significant strides in accuracy and computational efficiency.Notably,research and applications of Anchor-Free models have opened new avenues for real-time target detection in optical remote sensing images(ORSIs).However,in the realmof adversarial attacks,developing adversarial techniques tailored to Anchor-Freemodels remains challenging.Adversarial examples generated based on Anchor-Based models often exhibit poor transferability to these new model architectures.Furthermore,the growing diversity of Anchor-Free models poses additional hurdles to achieving robust transferability of adversarial attacks.This study presents an improved cross-conv-block feature fusion You Only Look Once(YOLO)architecture,meticulously engineered to facilitate the extraction ofmore comprehensive semantic features during the backpropagation process.To address the asymmetry between densely distributed objects in ORSIs and the corresponding detector outputs,a novel dense bounding box attack strategy is proposed.This approach leverages dense target bounding boxes loss in the calculation of adversarial loss functions.Furthermore,by integrating translation-invariant(TI)and momentum-iteration(MI)adversarial methodologies,the proposed framework significantly improves the transferability of adversarial attacks.Experimental results demonstrate that our method achieves superior adversarial attack performance,with adversarial transferability rates(ATR)of 67.53%on the NWPU VHR-10 dataset and 90.71%on the HRSC2016 dataset.Compared to ensemble adversarial attack and cascaded adversarial attack approaches,our method generates adversarial examples in an average of 0.64 s,representing an approximately 14.5%improvement in efficiency under equivalent conditions.
文摘In modern industrial production,foreign object detection in complex environments is crucial to ensure product quality and production safety.Detection systems based on deep-learning image processing algorithms often face challenges with handling high-resolution images and achieving accurate detection against complex backgrounds.To address these issues,this study employs the PatchCore unsupervised anomaly detection algorithm combined with data augmentation techniques to enhance the system’s generalization capability across varying lighting conditions,viewing angles,and object scales.The proposed method is evaluated in a complex industrial detection scenario involving the bogie of an electric multiple unit(EMU).A dataset consisting of complex backgrounds,diverse lighting conditions,and multiple viewing angles is constructed to validate the performance of the detection system in real industrial environments.Experimental results show that the proposed model achieves an average area under the receiver operating characteristic curve(AUROC)of 0.92 and an average F1 score of 0.85.Combined with data augmentation,the proposed model exhibits improvements in AUROC by 0.06 and F1 score by 0.03,demonstrating enhanced accuracy and robustness for foreign object detection in complex industrial settings.In addition,the effects of key factors on detection performance are systematically analyzed,providing practical guidance for parameter selection in real industrial applications.
文摘Salient object detection(SOD)models struggle to simultaneously preserve global structure,maintain sharp object boundaries,and sustain computational efficiency in complex scenes.In this study,we propose SPSALNet,a task-driven two-stage(macro–micro)architecture that restructures the SOD process around superpixel representations.In the proposed approach,a“split-and-enhance”principle,introduced to our knowledge for the first time in the SOD literature,hierarchically classifies superpixels and then applies targeted refinement only to ambiguous or error-prone regions.At the macro stage,the image is partitioned into content-adaptive superpixel regions,and each superpixel is represented by a high-dimensional region-level feature vector.These representations define a regional decomposition problem in which superpixels are assigned to three classes:background,object interior,and transition regions.Superpixel tokens interact with a global feature vector from a deep network backbone through a cross-attention module and are projected into an enriched embedding space that jointly encodes local topology and global context.At the micro stage,the model employs a U-Net-based refinement process that allocates computational resources only to ambiguous transition regions.The image and distance–similarity maps derived from superpixels are processed through a dual-encoder pathway.Subsequently,channel-aware fusion blocks adaptively combine information from these two sources,producing sharper and more stable object boundaries.Experimental results show that SPSALNet achieves high accuracy with lower computational cost compared to recent competing methods.On the PASCAL-S and DUT-OMRON datasets,SPSALNet exhibits a clear performance advantage across all key metrics,and it ranks first on accuracy-oriented measures on HKU-IS.On the challenging DUT-OMRON benchmark,SPSALNet reaches a MAE of 0.034.Across all datasets,it preserves object boundaries and regional structure in a stable and competitive manner.
基金financial support from the Nuclear Energy Science&Technology and Human Resource Development Project of the Japan Atomic Energy Agency/Collaborative Laboratories for Advanced Decommissioning Science(No.R04I034)The author Ruicong Xu appreciates the scholarship(financial support)from the China Scholarship Council(CSC,No.202106380073).
文摘Laser-induced aerosols,predominantly submicron in size,pose significant environmental and health risks during the decommissioning of nuclear reactors.This study experimentally investigated the removal of laser-generated aerosol particles using a water spray system integrated with an innovative system for pre-injecting electrically charged mist in our facility.To simulate aerosol generation in reactor decommissioning,a high-power laser was used to irradiate various materials(including stainless steel,carbon steel,and concrete),generating aerosol particles that were agglomerated with injected water mist and subsequently scavenged by water spray.Experimental results demonstrate enhanced aerosol removal via aerosol-mist agglomeration,with charged mist significantly improving particle capture by increasing wettability and size.The average improvements for the stainless steel,carbon steel,and concrete were 40%,44%,and 21%,respectively.The results of experiments using charged mist with different polarities(both positive and negative)and different surface coatings reveal that the dominant polarity of aerosols varies with the irradiated materials,influenced by their crystal structure and electron emission properties.Notably,surface coatings such as ZrO_(2)and CeO_(2)were found to possibly alter aerosol charging characteristics,thereby affecting aerosol removal efficiency with charged mist configurations.The innovative aerosol-mist agglomeration approach shows promise in mitigating radiation exposure,ensuring environmental safety,and reducing contaminated water during reactor dismantling.This study contributes critical knowledge for the development of advanced aerosol management strategies for nuclear reactor decommissioning.The understanding obtained in this work is also expected to be useful for various environmental and chemical engineering applications such as gas decontamination,air purification,and pollution control.
基金supported by the confidential research grant No.a8317。
文摘To address the issues of frequent identity switches(IDs)and degraded identification accuracy in multi object tracking(MOT)under complex occlusion scenarios,this study proposes an occlusion-robust tracking framework based on face-pedestrian joint feature modeling.By constructing a joint tracking model centered on“intra-class independent tracking+cross-category dynamic binding”,designing a multi-modal matching metric with spatio-temporal and appearance constraints,and innovatively introducing a cross-category feature mutual verification mechanism and a dual matching strategy,this work effectively resolves performance degradation in traditional single-category tracking methods caused by short-term occlusion,cross-camera tracking,and crowded environments.Experiments on the Chokepoint_Face_Pedestrian_Track test set demonstrate that in complex scenes,the proposed method improves Face-Pedestrian Matching F1 area under the curve(F1 AUC)by approximately 4 to 43 percentage points compared to several traditional methods.The joint tracking model achieves overall performance metrics of IDF1:85.1825%and MOTA:86.5956%,representing improvements of 0.91 and 0.06 percentage points,respectively,over the baseline model.Ablation studies confirm the effectiveness of key modules such as the Intersection over Area(IoA)/Intersection over Union(IoU)joint metric and dynamic threshold adjustment,validating the significant role of the cross-category identity matching mechanism in enhancing tracking stability.Our_model shows a 16.7%frame per second(FPS)drop vs.fairness of detection and re-identification in multiple object tracking(FairMOT),with its cross-category binding module adding aboute 10%overhead,yet maintains near-real-time performance for essential face-pedestrian tracking at small resolutions.
文摘With the rapid expansion of drone applications,accurate detection of objects in aerial imagery has become crucial for intelligent transportation,urban management,and emergency rescue missions.However,existing methods face numerous challenges in practical deployment,including scale variation handling,feature degradation,and complex backgrounds.To address these issues,we propose Edge-enhanced and Detail-Capturing You Only Look Once(EHDC-YOLO),a novel framework for object detection in Unmanned Aerial Vehicle(UAV)imagery.Based on the You Only Look Once version 11 nano(YOLOv11n)baseline,EHDC-YOLO systematically introduces several architectural enhancements:(1)a Multi-Scale Edge Enhancement(MSEE)module that leverages multi-scale pooling and edge information to enhance boundary feature extraction;(2)an Enhanced Feature Pyramid Network(EFPN)that integrates P2-level features with Cross Stage Partial(CSP)structures and OmniKernel convolutions for better fine-grained representation;and(3)Dynamic Head(DyHead)with multi-dimensional attention mechanisms for enhanced cross-scale modeling and perspective adaptability.Comprehensive experiments on the Vision meets Drones for Detection(VisDrone-DET)2019 dataset demonstrate that EHDC-YOLO achieves significant improvements,increasing mean Average Precision(mAP)@0.5 from 33.2%to 46.1%(an absolute improvement of 12.9 percentage points)and mAP@0.5:0.95 from 19.5%to 28.0%(an absolute improvement of 8.5 percentage points)compared with the YOLOv11n baseline,while maintaining a reasonable parameter count(2.81 M vs the baseline’s 2.58 M).Further ablation studies confirm the effectiveness of each proposed component,while visualization results highlight EHDC-YOLO’s superior performance in detecting objects and handling occlusions in complex drone scenarios.
文摘Modern manufacturing processes have become more reliant on automation because of the accelerated transition from Industry 3.0 to Industry 4.0.Manual inspection of products on assembly lines remains inefficient,prone to errors and lacks consistency,emphasizing the need for a reliable and automated inspection system.Leveraging both object detection and image segmentation approaches,this research proposes a vision-based solution for the detection of various kinds of tools in the toolkit using deep learning(DL)models.Two Intel RealSense D455f depth cameras were arranged in a top down configuration to capture both RGB and depth images of the toolkits.After applying multiple constraints and enhancing them through preprocessing and augmentation,a dataset consisting of 3300 annotated RGB-D photos was generated.Several DL models were selected through a comprehensive assessment of mean Average Precision(mAP),precision-recall equilibrium,inference latency(target≥30 FPS),and computational burden,resulting in a preference for YOLO and Region-based Convolutional Neural Networks(R-CNN)variants over ViT-based models due to the latter’s increased latency and resource requirements.YOLOV5,YOLOV8,YOLOV11,Faster R-CNN,and Mask R-CNN were trained on the annotated dataset and evaluated using key performance metrics(Recall,Accuracy,F1-score,and Precision).YOLOV11 demonstrated balanced excellence with 93.0%precision,89.9%recall,and a 90.6%F1-score in object detection,as well as 96.9%precision,95.3%recall,and a 96.5%F1-score in instance segmentation with an average inference time of 25 ms per frame(≈40 FPS),demonstrating real-time performance.Leveraging these results,a YOLOV11-based windows application was successfully deployed in a real-time assembly line environment,where it accurately processed live video streams to detect and segment tools within toolkits,demonstrating its practical effectiveness in industrial automation.The application is capable of precisely measuring socket dimensions by utilising edge detection techniques on YOLOv11 segmentation masks,in addition to detection and segmentation.This makes it possible to do specification-level quality control right on the assembly line,which improves the ability to examine things in real time.The implementation is a big step forward for intelligent manufacturing in the Industry 4.0 paradigm.It provides a scalable,efficient,and accurate way to do automated inspection and dimensional verification activities.
文摘The goal of the present work is to demonstrate the potential of Artificial Neural Network(ANN)-driven Genetic Algorithm(GA)methods for energy efficiency and economic performance optimization of energy efficiency measures in a multi-family house building in Greece.The energy efficiency measures include different heating/cooling systems(such as low-temperature and high-temperature heat pumps,natural gas boilers,split units),building envelope components for floor,walls,roof and windows of variable heat transfer coefficients,the installation of solar thermal collectors and PVs.The calculations of the building loads and investment and operating and maintenance costs of the measures are based on the methodology defined in Directive 2010/31/EU,while economic assumptions are based on EN 15459-1 standard.Typically,multi-objective optimization of energy efficiency measures often requires the simulation of very large numbers of cases involving numerous possible combinations,resulting in intense computational load.The results of the study indicate that ANN-driven GA methods can be used as an alternative,valuable tool for reliably predicting the optimal measures which minimize primary energy consumption and life cycle cost of the building with greatly reduced computational requirements.Through GA methods,the computational time needed for obtaining the optimal solutions is reduced by 96.4%-96.8%.
文摘To enhance the efficiency of wastewater biotreatment with microalgae, the effects of physical parameters need to be investigated and optimized. In this regard, the individual and interactive effects of temperature, p H and aeration rate on the performance of biological removal of nitrate and phosphate by Chlorella vulgaris were studied by response surface methodology(RSM). Furthermore, a multi-objective optimization technique was applied to the response equations to simultaneously find optimal combinations of input parameters capable of removing the highest possible amount of nitrate and phosphate. The optimal calculated values were temperature of 26.3 °C, pH of 8 and aeration rate of 4.7 L·min^(-1). Interestingly, under the optimum condition, approximately 85% of total nitrate and 77% of whole phosphate were removed after 48 h and 24 h, respectively, which were in excellent agreement with the predicted values. Finally, the effect of baffle on mixing performance and, as a result, on bioremoval efficiency was investigated in Stirred Tank Photobioreactor(STP) by means of Computational Fluid Dynamics(CFD). Flow behavior indicated substantial enhancement in mixing performance when the baffle was inserted into the tank. Obtained simulation results were validated experimentally. Under the optimum condition, due to proper mixing in baffled STP, nitrate and phosphate removal increased up to 93% and 86%,respectively, compared to unbaffled one.
基金supported in part by the National Science Foundation of China(52371372)the Project of Science and Technology Commission of Shanghai Municipality,China(22JC1401400,21190780300)the 111 Project,China(D18003)
文摘Dear Editor,This letter focuses on the fact that small objects with few pixels disappear in feature maps with large receptive fields, as the network deepens, in object detection tasks. Therefore, the detection of dense small objects is challenging.