The accurate segmentation of deep gray matter nuclei is critical for neuropathological research,disease diagnosis and treatment.Existing methods employ the supervised learning training approach,which requires large la...The accurate segmentation of deep gray matter nuclei is critical for neuropathological research,disease diagnosis and treatment.Existing methods employ the supervised learning training approach,which requires large labeled datasets.It is challenging and time-consuming to obtain such datasets for medical image analysis.In addition,these methods based on convolutional neural networks(CNNs)only achieve suboptimal performance due to the locality of convolutional operations.Vision Transformers(ViTs)efficiently model long-range dependencies and thus have the potentiality to outperform these methods in segmentation tasks.To address these issues,we propose a novel hybrid network based on self-supervised pre-training for deep gray matter nuclei segmentation.Specifically,we present a CNN-Transformer hybrid network(CTNet),whose encoder consists of 3D CNN and ViT to learn local spatial-detailed features and global semantic information.A self-supervised learning(SSL)approach that integrates rotation prediction and masked feature reconstruction is proposed to pre-train the CTNet,enabling the model to learn valuable visual representations from unlabeled data.We evaluate the effectiveness of our method on 3T and 7T human brain MRI datasets.The results demonstrate that our CTNet achieves better performance than other comparison models and our pre-training strategy outperforms other advanced self-supervised methods.When the training set has only one sample,our pre-trained CTNet enhances segmentation performance,showing an 8.4%improvement in Dice similarity coefficient(DSC)compared to the randomly initialized CTNet.展开更多
The collection and annotation of lar ge-scale bird datasets are resource-intensive and time-consuming processes that significantly limit the scalability and accuracy of biodiversity monitoring systems.While self-super...The collection and annotation of lar ge-scale bird datasets are resource-intensive and time-consuming processes that significantly limit the scalability and accuracy of biodiversity monitoring systems.While self-supervised learning(SSL)has emerged as a promising approach for leveraging unannotated data,current SSL methods face two critical challenges in bird species recognition:(1)long-tailed data distributions that result in poor performance on underrepresented species;and(2)domain shift issues caused by data augmentation strategies designed to mitigate class imbalance.Here we present SDNet,a novel SSL-based bird recognition framework that integrates diffusion models with large language models(LLMs)to overcome these limitations.SDNet employs LLMs to generate semantically rich textual descriptions for tail-class species by prompting the models with species taxonomy,morphological attributes,and habitat information,producing detailed natural language priors that capture fine-grained visual characteristics(e.g.,plumage patterns,body proportions,and distinctive markings).These textual descriptions are subsequently used by a conditional diffusion model to synthesize new bird image samples through cross-attention mechanisms that fuse textual embeddings with intermediate visual feature representations during the denoising process,ensuring generated images preserve species-specific morphological details while maintaining photorealistic quality.Additionally,we incorporate a Swin Transformer as the feature extraction backbone whose hierarchical window-based attention mechanism and shifted windowing scheme enable multi-scale local feature extraction that proves particularly effective at capturing finegrained discriminative patterns(such as beak shape and feather texture)while mitigating domain shift between synthetic and original images through consistent feature representations across both data sources.SDNet is validated on both a self-constructed dataset(Bird_BXS)an d a publicly available benchmark(Birds_25),demonstrating substantial improvements over conventional SSL approaches.Our results indicate that the synergistic integration of LLMs,diffusion models,and the Swin Transformer architecture contributes significantly to recognition accuracy,particularly for rare and morphologically similar species.These findings highlight the potential of SDNet for addressing fundamental limitations of existing SSL methods in avian recognition tasks and establishing a new paradigm for efficient self-supervised learning in large-scale ornithological vision applications.展开更多
The authors regret that there were errors in the affiliations and the funding declaration in the original published version.The affiliations a and b of the original manuscript are"School of Information Engineerin...The authors regret that there were errors in the affiliations and the funding declaration in the original published version.The affiliations a and b of the original manuscript are"School of Information Engineering,Jiangxi Provincial Key Laboratory of Advanced Signal Processing and Intelligent Communications,Nanchang University,Nanchang 330031,China",and"School of Internet of Things Engineering,Jiangnan University,Wuxi 214122,China",respectively.The order of the two affiliations are not correct.展开更多
Intelligent Transportation Systems(ITS)leverage Integrated Sensing and Communications(ISAC)to enhance data exchange between vehicles and infrastructure in the Internet of Vehicles(IoV).This integration inevitably incr...Intelligent Transportation Systems(ITS)leverage Integrated Sensing and Communications(ISAC)to enhance data exchange between vehicles and infrastructure in the Internet of Vehicles(IoV).This integration inevitably increases computing demands,risking real-time system stability.Vehicle Edge Computing(VEC)addresses this by offloading tasks to Road Side Units(RSUs),ensuring timely services.Our previous work,the FLSimCo algorithm,which uses local resources for federated Self-Supervised Learning(SSL),has a limitation:vehicles often can’t complete all iteration tasks.Our improved algorithm offloads partial tasks to RSUs and optimizes energy consumption by adjusting transmission power,CPU frequency,and task assignment ratios,balancing local and RSU-based training.Meanwhile,setting an offloading threshold further prevents inefficiencies.Simulation results show that the enhanced algorithm reduces energy consumption and improves offloading efficiency and accuracy of federated SSL.展开更多
Two-dimensional endoscopic images are susceptible to interferences such as specular reflections and monotonous texture illumination,hindering accurate three-dimensional lesion reconstruction by surgical robots.This st...Two-dimensional endoscopic images are susceptible to interferences such as specular reflections and monotonous texture illumination,hindering accurate three-dimensional lesion reconstruction by surgical robots.This study proposes a novel end-to-end disparity estimation model to address these challenges.Our approach combines a Pseudo-Siamese neural network architecture with pyramid dilated convolutions,integrating multi-scale image information to enhance robustness against lighting interferences.This study introduces a Pseudo-Siamese structure-based disparity regression model that simplifies left-right image comparison,improving accuracy and efficiency.The model was evaluated using a dataset of stereo endoscopic videos captured by the Da Vinci surgical robot,comprising simulated silicone heart sequences and real heart video data.Experimental results demonstrate significant improvement in the network’s resistance to lighting interference without substantially increasing parameters.Moreover,the model exhibited faster convergence during training,contributing to overall performance enhancement.This study advances endoscopic image processing accuracy and has potential implications for surgical robot applications in complex environments.展开更多
Few-shot intent detection is a practical challenge task,because new intents are frequently emerging and collecting large-scale data for them could be costly.Meta-learning,a promising technique for leveraging data from...Few-shot intent detection is a practical challenge task,because new intents are frequently emerging and collecting large-scale data for them could be costly.Meta-learning,a promising technique for leveraging data from previous tasks to enable efficient learning of new tasks,has been a popular way to tackle this problem.However,the existing meta-learning models have been evidenced to be overfitting when the meta-training tasks are insufficient.To overcome this challenge,we present a novel self-supervised task augmentation with meta-learning framework,namely STAM.Firstly,we introduce the task augmentation,which explores two different strategies and combines them to extend meta-training tasks.Secondly,we devise two auxiliary losses for integrating self-supervised learning into meta-learning to learn more generalizable and transferable features.Experimental results show that STAM can achieve consistent and considerable performance improvement to existing state-of-the-art methods on four datasets.展开更多
Recent years have witnessed significant progress in deep learning for remote sensing image Super-Resolution(SR).However,in real-world applications,paired data is often unavailable,making supervised training infeasible...Recent years have witnessed significant progress in deep learning for remote sensing image Super-Resolution(SR).However,in real-world applications,paired data is often unavailable,making supervised training infeasible,while unknown degradation factors constrain reconstruction performance and impair detail recovery.To this end,we propose a Degradation-Adaptive Self-supervised SR method,named DASSR,which recovers high-fidelity details from low-resolution remote sensing images without requiring supervision from high-resolution groundtruth.DASSR employs a dual-path closed-loop architecture,enabling joint learning of SR reconstruction and blur kernel estimation through cycle consistency in the main branch and regularization in the auxiliary branch.Specifically,we incorporate an Edge-Preserving SR network(EPSRN)into DASSR,whose core Hybrid Attention Enhancement Block(HAEB)captures precise structural representations to guide accurate detail reconstruction.Furthermore,a composite loss function is designed,integrating spatial reconstruction consistency,frequencydomain spectrum alignment,and kernel sparsity constraints to ensure stable and efficient self-supervised learning.Experiments on both simulated and real-world remote sensing datasets demonstrate that the proposed DASSR method outperforms competitive deep learning-based SR methods,notably achieving approximately 9%and 15%improvements in the Average Gradient(AG)and Spatial Frequency(SF)metrics,respectively,over the best-performing competitor.展开更多
Computed Tomography(CT)reconstruction is essential inmedical imaging and other engineering fields.However,blurring of the projection during CT imaging can lead to artifacts in the reconstructed images.Projection blur ...Computed Tomography(CT)reconstruction is essential inmedical imaging and other engineering fields.However,blurring of the projection during CT imaging can lead to artifacts in the reconstructed images.Projection blur combines factors such as larger ray sources,scattering and imaging system vibration.To address the problem,we propose DeblurTomo,a novel self-supervised learning-based deblurring and reconstruction algorithm that efficiently reconstructs sharp CT images from blurry input without needing external data and blur measurement.Specifically,we constructed a coordinate-based implicit neural representation reconstruction network,which can map the coordinates to the attenuation coefficient in the reconstructed space formore convenient ray representation.Then,wemodel the blur as aweighted sumof offset rays and design the RayCorrectionNetwork(RCN)andWeight ProposalNetwork(WPN)to fit these rays and their weights bymulti-view consistency and geometric information,thereby extending 2D deblurring to 3D space.In the training phase,we use the blurry input as the supervision signal to optimize the reconstruction network,the RCN,and the WPN simultaneously.Extensive experiments on the widely used synthetic dataset show that DeblurTomo performs superiorly on the limited-angle and sparse-view in the simulated blurred scenarios.Further experiments on real datasets demonstrate the superiority of our method in practical scenarios.展开更多
Blended acquisition offers efficiency improvements over conventional seismic data acquisition, at the cost of introducing blending noise effects. Besides, seismic data often suffers from irregularly missing shots caus...Blended acquisition offers efficiency improvements over conventional seismic data acquisition, at the cost of introducing blending noise effects. Besides, seismic data often suffers from irregularly missing shots caused by artificial or natural effects during blended acquisition. Therefore, blending noise attenuation and missing shots reconstruction are essential for providing high-quality seismic data for further seismic processing and interpretation. The iterative shrinkage thresholding algorithm can help obtain deblended data based on sparsity assumptions of complete unblended data, and it characterizes seismic data linearly. Supervised learning algorithms can effectively capture the nonlinear relationship between incomplete pseudo-deblended data and complete unblended data. However, the dependence on complete unblended labels limits their practicality in field applications. Consequently, a self-supervised algorithm is presented for simultaneous deblending and interpolation of incomplete blended data, which minimizes the difference between simulated and observed incomplete pseudo-deblended data. The used blind-trace U-Net (BTU-Net) prevents identity mapping during complete unblended data estimation. Furthermore, a multistep process with blending noise simulation-subtraction and missing traces reconstruction-insertion is used in each step to improve the deblending and interpolation performance. Experiments with synthetic and field incomplete blended data demonstrate the effectiveness of the multistep self-supervised BTU-Net algorithm.展开更多
Feature fusion is an important technique in medical image classification that can improve diagnostic accuracy by integrating complementary information from multiple sources.Recently,Deep Learning(DL)has been widely us...Feature fusion is an important technique in medical image classification that can improve diagnostic accuracy by integrating complementary information from multiple sources.Recently,Deep Learning(DL)has been widely used in pulmonary disease diagnosis,such as pneumonia and tuberculosis.However,traditional feature fusion methods often suffer from feature disparity,information loss,redundancy,and increased complexity,hindering the further extension of DL algorithms.To solve this problem,we propose a Graph-Convolution Fusion Network with Self-Supervised Feature Alignment(Self-FAGCFN)to address the limitations of traditional feature fusion methods in deep learning-based medical image classification for respiratory diseases such as pneumonia and tuberculosis.The network integrates Convolutional Neural Networks(CNNs)for robust feature extraction from two-dimensional grid structures and Graph Convolutional Networks(GCNs)within a Graph Neural Network branch to capture features based on graph structure,focusing on significant node representations.Additionally,an Attention-Embedding Ensemble Block is included to capture critical features from GCN outputs.To ensure effective feature alignment between pre-and post-fusion stages,we introduce a feature alignment loss that minimizes disparities.Moreover,to address the limitations of proposed methods,such as inappropriate centroid discrepancies during feature alignment and class imbalance in the dataset,we develop a Feature-Centroid Fusion(FCF)strategy and a Multi-Level Feature-Centroid Update(MLFCU)algorithm,respectively.Extensive experiments on public datasets LungVision and Chest-Xray demonstrate that the Self-FAGCFN model significantly outperforms existing methods in diagnosing pneumonia and tuberculosis,highlighting its potential for practical medical applications.展开更多
Self-supervised monocular depth estimation has emerged as a major research focus in recent years,primarily due to the elimination of ground-truth depth dependence.However,the prevailing architectures in this domain su...Self-supervised monocular depth estimation has emerged as a major research focus in recent years,primarily due to the elimination of ground-truth depth dependence.However,the prevailing architectures in this domain suffer from inherent limitations:existing pose network branches infer camera ego-motion exclusively under static-scene and Lambertian-surface assumptions.These assumptions are often violated in real-world scenarios due to dynamic objects,non-Lambertian reflectance,and unstructured background elements,leading to pervasive artifacts such as depth discontinuities(“holes”),structural collapse,and ambiguous reconstruction.To address these challenges,we propose a novel framework that integrates scene dynamic pose estimation into the conventional self-supervised depth network,enhancing its ability to model complex scene dynamics.Our contributions are threefold:(1)a pixel-wise dynamic pose estimation module that jointly resolves the pose transformations of moving objects and localized scene perturbations;(2)a physically-informed loss function that couples dynamic pose and depth predictions,designed to mitigate depth errors arising from high-speed distant objects and geometrically inconsistent motion profiles;(3)an efficient SE(3)transformation parameterization that streamlines network complexity and temporal pre-processing.Extensive experiments on the KITTI and NYU-V2 benchmarks show that our framework achieves state-of-the-art performance in both quantitative metrics and qualitative visual fidelity,significantly improving the robustness and generalization of monocular depth estimation under dynamic conditions.展开更多
Few-shot learning has emerged as a crucial technique for coral species classification,addressing the challenge of limited labeled data in underwater environments.This study introduces an optimized few-shot learning mo...Few-shot learning has emerged as a crucial technique for coral species classification,addressing the challenge of limited labeled data in underwater environments.This study introduces an optimized few-shot learning model that enhances classification accuracy while minimizing reliance on extensive data collection.The proposed model integrates a hybrid similarity measure combining Euclidean distance and cosine similarity,effectively capturing both feature magnitude and directional relationships.This approach achieves a notable accuracy of 71.8%under a 5-way 5-shot evaluation,outperforming state-of-the-art models such as Prototypical Networks,FEAT,and ESPT by up to 10%.Notably,the model demonstrates high precision in classifying Siderastreidae(87.52%)and Fungiidae(88.95%),underscoring its effectiveness in distinguishing subtle morphological differences.To further enhance performance,we incorporate a self-supervised learning mechanism based on contrastive learning,enabling the model to extract robust representations by leveraging local structural patterns in corals.This enhancement significantly improves classification accuracy,particularly for species with high intra-class variation,leading to an overall accuracy of 76.52%under a 5-way 10-shot evaluation.Additionally,the model exploits the repetitive structures inherent in corals,introducing a local feature aggregation strategy that refines classification through spatial information integration.Beyond its technical contributions,this study presents a scalable and efficient approach for automated coral reef monitoring,reducing annotation costs while maintaining high classification accuracy.By improving few-shot learning performance in underwater environments,our model enhances monitoring accuracy by up to 15%compared to traditional methods,offering a practical solution for large-scale coral conservation efforts.展开更多
In the era of the Internet of Things,distributed computing alleviates the problem of insufficient terminal computing power by integrating idle resources of heterogeneous devices.However,the imbalance between task exec...In the era of the Internet of Things,distributed computing alleviates the problem of insufficient terminal computing power by integrating idle resources of heterogeneous devices.However,the imbalance between task execution delay and node energy consumption,and the scheduling and adaptation challenges brought about by device heterogeneity,urgently need to be addressed.To tackle this problem,this paper constructs a multi-objective real-time task scheduling model that considers task real-time performance,execution delay,system energy consumption,and node interests.The model aims to minimize the delay upper bound and total energy consumption while maximizing system satisfaction.A real-time task scheduling algorithm based on bilateral matching game is proposed.By designing a bidirectional preference mechanism between tasks and computing nodes,combined with a multi-round stable matching strategy,accurate matching between tasks and nodes is achieved.Simulation results show that compared with the baseline scheme,the proposed algorithm significantly reduces the total execution cost,effectively balances the task execution delay and the energy consumption of compute nodes,and takes into account the interests of each network compute node.展开更多
This study compares the relative efficacy of the continuation task and the model-as-feedbackwriting (MAFW) task in EFL writing development. Ninety intermediate-level Chinese EFL learnerswere randomly assigned to a con...This study compares the relative efficacy of the continuation task and the model-as-feedbackwriting (MAFW) task in EFL writing development. Ninety intermediate-level Chinese EFL learnerswere randomly assigned to a continuation group, a MAFW group, and a control group, each with30 learners. A pretest and a posttest were used to gauge L2 writing development. Results showedthat the continuation task outperformed the MAFW task not only in enhancing the overall qualityof L2 writing, but also in promoting the quality of three components of L2 writing, namely, content,organization, and language. The finding has important implications for L2 writing teaching andlearning.展开更多
With the widespread deployment of assembly robots in smart manufacturing,efficiently offloading tasks and allocating resources in highly dynamic industrial environments has become a critical challenge for Mobile Edge ...With the widespread deployment of assembly robots in smart manufacturing,efficiently offloading tasks and allocating resources in highly dynamic industrial environments has become a critical challenge for Mobile Edge Computing(MEC).To address this challenge,this paper constructs a cloud-edge-end collaborative MEC system that enables assembly robots to offload complex workflow tasks via multiple paths(horizontal,vertical,and hybrid collaboration).Tomitigate uncertainties arising frommobility,the location predictionmodule is employed.This enables proactive channel-quality estimation,providing forward-looking insights for offloading decisions.Furthermore,we propose a fairness-aware joint optimization framework.Utilizing an improved Multi-Agent Deep Reinforcement Learning(MADRL)algorithm whose reward function incorporates total system cost,positional reliability,and timeout penalties,the framework aims to balance resource distribution among assembly robots while maximizing system utility.Simulation results demonstrate that the proposed framework outperforms traditional offloading strategies.By integrating predictive mobility management with fairness-aware optimization,the framework offers a robust solution for dynamic industrial MEC environments.展开更多
Task scheduling in cloud computing is a multi-objective optimization problem,often involving conflicting objectives such as minimizing execution time,reducing operational cost,and maximizing resource utilization.Howev...Task scheduling in cloud computing is a multi-objective optimization problem,often involving conflicting objectives such as minimizing execution time,reducing operational cost,and maximizing resource utilization.However,traditional approaches frequently rely on single-objective optimization methods which are insufficient for capturing the complexity of such problems.To address this limitation,we introduce MDMOSA(Multi-objective Dwarf Mongoose Optimization with Simulated Annealing),a hybrid that integrates multi-objective optimization for efficient task scheduling in Infrastructure-as-a-Service(IaaS)cloud environments.MDMOSA harmonizes the exploration capabilities of the biologically inspired Dwarf Mongoose Optimization(DMO)with the exploitation strengths of Simulated Annealing(SA),achieving a balanced search process.The algorithm aims to optimize task allocation by reducing makespan and financial cost while improving system resource utilization.We evaluate MDMOSA through extensive simulations using the real-world Google Cloud Jobs(GoCJ)dataset within the CloudSim environment.Comparative analysis against benchmarked algorithms such as SMOACO,MOTSGWO,and MFPAGWO reveals that MDMOSA consistently achieves superior performance in terms of scheduling efficiency,cost-effectiveness,and scalability.These results confirm the potential of MDMOSA as a robust and adaptable solution for resource scheduling in dynamic and heterogeneous cloud computing infrastructures.展开更多
The iterative continuation task(ICT)requires English as a foreign language(EFL)learners to read a segment and write a continuation that aligns with the preceding segment of an English novel with successive turns,offer...The iterative continuation task(ICT)requires English as a foreign language(EFL)learners to read a segment and write a continuation that aligns with the preceding segment of an English novel with successive turns,offering exposure to diverse grammatical structures and opportunities for contextualized usage.Given the importance of integrating technology into second language(L2)writing and the critical role that grammar plays in L2 writing development,automated written corrective feedback provided by Grammarly has gained significant attention.This study investigates the impact of Grammarly on grammar learning strategies,grammar grit,and grammar competence among EFL college students engaged in ICT.This study employed a mixed-methods sequential exploratory design;56 participants were divided into an experimental group(n=28),receiving Grammarly feedback for ICT,and a control group(n=28),completing ICT without Grammarly feedback.Quantitative results revealed that both groups showed improvements in L2 grammar learning strategies,grit and competence.For the experimental group,significant differences were observed across all variables of L2 grammar learning strategies,grit,and competence between pre-and post-tests.For the control group,significant differences were only observed in the affective dimension of grammar learning strategies,Consistency of Interest(COI)of grammar grit,and grammar competence.However,the control group presented a significantly higher improvement in grammar competence.Qualitative analysis showed both positive and negative perceptions of Grammarly.The pedagogical implications of integrating Grammarly and ICT for L2 grammar development are discussed.展开更多
In scenarios where ground-based cloud computing infrastructure is unavailable,unmanned aerial vehicles(UAVs)act as mobile edge computing(MEC)servers to provide on-demand computation services for ground terminals.To ad...In scenarios where ground-based cloud computing infrastructure is unavailable,unmanned aerial vehicles(UAVs)act as mobile edge computing(MEC)servers to provide on-demand computation services for ground terminals.To address the challenge of jointly optimizing task scheduling and UAV trajectory under limited resources and high mobility of UAVs,this paper presents PER-MATD3,a multi-agent deep reinforcement learning algorithm with prioritized experience replay(PER)into the Centralized Training with Decentralized Execution(CTDE)framework.Specifically,PER-MATD3 enables each agent to learn a decentralized policy using only local observations during execution,while leveraging a shared replay buffer with prioritized sampling and centralized critic during training to accelerate convergence and improve sample efficiency.Simulation results show that PER-MATD3 reduces average task latency by up to 23%,improves energy efficiency by 21%,and enhances service coverage compared to state-of-the-art baselines,demonstrating its effectiveness and practicality in scenarios without terrestrial networks.展开更多
Advanced technologies like Cyber-Physical Systems(CPS)and the Internet of Things(IoT)have supported modernizing and automating the transportation region through the introduction of Intelligent Transportation Systems(I...Advanced technologies like Cyber-Physical Systems(CPS)and the Internet of Things(IoT)have supported modernizing and automating the transportation region through the introduction of Intelligent Transportation Systems(ITS).Integrating CPS-ITS and IoT provides real-time Vehicle-to-Infrastructure(V2I)communication,supporting better traffic management,safety,and efficiency.These technological innovations generate complex problems that need to be addressed,uniquely about data routing and Task Scheduling(TS)in ITS.Attempts to solve those problems were primarily based on traditional and experimental methods,and the solutions were not so successful due to the dynamic nature of ITS.This is where the scope of Machine learning(ML)and Swarm Intelligence(SI)has significantly impacted dealing with these challenges;in this line,this research paper presents a novel method for TS and data routing in the CPS-ITS.This paper proposes using a cutting-edge ML algorithm for data transmission from CPS-ITS.This ML has Gated Linear Unit-approximated Reinforcement Learning(GLRL).Greedy Iterative-Particle Swarm Optimization(GI-PSO)has been recommended to develop the Particle Swarm Optimization(PSO)for TS.The primary objective of this study is to enhance the security and effectiveness of ITS systems that utilize CPS-ITS.This study trained and validated the models using a network simulation dataset of 50 nodes from numerous ITS environments.The experiments demonstrate that the proposed GLRL reduces End-toEnd Delay(EED)by 12%,enhances data size use from 83.6%to 88.6%,and achieves higher bandwidth allocation,particularly in high-demand scenarios such as multimedia data streams where adherence improved to 98.15%.Furthermore,the GLRL reduced Network Congestion(NC)by 5.5%,demonstrating its efficiency in managing complex traffic conditions across several environments.The model passed simulation tests in three different environments:urban(UE),suburban(SE),and rural(RE).It met the high bandwidth requirements,made task scheduling more efficient,and increased network throughput(NT).This proved that it was robust and flexible enough for scalable ITS applications.These innovations provide robust,scalable solutions for real-time traffic management,ultimately improving safety,reducing NC,and increasing overall NT.This study can affect ITS by developing it to be more responsive,safe,and effective and by creating a perfect method to set up UE,SE,and RE.展开更多
基金supported in part by the National Natural Science Foundation of China under Grant 62071405the National Natural Science Foundation of China under Grant 12175189.
文摘The accurate segmentation of deep gray matter nuclei is critical for neuropathological research,disease diagnosis and treatment.Existing methods employ the supervised learning training approach,which requires large labeled datasets.It is challenging and time-consuming to obtain such datasets for medical image analysis.In addition,these methods based on convolutional neural networks(CNNs)only achieve suboptimal performance due to the locality of convolutional operations.Vision Transformers(ViTs)efficiently model long-range dependencies and thus have the potentiality to outperform these methods in segmentation tasks.To address these issues,we propose a novel hybrid network based on self-supervised pre-training for deep gray matter nuclei segmentation.Specifically,we present a CNN-Transformer hybrid network(CTNet),whose encoder consists of 3D CNN and ViT to learn local spatial-detailed features and global semantic information.A self-supervised learning(SSL)approach that integrates rotation prediction and masked feature reconstruction is proposed to pre-train the CTNet,enabling the model to learn valuable visual representations from unlabeled data.We evaluate the effectiveness of our method on 3T and 7T human brain MRI datasets.The results demonstrate that our CTNet achieves better performance than other comparison models and our pre-training strategy outperforms other advanced self-supervised methods.When the training set has only one sample,our pre-trained CTNet enhances segmentation performance,showing an 8.4%improvement in Dice similarity coefficient(DSC)compared to the randomly initialized CTNet.
基金supported by the National Natural Science Foundation of China(32471964)。
文摘The collection and annotation of lar ge-scale bird datasets are resource-intensive and time-consuming processes that significantly limit the scalability and accuracy of biodiversity monitoring systems.While self-supervised learning(SSL)has emerged as a promising approach for leveraging unannotated data,current SSL methods face two critical challenges in bird species recognition:(1)long-tailed data distributions that result in poor performance on underrepresented species;and(2)domain shift issues caused by data augmentation strategies designed to mitigate class imbalance.Here we present SDNet,a novel SSL-based bird recognition framework that integrates diffusion models with large language models(LLMs)to overcome these limitations.SDNet employs LLMs to generate semantically rich textual descriptions for tail-class species by prompting the models with species taxonomy,morphological attributes,and habitat information,producing detailed natural language priors that capture fine-grained visual characteristics(e.g.,plumage patterns,body proportions,and distinctive markings).These textual descriptions are subsequently used by a conditional diffusion model to synthesize new bird image samples through cross-attention mechanisms that fuse textual embeddings with intermediate visual feature representations during the denoising process,ensuring generated images preserve species-specific morphological details while maintaining photorealistic quality.Additionally,we incorporate a Swin Transformer as the feature extraction backbone whose hierarchical window-based attention mechanism and shifted windowing scheme enable multi-scale local feature extraction that proves particularly effective at capturing finegrained discriminative patterns(such as beak shape and feather texture)while mitigating domain shift between synthetic and original images through consistent feature representations across both data sources.SDNet is validated on both a self-constructed dataset(Bird_BXS)an d a publicly available benchmark(Birds_25),demonstrating substantial improvements over conventional SSL approaches.Our results indicate that the synergistic integration of LLMs,diffusion models,and the Swin Transformer architecture contributes significantly to recognition accuracy,particularly for rare and morphologically similar species.These findings highlight the potential of SDNet for addressing fundamental limitations of existing SSL methods in avian recognition tasks and establishing a new paradigm for efficient self-supervised learning in large-scale ornithological vision applications.
文摘The authors regret that there were errors in the affiliations and the funding declaration in the original published version.The affiliations a and b of the original manuscript are"School of Information Engineering,Jiangxi Provincial Key Laboratory of Advanced Signal Processing and Intelligent Communications,Nanchang University,Nanchang 330031,China",and"School of Internet of Things Engineering,Jiangnan University,Wuxi 214122,China",respectively.The order of the two affiliations are not correct.
文摘Intelligent Transportation Systems(ITS)leverage Integrated Sensing and Communications(ISAC)to enhance data exchange between vehicles and infrastructure in the Internet of Vehicles(IoV).This integration inevitably increases computing demands,risking real-time system stability.Vehicle Edge Computing(VEC)addresses this by offloading tasks to Road Side Units(RSUs),ensuring timely services.Our previous work,the FLSimCo algorithm,which uses local resources for federated Self-Supervised Learning(SSL),has a limitation:vehicles often can’t complete all iteration tasks.Our improved algorithm offloads partial tasks to RSUs and optimizes energy consumption by adjusting transmission power,CPU frequency,and task assignment ratios,balancing local and RSU-based training.Meanwhile,setting an offloading threshold further prevents inefficiencies.Simulation results show that the enhanced algorithm reduces energy consumption and improves offloading efficiency and accuracy of federated SSL.
基金Supported by Sichuan Science and Technology Program(2023YFSY0026,2023YFH0004)Supported by the Institute of Information&Communications Technology Planning&Evaluation(IITP)grant funded by the Korean government(MSIT)(No.RS-2022-00155885,Artificial Intelligence Convergence Innovation Human Resources Development(Hanyang University ERICA)).
文摘Two-dimensional endoscopic images are susceptible to interferences such as specular reflections and monotonous texture illumination,hindering accurate three-dimensional lesion reconstruction by surgical robots.This study proposes a novel end-to-end disparity estimation model to address these challenges.Our approach combines a Pseudo-Siamese neural network architecture with pyramid dilated convolutions,integrating multi-scale image information to enhance robustness against lighting interferences.This study introduces a Pseudo-Siamese structure-based disparity regression model that simplifies left-right image comparison,improving accuracy and efficiency.The model was evaluated using a dataset of stereo endoscopic videos captured by the Da Vinci surgical robot,comprising simulated silicone heart sequences and real heart video data.Experimental results demonstrate significant improvement in the network’s resistance to lighting interference without substantially increasing parameters.Moreover,the model exhibited faster convergence during training,contributing to overall performance enhancement.This study advances endoscopic image processing accuracy and has potential implications for surgical robot applications in complex environments.
基金the National Natural Science Foundation of China under Grant Nos.61936012 and 61976114。
文摘Few-shot intent detection is a practical challenge task,because new intents are frequently emerging and collecting large-scale data for them could be costly.Meta-learning,a promising technique for leveraging data from previous tasks to enable efficient learning of new tasks,has been a popular way to tackle this problem.However,the existing meta-learning models have been evidenced to be overfitting when the meta-training tasks are insufficient.To overcome this challenge,we present a novel self-supervised task augmentation with meta-learning framework,namely STAM.Firstly,we introduce the task augmentation,which explores two different strategies and combines them to extend meta-training tasks.Secondly,we devise two auxiliary losses for integrating self-supervised learning into meta-learning to learn more generalizable and transferable features.Experimental results show that STAM can achieve consistent and considerable performance improvement to existing state-of-the-art methods on four datasets.
基金National Natural Science Foundation of China(Nos.42501465,42471504)。
文摘Recent years have witnessed significant progress in deep learning for remote sensing image Super-Resolution(SR).However,in real-world applications,paired data is often unavailable,making supervised training infeasible,while unknown degradation factors constrain reconstruction performance and impair detail recovery.To this end,we propose a Degradation-Adaptive Self-supervised SR method,named DASSR,which recovers high-fidelity details from low-resolution remote sensing images without requiring supervision from high-resolution groundtruth.DASSR employs a dual-path closed-loop architecture,enabling joint learning of SR reconstruction and blur kernel estimation through cycle consistency in the main branch and regularization in the auxiliary branch.Specifically,we incorporate an Edge-Preserving SR network(EPSRN)into DASSR,whose core Hybrid Attention Enhancement Block(HAEB)captures precise structural representations to guide accurate detail reconstruction.Furthermore,a composite loss function is designed,integrating spatial reconstruction consistency,frequencydomain spectrum alignment,and kernel sparsity constraints to ensure stable and efficient self-supervised learning.Experiments on both simulated and real-world remote sensing datasets demonstrate that the proposed DASSR method outperforms competitive deep learning-based SR methods,notably achieving approximately 9%and 15%improvements in the Average Gradient(AG)and Spatial Frequency(SF)metrics,respectively,over the best-performing competitor.
基金supported in part by the National Natural Science Foundation of China under Grants 62472434 and 62402171in part by the National Key Research and Development Program of China under Grant 2022YFF1203001+1 种基金in part by the Science and Technology Innovation Program of Hunan Province under Grant 2022RC3061in part by the Sci-Tech Innovation 2030 Agenda under Grant 2023ZD0508600.
文摘Computed Tomography(CT)reconstruction is essential inmedical imaging and other engineering fields.However,blurring of the projection during CT imaging can lead to artifacts in the reconstructed images.Projection blur combines factors such as larger ray sources,scattering and imaging system vibration.To address the problem,we propose DeblurTomo,a novel self-supervised learning-based deblurring and reconstruction algorithm that efficiently reconstructs sharp CT images from blurry input without needing external data and blur measurement.Specifically,we constructed a coordinate-based implicit neural representation reconstruction network,which can map the coordinates to the attenuation coefficient in the reconstructed space formore convenient ray representation.Then,wemodel the blur as aweighted sumof offset rays and design the RayCorrectionNetwork(RCN)andWeight ProposalNetwork(WPN)to fit these rays and their weights bymulti-view consistency and geometric information,thereby extending 2D deblurring to 3D space.In the training phase,we use the blurry input as the supervision signal to optimize the reconstruction network,the RCN,and the WPN simultaneously.Extensive experiments on the widely used synthetic dataset show that DeblurTomo performs superiorly on the limited-angle and sparse-view in the simulated blurred scenarios.Further experiments on real datasets demonstrate the superiority of our method in practical scenarios.
基金supported by the National Natural Science Foundation of China(42374134,42304125,U20B6005)the Science and Technology Commission of Shanghai Municipality(23JC1400502)the Fundamental Research Funds for the Central Universities.
文摘Blended acquisition offers efficiency improvements over conventional seismic data acquisition, at the cost of introducing blending noise effects. Besides, seismic data often suffers from irregularly missing shots caused by artificial or natural effects during blended acquisition. Therefore, blending noise attenuation and missing shots reconstruction are essential for providing high-quality seismic data for further seismic processing and interpretation. The iterative shrinkage thresholding algorithm can help obtain deblended data based on sparsity assumptions of complete unblended data, and it characterizes seismic data linearly. Supervised learning algorithms can effectively capture the nonlinear relationship between incomplete pseudo-deblended data and complete unblended data. However, the dependence on complete unblended labels limits their practicality in field applications. Consequently, a self-supervised algorithm is presented for simultaneous deblending and interpolation of incomplete blended data, which minimizes the difference between simulated and observed incomplete pseudo-deblended data. The used blind-trace U-Net (BTU-Net) prevents identity mapping during complete unblended data estimation. Furthermore, a multistep process with blending noise simulation-subtraction and missing traces reconstruction-insertion is used in each step to improve the deblending and interpolation performance. Experiments with synthetic and field incomplete blended data demonstrate the effectiveness of the multistep self-supervised BTU-Net algorithm.
基金supported by the National Natural Science Foundation of China(62276092,62303167)the Postdoctoral Fellowship Program(Grade C)of China Postdoctoral Science Foundation(GZC20230707)+3 种基金the Key Science and Technology Program of Henan Province,China(242102211051,242102211042,212102310084)Key Scientiffc Research Projects of Colleges and Universities in Henan Province,China(25A520009)the China Postdoctoral Science Foundation(2024M760808)the Henan Province medical science and technology research plan joint construction project(LHGJ2024069).
文摘Feature fusion is an important technique in medical image classification that can improve diagnostic accuracy by integrating complementary information from multiple sources.Recently,Deep Learning(DL)has been widely used in pulmonary disease diagnosis,such as pneumonia and tuberculosis.However,traditional feature fusion methods often suffer from feature disparity,information loss,redundancy,and increased complexity,hindering the further extension of DL algorithms.To solve this problem,we propose a Graph-Convolution Fusion Network with Self-Supervised Feature Alignment(Self-FAGCFN)to address the limitations of traditional feature fusion methods in deep learning-based medical image classification for respiratory diseases such as pneumonia and tuberculosis.The network integrates Convolutional Neural Networks(CNNs)for robust feature extraction from two-dimensional grid structures and Graph Convolutional Networks(GCNs)within a Graph Neural Network branch to capture features based on graph structure,focusing on significant node representations.Additionally,an Attention-Embedding Ensemble Block is included to capture critical features from GCN outputs.To ensure effective feature alignment between pre-and post-fusion stages,we introduce a feature alignment loss that minimizes disparities.Moreover,to address the limitations of proposed methods,such as inappropriate centroid discrepancies during feature alignment and class imbalance in the dataset,we develop a Feature-Centroid Fusion(FCF)strategy and a Multi-Level Feature-Centroid Update(MLFCU)algorithm,respectively.Extensive experiments on public datasets LungVision and Chest-Xray demonstrate that the Self-FAGCFN model significantly outperforms existing methods in diagnosing pneumonia and tuberculosis,highlighting its potential for practical medical applications.
基金supported in part by the National Natural Science Foundation of China under Grants 62071345。
文摘Self-supervised monocular depth estimation has emerged as a major research focus in recent years,primarily due to the elimination of ground-truth depth dependence.However,the prevailing architectures in this domain suffer from inherent limitations:existing pose network branches infer camera ego-motion exclusively under static-scene and Lambertian-surface assumptions.These assumptions are often violated in real-world scenarios due to dynamic objects,non-Lambertian reflectance,and unstructured background elements,leading to pervasive artifacts such as depth discontinuities(“holes”),structural collapse,and ambiguous reconstruction.To address these challenges,we propose a novel framework that integrates scene dynamic pose estimation into the conventional self-supervised depth network,enhancing its ability to model complex scene dynamics.Our contributions are threefold:(1)a pixel-wise dynamic pose estimation module that jointly resolves the pose transformations of moving objects and localized scene perturbations;(2)a physically-informed loss function that couples dynamic pose and depth predictions,designed to mitigate depth errors arising from high-speed distant objects and geometrically inconsistent motion profiles;(3)an efficient SE(3)transformation parameterization that streamlines network complexity and temporal pre-processing.Extensive experiments on the KITTI and NYU-V2 benchmarks show that our framework achieves state-of-the-art performance in both quantitative metrics and qualitative visual fidelity,significantly improving the robustness and generalization of monocular depth estimation under dynamic conditions.
基金funded by theNational Science and TechnologyCouncil(NSTC),Taiwan,under grant numbers NSTC 112-2634-F-019-001 and NSTC 113-2634-F-A49-007.
文摘Few-shot learning has emerged as a crucial technique for coral species classification,addressing the challenge of limited labeled data in underwater environments.This study introduces an optimized few-shot learning model that enhances classification accuracy while minimizing reliance on extensive data collection.The proposed model integrates a hybrid similarity measure combining Euclidean distance and cosine similarity,effectively capturing both feature magnitude and directional relationships.This approach achieves a notable accuracy of 71.8%under a 5-way 5-shot evaluation,outperforming state-of-the-art models such as Prototypical Networks,FEAT,and ESPT by up to 10%.Notably,the model demonstrates high precision in classifying Siderastreidae(87.52%)and Fungiidae(88.95%),underscoring its effectiveness in distinguishing subtle morphological differences.To further enhance performance,we incorporate a self-supervised learning mechanism based on contrastive learning,enabling the model to extract robust representations by leveraging local structural patterns in corals.This enhancement significantly improves classification accuracy,particularly for species with high intra-class variation,leading to an overall accuracy of 76.52%under a 5-way 10-shot evaluation.Additionally,the model exploits the repetitive structures inherent in corals,introducing a local feature aggregation strategy that refines classification through spatial information integration.Beyond its technical contributions,this study presents a scalable and efficient approach for automated coral reef monitoring,reducing annotation costs while maintaining high classification accuracy.By improving few-shot learning performance in underwater environments,our model enhances monitoring accuracy by up to 15%compared to traditional methods,offering a practical solution for large-scale coral conservation efforts.
基金Supported by the National Program on Key Basic Research Project(2020YFA0713600)the National Natural Science Foundation of China(62272214)。
文摘In the era of the Internet of Things,distributed computing alleviates the problem of insufficient terminal computing power by integrating idle resources of heterogeneous devices.However,the imbalance between task execution delay and node energy consumption,and the scheduling and adaptation challenges brought about by device heterogeneity,urgently need to be addressed.To tackle this problem,this paper constructs a multi-objective real-time task scheduling model that considers task real-time performance,execution delay,system energy consumption,and node interests.The model aims to minimize the delay upper bound and total energy consumption while maximizing system satisfaction.A real-time task scheduling algorithm based on bilateral matching game is proposed.By designing a bidirectional preference mechanism between tasks and computing nodes,combined with a multi-round stable matching strategy,accurate matching between tasks and nodes is achieved.Simulation results show that compared with the baseline scheme,the proposed algorithm significantly reduces the total execution cost,effectively balances the task execution delay and the energy consumption of compute nodes,and takes into account the interests of each network compute node.
文摘This study compares the relative efficacy of the continuation task and the model-as-feedbackwriting (MAFW) task in EFL writing development. Ninety intermediate-level Chinese EFL learnerswere randomly assigned to a continuation group, a MAFW group, and a control group, each with30 learners. A pretest and a posttest were used to gauge L2 writing development. Results showedthat the continuation task outperformed the MAFW task not only in enhancing the overall qualityof L2 writing, but also in promoting the quality of three components of L2 writing, namely, content,organization, and language. The finding has important implications for L2 writing teaching andlearning.
基金supported by the National Key R&D Program of China under Grant Nos.2024YFD2400200 and 2024YFD2400204supported in part by the Science and Technology Development Program for the Two Zones under Grant No.2023LQ02004.
文摘With the widespread deployment of assembly robots in smart manufacturing,efficiently offloading tasks and allocating resources in highly dynamic industrial environments has become a critical challenge for Mobile Edge Computing(MEC).To address this challenge,this paper constructs a cloud-edge-end collaborative MEC system that enables assembly robots to offload complex workflow tasks via multiple paths(horizontal,vertical,and hybrid collaboration).Tomitigate uncertainties arising frommobility,the location predictionmodule is employed.This enables proactive channel-quality estimation,providing forward-looking insights for offloading decisions.Furthermore,we propose a fairness-aware joint optimization framework.Utilizing an improved Multi-Agent Deep Reinforcement Learning(MADRL)algorithm whose reward function incorporates total system cost,positional reliability,and timeout penalties,the framework aims to balance resource distribution among assembly robots while maximizing system utility.Simulation results demonstrate that the proposed framework outperforms traditional offloading strategies.By integrating predictive mobility management with fairness-aware optimization,the framework offers a robust solution for dynamic industrial MEC environments.
文摘Task scheduling in cloud computing is a multi-objective optimization problem,often involving conflicting objectives such as minimizing execution time,reducing operational cost,and maximizing resource utilization.However,traditional approaches frequently rely on single-objective optimization methods which are insufficient for capturing the complexity of such problems.To address this limitation,we introduce MDMOSA(Multi-objective Dwarf Mongoose Optimization with Simulated Annealing),a hybrid that integrates multi-objective optimization for efficient task scheduling in Infrastructure-as-a-Service(IaaS)cloud environments.MDMOSA harmonizes the exploration capabilities of the biologically inspired Dwarf Mongoose Optimization(DMO)with the exploitation strengths of Simulated Annealing(SA),achieving a balanced search process.The algorithm aims to optimize task allocation by reducing makespan and financial cost while improving system resource utilization.We evaluate MDMOSA through extensive simulations using the real-world Google Cloud Jobs(GoCJ)dataset within the CloudSim environment.Comparative analysis against benchmarked algorithms such as SMOACO,MOTSGWO,and MFPAGWO reveals that MDMOSA consistently achieves superior performance in terms of scheduling efficiency,cost-effectiveness,and scalability.These results confirm the potential of MDMOSA as a robust and adaptable solution for resource scheduling in dynamic and heterogeneous cloud computing infrastructures.
文摘The iterative continuation task(ICT)requires English as a foreign language(EFL)learners to read a segment and write a continuation that aligns with the preceding segment of an English novel with successive turns,offering exposure to diverse grammatical structures and opportunities for contextualized usage.Given the importance of integrating technology into second language(L2)writing and the critical role that grammar plays in L2 writing development,automated written corrective feedback provided by Grammarly has gained significant attention.This study investigates the impact of Grammarly on grammar learning strategies,grammar grit,and grammar competence among EFL college students engaged in ICT.This study employed a mixed-methods sequential exploratory design;56 participants were divided into an experimental group(n=28),receiving Grammarly feedback for ICT,and a control group(n=28),completing ICT without Grammarly feedback.Quantitative results revealed that both groups showed improvements in L2 grammar learning strategies,grit and competence.For the experimental group,significant differences were observed across all variables of L2 grammar learning strategies,grit,and competence between pre-and post-tests.For the control group,significant differences were only observed in the affective dimension of grammar learning strategies,Consistency of Interest(COI)of grammar grit,and grammar competence.However,the control group presented a significantly higher improvement in grammar competence.Qualitative analysis showed both positive and negative perceptions of Grammarly.The pedagogical implications of integrating Grammarly and ICT for L2 grammar development are discussed.
基金supported by the National Natural Science Foundation of China under Grant No.61701100.
文摘In scenarios where ground-based cloud computing infrastructure is unavailable,unmanned aerial vehicles(UAVs)act as mobile edge computing(MEC)servers to provide on-demand computation services for ground terminals.To address the challenge of jointly optimizing task scheduling and UAV trajectory under limited resources and high mobility of UAVs,this paper presents PER-MATD3,a multi-agent deep reinforcement learning algorithm with prioritized experience replay(PER)into the Centralized Training with Decentralized Execution(CTDE)framework.Specifically,PER-MATD3 enables each agent to learn a decentralized policy using only local observations during execution,while leveraging a shared replay buffer with prioritized sampling and centralized critic during training to accelerate convergence and improve sample efficiency.Simulation results show that PER-MATD3 reduces average task latency by up to 23%,improves energy efficiency by 21%,and enhances service coverage compared to state-of-the-art baselines,demonstrating its effectiveness and practicality in scenarios without terrestrial networks.
基金funded by Taif University,Taif,Saudi Arabia,project number(TU-DSPP-2024-17)。
文摘Advanced technologies like Cyber-Physical Systems(CPS)and the Internet of Things(IoT)have supported modernizing and automating the transportation region through the introduction of Intelligent Transportation Systems(ITS).Integrating CPS-ITS and IoT provides real-time Vehicle-to-Infrastructure(V2I)communication,supporting better traffic management,safety,and efficiency.These technological innovations generate complex problems that need to be addressed,uniquely about data routing and Task Scheduling(TS)in ITS.Attempts to solve those problems were primarily based on traditional and experimental methods,and the solutions were not so successful due to the dynamic nature of ITS.This is where the scope of Machine learning(ML)and Swarm Intelligence(SI)has significantly impacted dealing with these challenges;in this line,this research paper presents a novel method for TS and data routing in the CPS-ITS.This paper proposes using a cutting-edge ML algorithm for data transmission from CPS-ITS.This ML has Gated Linear Unit-approximated Reinforcement Learning(GLRL).Greedy Iterative-Particle Swarm Optimization(GI-PSO)has been recommended to develop the Particle Swarm Optimization(PSO)for TS.The primary objective of this study is to enhance the security and effectiveness of ITS systems that utilize CPS-ITS.This study trained and validated the models using a network simulation dataset of 50 nodes from numerous ITS environments.The experiments demonstrate that the proposed GLRL reduces End-toEnd Delay(EED)by 12%,enhances data size use from 83.6%to 88.6%,and achieves higher bandwidth allocation,particularly in high-demand scenarios such as multimedia data streams where adherence improved to 98.15%.Furthermore,the GLRL reduced Network Congestion(NC)by 5.5%,demonstrating its efficiency in managing complex traffic conditions across several environments.The model passed simulation tests in three different environments:urban(UE),suburban(SE),and rural(RE).It met the high bandwidth requirements,made task scheduling more efficient,and increased network throughput(NT).This proved that it was robust and flexible enough for scalable ITS applications.These innovations provide robust,scalable solutions for real-time traffic management,ultimately improving safety,reducing NC,and increasing overall NT.This study can affect ITS by developing it to be more responsive,safe,and effective and by creating a perfect method to set up UE,SE,and RE.