With the increasing complexity of vehicular networks and the proliferation of connected vehicles,Federated Learning(FL)has emerged as a critical framework for decentralized model training while preserving data privacy...With the increasing complexity of vehicular networks and the proliferation of connected vehicles,Federated Learning(FL)has emerged as a critical framework for decentralized model training while preserving data privacy.However,efficient client selection and adaptive weight allocation in heterogeneous and non-IID environments remain challenging.To address these issues,we propose Federated Learning with Client Selection and Adaptive Weighting(FedCW),a novel algorithm that leverages adaptive client selection and dynamic weight allocation for optimizing model convergence in real-time vehicular networks.FedCW selects clients based on their Euclidean distance from the global model and dynamically adjusts aggregation weights to optimize both data diversity and model convergence.Experimental results show that FedCW significantly outperforms existing FL algorithms such as FedAvg,FedProx,and SCAFFOLD,particularly in non-IID settings,achieving faster convergence,higher accuracy,and reduced communication overhead.These findings demonstrate that FedCW provides an effective solution for enhancing the performance of FL in heterogeneous,edge-based computing environments.展开更多
1.Introduction The field of exercise science is experiencing a renaissance,with recent research illuminating the molecular,cellular,and systemic effects of physical activity.This is largely due to the now unequivocal ...1.Introduction The field of exercise science is experiencing a renaissance,with recent research illuminating the molecular,cellular,and systemic effects of physical activity.This is largely due to the now unequivocal evidence that a lack of physical activity,not only has direct effects on the prevalence of non-contagious diseases(NCDs)but has profound additive effects of other risk factors for NCD such as obesity and hypertension.1 The articles in this special topic of Journal of Sport and Health Science(JSHS)are dedicated to research on Exercise biochemistry&metabolism.展开更多
Qingke,a staple crop grown on the high-altitude Tibetan Plateau,has evolved a metabolomic profile providing both environmental stress resilience and human nutrition.We review the hypothesis that the metabolites that c...Qingke,a staple crop grown on the high-altitude Tibetan Plateau,has evolved a metabolomic profile providing both environmental stress resilience and human nutrition.We review the hypothesis that the metabolites that confer cold and UV resistance on the crop also facilitate human adaptation to high-altitude stresses.Specifically,β-glucans regulate blood glucose primarily via short-chain fatty acids(SCFAs)produced through gut microbiota fermentation,which directly mediate glucose homeostasis.Phenolamides accumulate via the phenylpropanoid pathway,with chalcone isomerase(CHI)serving as a key enzyme in flavonoid biosynthesis and enhancing UV-B resistance.Under low temperatures,β-glucans improve frost tolerance by modulating osmotic balance and inhibiting ice-nucleating proteins,while lipids maintain membrane fluidity to sustain cellular function during cold stress.Importantly,we explore the hypothesis that these same metabolites,upon consumption,may facilitate human adaptation to high-altitude stresses.This hypothesis is supported by preliminary epidemiological associations between Qingke consumption and favorable health outcomes in high-altitude populations,as well as established bioactivities of the implicated metabolites in vitro and in animal models.However,direct causal evidence in humans and a comprehensive understanding of the underlying molecular mechanisms remain key knowledge gaps that warrant future investigation.Qingke as a unique resource at the interface of agricultural resilience and human nutrition.Understanding its metabolic blueprint will inform the development of functional foods and climate-resilient crops.展开更多
The Financial Technology(FinTech)sector has witnessed rapid growth,resulting in increasingly complex and high-volume digital transactions.Although this expansion improves efficiency and accessibility,it also introduce...The Financial Technology(FinTech)sector has witnessed rapid growth,resulting in increasingly complex and high-volume digital transactions.Although this expansion improves efficiency and accessibility,it also introduces significant vulnerabilities,including fraud,money laundering,and market manipulation.Traditional anomaly detection techniques often fail to capture the relational and dynamic characteristics of financial data.Graph Neural Networks(GNNs),capable of modeling intricate interdependencies among entities,have emerged as a powerful framework for detecting subtle and sophisticated anomalies.However,the high-dimensionality and inherent noise of FinTech datasets demand robust feature selection strategies to improve model scalability,performance,and interpretability.This paper presents a comprehensive survey of GNN-based approaches for anomaly detection in FinTech,with an emphasis on the synergistic role of feature selection.We examine the theoretical foundations of GNNs,review state-of-the-art feature selection techniques,analyze their integration with GNNs,and categorize prevalent anomaly types in FinTech applications.In addition,we discuss practical implementation challenges,highlight representative case studies,and propose future research directions to advance the field of graph-based anomaly detection in financial systems.展开更多
High-dimensional data causes difficulties in machine learning due to high time consumption and large memory requirements.In particular,in amulti-label environment,higher complexity is required asmuch as the number of ...High-dimensional data causes difficulties in machine learning due to high time consumption and large memory requirements.In particular,in amulti-label environment,higher complexity is required asmuch as the number of labels.Moreover,an optimization problem that fully considers all dependencies between features and labels is difficult to solve.In this study,we propose a novel regression-basedmulti-label feature selectionmethod that integrates mutual information to better exploit the underlying data structure.By incorporating mutual information into the regression formulation,the model captures not only linear relationships but also complex non-linear dependencies.The proposed objective function simultaneously considers three types of relationships:(1)feature redundancy,(2)featurelabel relevance,and(3)inter-label dependency.These three quantities are computed usingmutual information,allowing the proposed formulation to capture nonlinear dependencies among variables.These three types of relationships are key factors in multi-label feature selection,and our method expresses them within a unified formulation,enabling efficient optimization while simultaneously accounting for all of them.To efficiently solve the proposed optimization problem under non-negativity constraints,we develop a gradient-based optimization algorithm with fast convergence.Theexperimental results on sevenmulti-label datasets show that the proposed method outperforms existingmulti-label feature selection techniques.展开更多
Populus species,important economic species combining rapid growth with broad ecological adaptability,play a critical role in sustainable forestry and bioenergy production.In this study,we performed whole-genome resequ...Populus species,important economic species combining rapid growth with broad ecological adaptability,play a critical role in sustainable forestry and bioenergy production.In this study,we performed whole-genome resequencing of 707 individuals from a full-sib family to develop comprehensive single nucleotide polymorphism(SNP)markers and constructed a high-density genetic linkage map of 19 linkage groups.The total genetic length of the map reached 3623.65 cM with an average marker interval of 0.34 cM.By integrating multidimensional phenotypic data,89 quantitative trait loci(QTL)associated with growth,wood physical and chemical properties,disease resistance,and leaf morphology traits were identified,with logarithm of odds(LOD)scores ranging from 3.13 to 21.72 Notably,pleiotropic analysis revealed significant colocaliza and phenotypic variance explained between 1.7% and 11.6%.-tion hotspots on chromosomes LG1,LG5,LG6,LG8,and LG14,with epistatic interaction network analysis confirming genetic basis of coordinated regulation across multiple traits.Functional annotation of 207 candidate genes showed that R2R3-MYB and bHLH transcription factors and pyruvate kinase-encoding genes were significantly enriched,suggesting crucial roles in lignin biosynthesis and carbon metabolic pathways.Allelic effect analysis indicated that the frequency of favorable alleles associated with target traits ranged from 0.20 to 0.55.Incorporation of QTL-derived favorable alleles as random effects into Bayesian-based genomic selection models led to an increase in prediction accuracy ranging from 1% to 21%,with Bayesian ridge regression as the best predictive model.This study provides valuable genomic resources and genetic insights for deciphering complex trait architecture and advancing molecular breeding in poplar.展开更多
Rheumatoid arthritis(RA)patients face significant psychological challenges alongside physical symptoms,necessitating a comprehensive understanding of how psychological vulnerability and adaptation patterns evolve thro...Rheumatoid arthritis(RA)patients face significant psychological challenges alongside physical symptoms,necessitating a comprehensive understanding of how psychological vulnerability and adaptation patterns evolve throughout the disease course.This review examined 95 studies(2000-2025)from PubMed,Web of Science,and CNKI databases including longitudinal cohorts,randomized controlled trials,and mixed-methods research,to characterize the complex interplay between biological,psychological,and social factors affecting RA patients’mental health.Findings revealed three distinct vulnerability trajectories(45%persistently low,30%fluctuating improvement,25%persistently high)and four adaptation stages,with critical intervention periods occurring 3-6 months postdiagnosis and during disease flares.Multiple factors significantly influence psychological outcomes,including gender(females showing 1.8-fold increased risk),age(younger patients experiencing 42%higher vulnerability),pain intensity,inflammatory markers,and neuroendocrine dysregulation(48%showing cortisol rhythm disruption).Early psychological intervention(within 3 months of diagnosis)demonstrated robust benefits,reducing depression incidence by 42%with effects persisting 24-36 months,while different modalities showed complementary advantages:Cognitive behavioral therapy for depression(Cohen’s d=0.68),mindfulness for pain acceptance(38%improvement),and peer support for meaning reconstruction(25.6%increase).These findings underscore the importance of integrating routine psychological assessment into standard RA care,developing stage-appropriate interventions,and advancing research toward personalized biopsychosocial approaches that address the dynamic psychological dimensions of the disease.展开更多
The advantages of genome selection(GS) in animal and plant breeding are self-evident.Traditional parametric models have disadvantage in better fit the increasingly large sequencing data and capture complex effects acc...The advantages of genome selection(GS) in animal and plant breeding are self-evident.Traditional parametric models have disadvantage in better fit the increasingly large sequencing data and capture complex effects accurately.Machine learning models have demonstrated remarkable potential in addressing these challenges.In this study,we introduced the concept of mixed kernel functions to explore the performance of support vector machine regression(SVR) in GS.Six single kernel functions(SVR_L,SVR_C,SVR_G,SVR_P,SVR_S,SVR_L) and four mixed kernel functions(SVR_GS,SVR_GP,SVR_LS,SVR_LP) were used to predict genome breeding values.The prediction accuracy,mean squared error(MSE) and mean absolute error(MAE) were used as evaluation indicators to compare with two traditional parametric models(GBLUP,BayesB) and two popular machine learning models(RF,KcRR).The results indicate that in most cases,the performance of the mixed kernel function model significantly outperforms that of GBLUP,BayesB and single kernel function.For instance,for T1 in the pig dataset,the predictive accuracy of SVR_GS is improved by 10% compared to GBLUP,and by approximately 4.4 and 18.6% compared to SVR_G and SVR_S respectively.For E1 in the wheat dataset,SVR_GS achieves 13.3% higher prediction accuracy than GBLUP.Among single kernel functions,the Laplacian and Gaussian kernel functions yield similar results,with the Gaussian kernel function performing better.The mixed kernel function notably reduces the MSE and MAE when compared to all single kernel functions.Furthermore,regarding runtime,SVR_GS and SVR_GP mixed kernel functions run approximately three times faster than GBLUP in the pig dataset,with only a slight increase in runtime compared to the single kernel function model.In summary,the mixed kernel function model of SVR demonstrates speed and accuracy competitiveness,and the model such as SVR_GS has important application potential for GS.展开更多
Domain adaptation aims to reduce the distribution gap between the training data(source domain)and the target data.This enables effective predictions even for domains not seen during training.However,most conventional ...Domain adaptation aims to reduce the distribution gap between the training data(source domain)and the target data.This enables effective predictions even for domains not seen during training.However,most conventional domain adaptation methods assume a single source domain,making them less suitable for modern deep learning settings that rely on diverse and large-scale datasets.To address this limitation,recent research has focused on Multi-Source Domain Adaptation(MSDA),which aims to learn effectively from multiple source domains.In this paper,we propose Efficient Domain Transition for Multi-source(EDTM),a novel and efficient framework designed to tackle two major challenges in existing MSDA approaches:(1)integrating knowledge across different source domains and(2)aligning label distributions between source and target domains.EDTM leverages an ensemble-based classifier expert mechanism to enhance the contribution of source domains that are more similar to the target domain.To further stabilize the learning process and improve performance,we incorporate imitation learning into the training of the target model.In addition,Maximum Classifier Discrepancy(MCD)is employed to align class-wise label distributions between the source and target domains.Experiments were conducted using Digits-Five,one of the most representative benchmark datasets for MSDA.The results show that EDTM consistently outperforms existing methods in terms of average classification accuracy.Notably,EDTM achieved significantly higher performance on target domains such as Modified National Institute of Standards and Technolog with blended background images(MNIST-M)and Street View House Numbers(SVHN)datasets,demonstrating enhanced generalization compared to baseline approaches.Furthermore,an ablation study analyzing the contribution of each loss component validated the effectiveness of the framework,highlighting the importance of each module in achieving optimal performance.展开更多
To address the issue of scarce labeled samples and operational condition variations that degrade the accuracy of fault diagnosis models in variable-condition gearbox fault diagnosis,this paper proposes a semi-supervis...To address the issue of scarce labeled samples and operational condition variations that degrade the accuracy of fault diagnosis models in variable-condition gearbox fault diagnosis,this paper proposes a semi-supervised masked contrastive learning and domain adaptation(SSMCL-DA)method for gearbox fault diagnosis under variable conditions.Initially,during the unsupervised pre-training phase,a dual signal augmentation strategy is devised,which simultaneously applies random masking in the time domain and random scaling in the frequency domain to unlabeled samples,thereby constructing more challenging positive sample pairs to guide the encoder in learning intrinsic features robust to condition variations.Subsequently,a ConvNeXt-Transformer hybrid architecture is employed,integrating the superior local detail modeling capacity of ConvNeXt with the robust global perception capability of Transformer to enhance feature extraction in complex scenarios.Thereafter,a contrastive learning model is constructed with the optimization objective of maximizing feature similarity across different masked instances of the same sample,enabling the extraction of consistent features from multiple masked perspectives and reducing reliance on labeled data.In the final supervised fine-tuning phase,a multi-scale attention mechanism is incorporated for feature rectification,and a domain adaptation module combining Local Maximum Mean Discrepancy(LMMD)with adversarial learning is proposed.This module embodies a dual mechanism:LMMD facilitates fine-grained class-conditional alignment,compelling features of identical fault classes to converge across varying conditions,while the domain discriminator utilizes adversarial training to guide the feature extractor toward learning domain-invariant features.Working in concert,they markedly diminish feature distribution discrepancies induced by changes in load,rotational speed,and other factors,thereby boosting the model’s adaptability to cross-condition scenarios.Experimental evaluations on the WT planetary gearbox dataset and the Case Western Reserve University(CWRU)bearing dataset demonstrate that the SSMCL-DA model effectively identifies multiple fault classes in gearboxes,with diagnostic performance substantially surpassing that of conventional methods.Under cross-condition scenarios,the model attains fault diagnosis accuracies of 99.21%for the WT planetary gearbox and 99.86%for the bearings,respectively.Furthermore,the model exhibits stable generalization capability in cross-device settings.展开更多
Most predictive maintenance studies have emphasized accuracy but provide very little focus on Interpretability or deployment readiness.This study improves on prior methods by developing a small yet robust system that ...Most predictive maintenance studies have emphasized accuracy but provide very little focus on Interpretability or deployment readiness.This study improves on prior methods by developing a small yet robust system that can predict when turbofan engines will fail.It uses the NASA CMAPSS dataset,which has over 200,000 engine cycles from260 engines.The process begins with systematic preprocessing,which includes imputation,outlier removal,scaling,and labelling of the remaining useful life.Dimensionality is reduced using a hybrid selection method that combines variance filtering,recursive elimination,and gradient-boosted importance scores,yielding a stable set of 10 informative sensors.To mitigate class imbalance,minority cases are oversampled,and class-weighted losses are applied during training.Benchmarking is carried out with logistic regression,gradient boosting,and a recurrent design that integrates gated recurrent units with long short-term memory networks.The Long Short-Term Memory–Gated Recurrent Unit(LSTM–GRU)hybrid achieved the strongest performance with an F1 score of 0.92,precision of 0.93,recall of 0.91,ReceiverOperating Characteristic–AreaUnder the Curve(ROC-AUC)of 0.97,andminority recall of 0.75.Interpretability testing using permutation importance and Shapley values indicates that sensors 13,15,and 11 are the most important indicators of engine wear.The proposed system combines imbalance handling,feature reduction,and Interpretability into a practical design suitable for real industrial settings.展开更多
Feature selection serves as a critical preprocessing step inmachine learning,focusing on identifying and preserving the most relevant features to improve the efficiency and performance of classification algorithms.Par...Feature selection serves as a critical preprocessing step inmachine learning,focusing on identifying and preserving the most relevant features to improve the efficiency and performance of classification algorithms.Particle Swarm Optimization has demonstrated significant potential in addressing feature selection challenges.However,there are inherent limitations in Particle Swarm Optimization,such as the delicate balance between exploration and exploitation,susceptibility to local optima,and suboptimal convergence rates,hinder its performance.To tackle these issues,this study introduces a novel Leveraged Opposition-Based Learning method within Fitness Landscape Particle Swarm Optimization,tailored for wrapper-based feature selection.The proposed approach integrates:(1)a fitness-landscape adaptive strategy to dynamically balance exploration and exploitation,(2)the lever principle within Opposition-Based Learning to improve search efficiency,and(3)a Local Selection and Re-optimization mechanism combined with random perturbation to expedite convergence and enhance the quality of the optimal feature subset.The effectiveness of is rigorously evaluated on 24 benchmark datasets and compared against 13 advancedmetaheuristic algorithms.Experimental results demonstrate that the proposed method outperforms the compared algorithms in classification accuracy on over half of the datasets,whilst also significantly reducing the number of selected features.These findings demonstrate its effectiveness and robustness in feature selection tasks.展开更多
Existing feature selection methods for intrusion detection systems in the Industrial Internet of Things often suffer from local optimality and high computational complexity.These challenges hinder traditional IDS from...Existing feature selection methods for intrusion detection systems in the Industrial Internet of Things often suffer from local optimality and high computational complexity.These challenges hinder traditional IDS from effectively extracting features while maintaining detection accuracy.This paper proposes an industrial Internet ofThings intrusion detection feature selection algorithm based on an improved whale optimization algorithm(GSLDWOA).The aim is to address the problems that feature selection algorithms under high-dimensional data are prone to,such as local optimality,long detection time,and reduced accuracy.First,the initial population’s diversity is increased using the Gaussian Mutation mechanism.Then,Non-linear Shrinking Factor balances global exploration and local development,avoiding premature convergence.Lastly,Variable-step Levy Flight operator and Dynamic Differential Evolution strategy are introduced to improve the algorithm’s search efficiency and convergence accuracy in highdimensional feature space.Experiments on the NSL-KDD and WUSTL-IIoT-2021 datasets demonstrate that the feature subset selected by GSLDWOA significantly improves detection performance.Compared to the traditional WOA algorithm,the detection rate and F1-score increased by 3.68%and 4.12%.On the WUSTL-IIoT-2021 dataset,accuracy,recall,and F1-score all exceed 99.9%.展开更多
Existing elevator fault diagnosis algorithms have limited engineering applicability due to variations in working conditions and differences in equipment structures.To address this limitation,this study proposes an uns...Existing elevator fault diagnosis algorithms have limited engineering applicability due to variations in working conditions and differences in equipment structures.To address this limitation,this study proposes an unsupervised subdomain adaptation method based on a time-frequency feature attention mechanism,LMMD-based subdomain alignment,and contrastive local alignment.This enables the application of the diagnosis model across different working conditions and equipment types.First,a novel time-frequency feature attention mechanism assigns weights to vibration signals of varying dimensions.Second,the time series is transformed to obtain a three-channel time-frequency diagram.This diagram is input into the proposed dimension-segmentation cross-channel multihead self-attention framework to extract high-dimensional frequencydomain fault features.These features are concatenated with the time-domain features to obtain a global feature representation.Then,the extracted high-dimensional features are sent to the classification module to obtain the predicted labels for the source and target domains.Finally,after confidence filtering,the true labels from the source domain and the prediction labels from the target domain are fed into a dynamically weighted multilevel feature alignment module to promote proximity between similar fault features across domains while enhancing separation among different fault types.The validity and superiority of the proposed method were demonstrated through simulation experiments conducted on two types of manned escalator systems under multiple working conditions.For the most challenging transfer task,the proposed method achieved higher accuracy on the target domain test set than DANN,ADDA,C-CLCN,TFA-CCN,and TFA-LCN by 26.87%,24.72%,11.44%,28.94%,and 16.85%,respectively.展开更多
Unconfined Compressive Strength(UCS)is a key parameter for the assessment of the stability and performance of stabilized soils,yet traditional laboratory testing is both time and resource intensive.In this study,an in...Unconfined Compressive Strength(UCS)is a key parameter for the assessment of the stability and performance of stabilized soils,yet traditional laboratory testing is both time and resource intensive.In this study,an interpretable machine learning approach to UCS prediction is presented,pairing five models(Random Forest(RF),Gradient Boosting(GB),Extreme Gradient Boosting(XGB),CatBoost,and K-Nearest Neighbors(KNN))with SHapley Additive exPlanations(SHAP)for enhanced interpretability and to guide feature removal.A complete dataset of 12 geotechnical and chemical parameters,i.e.,Atterberg limits,compaction properties,stabilizer chemistry,dosage,curing time,was used to train and test the models.R2,RMSE,MSE,and MAE were used to assess performance.Initial results with all 12 features indicated that boosting-based models(GB,XGB,CatBoost)exhibited the highest predictive accuracy(R^(2)=0.93)with satisfactory generalization on test data,followed by RF and KNN.SHAP analysis consistently picked CaO content,curing time,stabilizer dosage,and compaction parameters as the most important features,aligning with established soil stabilization mechanisms.Models were then re-trained on the top 8 and top 5 SHAP-ranked features.Interestingly,GB,XGB,and CatBoost maintained comparable accuracy with reduced input sets,while RF was moderately sensitive and KNN was somewhat better owing to reduced dimensionality.The findings confirm that feature reduction through SHAP enables cost-effective UCS prediction through the reduction of laboratory test requirements without significant accuracy loss.The suggested hybrid approach offers an explainable,interpretable,and cost-effective tool for geotechnical engineering practice.展开更多
Automated essay scoring(AES)systems have gained significant importance in educational settings,offering a scalable,efficient,and objective method for evaluating student essays.However,developing AES systems for Arabic...Automated essay scoring(AES)systems have gained significant importance in educational settings,offering a scalable,efficient,and objective method for evaluating student essays.However,developing AES systems for Arabic poses distinct challenges due to the language’s complex morphology,diglossia,and the scarcity of annotated datasets.This paper presents a hybrid approach to Arabic AES by combining text-based,vector-based,and embeddingbased similarity measures to improve essay scoring accuracy while minimizing the training data required.Using a large Arabic essay dataset categorized into thematic groups,the study conducted four experiments to evaluate the impact of feature selection,data size,and model performance.Experiment 1 established a baseline using a non-machine learning approach,selecting top-N correlated features to predict essay scores.The subsequent experiments employed 5-fold cross-validation.Experiment 2 showed that combining embedding-based,text-based,and vector-based features in a Random Forest(RF)model achieved an R2 of 88.92%and an accuracy of 83.3%within a 0.5-point tolerance.Experiment 3 further refined the feature selection process,demonstrating that 19 correlated features yielded optimal results,improving R2 to 88.95%.In Experiment 4,an optimal data efficiency training approach was introduced,where training data portions increased from 5%to 50%.The study found that using just 10%of the data achieved near-peak performance,with an R2 of 85.49%,emphasizing an effective trade-off between performance and computational costs.These findings highlight the potential of the hybrid approach for developing scalable Arabic AES systems,especially in low-resource environments,addressing linguistic challenges while ensuring efficient data usage.展开更多
In the quest to enhance energy efficiency and reduce environmental impact in the transportation sector,the recovery of waste heat from diesel engines has become a critical area of focus.This study provided an exhausti...In the quest to enhance energy efficiency and reduce environmental impact in the transportation sector,the recovery of waste heat from diesel engines has become a critical area of focus.This study provided an exhaustive thermodynamic analysis optimizing Organic Rankine Cycle(ORC)systems forwaste heat recovery fromdiesel engines.Thestudy assessed the performance of five candidateworking fluids—R11,R123,R113,R245fa,and R141b—under a range of operating conditions,specifically varying overheat temperatures and evaporation pressures.The results indicated that the choice of working fluid substantially influences the system’s exergetic efficiency,net output power,and thermal efficiency.R245fa showed an outstanding net output power of 30.39 kW at high overheat conditions,outperforming R11,which is significant for high-temperature waste heat recovery.At lower temperatures,R11 and R113 demonstrated higher exergetic efficiencies,with R11 reaching a peak exergetic efficiency of 7.4%at an evaporation pressure of 10 bar and an overheat of 10℃.The study also revealed that controlling the overheat and optimizing the evaporation pressure are crucial for enhancing the net output power of the ORC system.Specifically,at an evaporation pressure of 30 bar and an overheat of 0℃,R113 exhibited the lowest exergetic destruction of 544.5 kJ/kg,making it a suitable choice for minimizing irreversible losses.These findings are instrumental for understanding the performance of ORC systems in waste heat recovery applications and offer valuable insights for the design and operation of more efficient and environmentally friendly diesel engine systems.展开更多
Multi-label feature selection(MFS)is a crucial dimensionality reduction technique aimed at identifying informative features associated with multiple labels.However,traditional centralized methods face significant chal...Multi-label feature selection(MFS)is a crucial dimensionality reduction technique aimed at identifying informative features associated with multiple labels.However,traditional centralized methods face significant challenges in privacy-sensitive and distributed settings,often neglecting label dependencies and suffering from low computational efficiency.To address these issues,we introduce a novel framework,Fed-MFSDHBCPSO—federated MFS via dual-layer hybrid breeding cooperative particle swarm optimization algorithm with manifold and sparsity regularization(DHBCPSO-MSR).Leveraging the federated learning paradigm,Fed-MFSDHBCPSO allows clients to perform local feature selection(FS)using DHBCPSO-MSR.Locally selected feature subsets are encrypted with differential privacy(DP)and transmitted to a central server,where they are securely aggregated and refined through secure multi-party computation(SMPC)until global convergence is achieved.Within each client,DHBCPSO-MSR employs a dual-layer FS strategy.The inner layer constructs sample and label similarity graphs,generates Laplacian matrices to capture the manifold structure between samples and labels,and applies L2,1-norm regularization to sparsify the feature subset,yielding an optimized feature weight matrix.The outer layer uses a hybrid breeding cooperative particle swarm optimization algorithm to further refine the feature weight matrix and identify the optimal feature subset.The updated weight matrix is then fed back to the inner layer for further optimization.Comprehensive experiments on multiple real-world multi-label datasets demonstrate that Fed-MFSDHBCPSO consistently outperforms both centralized and federated baseline methods across several key evaluation metrics.展开更多
Data collected in fields such as cybersecurity and biomedicine often encounter high dimensionality and class imbalance.To address the problem of low classification accuracy for minority class samples arising from nume...Data collected in fields such as cybersecurity and biomedicine often encounter high dimensionality and class imbalance.To address the problem of low classification accuracy for minority class samples arising from numerous irrelevant and redundant features in high-dimensional imbalanced data,we proposed a novel feature selection method named AMF-SGSK based on adaptive multi-filter and subspace-based gaining sharing knowledge.Firstly,the balanced dataset was obtained by random under-sampling.Secondly,combining the feature importance score with the AUC score for each filter method,we proposed a concept called feature hardness to judge the importance of feature,which could adaptively select the essential features.Finally,the optimal feature subset was obtained by gaining sharing knowledge in multiple subspaces.This approach effectively achieved dimensionality reduction for high-dimensional imbalanced data.The experiment results on 30 benchmark imbalanced datasets showed that AMF-SGSK performed better than other eight commonly used algorithms including BGWO and IG-SSO in terms of F1-score,AUC,and G-mean.The mean values of F1-score,AUC,and Gmean for AMF-SGSK are 0.950,0.967,and 0.965,respectively,achieving the highest among all algorithms.And the mean value of Gmean is higher than those of IG-PSO,ReliefF-GWO,and BGOA by 3.72%,11.12%,and 20.06%,respectively.Furthermore,the selected feature ratio is below 0.01 across the selected ten datasets,further demonstrating the proposed method’s overall superiority over competing approaches.AMF-SGSK could adaptively remove irrelevant and redundant features and effectively improve the classification accuracy of high-dimensional imbalanced data,providing scientific and technological references for practical applications.展开更多
Emerging and powerful genome editing tools,particularly CRISPR/Cas9,are facilitating functional genomics research and accelerating crop improvement(Jiang et al.2021;Cao et al.2023;Chen C et al.2023;Liu et al.2023a).Ho...Emerging and powerful genome editing tools,particularly CRISPR/Cas9,are facilitating functional genomics research and accelerating crop improvement(Jiang et al.2021;Cao et al.2023;Chen C et al.2023;Liu et al.2023a).However,the detection and screening of transgenic lines remain major bottlenecks,being time-consuming,labor-intensive,and inefficient during transformation and subsequent mutation identification.A simple and efficient visual marker system plays a critical role in addressing these challenges.Recent studies demonstrated that the GmW1 and RUBY reporter systems were used to obtain visual transgenic soybean(Glycine max) plants(Chen L et al.2023;Chen et al.2024).展开更多
文摘With the increasing complexity of vehicular networks and the proliferation of connected vehicles,Federated Learning(FL)has emerged as a critical framework for decentralized model training while preserving data privacy.However,efficient client selection and adaptive weight allocation in heterogeneous and non-IID environments remain challenging.To address these issues,we propose Federated Learning with Client Selection and Adaptive Weighting(FedCW),a novel algorithm that leverages adaptive client selection and dynamic weight allocation for optimizing model convergence in real-time vehicular networks.FedCW selects clients based on their Euclidean distance from the global model and dynamically adjusts aggregation weights to optimize both data diversity and model convergence.Experimental results show that FedCW significantly outperforms existing FL algorithms such as FedAvg,FedProx,and SCAFFOLD,particularly in non-IID settings,achieving faster convergence,higher accuracy,and reduced communication overhead.These findings demonstrate that FedCW provides an effective solution for enhancing the performance of FL in heterogeneous,edge-based computing environments.
文摘1.Introduction The field of exercise science is experiencing a renaissance,with recent research illuminating the molecular,cellular,and systemic effects of physical activity.This is largely due to the now unequivocal evidence that a lack of physical activity,not only has direct effects on the prevalence of non-contagious diseases(NCDs)but has profound additive effects of other risk factors for NCD such as obesity and hypertension.1 The articles in this special topic of Journal of Sport and Health Science(JSHS)are dedicated to research on Exercise biochemistry&metabolism.
基金supported by the Financial Special Fund,grant number XZ202401JD0027National Barley Industry Technology System(CARS-05-01A-08)+3 种基金the Xizang Agri-Tech Innovation Project(XZNKY-2025-CXGC-T01)the Joint Funds of the National Natural Science Foundation of China(No.U20A2026)the Financial Special Fund,grant number(32401784,2017CZZX001/2,XZNKY-2018-C-021 and NYSTC202401)the China Agriculture Research System of Barley(CARS-05).
文摘Qingke,a staple crop grown on the high-altitude Tibetan Plateau,has evolved a metabolomic profile providing both environmental stress resilience and human nutrition.We review the hypothesis that the metabolites that confer cold and UV resistance on the crop also facilitate human adaptation to high-altitude stresses.Specifically,β-glucans regulate blood glucose primarily via short-chain fatty acids(SCFAs)produced through gut microbiota fermentation,which directly mediate glucose homeostasis.Phenolamides accumulate via the phenylpropanoid pathway,with chalcone isomerase(CHI)serving as a key enzyme in flavonoid biosynthesis and enhancing UV-B resistance.Under low temperatures,β-glucans improve frost tolerance by modulating osmotic balance and inhibiting ice-nucleating proteins,while lipids maintain membrane fluidity to sustain cellular function during cold stress.Importantly,we explore the hypothesis that these same metabolites,upon consumption,may facilitate human adaptation to high-altitude stresses.This hypothesis is supported by preliminary epidemiological associations between Qingke consumption and favorable health outcomes in high-altitude populations,as well as established bioactivities of the implicated metabolites in vitro and in animal models.However,direct causal evidence in humans and a comprehensive understanding of the underlying molecular mechanisms remain key knowledge gaps that warrant future investigation.Qingke as a unique resource at the interface of agricultural resilience and human nutrition.Understanding its metabolic blueprint will inform the development of functional foods and climate-resilient crops.
基金supported by Ho Chi Minh City Open University,Vietnam under grant number E2024.02.1CD and Suan Sunandha Rajabhat University,Thailand.
文摘The Financial Technology(FinTech)sector has witnessed rapid growth,resulting in increasingly complex and high-volume digital transactions.Although this expansion improves efficiency and accessibility,it also introduces significant vulnerabilities,including fraud,money laundering,and market manipulation.Traditional anomaly detection techniques often fail to capture the relational and dynamic characteristics of financial data.Graph Neural Networks(GNNs),capable of modeling intricate interdependencies among entities,have emerged as a powerful framework for detecting subtle and sophisticated anomalies.However,the high-dimensionality and inherent noise of FinTech datasets demand robust feature selection strategies to improve model scalability,performance,and interpretability.This paper presents a comprehensive survey of GNN-based approaches for anomaly detection in FinTech,with an emphasis on the synergistic role of feature selection.We examine the theoretical foundations of GNNs,review state-of-the-art feature selection techniques,analyze their integration with GNNs,and categorize prevalent anomaly types in FinTech applications.In addition,we discuss practical implementation challenges,highlight representative case studies,and propose future research directions to advance the field of graph-based anomaly detection in financial systems.
基金supported by Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(RS-2020-NR049579).
文摘High-dimensional data causes difficulties in machine learning due to high time consumption and large memory requirements.In particular,in amulti-label environment,higher complexity is required asmuch as the number of labels.Moreover,an optimization problem that fully considers all dependencies between features and labels is difficult to solve.In this study,we propose a novel regression-basedmulti-label feature selectionmethod that integrates mutual information to better exploit the underlying data structure.By incorporating mutual information into the regression formulation,the model captures not only linear relationships but also complex non-linear dependencies.The proposed objective function simultaneously considers three types of relationships:(1)feature redundancy,(2)featurelabel relevance,and(3)inter-label dependency.These three quantities are computed usingmutual information,allowing the proposed formulation to capture nonlinear dependencies among variables.These three types of relationships are key factors in multi-label feature selection,and our method expresses them within a unified formulation,enabling efficient optimization while simultaneously accounting for all of them.To efficiently solve the proposed optimization problem under non-negativity constraints,we develop a gradient-based optimization algorithm with fast convergence.Theexperimental results on sevenmulti-label datasets show that the proposed method outperforms existingmulti-label feature selection techniques.
基金supported by the National Key Research and Development Plan of China(2021YFD2200202)the Key Research and Development Project of Jiangsu Province,China(BE2021366).
文摘Populus species,important economic species combining rapid growth with broad ecological adaptability,play a critical role in sustainable forestry and bioenergy production.In this study,we performed whole-genome resequencing of 707 individuals from a full-sib family to develop comprehensive single nucleotide polymorphism(SNP)markers and constructed a high-density genetic linkage map of 19 linkage groups.The total genetic length of the map reached 3623.65 cM with an average marker interval of 0.34 cM.By integrating multidimensional phenotypic data,89 quantitative trait loci(QTL)associated with growth,wood physical and chemical properties,disease resistance,and leaf morphology traits were identified,with logarithm of odds(LOD)scores ranging from 3.13 to 21.72 Notably,pleiotropic analysis revealed significant colocaliza and phenotypic variance explained between 1.7% and 11.6%.-tion hotspots on chromosomes LG1,LG5,LG6,LG8,and LG14,with epistatic interaction network analysis confirming genetic basis of coordinated regulation across multiple traits.Functional annotation of 207 candidate genes showed that R2R3-MYB and bHLH transcription factors and pyruvate kinase-encoding genes were significantly enriched,suggesting crucial roles in lignin biosynthesis and carbon metabolic pathways.Allelic effect analysis indicated that the frequency of favorable alleles associated with target traits ranged from 0.20 to 0.55.Incorporation of QTL-derived favorable alleles as random effects into Bayesian-based genomic selection models led to an increase in prediction accuracy ranging from 1% to 21%,with Bayesian ridge regression as the best predictive model.This study provides valuable genomic resources and genetic insights for deciphering complex trait architecture and advancing molecular breeding in poplar.
基金Supported by Chongqing Health Commission and Chongqing Science and Technology Bureau,No.2023MSXM182。
文摘Rheumatoid arthritis(RA)patients face significant psychological challenges alongside physical symptoms,necessitating a comprehensive understanding of how psychological vulnerability and adaptation patterns evolve throughout the disease course.This review examined 95 studies(2000-2025)from PubMed,Web of Science,and CNKI databases including longitudinal cohorts,randomized controlled trials,and mixed-methods research,to characterize the complex interplay between biological,psychological,and social factors affecting RA patients’mental health.Findings revealed three distinct vulnerability trajectories(45%persistently low,30%fluctuating improvement,25%persistently high)and four adaptation stages,with critical intervention periods occurring 3-6 months postdiagnosis and during disease flares.Multiple factors significantly influence psychological outcomes,including gender(females showing 1.8-fold increased risk),age(younger patients experiencing 42%higher vulnerability),pain intensity,inflammatory markers,and neuroendocrine dysregulation(48%showing cortisol rhythm disruption).Early psychological intervention(within 3 months of diagnosis)demonstrated robust benefits,reducing depression incidence by 42%with effects persisting 24-36 months,while different modalities showed complementary advantages:Cognitive behavioral therapy for depression(Cohen’s d=0.68),mindfulness for pain acceptance(38%improvement),and peer support for meaning reconstruction(25.6%increase).These findings underscore the importance of integrating routine psychological assessment into standard RA care,developing stage-appropriate interventions,and advancing research toward personalized biopsychosocial approaches that address the dynamic psychological dimensions of the disease.
基金supported by the China Agriculture Research System of MOF and MARAthe National Natural Science Foundation of China (31872337 and 31501919)the Agricultural Science and Technology Innovation Project,China (ASTIP-IAS02)。
文摘The advantages of genome selection(GS) in animal and plant breeding are self-evident.Traditional parametric models have disadvantage in better fit the increasingly large sequencing data and capture complex effects accurately.Machine learning models have demonstrated remarkable potential in addressing these challenges.In this study,we introduced the concept of mixed kernel functions to explore the performance of support vector machine regression(SVR) in GS.Six single kernel functions(SVR_L,SVR_C,SVR_G,SVR_P,SVR_S,SVR_L) and four mixed kernel functions(SVR_GS,SVR_GP,SVR_LS,SVR_LP) were used to predict genome breeding values.The prediction accuracy,mean squared error(MSE) and mean absolute error(MAE) were used as evaluation indicators to compare with two traditional parametric models(GBLUP,BayesB) and two popular machine learning models(RF,KcRR).The results indicate that in most cases,the performance of the mixed kernel function model significantly outperforms that of GBLUP,BayesB and single kernel function.For instance,for T1 in the pig dataset,the predictive accuracy of SVR_GS is improved by 10% compared to GBLUP,and by approximately 4.4 and 18.6% compared to SVR_G and SVR_S respectively.For E1 in the wheat dataset,SVR_GS achieves 13.3% higher prediction accuracy than GBLUP.Among single kernel functions,the Laplacian and Gaussian kernel functions yield similar results,with the Gaussian kernel function performing better.The mixed kernel function notably reduces the MSE and MAE when compared to all single kernel functions.Furthermore,regarding runtime,SVR_GS and SVR_GP mixed kernel functions run approximately three times faster than GBLUP in the pig dataset,with only a slight increase in runtime compared to the single kernel function model.In summary,the mixed kernel function model of SVR demonstrates speed and accuracy competitiveness,and the model such as SVR_GS has important application potential for GS.
基金supported by the National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(No.RS-2024-00406320)the Institute of Information&Communica-tions Technology Planning&Evaluation(IITP)-Innovative Human Resource Development for Local Intellectualization Program Grant funded by the Korea government(MSIT)(IITP-2026-RS-2023-00259678).
文摘Domain adaptation aims to reduce the distribution gap between the training data(source domain)and the target data.This enables effective predictions even for domains not seen during training.However,most conventional domain adaptation methods assume a single source domain,making them less suitable for modern deep learning settings that rely on diverse and large-scale datasets.To address this limitation,recent research has focused on Multi-Source Domain Adaptation(MSDA),which aims to learn effectively from multiple source domains.In this paper,we propose Efficient Domain Transition for Multi-source(EDTM),a novel and efficient framework designed to tackle two major challenges in existing MSDA approaches:(1)integrating knowledge across different source domains and(2)aligning label distributions between source and target domains.EDTM leverages an ensemble-based classifier expert mechanism to enhance the contribution of source domains that are more similar to the target domain.To further stabilize the learning process and improve performance,we incorporate imitation learning into the training of the target model.In addition,Maximum Classifier Discrepancy(MCD)is employed to align class-wise label distributions between the source and target domains.Experiments were conducted using Digits-Five,one of the most representative benchmark datasets for MSDA.The results show that EDTM consistently outperforms existing methods in terms of average classification accuracy.Notably,EDTM achieved significantly higher performance on target domains such as Modified National Institute of Standards and Technolog with blended background images(MNIST-M)and Street View House Numbers(SVHN)datasets,demonstrating enhanced generalization compared to baseline approaches.Furthermore,an ablation study analyzing the contribution of each loss component validated the effectiveness of the framework,highlighting the importance of each module in achieving optimal performance.
基金supported by the National Natural Science Foundation of China Funded Project(Project Name:Research on Robust Adaptive Allocation Mechanism of Human Machine Co-Driving System Based on NMS Features,Project Approval Number:52172381).
文摘To address the issue of scarce labeled samples and operational condition variations that degrade the accuracy of fault diagnosis models in variable-condition gearbox fault diagnosis,this paper proposes a semi-supervised masked contrastive learning and domain adaptation(SSMCL-DA)method for gearbox fault diagnosis under variable conditions.Initially,during the unsupervised pre-training phase,a dual signal augmentation strategy is devised,which simultaneously applies random masking in the time domain and random scaling in the frequency domain to unlabeled samples,thereby constructing more challenging positive sample pairs to guide the encoder in learning intrinsic features robust to condition variations.Subsequently,a ConvNeXt-Transformer hybrid architecture is employed,integrating the superior local detail modeling capacity of ConvNeXt with the robust global perception capability of Transformer to enhance feature extraction in complex scenarios.Thereafter,a contrastive learning model is constructed with the optimization objective of maximizing feature similarity across different masked instances of the same sample,enabling the extraction of consistent features from multiple masked perspectives and reducing reliance on labeled data.In the final supervised fine-tuning phase,a multi-scale attention mechanism is incorporated for feature rectification,and a domain adaptation module combining Local Maximum Mean Discrepancy(LMMD)with adversarial learning is proposed.This module embodies a dual mechanism:LMMD facilitates fine-grained class-conditional alignment,compelling features of identical fault classes to converge across varying conditions,while the domain discriminator utilizes adversarial training to guide the feature extractor toward learning domain-invariant features.Working in concert,they markedly diminish feature distribution discrepancies induced by changes in load,rotational speed,and other factors,thereby boosting the model’s adaptability to cross-condition scenarios.Experimental evaluations on the WT planetary gearbox dataset and the Case Western Reserve University(CWRU)bearing dataset demonstrate that the SSMCL-DA model effectively identifies multiple fault classes in gearboxes,with diagnostic performance substantially surpassing that of conventional methods.Under cross-condition scenarios,the model attains fault diagnosis accuracies of 99.21%for the WT planetary gearbox and 99.86%for the bearings,respectively.Furthermore,the model exhibits stable generalization capability in cross-device settings.
基金supported by the Deanship of Scientific Research,Vice Presidency for Graduate Studies and Scientific Research,King Faisal University,Saudi Arabia Grant No.KFU253765.
文摘Most predictive maintenance studies have emphasized accuracy but provide very little focus on Interpretability or deployment readiness.This study improves on prior methods by developing a small yet robust system that can predict when turbofan engines will fail.It uses the NASA CMAPSS dataset,which has over 200,000 engine cycles from260 engines.The process begins with systematic preprocessing,which includes imputation,outlier removal,scaling,and labelling of the remaining useful life.Dimensionality is reduced using a hybrid selection method that combines variance filtering,recursive elimination,and gradient-boosted importance scores,yielding a stable set of 10 informative sensors.To mitigate class imbalance,minority cases are oversampled,and class-weighted losses are applied during training.Benchmarking is carried out with logistic regression,gradient boosting,and a recurrent design that integrates gated recurrent units with long short-term memory networks.The Long Short-Term Memory–Gated Recurrent Unit(LSTM–GRU)hybrid achieved the strongest performance with an F1 score of 0.92,precision of 0.93,recall of 0.91,ReceiverOperating Characteristic–AreaUnder the Curve(ROC-AUC)of 0.97,andminority recall of 0.75.Interpretability testing using permutation importance and Shapley values indicates that sensors 13,15,and 11 are the most important indicators of engine wear.The proposed system combines imbalance handling,feature reduction,and Interpretability into a practical design suitable for real industrial settings.
基金supported by National Natural Science Foundation of China(62106092)Natural Science Foundation of Fujian Province(2024J01822,2024J01820,2022J01916)Natural Science Foundation of Zhangzhou City(ZZ2024J28).
文摘Feature selection serves as a critical preprocessing step inmachine learning,focusing on identifying and preserving the most relevant features to improve the efficiency and performance of classification algorithms.Particle Swarm Optimization has demonstrated significant potential in addressing feature selection challenges.However,there are inherent limitations in Particle Swarm Optimization,such as the delicate balance between exploration and exploitation,susceptibility to local optima,and suboptimal convergence rates,hinder its performance.To tackle these issues,this study introduces a novel Leveraged Opposition-Based Learning method within Fitness Landscape Particle Swarm Optimization,tailored for wrapper-based feature selection.The proposed approach integrates:(1)a fitness-landscape adaptive strategy to dynamically balance exploration and exploitation,(2)the lever principle within Opposition-Based Learning to improve search efficiency,and(3)a Local Selection and Re-optimization mechanism combined with random perturbation to expedite convergence and enhance the quality of the optimal feature subset.The effectiveness of is rigorously evaluated on 24 benchmark datasets and compared against 13 advancedmetaheuristic algorithms.Experimental results demonstrate that the proposed method outperforms the compared algorithms in classification accuracy on over half of the datasets,whilst also significantly reducing the number of selected features.These findings demonstrate its effectiveness and robustness in feature selection tasks.
基金supported by the Major Science and Technology Programs in Henan Province(No.241100210100)Henan Provincial Science and Technology Research Project(No.252102211085,No.252102211105)+3 种基金Endogenous Security Cloud Network Convergence R&D Center(No.602431011PQ1)The Special Project for Research and Development in Key Areas of Guangdong Province(No.2021ZDZX1098)The Stabilization Support Program of Science,Technology and Innovation Commission of Shenzhen Municipality(No.20231128083944001)The Key scientific research projects of Henan higher education institutions(No.24A520042).
文摘Existing feature selection methods for intrusion detection systems in the Industrial Internet of Things often suffer from local optimality and high computational complexity.These challenges hinder traditional IDS from effectively extracting features while maintaining detection accuracy.This paper proposes an industrial Internet ofThings intrusion detection feature selection algorithm based on an improved whale optimization algorithm(GSLDWOA).The aim is to address the problems that feature selection algorithms under high-dimensional data are prone to,such as local optimality,long detection time,and reduced accuracy.First,the initial population’s diversity is increased using the Gaussian Mutation mechanism.Then,Non-linear Shrinking Factor balances global exploration and local development,avoiding premature convergence.Lastly,Variable-step Levy Flight operator and Dynamic Differential Evolution strategy are introduced to improve the algorithm’s search efficiency and convergence accuracy in highdimensional feature space.Experiments on the NSL-KDD and WUSTL-IIoT-2021 datasets demonstrate that the feature subset selected by GSLDWOA significantly improves detection performance.Compared to the traditional WOA algorithm,the detection rate and F1-score increased by 3.68%and 4.12%.On the WUSTL-IIoT-2021 dataset,accuracy,recall,and F1-score all exceed 99.9%.
基金supported by the National Natural Science Foundation of China(Grant Nos.52375255,51935007)the Shanghai Rising-Star Program(Grant No.24QB2705000)。
文摘Existing elevator fault diagnosis algorithms have limited engineering applicability due to variations in working conditions and differences in equipment structures.To address this limitation,this study proposes an unsupervised subdomain adaptation method based on a time-frequency feature attention mechanism,LMMD-based subdomain alignment,and contrastive local alignment.This enables the application of the diagnosis model across different working conditions and equipment types.First,a novel time-frequency feature attention mechanism assigns weights to vibration signals of varying dimensions.Second,the time series is transformed to obtain a three-channel time-frequency diagram.This diagram is input into the proposed dimension-segmentation cross-channel multihead self-attention framework to extract high-dimensional frequencydomain fault features.These features are concatenated with the time-domain features to obtain a global feature representation.Then,the extracted high-dimensional features are sent to the classification module to obtain the predicted labels for the source and target domains.Finally,after confidence filtering,the true labels from the source domain and the prediction labels from the target domain are fed into a dynamically weighted multilevel feature alignment module to promote proximity between similar fault features across domains while enhancing separation among different fault types.The validity and superiority of the proposed method were demonstrated through simulation experiments conducted on two types of manned escalator systems under multiple working conditions.For the most challenging transfer task,the proposed method achieved higher accuracy on the target domain test set than DANN,ADDA,C-CLCN,TFA-CCN,and TFA-LCN by 26.87%,24.72%,11.44%,28.94%,and 16.85%,respectively.
文摘Unconfined Compressive Strength(UCS)is a key parameter for the assessment of the stability and performance of stabilized soils,yet traditional laboratory testing is both time and resource intensive.In this study,an interpretable machine learning approach to UCS prediction is presented,pairing five models(Random Forest(RF),Gradient Boosting(GB),Extreme Gradient Boosting(XGB),CatBoost,and K-Nearest Neighbors(KNN))with SHapley Additive exPlanations(SHAP)for enhanced interpretability and to guide feature removal.A complete dataset of 12 geotechnical and chemical parameters,i.e.,Atterberg limits,compaction properties,stabilizer chemistry,dosage,curing time,was used to train and test the models.R2,RMSE,MSE,and MAE were used to assess performance.Initial results with all 12 features indicated that boosting-based models(GB,XGB,CatBoost)exhibited the highest predictive accuracy(R^(2)=0.93)with satisfactory generalization on test data,followed by RF and KNN.SHAP analysis consistently picked CaO content,curing time,stabilizer dosage,and compaction parameters as the most important features,aligning with established soil stabilization mechanisms.Models were then re-trained on the top 8 and top 5 SHAP-ranked features.Interestingly,GB,XGB,and CatBoost maintained comparable accuracy with reduced input sets,while RF was moderately sensitive and KNN was somewhat better owing to reduced dimensionality.The findings confirm that feature reduction through SHAP enables cost-effective UCS prediction through the reduction of laboratory test requirements without significant accuracy loss.The suggested hybrid approach offers an explainable,interpretable,and cost-effective tool for geotechnical engineering practice.
基金funded by Deanship of Graduate studies and Scientific Research at Jouf University under grant No.(DGSSR-2024-02-01264).
文摘Automated essay scoring(AES)systems have gained significant importance in educational settings,offering a scalable,efficient,and objective method for evaluating student essays.However,developing AES systems for Arabic poses distinct challenges due to the language’s complex morphology,diglossia,and the scarcity of annotated datasets.This paper presents a hybrid approach to Arabic AES by combining text-based,vector-based,and embeddingbased similarity measures to improve essay scoring accuracy while minimizing the training data required.Using a large Arabic essay dataset categorized into thematic groups,the study conducted four experiments to evaluate the impact of feature selection,data size,and model performance.Experiment 1 established a baseline using a non-machine learning approach,selecting top-N correlated features to predict essay scores.The subsequent experiments employed 5-fold cross-validation.Experiment 2 showed that combining embedding-based,text-based,and vector-based features in a Random Forest(RF)model achieved an R2 of 88.92%and an accuracy of 83.3%within a 0.5-point tolerance.Experiment 3 further refined the feature selection process,demonstrating that 19 correlated features yielded optimal results,improving R2 to 88.95%.In Experiment 4,an optimal data efficiency training approach was introduced,where training data portions increased from 5%to 50%.The study found that using just 10%of the data achieved near-peak performance,with an R2 of 85.49%,emphasizing an effective trade-off between performance and computational costs.These findings highlight the potential of the hybrid approach for developing scalable Arabic AES systems,especially in low-resource environments,addressing linguistic challenges while ensuring efficient data usage.
基金funded by the Huaiyin Institute of Technology—Institute of Smart Energy.
文摘In the quest to enhance energy efficiency and reduce environmental impact in the transportation sector,the recovery of waste heat from diesel engines has become a critical area of focus.This study provided an exhaustive thermodynamic analysis optimizing Organic Rankine Cycle(ORC)systems forwaste heat recovery fromdiesel engines.Thestudy assessed the performance of five candidateworking fluids—R11,R123,R113,R245fa,and R141b—under a range of operating conditions,specifically varying overheat temperatures and evaporation pressures.The results indicated that the choice of working fluid substantially influences the system’s exergetic efficiency,net output power,and thermal efficiency.R245fa showed an outstanding net output power of 30.39 kW at high overheat conditions,outperforming R11,which is significant for high-temperature waste heat recovery.At lower temperatures,R11 and R113 demonstrated higher exergetic efficiencies,with R11 reaching a peak exergetic efficiency of 7.4%at an evaporation pressure of 10 bar and an overheat of 10℃.The study also revealed that controlling the overheat and optimizing the evaporation pressure are crucial for enhancing the net output power of the ORC system.Specifically,at an evaporation pressure of 30 bar and an overheat of 0℃,R113 exhibited the lowest exergetic destruction of 544.5 kJ/kg,making it a suitable choice for minimizing irreversible losses.These findings are instrumental for understanding the performance of ORC systems in waste heat recovery applications and offer valuable insights for the design and operation of more efficient and environmentally friendly diesel engine systems.
文摘Multi-label feature selection(MFS)is a crucial dimensionality reduction technique aimed at identifying informative features associated with multiple labels.However,traditional centralized methods face significant challenges in privacy-sensitive and distributed settings,often neglecting label dependencies and suffering from low computational efficiency.To address these issues,we introduce a novel framework,Fed-MFSDHBCPSO—federated MFS via dual-layer hybrid breeding cooperative particle swarm optimization algorithm with manifold and sparsity regularization(DHBCPSO-MSR).Leveraging the federated learning paradigm,Fed-MFSDHBCPSO allows clients to perform local feature selection(FS)using DHBCPSO-MSR.Locally selected feature subsets are encrypted with differential privacy(DP)and transmitted to a central server,where they are securely aggregated and refined through secure multi-party computation(SMPC)until global convergence is achieved.Within each client,DHBCPSO-MSR employs a dual-layer FS strategy.The inner layer constructs sample and label similarity graphs,generates Laplacian matrices to capture the manifold structure between samples and labels,and applies L2,1-norm regularization to sparsify the feature subset,yielding an optimized feature weight matrix.The outer layer uses a hybrid breeding cooperative particle swarm optimization algorithm to further refine the feature weight matrix and identify the optimal feature subset.The updated weight matrix is then fed back to the inner layer for further optimization.Comprehensive experiments on multiple real-world multi-label datasets demonstrate that Fed-MFSDHBCPSO consistently outperforms both centralized and federated baseline methods across several key evaluation metrics.
基金supported by Fundamental Research Program of Shanxi Province(Nos.202203021211088,202403021212254,202403021221109)Graduate Research Innovation Project in Shanxi Province(No.2024KY616).
文摘Data collected in fields such as cybersecurity and biomedicine often encounter high dimensionality and class imbalance.To address the problem of low classification accuracy for minority class samples arising from numerous irrelevant and redundant features in high-dimensional imbalanced data,we proposed a novel feature selection method named AMF-SGSK based on adaptive multi-filter and subspace-based gaining sharing knowledge.Firstly,the balanced dataset was obtained by random under-sampling.Secondly,combining the feature importance score with the AUC score for each filter method,we proposed a concept called feature hardness to judge the importance of feature,which could adaptively select the essential features.Finally,the optimal feature subset was obtained by gaining sharing knowledge in multiple subspaces.This approach effectively achieved dimensionality reduction for high-dimensional imbalanced data.The experiment results on 30 benchmark imbalanced datasets showed that AMF-SGSK performed better than other eight commonly used algorithms including BGWO and IG-SSO in terms of F1-score,AUC,and G-mean.The mean values of F1-score,AUC,and Gmean for AMF-SGSK are 0.950,0.967,and 0.965,respectively,achieving the highest among all algorithms.And the mean value of Gmean is higher than those of IG-PSO,ReliefF-GWO,and BGOA by 3.72%,11.12%,and 20.06%,respectively.Furthermore,the selected feature ratio is below 0.01 across the selected ten datasets,further demonstrating the proposed method’s overall superiority over competing approaches.AMF-SGSK could adaptively remove irrelevant and redundant features and effectively improve the classification accuracy of high-dimensional imbalanced data,providing scientific and technological references for practical applications.
基金supported by the Jilin Science and Technology Development Program,China (20240602032RC)the Jilin Agricultural Science and Technology Innovation Project,China (CXGC2024ZD001)+1 种基金the Jilin Agricultural Science and Technology Innovation Project,China (CXGC2024ZY012)the Jilin Province Development and Reform Commission-Project for Improving the Independent Innovation Capacity of Major Grain Crops,China (2024C002)。
文摘Emerging and powerful genome editing tools,particularly CRISPR/Cas9,are facilitating functional genomics research and accelerating crop improvement(Jiang et al.2021;Cao et al.2023;Chen C et al.2023;Liu et al.2023a).However,the detection and screening of transgenic lines remain major bottlenecks,being time-consuming,labor-intensive,and inefficient during transformation and subsequent mutation identification.A simple and efficient visual marker system plays a critical role in addressing these challenges.Recent studies demonstrated that the GmW1 and RUBY reporter systems were used to obtain visual transgenic soybean(Glycine max) plants(Chen L et al.2023;Chen et al.2024).