Gastrointestinal hemangioma(GIH)is clinically rare,accounting for 7%-10%of benign gastrointestinal tumors and 0.5%of systemic hemangiomas.GIH can occur as either solitary or multiple lesions,with gastrointestinal blee...Gastrointestinal hemangioma(GIH)is clinically rare,accounting for 7%-10%of benign gastrointestinal tumors and 0.5%of systemic hemangiomas.GIH can occur as either solitary or multiple lesions,with gastrointestinal bleeding as a significant clinical manifestation.Understanding the clinical and endoscopic features of GIH is essential for improving diagnostic accuracy,particularly through endoscopy and selective arteriography,which are highly effective in diagnosing GIH and preventing misdiagnosis and inappropriate treatment.Upon confirmed diagnosis,it is essential to thoroughly evaluate the patient's condition to determine the most suitable treatment modality—whether surgical,endoscopic,or minimally invasive intervention.The minimally invasive interventional partial embolization therapy using polyvinyl alcohol particles,proposed and implemented by Pospisilova et al,has achieved excellent clinical outcomes.This approach reduces surgical trauma and the inherent risks of traditional surgical treatments.展开更多
The Financial Technology(FinTech)sector has witnessed rapid growth,resulting in increasingly complex and high-volume digital transactions.Although this expansion improves efficiency and accessibility,it also introduce...The Financial Technology(FinTech)sector has witnessed rapid growth,resulting in increasingly complex and high-volume digital transactions.Although this expansion improves efficiency and accessibility,it also introduces significant vulnerabilities,including fraud,money laundering,and market manipulation.Traditional anomaly detection techniques often fail to capture the relational and dynamic characteristics of financial data.Graph Neural Networks(GNNs),capable of modeling intricate interdependencies among entities,have emerged as a powerful framework for detecting subtle and sophisticated anomalies.However,the high-dimensionality and inherent noise of FinTech datasets demand robust feature selection strategies to improve model scalability,performance,and interpretability.This paper presents a comprehensive survey of GNN-based approaches for anomaly detection in FinTech,with an emphasis on the synergistic role of feature selection.We examine the theoretical foundations of GNNs,review state-of-the-art feature selection techniques,analyze their integration with GNNs,and categorize prevalent anomaly types in FinTech applications.In addition,we discuss practical implementation challenges,highlight representative case studies,and propose future research directions to advance the field of graph-based anomaly detection in financial systems.展开更多
With the increasing complexity of vehicular networks and the proliferation of connected vehicles,Federated Learning(FL)has emerged as a critical framework for decentralized model training while preserving data privacy...With the increasing complexity of vehicular networks and the proliferation of connected vehicles,Federated Learning(FL)has emerged as a critical framework for decentralized model training while preserving data privacy.However,efficient client selection and adaptive weight allocation in heterogeneous and non-IID environments remain challenging.To address these issues,we propose Federated Learning with Client Selection and Adaptive Weighting(FedCW),a novel algorithm that leverages adaptive client selection and dynamic weight allocation for optimizing model convergence in real-time vehicular networks.FedCW selects clients based on their Euclidean distance from the global model and dynamically adjusts aggregation weights to optimize both data diversity and model convergence.Experimental results show that FedCW significantly outperforms existing FL algorithms such as FedAvg,FedProx,and SCAFFOLD,particularly in non-IID settings,achieving faster convergence,higher accuracy,and reduced communication overhead.These findings demonstrate that FedCW provides an effective solution for enhancing the performance of FL in heterogeneous,edge-based computing environments.展开更多
High-dimensional data causes difficulties in machine learning due to high time consumption and large memory requirements.In particular,in amulti-label environment,higher complexity is required asmuch as the number of ...High-dimensional data causes difficulties in machine learning due to high time consumption and large memory requirements.In particular,in amulti-label environment,higher complexity is required asmuch as the number of labels.Moreover,an optimization problem that fully considers all dependencies between features and labels is difficult to solve.In this study,we propose a novel regression-basedmulti-label feature selectionmethod that integrates mutual information to better exploit the underlying data structure.By incorporating mutual information into the regression formulation,the model captures not only linear relationships but also complex non-linear dependencies.The proposed objective function simultaneously considers three types of relationships:(1)feature redundancy,(2)featurelabel relevance,and(3)inter-label dependency.These three quantities are computed usingmutual information,allowing the proposed formulation to capture nonlinear dependencies among variables.These three types of relationships are key factors in multi-label feature selection,and our method expresses them within a unified formulation,enabling efficient optimization while simultaneously accounting for all of them.To efficiently solve the proposed optimization problem under non-negativity constraints,we develop a gradient-based optimization algorithm with fast convergence.Theexperimental results on sevenmulti-label datasets show that the proposed method outperforms existingmulti-label feature selection techniques.展开更多
Populus species,important economic species combining rapid growth with broad ecological adaptability,play a critical role in sustainable forestry and bioenergy production.In this study,we performed whole-genome resequ...Populus species,important economic species combining rapid growth with broad ecological adaptability,play a critical role in sustainable forestry and bioenergy production.In this study,we performed whole-genome resequencing of 707 individuals from a full-sib family to develop comprehensive single nucleotide polymorphism(SNP)markers and constructed a high-density genetic linkage map of 19 linkage groups.The total genetic length of the map reached 3623.65 cM with an average marker interval of 0.34 cM.By integrating multidimensional phenotypic data,89 quantitative trait loci(QTL)associated with growth,wood physical and chemical properties,disease resistance,and leaf morphology traits were identified,with logarithm of odds(LOD)scores ranging from 3.13 to 21.72 Notably,pleiotropic analysis revealed significant colocaliza and phenotypic variance explained between 1.7% and 11.6%.-tion hotspots on chromosomes LG1,LG5,LG6,LG8,and LG14,with epistatic interaction network analysis confirming genetic basis of coordinated regulation across multiple traits.Functional annotation of 207 candidate genes showed that R2R3-MYB and bHLH transcription factors and pyruvate kinase-encoding genes were significantly enriched,suggesting crucial roles in lignin biosynthesis and carbon metabolic pathways.Allelic effect analysis indicated that the frequency of favorable alleles associated with target traits ranged from 0.20 to 0.55.Incorporation of QTL-derived favorable alleles as random effects into Bayesian-based genomic selection models led to an increase in prediction accuracy ranging from 1% to 21%,with Bayesian ridge regression as the best predictive model.This study provides valuable genomic resources and genetic insights for deciphering complex trait architecture and advancing molecular breeding in poplar.展开更多
The advantages of genome selection(GS) in animal and plant breeding are self-evident.Traditional parametric models have disadvantage in better fit the increasingly large sequencing data and capture complex effects acc...The advantages of genome selection(GS) in animal and plant breeding are self-evident.Traditional parametric models have disadvantage in better fit the increasingly large sequencing data and capture complex effects accurately.Machine learning models have demonstrated remarkable potential in addressing these challenges.In this study,we introduced the concept of mixed kernel functions to explore the performance of support vector machine regression(SVR) in GS.Six single kernel functions(SVR_L,SVR_C,SVR_G,SVR_P,SVR_S,SVR_L) and four mixed kernel functions(SVR_GS,SVR_GP,SVR_LS,SVR_LP) were used to predict genome breeding values.The prediction accuracy,mean squared error(MSE) and mean absolute error(MAE) were used as evaluation indicators to compare with two traditional parametric models(GBLUP,BayesB) and two popular machine learning models(RF,KcRR).The results indicate that in most cases,the performance of the mixed kernel function model significantly outperforms that of GBLUP,BayesB and single kernel function.For instance,for T1 in the pig dataset,the predictive accuracy of SVR_GS is improved by 10% compared to GBLUP,and by approximately 4.4 and 18.6% compared to SVR_G and SVR_S respectively.For E1 in the wheat dataset,SVR_GS achieves 13.3% higher prediction accuracy than GBLUP.Among single kernel functions,the Laplacian and Gaussian kernel functions yield similar results,with the Gaussian kernel function performing better.The mixed kernel function notably reduces the MSE and MAE when compared to all single kernel functions.Furthermore,regarding runtime,SVR_GS and SVR_GP mixed kernel functions run approximately three times faster than GBLUP in the pig dataset,with only a slight increase in runtime compared to the single kernel function model.In summary,the mixed kernel function model of SVR demonstrates speed and accuracy competitiveness,and the model such as SVR_GS has important application potential for GS.展开更多
Most predictive maintenance studies have emphasized accuracy but provide very little focus on Interpretability or deployment readiness.This study improves on prior methods by developing a small yet robust system that ...Most predictive maintenance studies have emphasized accuracy but provide very little focus on Interpretability or deployment readiness.This study improves on prior methods by developing a small yet robust system that can predict when turbofan engines will fail.It uses the NASA CMAPSS dataset,which has over 200,000 engine cycles from260 engines.The process begins with systematic preprocessing,which includes imputation,outlier removal,scaling,and labelling of the remaining useful life.Dimensionality is reduced using a hybrid selection method that combines variance filtering,recursive elimination,and gradient-boosted importance scores,yielding a stable set of 10 informative sensors.To mitigate class imbalance,minority cases are oversampled,and class-weighted losses are applied during training.Benchmarking is carried out with logistic regression,gradient boosting,and a recurrent design that integrates gated recurrent units with long short-term memory networks.The Long Short-Term Memory–Gated Recurrent Unit(LSTM–GRU)hybrid achieved the strongest performance with an F1 score of 0.92,precision of 0.93,recall of 0.91,ReceiverOperating Characteristic–AreaUnder the Curve(ROC-AUC)of 0.97,andminority recall of 0.75.Interpretability testing using permutation importance and Shapley values indicates that sensors 13,15,and 11 are the most important indicators of engine wear.The proposed system combines imbalance handling,feature reduction,and Interpretability into a practical design suitable for real industrial settings.展开更多
Feature selection serves as a critical preprocessing step inmachine learning,focusing on identifying and preserving the most relevant features to improve the efficiency and performance of classification algorithms.Par...Feature selection serves as a critical preprocessing step inmachine learning,focusing on identifying and preserving the most relevant features to improve the efficiency and performance of classification algorithms.Particle Swarm Optimization has demonstrated significant potential in addressing feature selection challenges.However,there are inherent limitations in Particle Swarm Optimization,such as the delicate balance between exploration and exploitation,susceptibility to local optima,and suboptimal convergence rates,hinder its performance.To tackle these issues,this study introduces a novel Leveraged Opposition-Based Learning method within Fitness Landscape Particle Swarm Optimization,tailored for wrapper-based feature selection.The proposed approach integrates:(1)a fitness-landscape adaptive strategy to dynamically balance exploration and exploitation,(2)the lever principle within Opposition-Based Learning to improve search efficiency,and(3)a Local Selection and Re-optimization mechanism combined with random perturbation to expedite convergence and enhance the quality of the optimal feature subset.The effectiveness of is rigorously evaluated on 24 benchmark datasets and compared against 13 advancedmetaheuristic algorithms.Experimental results demonstrate that the proposed method outperforms the compared algorithms in classification accuracy on over half of the datasets,whilst also significantly reducing the number of selected features.These findings demonstrate its effectiveness and robustness in feature selection tasks.展开更多
Existing feature selection methods for intrusion detection systems in the Industrial Internet of Things often suffer from local optimality and high computational complexity.These challenges hinder traditional IDS from...Existing feature selection methods for intrusion detection systems in the Industrial Internet of Things often suffer from local optimality and high computational complexity.These challenges hinder traditional IDS from effectively extracting features while maintaining detection accuracy.This paper proposes an industrial Internet ofThings intrusion detection feature selection algorithm based on an improved whale optimization algorithm(GSLDWOA).The aim is to address the problems that feature selection algorithms under high-dimensional data are prone to,such as local optimality,long detection time,and reduced accuracy.First,the initial population’s diversity is increased using the Gaussian Mutation mechanism.Then,Non-linear Shrinking Factor balances global exploration and local development,avoiding premature convergence.Lastly,Variable-step Levy Flight operator and Dynamic Differential Evolution strategy are introduced to improve the algorithm’s search efficiency and convergence accuracy in highdimensional feature space.Experiments on the NSL-KDD and WUSTL-IIoT-2021 datasets demonstrate that the feature subset selected by GSLDWOA significantly improves detection performance.Compared to the traditional WOA algorithm,the detection rate and F1-score increased by 3.68%and 4.12%.On the WUSTL-IIoT-2021 dataset,accuracy,recall,and F1-score all exceed 99.9%.展开更多
Unconfined Compressive Strength(UCS)is a key parameter for the assessment of the stability and performance of stabilized soils,yet traditional laboratory testing is both time and resource intensive.In this study,an in...Unconfined Compressive Strength(UCS)is a key parameter for the assessment of the stability and performance of stabilized soils,yet traditional laboratory testing is both time and resource intensive.In this study,an interpretable machine learning approach to UCS prediction is presented,pairing five models(Random Forest(RF),Gradient Boosting(GB),Extreme Gradient Boosting(XGB),CatBoost,and K-Nearest Neighbors(KNN))with SHapley Additive exPlanations(SHAP)for enhanced interpretability and to guide feature removal.A complete dataset of 12 geotechnical and chemical parameters,i.e.,Atterberg limits,compaction properties,stabilizer chemistry,dosage,curing time,was used to train and test the models.R2,RMSE,MSE,and MAE were used to assess performance.Initial results with all 12 features indicated that boosting-based models(GB,XGB,CatBoost)exhibited the highest predictive accuracy(R^(2)=0.93)with satisfactory generalization on test data,followed by RF and KNN.SHAP analysis consistently picked CaO content,curing time,stabilizer dosage,and compaction parameters as the most important features,aligning with established soil stabilization mechanisms.Models were then re-trained on the top 8 and top 5 SHAP-ranked features.Interestingly,GB,XGB,and CatBoost maintained comparable accuracy with reduced input sets,while RF was moderately sensitive and KNN was somewhat better owing to reduced dimensionality.The findings confirm that feature reduction through SHAP enables cost-effective UCS prediction through the reduction of laboratory test requirements without significant accuracy loss.The suggested hybrid approach offers an explainable,interpretable,and cost-effective tool for geotechnical engineering practice.展开更多
Automated essay scoring(AES)systems have gained significant importance in educational settings,offering a scalable,efficient,and objective method for evaluating student essays.However,developing AES systems for Arabic...Automated essay scoring(AES)systems have gained significant importance in educational settings,offering a scalable,efficient,and objective method for evaluating student essays.However,developing AES systems for Arabic poses distinct challenges due to the language’s complex morphology,diglossia,and the scarcity of annotated datasets.This paper presents a hybrid approach to Arabic AES by combining text-based,vector-based,and embeddingbased similarity measures to improve essay scoring accuracy while minimizing the training data required.Using a large Arabic essay dataset categorized into thematic groups,the study conducted four experiments to evaluate the impact of feature selection,data size,and model performance.Experiment 1 established a baseline using a non-machine learning approach,selecting top-N correlated features to predict essay scores.The subsequent experiments employed 5-fold cross-validation.Experiment 2 showed that combining embedding-based,text-based,and vector-based features in a Random Forest(RF)model achieved an R2 of 88.92%and an accuracy of 83.3%within a 0.5-point tolerance.Experiment 3 further refined the feature selection process,demonstrating that 19 correlated features yielded optimal results,improving R2 to 88.95%.In Experiment 4,an optimal data efficiency training approach was introduced,where training data portions increased from 5%to 50%.The study found that using just 10%of the data achieved near-peak performance,with an R2 of 85.49%,emphasizing an effective trade-off between performance and computational costs.These findings highlight the potential of the hybrid approach for developing scalable Arabic AES systems,especially in low-resource environments,addressing linguistic challenges while ensuring efficient data usage.展开更多
In the quest to enhance energy efficiency and reduce environmental impact in the transportation sector,the recovery of waste heat from diesel engines has become a critical area of focus.This study provided an exhausti...In the quest to enhance energy efficiency and reduce environmental impact in the transportation sector,the recovery of waste heat from diesel engines has become a critical area of focus.This study provided an exhaustive thermodynamic analysis optimizing Organic Rankine Cycle(ORC)systems forwaste heat recovery fromdiesel engines.Thestudy assessed the performance of five candidateworking fluids—R11,R123,R113,R245fa,and R141b—under a range of operating conditions,specifically varying overheat temperatures and evaporation pressures.The results indicated that the choice of working fluid substantially influences the system’s exergetic efficiency,net output power,and thermal efficiency.R245fa showed an outstanding net output power of 30.39 kW at high overheat conditions,outperforming R11,which is significant for high-temperature waste heat recovery.At lower temperatures,R11 and R113 demonstrated higher exergetic efficiencies,with R11 reaching a peak exergetic efficiency of 7.4%at an evaporation pressure of 10 bar and an overheat of 10℃.The study also revealed that controlling the overheat and optimizing the evaporation pressure are crucial for enhancing the net output power of the ORC system.Specifically,at an evaporation pressure of 30 bar and an overheat of 0℃,R113 exhibited the lowest exergetic destruction of 544.5 kJ/kg,making it a suitable choice for minimizing irreversible losses.These findings are instrumental for understanding the performance of ORC systems in waste heat recovery applications and offer valuable insights for the design and operation of more efficient and environmentally friendly diesel engine systems.展开更多
Multi-label feature selection(MFS)is a crucial dimensionality reduction technique aimed at identifying informative features associated with multiple labels.However,traditional centralized methods face significant chal...Multi-label feature selection(MFS)is a crucial dimensionality reduction technique aimed at identifying informative features associated with multiple labels.However,traditional centralized methods face significant challenges in privacy-sensitive and distributed settings,often neglecting label dependencies and suffering from low computational efficiency.To address these issues,we introduce a novel framework,Fed-MFSDHBCPSO—federated MFS via dual-layer hybrid breeding cooperative particle swarm optimization algorithm with manifold and sparsity regularization(DHBCPSO-MSR).Leveraging the federated learning paradigm,Fed-MFSDHBCPSO allows clients to perform local feature selection(FS)using DHBCPSO-MSR.Locally selected feature subsets are encrypted with differential privacy(DP)and transmitted to a central server,where they are securely aggregated and refined through secure multi-party computation(SMPC)until global convergence is achieved.Within each client,DHBCPSO-MSR employs a dual-layer FS strategy.The inner layer constructs sample and label similarity graphs,generates Laplacian matrices to capture the manifold structure between samples and labels,and applies L2,1-norm regularization to sparsify the feature subset,yielding an optimized feature weight matrix.The outer layer uses a hybrid breeding cooperative particle swarm optimization algorithm to further refine the feature weight matrix and identify the optimal feature subset.The updated weight matrix is then fed back to the inner layer for further optimization.Comprehensive experiments on multiple real-world multi-label datasets demonstrate that Fed-MFSDHBCPSO consistently outperforms both centralized and federated baseline methods across several key evaluation metrics.展开更多
Emerging and powerful genome editing tools,particularly CRISPR/Cas9,are facilitating functional genomics research and accelerating crop improvement(Jiang et al.2021;Cao et al.2023;Chen C et al.2023;Liu et al.2023a).Ho...Emerging and powerful genome editing tools,particularly CRISPR/Cas9,are facilitating functional genomics research and accelerating crop improvement(Jiang et al.2021;Cao et al.2023;Chen C et al.2023;Liu et al.2023a).However,the detection and screening of transgenic lines remain major bottlenecks,being time-consuming,labor-intensive,and inefficient during transformation and subsequent mutation identification.A simple and efficient visual marker system plays a critical role in addressing these challenges.Recent studies demonstrated that the GmW1 and RUBY reporter systems were used to obtain visual transgenic soybean(Glycine max) plants(Chen L et al.2023;Chen et al.2024).展开更多
Massive MIMO is one of the key technologies in future 5G communications which can satisfy the requirement of high speed and large capacity. This paper considers antenna selection and power allocation design to promote...Massive MIMO is one of the key technologies in future 5G communications which can satisfy the requirement of high speed and large capacity. This paper considers antenna selection and power allocation design to promote energy conservation then provide good quality of service(QoS) for the whole massive MIMO uplink network. Unlike previous related works, hardware impairment, transmission efficiency, and energy consumption at the circuit and antennas are involved in massive MIMO networks. In order to ensure the QoS, we consider the minimum rate constraint for each user and the system, which increases the complexity of power allocation problem for maximizing energy and spectral efficiency in massive MIMO system. To this end, a quantum-inspired social emotional optimization(QSEO) algorithm is proposed to obtain the optimal power control strategy in massive MIMO uplink networks. Simulation results assess the great advantages of QSEO which previous strategies do not have.展开更多
In 5G systems, massive multiple-input multiple-output (MIMO) has been adopted in base stations (BSs) to improve spectral efficiency and coverage. The traditional conductive performance test techniques are challenging ...In 5G systems, massive multiple-input multiple-output (MIMO) has been adopted in base stations (BSs) to improve spectral efficiency and coverage. The traditional conductive performance test techniques are challenging due to the unaffordable cost and high complexity when testing a large number of antennas. To solve this problem, the over-the-air (OTA) test has been presented, in which probe selection is the key to reduce the number of channel emulators and probes. In this paper, a novel artificial bee colony (ABC) algorithm is introduced to enhance the efficiency and accuracy of probe selection procedure. A sectoring- based multi-probe anechoic chamber (MPAC) is built to evaluate the throughput performance of massive MIMO equipped in 5G BS. In addition, link level simulation is carried out to evaluate the proposal’s performance gain under the commercial network assumptions, where the average throughput of three velocity is given with different SNR region. The results suggest that OTA chamber and multi-probe wall are available not only for 5G BSs, but also for user equipments (UEs) with end-to-end communication.展开更多
The Pacific oyster,Crassostrea gigas,naturally distributing along the coast of northwest Pacific,is one of the most important bivalve species due to its high economic value and fecundity.In China,we have initiated a s...The Pacific oyster,Crassostrea gigas,naturally distributing along the coast of northwest Pacific,is one of the most important bivalve species due to its high economic value and fecundity.In China,we have initiated a selective breeding program on both shell color and growth rate of C.gigas since 2010.A black shell line was obtained through four-generation family selection.In this study,mass selection for growth improvement was conducted in the sixth generation and seventh generation of black shell lines.To assess the progress of potential genetic improvement,the progeny of two generations of black shell lines were selected to evaluate their shell heights via a 450-day farming experiment.As the results,after growing for 450 days,the sixth generation and seventh generation of selected lines were 9.03% and 11.42% larger than the control lines,respectively.During the grow-out stage,the genetic gain of two generations was 8.82%±0.18% and 11.54%±0.43%,respectively;and the corresponding realized heritability was 0.45±0.04 and 0.41±0.04,respectively.These results indicated that the mass selection for shell height achieved steady progress in the two generations of C.gigas.展开更多
Rice with low glutelin content is suitable as functional food for patients affected with diabetes and kidney failure. The fine mapping of the gene(s) responsible for low glutelin content will provide information regar...Rice with low glutelin content is suitable as functional food for patients affected with diabetes and kidney failure. The fine mapping of the gene(s) responsible for low glutelin content will provide information regarding the distribution of glutelin related genes in rice genome and will generate markers for the selection of low glutelin rice varieties. Following an SDS-PAGE screen of rice germplasm from Taihu Valley of China, Japonica selection W3660 is identified to be a novel mutant characterized with low glutelin content. For fine mapping the mutant gene for low glutelin content, F2 and F3 populations were derived from a cross between W3660 and Jingrennuo. SDS-PAGE analysis of the total endosperm protein showed that the low glutelin content trait was controlled by a single dominant nuclear gene. Genetic mapping, using SSRs, located this gene to chromosome 2, in the region between SSR2-001/SSR2-004 and RM1358. The dis- tances of the two markers to the target gene were 1.1 cM and 3.8 cM respectively. By semi-quantitative RT-PCR analysis, the transcripts of GluB4/GluB5 genes located within the region do not change. However, GluB5 gene located proximal to SSR2-001/SSR2-004 was specifically reduced. SSR profiles of seven Japonica varieties were compared with that of W3660 for loci in the relevant genetic region. The markers SSR2-004 and RM1358 were used for marker- assisted selection. The selection efficiencies of SSR2-004 and RM1358 were 96.8% and 92.7% respectively. This provides a standard starting point for the breeding of low glutelin content rice varieties in China.展开更多
基金Supported by Science and Technology Plan of Qinghai Province,No.2023-ZJ-787.
文摘Gastrointestinal hemangioma(GIH)is clinically rare,accounting for 7%-10%of benign gastrointestinal tumors and 0.5%of systemic hemangiomas.GIH can occur as either solitary or multiple lesions,with gastrointestinal bleeding as a significant clinical manifestation.Understanding the clinical and endoscopic features of GIH is essential for improving diagnostic accuracy,particularly through endoscopy and selective arteriography,which are highly effective in diagnosing GIH and preventing misdiagnosis and inappropriate treatment.Upon confirmed diagnosis,it is essential to thoroughly evaluate the patient's condition to determine the most suitable treatment modality—whether surgical,endoscopic,or minimally invasive intervention.The minimally invasive interventional partial embolization therapy using polyvinyl alcohol particles,proposed and implemented by Pospisilova et al,has achieved excellent clinical outcomes.This approach reduces surgical trauma and the inherent risks of traditional surgical treatments.
基金supported by Ho Chi Minh City Open University,Vietnam under grant number E2024.02.1CD and Suan Sunandha Rajabhat University,Thailand.
文摘The Financial Technology(FinTech)sector has witnessed rapid growth,resulting in increasingly complex and high-volume digital transactions.Although this expansion improves efficiency and accessibility,it also introduces significant vulnerabilities,including fraud,money laundering,and market manipulation.Traditional anomaly detection techniques often fail to capture the relational and dynamic characteristics of financial data.Graph Neural Networks(GNNs),capable of modeling intricate interdependencies among entities,have emerged as a powerful framework for detecting subtle and sophisticated anomalies.However,the high-dimensionality and inherent noise of FinTech datasets demand robust feature selection strategies to improve model scalability,performance,and interpretability.This paper presents a comprehensive survey of GNN-based approaches for anomaly detection in FinTech,with an emphasis on the synergistic role of feature selection.We examine the theoretical foundations of GNNs,review state-of-the-art feature selection techniques,analyze their integration with GNNs,and categorize prevalent anomaly types in FinTech applications.In addition,we discuss practical implementation challenges,highlight representative case studies,and propose future research directions to advance the field of graph-based anomaly detection in financial systems.
文摘With the increasing complexity of vehicular networks and the proliferation of connected vehicles,Federated Learning(FL)has emerged as a critical framework for decentralized model training while preserving data privacy.However,efficient client selection and adaptive weight allocation in heterogeneous and non-IID environments remain challenging.To address these issues,we propose Federated Learning with Client Selection and Adaptive Weighting(FedCW),a novel algorithm that leverages adaptive client selection and dynamic weight allocation for optimizing model convergence in real-time vehicular networks.FedCW selects clients based on their Euclidean distance from the global model and dynamically adjusts aggregation weights to optimize both data diversity and model convergence.Experimental results show that FedCW significantly outperforms existing FL algorithms such as FedAvg,FedProx,and SCAFFOLD,particularly in non-IID settings,achieving faster convergence,higher accuracy,and reduced communication overhead.These findings demonstrate that FedCW provides an effective solution for enhancing the performance of FL in heterogeneous,edge-based computing environments.
基金supported by Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(RS-2020-NR049579).
文摘High-dimensional data causes difficulties in machine learning due to high time consumption and large memory requirements.In particular,in amulti-label environment,higher complexity is required asmuch as the number of labels.Moreover,an optimization problem that fully considers all dependencies between features and labels is difficult to solve.In this study,we propose a novel regression-basedmulti-label feature selectionmethod that integrates mutual information to better exploit the underlying data structure.By incorporating mutual information into the regression formulation,the model captures not only linear relationships but also complex non-linear dependencies.The proposed objective function simultaneously considers three types of relationships:(1)feature redundancy,(2)featurelabel relevance,and(3)inter-label dependency.These three quantities are computed usingmutual information,allowing the proposed formulation to capture nonlinear dependencies among variables.These three types of relationships are key factors in multi-label feature selection,and our method expresses them within a unified formulation,enabling efficient optimization while simultaneously accounting for all of them.To efficiently solve the proposed optimization problem under non-negativity constraints,we develop a gradient-based optimization algorithm with fast convergence.Theexperimental results on sevenmulti-label datasets show that the proposed method outperforms existingmulti-label feature selection techniques.
基金supported by the National Key Research and Development Plan of China(2021YFD2200202)the Key Research and Development Project of Jiangsu Province,China(BE2021366).
文摘Populus species,important economic species combining rapid growth with broad ecological adaptability,play a critical role in sustainable forestry and bioenergy production.In this study,we performed whole-genome resequencing of 707 individuals from a full-sib family to develop comprehensive single nucleotide polymorphism(SNP)markers and constructed a high-density genetic linkage map of 19 linkage groups.The total genetic length of the map reached 3623.65 cM with an average marker interval of 0.34 cM.By integrating multidimensional phenotypic data,89 quantitative trait loci(QTL)associated with growth,wood physical and chemical properties,disease resistance,and leaf morphology traits were identified,with logarithm of odds(LOD)scores ranging from 3.13 to 21.72 Notably,pleiotropic analysis revealed significant colocaliza and phenotypic variance explained between 1.7% and 11.6%.-tion hotspots on chromosomes LG1,LG5,LG6,LG8,and LG14,with epistatic interaction network analysis confirming genetic basis of coordinated regulation across multiple traits.Functional annotation of 207 candidate genes showed that R2R3-MYB and bHLH transcription factors and pyruvate kinase-encoding genes were significantly enriched,suggesting crucial roles in lignin biosynthesis and carbon metabolic pathways.Allelic effect analysis indicated that the frequency of favorable alleles associated with target traits ranged from 0.20 to 0.55.Incorporation of QTL-derived favorable alleles as random effects into Bayesian-based genomic selection models led to an increase in prediction accuracy ranging from 1% to 21%,with Bayesian ridge regression as the best predictive model.This study provides valuable genomic resources and genetic insights for deciphering complex trait architecture and advancing molecular breeding in poplar.
基金supported by the China Agriculture Research System of MOF and MARAthe National Natural Science Foundation of China (31872337 and 31501919)the Agricultural Science and Technology Innovation Project,China (ASTIP-IAS02)。
文摘The advantages of genome selection(GS) in animal and plant breeding are self-evident.Traditional parametric models have disadvantage in better fit the increasingly large sequencing data and capture complex effects accurately.Machine learning models have demonstrated remarkable potential in addressing these challenges.In this study,we introduced the concept of mixed kernel functions to explore the performance of support vector machine regression(SVR) in GS.Six single kernel functions(SVR_L,SVR_C,SVR_G,SVR_P,SVR_S,SVR_L) and four mixed kernel functions(SVR_GS,SVR_GP,SVR_LS,SVR_LP) were used to predict genome breeding values.The prediction accuracy,mean squared error(MSE) and mean absolute error(MAE) were used as evaluation indicators to compare with two traditional parametric models(GBLUP,BayesB) and two popular machine learning models(RF,KcRR).The results indicate that in most cases,the performance of the mixed kernel function model significantly outperforms that of GBLUP,BayesB and single kernel function.For instance,for T1 in the pig dataset,the predictive accuracy of SVR_GS is improved by 10% compared to GBLUP,and by approximately 4.4 and 18.6% compared to SVR_G and SVR_S respectively.For E1 in the wheat dataset,SVR_GS achieves 13.3% higher prediction accuracy than GBLUP.Among single kernel functions,the Laplacian and Gaussian kernel functions yield similar results,with the Gaussian kernel function performing better.The mixed kernel function notably reduces the MSE and MAE when compared to all single kernel functions.Furthermore,regarding runtime,SVR_GS and SVR_GP mixed kernel functions run approximately three times faster than GBLUP in the pig dataset,with only a slight increase in runtime compared to the single kernel function model.In summary,the mixed kernel function model of SVR demonstrates speed and accuracy competitiveness,and the model such as SVR_GS has important application potential for GS.
基金supported by the Deanship of Scientific Research,Vice Presidency for Graduate Studies and Scientific Research,King Faisal University,Saudi Arabia Grant No.KFU253765.
文摘Most predictive maintenance studies have emphasized accuracy but provide very little focus on Interpretability or deployment readiness.This study improves on prior methods by developing a small yet robust system that can predict when turbofan engines will fail.It uses the NASA CMAPSS dataset,which has over 200,000 engine cycles from260 engines.The process begins with systematic preprocessing,which includes imputation,outlier removal,scaling,and labelling of the remaining useful life.Dimensionality is reduced using a hybrid selection method that combines variance filtering,recursive elimination,and gradient-boosted importance scores,yielding a stable set of 10 informative sensors.To mitigate class imbalance,minority cases are oversampled,and class-weighted losses are applied during training.Benchmarking is carried out with logistic regression,gradient boosting,and a recurrent design that integrates gated recurrent units with long short-term memory networks.The Long Short-Term Memory–Gated Recurrent Unit(LSTM–GRU)hybrid achieved the strongest performance with an F1 score of 0.92,precision of 0.93,recall of 0.91,ReceiverOperating Characteristic–AreaUnder the Curve(ROC-AUC)of 0.97,andminority recall of 0.75.Interpretability testing using permutation importance and Shapley values indicates that sensors 13,15,and 11 are the most important indicators of engine wear.The proposed system combines imbalance handling,feature reduction,and Interpretability into a practical design suitable for real industrial settings.
基金supported by National Natural Science Foundation of China(62106092)Natural Science Foundation of Fujian Province(2024J01822,2024J01820,2022J01916)Natural Science Foundation of Zhangzhou City(ZZ2024J28).
文摘Feature selection serves as a critical preprocessing step inmachine learning,focusing on identifying and preserving the most relevant features to improve the efficiency and performance of classification algorithms.Particle Swarm Optimization has demonstrated significant potential in addressing feature selection challenges.However,there are inherent limitations in Particle Swarm Optimization,such as the delicate balance between exploration and exploitation,susceptibility to local optima,and suboptimal convergence rates,hinder its performance.To tackle these issues,this study introduces a novel Leveraged Opposition-Based Learning method within Fitness Landscape Particle Swarm Optimization,tailored for wrapper-based feature selection.The proposed approach integrates:(1)a fitness-landscape adaptive strategy to dynamically balance exploration and exploitation,(2)the lever principle within Opposition-Based Learning to improve search efficiency,and(3)a Local Selection and Re-optimization mechanism combined with random perturbation to expedite convergence and enhance the quality of the optimal feature subset.The effectiveness of is rigorously evaluated on 24 benchmark datasets and compared against 13 advancedmetaheuristic algorithms.Experimental results demonstrate that the proposed method outperforms the compared algorithms in classification accuracy on over half of the datasets,whilst also significantly reducing the number of selected features.These findings demonstrate its effectiveness and robustness in feature selection tasks.
基金supported by the Major Science and Technology Programs in Henan Province(No.241100210100)Henan Provincial Science and Technology Research Project(No.252102211085,No.252102211105)+3 种基金Endogenous Security Cloud Network Convergence R&D Center(No.602431011PQ1)The Special Project for Research and Development in Key Areas of Guangdong Province(No.2021ZDZX1098)The Stabilization Support Program of Science,Technology and Innovation Commission of Shenzhen Municipality(No.20231128083944001)The Key scientific research projects of Henan higher education institutions(No.24A520042).
文摘Existing feature selection methods for intrusion detection systems in the Industrial Internet of Things often suffer from local optimality and high computational complexity.These challenges hinder traditional IDS from effectively extracting features while maintaining detection accuracy.This paper proposes an industrial Internet ofThings intrusion detection feature selection algorithm based on an improved whale optimization algorithm(GSLDWOA).The aim is to address the problems that feature selection algorithms under high-dimensional data are prone to,such as local optimality,long detection time,and reduced accuracy.First,the initial population’s diversity is increased using the Gaussian Mutation mechanism.Then,Non-linear Shrinking Factor balances global exploration and local development,avoiding premature convergence.Lastly,Variable-step Levy Flight operator and Dynamic Differential Evolution strategy are introduced to improve the algorithm’s search efficiency and convergence accuracy in highdimensional feature space.Experiments on the NSL-KDD and WUSTL-IIoT-2021 datasets demonstrate that the feature subset selected by GSLDWOA significantly improves detection performance.Compared to the traditional WOA algorithm,the detection rate and F1-score increased by 3.68%and 4.12%.On the WUSTL-IIoT-2021 dataset,accuracy,recall,and F1-score all exceed 99.9%.
文摘Unconfined Compressive Strength(UCS)is a key parameter for the assessment of the stability and performance of stabilized soils,yet traditional laboratory testing is both time and resource intensive.In this study,an interpretable machine learning approach to UCS prediction is presented,pairing five models(Random Forest(RF),Gradient Boosting(GB),Extreme Gradient Boosting(XGB),CatBoost,and K-Nearest Neighbors(KNN))with SHapley Additive exPlanations(SHAP)for enhanced interpretability and to guide feature removal.A complete dataset of 12 geotechnical and chemical parameters,i.e.,Atterberg limits,compaction properties,stabilizer chemistry,dosage,curing time,was used to train and test the models.R2,RMSE,MSE,and MAE were used to assess performance.Initial results with all 12 features indicated that boosting-based models(GB,XGB,CatBoost)exhibited the highest predictive accuracy(R^(2)=0.93)with satisfactory generalization on test data,followed by RF and KNN.SHAP analysis consistently picked CaO content,curing time,stabilizer dosage,and compaction parameters as the most important features,aligning with established soil stabilization mechanisms.Models were then re-trained on the top 8 and top 5 SHAP-ranked features.Interestingly,GB,XGB,and CatBoost maintained comparable accuracy with reduced input sets,while RF was moderately sensitive and KNN was somewhat better owing to reduced dimensionality.The findings confirm that feature reduction through SHAP enables cost-effective UCS prediction through the reduction of laboratory test requirements without significant accuracy loss.The suggested hybrid approach offers an explainable,interpretable,and cost-effective tool for geotechnical engineering practice.
基金funded by Deanship of Graduate studies and Scientific Research at Jouf University under grant No.(DGSSR-2024-02-01264).
文摘Automated essay scoring(AES)systems have gained significant importance in educational settings,offering a scalable,efficient,and objective method for evaluating student essays.However,developing AES systems for Arabic poses distinct challenges due to the language’s complex morphology,diglossia,and the scarcity of annotated datasets.This paper presents a hybrid approach to Arabic AES by combining text-based,vector-based,and embeddingbased similarity measures to improve essay scoring accuracy while minimizing the training data required.Using a large Arabic essay dataset categorized into thematic groups,the study conducted four experiments to evaluate the impact of feature selection,data size,and model performance.Experiment 1 established a baseline using a non-machine learning approach,selecting top-N correlated features to predict essay scores.The subsequent experiments employed 5-fold cross-validation.Experiment 2 showed that combining embedding-based,text-based,and vector-based features in a Random Forest(RF)model achieved an R2 of 88.92%and an accuracy of 83.3%within a 0.5-point tolerance.Experiment 3 further refined the feature selection process,demonstrating that 19 correlated features yielded optimal results,improving R2 to 88.95%.In Experiment 4,an optimal data efficiency training approach was introduced,where training data portions increased from 5%to 50%.The study found that using just 10%of the data achieved near-peak performance,with an R2 of 85.49%,emphasizing an effective trade-off between performance and computational costs.These findings highlight the potential of the hybrid approach for developing scalable Arabic AES systems,especially in low-resource environments,addressing linguistic challenges while ensuring efficient data usage.
基金funded by the Huaiyin Institute of Technology—Institute of Smart Energy.
文摘In the quest to enhance energy efficiency and reduce environmental impact in the transportation sector,the recovery of waste heat from diesel engines has become a critical area of focus.This study provided an exhaustive thermodynamic analysis optimizing Organic Rankine Cycle(ORC)systems forwaste heat recovery fromdiesel engines.Thestudy assessed the performance of five candidateworking fluids—R11,R123,R113,R245fa,and R141b—under a range of operating conditions,specifically varying overheat temperatures and evaporation pressures.The results indicated that the choice of working fluid substantially influences the system’s exergetic efficiency,net output power,and thermal efficiency.R245fa showed an outstanding net output power of 30.39 kW at high overheat conditions,outperforming R11,which is significant for high-temperature waste heat recovery.At lower temperatures,R11 and R113 demonstrated higher exergetic efficiencies,with R11 reaching a peak exergetic efficiency of 7.4%at an evaporation pressure of 10 bar and an overheat of 10℃.The study also revealed that controlling the overheat and optimizing the evaporation pressure are crucial for enhancing the net output power of the ORC system.Specifically,at an evaporation pressure of 30 bar and an overheat of 0℃,R113 exhibited the lowest exergetic destruction of 544.5 kJ/kg,making it a suitable choice for minimizing irreversible losses.These findings are instrumental for understanding the performance of ORC systems in waste heat recovery applications and offer valuable insights for the design and operation of more efficient and environmentally friendly diesel engine systems.
文摘Multi-label feature selection(MFS)is a crucial dimensionality reduction technique aimed at identifying informative features associated with multiple labels.However,traditional centralized methods face significant challenges in privacy-sensitive and distributed settings,often neglecting label dependencies and suffering from low computational efficiency.To address these issues,we introduce a novel framework,Fed-MFSDHBCPSO—federated MFS via dual-layer hybrid breeding cooperative particle swarm optimization algorithm with manifold and sparsity regularization(DHBCPSO-MSR).Leveraging the federated learning paradigm,Fed-MFSDHBCPSO allows clients to perform local feature selection(FS)using DHBCPSO-MSR.Locally selected feature subsets are encrypted with differential privacy(DP)and transmitted to a central server,where they are securely aggregated and refined through secure multi-party computation(SMPC)until global convergence is achieved.Within each client,DHBCPSO-MSR employs a dual-layer FS strategy.The inner layer constructs sample and label similarity graphs,generates Laplacian matrices to capture the manifold structure between samples and labels,and applies L2,1-norm regularization to sparsify the feature subset,yielding an optimized feature weight matrix.The outer layer uses a hybrid breeding cooperative particle swarm optimization algorithm to further refine the feature weight matrix and identify the optimal feature subset.The updated weight matrix is then fed back to the inner layer for further optimization.Comprehensive experiments on multiple real-world multi-label datasets demonstrate that Fed-MFSDHBCPSO consistently outperforms both centralized and federated baseline methods across several key evaluation metrics.
基金supported by the Jilin Science and Technology Development Program,China (20240602032RC)the Jilin Agricultural Science and Technology Innovation Project,China (CXGC2024ZD001)+1 种基金the Jilin Agricultural Science and Technology Innovation Project,China (CXGC2024ZY012)the Jilin Province Development and Reform Commission-Project for Improving the Independent Innovation Capacity of Major Grain Crops,China (2024C002)。
文摘Emerging and powerful genome editing tools,particularly CRISPR/Cas9,are facilitating functional genomics research and accelerating crop improvement(Jiang et al.2021;Cao et al.2023;Chen C et al.2023;Liu et al.2023a).However,the detection and screening of transgenic lines remain major bottlenecks,being time-consuming,labor-intensive,and inefficient during transformation and subsequent mutation identification.A simple and efficient visual marker system plays a critical role in addressing these challenges.Recent studies demonstrated that the GmW1 and RUBY reporter systems were used to obtain visual transgenic soybean(Glycine max) plants(Chen L et al.2023;Chen et al.2024).
基金supported by the National Natural Science Foundation of China (No. 61571149)the Special China Postdoctoral Science Foundation (2015T80325)+1 种基金the Fun-damental Research Funds for the Central Universities (HEUCFP201808)the China Postdoctoral Science Foundation (2013M530148)
文摘Massive MIMO is one of the key technologies in future 5G communications which can satisfy the requirement of high speed and large capacity. This paper considers antenna selection and power allocation design to promote energy conservation then provide good quality of service(QoS) for the whole massive MIMO uplink network. Unlike previous related works, hardware impairment, transmission efficiency, and energy consumption at the circuit and antennas are involved in massive MIMO networks. In order to ensure the QoS, we consider the minimum rate constraint for each user and the system, which increases the complexity of power allocation problem for maximizing energy and spectral efficiency in massive MIMO system. To this end, a quantum-inspired social emotional optimization(QSEO) algorithm is proposed to obtain the optimal power control strategy in massive MIMO uplink networks. Simulation results assess the great advantages of QSEO which previous strategies do not have.
基金supported by the State Major Science and Technology Special Projects under Grant No. 2018ZX03001028-003
文摘In 5G systems, massive multiple-input multiple-output (MIMO) has been adopted in base stations (BSs) to improve spectral efficiency and coverage. The traditional conductive performance test techniques are challenging due to the unaffordable cost and high complexity when testing a large number of antennas. To solve this problem, the over-the-air (OTA) test has been presented, in which probe selection is the key to reduce the number of channel emulators and probes. In this paper, a novel artificial bee colony (ABC) algorithm is introduced to enhance the efficiency and accuracy of probe selection procedure. A sectoring- based multi-probe anechoic chamber (MPAC) is built to evaluate the throughput performance of massive MIMO equipped in 5G BS. In addition, link level simulation is carried out to evaluate the proposal’s performance gain under the commercial network assumptions, where the average throughput of three velocity is given with different SNR region. The results suggest that OTA chamber and multi-probe wall are available not only for 5G BSs, but also for user equipments (UEs) with end-to-end communication.
基金supported by the grants from the National Natural Science Foundation of China (Nos. 3177 2843, 31741122)the Earmarked Fund for Agriculture Seed Improvement Project of Shandong Province (No. 2017LZGC009)the Fundamental Research Funds for the Central Universities (No. 201762014)
文摘The Pacific oyster,Crassostrea gigas,naturally distributing along the coast of northwest Pacific,is one of the most important bivalve species due to its high economic value and fecundity.In China,we have initiated a selective breeding program on both shell color and growth rate of C.gigas since 2010.A black shell line was obtained through four-generation family selection.In this study,mass selection for growth improvement was conducted in the sixth generation and seventh generation of black shell lines.To assess the progress of potential genetic improvement,the progeny of two generations of black shell lines were selected to evaluate their shell heights via a 450-day farming experiment.As the results,after growing for 450 days,the sixth generation and seventh generation of selected lines were 9.03% and 11.42% larger than the control lines,respectively.During the grow-out stage,the genetic gain of two generations was 8.82%±0.18% and 11.54%±0.43%,respectively;and the corresponding realized heritability was 0.45±0.04 and 0.41±0.04,respectively.These results indicated that the mass selection for shell height achieved steady progress in the two generations of C.gigas.
基金supported by the grants from Hi-Tech Research and Development Program of China("863"Program,No.2003AA222131,2003AA207020)the National Natural Science Foundation of China(No.30170570)Special Program for gene-transfering(No.JY03-B-07,JY03-A-07-02)
文摘Rice with low glutelin content is suitable as functional food for patients affected with diabetes and kidney failure. The fine mapping of the gene(s) responsible for low glutelin content will provide information regarding the distribution of glutelin related genes in rice genome and will generate markers for the selection of low glutelin rice varieties. Following an SDS-PAGE screen of rice germplasm from Taihu Valley of China, Japonica selection W3660 is identified to be a novel mutant characterized with low glutelin content. For fine mapping the mutant gene for low glutelin content, F2 and F3 populations were derived from a cross between W3660 and Jingrennuo. SDS-PAGE analysis of the total endosperm protein showed that the low glutelin content trait was controlled by a single dominant nuclear gene. Genetic mapping, using SSRs, located this gene to chromosome 2, in the region between SSR2-001/SSR2-004 and RM1358. The dis- tances of the two markers to the target gene were 1.1 cM and 3.8 cM respectively. By semi-quantitative RT-PCR analysis, the transcripts of GluB4/GluB5 genes located within the region do not change. However, GluB5 gene located proximal to SSR2-001/SSR2-004 was specifically reduced. SSR profiles of seven Japonica varieties were compared with that of W3660 for loci in the relevant genetic region. The markers SSR2-004 and RM1358 were used for marker- assisted selection. The selection efficiencies of SSR2-004 and RM1358 were 96.8% and 92.7% respectively. This provides a standard starting point for the breeding of low glutelin content rice varieties in China.