In the realm of Multi-Label Text Classification(MLTC),the dual challenges of extracting rich semantic features from text and discerning inter-label relationships have spurred innovative approaches.Many studies in sema...In the realm of Multi-Label Text Classification(MLTC),the dual challenges of extracting rich semantic features from text and discerning inter-label relationships have spurred innovative approaches.Many studies in semantic feature extraction have turned to external knowledge to augment the model’s grasp of textual content,often overlooking intrinsic textual cues such as label statistical features.In contrast,these endogenous insights naturally align with the classification task.In our paper,to complement this focus on intrinsic knowledge,we introduce a novel Gate-Attention mechanism.This mechanism adeptly integrates statistical features from the text itself into the semantic fabric,enhancing the model’s capacity to understand and represent the data.Additionally,to address the intricate task of mining label correlations,we propose a Dual-end enhancement mechanism.This mechanism effectively mitigates the challenges of information loss and erroneous transmission inherent in traditional long short term memory propagation.We conducted an extensive battery of experiments on the AAPD and RCV1-2 datasets.These experiments serve the dual purpose of confirming the efficacy of both the Gate-Attention mechanism and the Dual-end enhancement mechanism.Our final model unequivocally outperforms the baseline model,attesting to its robustness.These findings emphatically underscore the imperativeness of taking into account not just external knowledge but also the inherent intricacies of textual data when crafting potent MLTC models.展开更多
The rapid advancement of large language models(LLMs)has driven the pervasive adoption of AI-generated content(AIGC),while also raising concerns about misinformation,academic misconduct,biased or harmful content,and ot...The rapid advancement of large language models(LLMs)has driven the pervasive adoption of AI-generated content(AIGC),while also raising concerns about misinformation,academic misconduct,biased or harmful content,and other risks.Detecting AI-generated text has thus become essential to safeguard the authenticity and reliability of digital information.This survey reviews recent progress in detection methods,categorizing approaches into passive and active categories based on their reliance on intrinsic textual features or embedded signals.Passive detection is further divided into surface linguistic feature-based and language model-based methods,whereas active detection encompasses watermarking-based and semantic retrieval-based approaches.This taxonomy enables systematic comparison of methodological differences in model dependency,applicability,and robustness.A key challenge for AI-generated text detection is that existing detectors are highly vulnerable to adversarial attacks,particularly paraphrasing,which substantially compromises their effectiveness.Addressing this gap highlights the need for future research on enhancing robustness and cross-domain generalization.By synthesizing current advances and limitations,this survey provides a structured reference for the field and outlines pathways toward more reliable and scalable detection solutions.展开更多
High-dimensional data causes difficulties in machine learning due to high time consumption and large memory requirements.In particular,in amulti-label environment,higher complexity is required asmuch as the number of ...High-dimensional data causes difficulties in machine learning due to high time consumption and large memory requirements.In particular,in amulti-label environment,higher complexity is required asmuch as the number of labels.Moreover,an optimization problem that fully considers all dependencies between features and labels is difficult to solve.In this study,we propose a novel regression-basedmulti-label feature selectionmethod that integrates mutual information to better exploit the underlying data structure.By incorporating mutual information into the regression formulation,the model captures not only linear relationships but also complex non-linear dependencies.The proposed objective function simultaneously considers three types of relationships:(1)feature redundancy,(2)featurelabel relevance,and(3)inter-label dependency.These three quantities are computed usingmutual information,allowing the proposed formulation to capture nonlinear dependencies among variables.These three types of relationships are key factors in multi-label feature selection,and our method expresses them within a unified formulation,enabling efficient optimization while simultaneously accounting for all of them.To efficiently solve the proposed optimization problem under non-negativity constraints,we develop a gradient-based optimization algorithm with fast convergence.Theexperimental results on sevenmulti-label datasets show that the proposed method outperforms existingmulti-label feature selection techniques.展开更多
The increasing significance of text data in power system intelligence has highlighted the out-of-distribution(OOD)problem as a critical challenge,hindering the deployment of artificial intelligence(AI)models.In a clos...The increasing significance of text data in power system intelligence has highlighted the out-of-distribution(OOD)problem as a critical challenge,hindering the deployment of artificial intelligence(AI)models.In a closed-world setting,most AI models cannot detect and reject unexpected data,which exacerbates the harmful impact of the OOD problem.The high similarity between OOD and indistribution(IND)samples in the power system presents challenges for existing OOD detection methods in achieving effective results.This study aims to elucidate and address the OOD problem in power systems through a text classification task.First,the underlying causes of OOD sample generation are analyzed,highlighting the inherent nature of the OOD problem in the power system.Second,a novel method integrating the enhanced Mahalanobis distance with calibration strategies is introduced to improve OOD detection for text data in power system applications.Finally,the case study utilizing the actual text data from power system field operation(PSFO)is conducted,demonstrating the effectiveness of the proposed OOD detection method.Experimental results indicate that the proposed method outperformed existing methods in text OOD detection tasks within the power system,achieving a remarkable 21.03%enhancement of metric in the false positive rate at 95%true positive recall(FPR95)and a 12.97%enhancement in classi-fication accuracy for the mixed IND-OOD scenarios.展开更多
Graph neural networks(GNN)have shown strong performance in node classification tasks,yet most existing models rely on uniform or shared weight aggregation,lacking flexibility in modeling the varying strength of relati...Graph neural networks(GNN)have shown strong performance in node classification tasks,yet most existing models rely on uniform or shared weight aggregation,lacking flexibility in modeling the varying strength of relationships among nodes.This paper proposes a novel graph coupling convolutional model that introduces an adaptive weighting mechanism to assign distinct importance to neighboring nodes based on their similarity to the central node.Unlike traditional methods,the proposed coupling strategy enhances the interpretability of node interactions while maintaining competitive classification performance.The model operates in the spatial domain,utilizing adjacency list structures for efficient convolution and addressing the limitations of weight sharing through a coupling-based similarity computation.Extensive experiments are conducted on five graph-structured datasets,including Cora,Citeseer,PubMed,Reddit,and BlogCatalog,as well as a custom topology dataset constructed from the Open University Learning Analytics Dataset(OULAD)educational platform.Results demonstrate that the proposed model achieves good classification accuracy,while significantly reducing training time through direct second-order neighbor fusion and data preprocessing.Moreover,analysis of neighborhood order reveals that considering third-order neighbors offers limited accuracy gains but introduces considerable computational overhead,confirming the efficiency of first-and second-order convolution in practical applications.Overall,the proposed graph coupling model offers a lightweight,interpretable,and effective framework for multi-label node classification in complex networks.展开更多
Objective To develop a dual-branch deep learning framework for accurate multi-label classification of fundus diseases,addressing the key limitations of insufficient complementary feature extraction and inadequate cros...Objective To develop a dual-branch deep learning framework for accurate multi-label classification of fundus diseases,addressing the key limitations of insufficient complementary feature extraction and inadequate cross-modal feature fusion in existing automated diagnostic methods.Methods The fundus multi-label classification dataset with 12 disease categories(FMLC-12)dataset was constructed by integrating complementary samples from Ocular Disease Intelligent Recognition(ODIR)and Retinal Fundus Multi-Disease Image Dataset(RFMiD),yielding 6936 fundus images across 12 retinal pathology categories,and the framework was validated on both FMLC-12 and ODIR.Inspired by the holistic multi-regional assessment principle of the Five Wheels theory in traditional Chinese medicine(TCM)ophthalmology,the dualbranch multi-label network(DBMNet)was developed as a novel framework integrating complementary visual feature extraction with pathological correlation modeling.The architecture employed a TransNeXt backbone within a dual-branch design:one branch processed redgreen-blue(RGB)images to capture color-dependent features,such as vascular patterns and lesion morphology,while the other processed grayscale-converted images to enhance subtle textural details and contrast variations.A feature interaction module(FIM)effectively integrated the multi-scale features from both branches.Comprehensive ablation studies were conducted to evaluate the contributions of the dual-branch architecture and the FIM.The performance of DBMNet was compared against four state-of-the-art methods,including EfficientNet Ensemble,transfer learning-based convolutional neural network(CNN),BFENet,and EyeDeep-Net,using mean average precision(mAP),F1-score,and Cohen's kappa coefficient.Results The dual-branch architecture improved mAP by 15.44 percentage points over the single-branch TransNeXt baseline,increasing from 34.41%to 44.24%,and the addition of FIM further boosted mAP to 49.85%.On FMLC-12,DBMNet achieved an mAP of 49.85%,a Cohen’s kappa coefficient of 62.14%,and an F1-score of 70.21%.Compared with BFENet(mAP:45.42%,kappa:46.64%,F1-score:71.34%),DBMNet outperformed it by 4.43 percentage points in mAP and 15.50 percentage points in kappa,while BFENet achieved a marginally higher F1-score.On ODIR,DBMNet achieved an F1-score of 85.50%,comparable to state-of-the-art methods.Conclusion DBMNet effectively integrates RGB and grayscale visual modalities through a dual-branch architecture,significantly improving multi-label fundus disease classification.The framework not only addresses the issue of insufficient feature fusion in existing methods but also demonstrates outstanding performance in balancing detection across both common and rare diseases,providing a promising and clinically applicable pathway for standardized,intelligent fundus disease classification.展开更多
Spam emails remain one of the most persistent threats to digital communication,necessitating effective detection solutions that safeguard both individuals and organisations.We propose a spam email classification frame...Spam emails remain one of the most persistent threats to digital communication,necessitating effective detection solutions that safeguard both individuals and organisations.We propose a spam email classification frame-work that uses Bidirectional Encoder Representations from Transformers(BERT)for contextual feature extraction and a multiple-window Convolutional Neural Network(CNN)for classification.To identify semantic nuances in email content,BERT embeddings are used,and CNN filters extract discriminative n-gram patterns at various levels of detail,enabling accurate spam identification.The proposed model outperformed Word2Vec-based baselines on a sample of 5728 labelled emails,achieving an accuracy of 98.69%,AUC of 0.9981,F1 Score of 0.9724,and MCC of 0.9639.With a medium kernel size of(6,9)and compact multi-window CNN architectures,it improves performance.Cross-validation illustrates stability and generalization across folds.By balancing high recall with minimal false positives,our method provides a reliable and scalable solution for current spam detection in advanced deep learning.By combining contextual embedding and a neural architecture,this study develops a security analysis method.展开更多
Multi-label feature selection(MFS)is a crucial dimensionality reduction technique aimed at identifying informative features associated with multiple labels.However,traditional centralized methods face significant chal...Multi-label feature selection(MFS)is a crucial dimensionality reduction technique aimed at identifying informative features associated with multiple labels.However,traditional centralized methods face significant challenges in privacy-sensitive and distributed settings,often neglecting label dependencies and suffering from low computational efficiency.To address these issues,we introduce a novel framework,Fed-MFSDHBCPSO—federated MFS via dual-layer hybrid breeding cooperative particle swarm optimization algorithm with manifold and sparsity regularization(DHBCPSO-MSR).Leveraging the federated learning paradigm,Fed-MFSDHBCPSO allows clients to perform local feature selection(FS)using DHBCPSO-MSR.Locally selected feature subsets are encrypted with differential privacy(DP)and transmitted to a central server,where they are securely aggregated and refined through secure multi-party computation(SMPC)until global convergence is achieved.Within each client,DHBCPSO-MSR employs a dual-layer FS strategy.The inner layer constructs sample and label similarity graphs,generates Laplacian matrices to capture the manifold structure between samples and labels,and applies L2,1-norm regularization to sparsify the feature subset,yielding an optimized feature weight matrix.The outer layer uses a hybrid breeding cooperative particle swarm optimization algorithm to further refine the feature weight matrix and identify the optimal feature subset.The updated weight matrix is then fed back to the inner layer for further optimization.Comprehensive experiments on multiple real-world multi-label datasets demonstrate that Fed-MFSDHBCPSO consistently outperforms both centralized and federated baseline methods across several key evaluation metrics.展开更多
This study compares the relative efficacy of the continuation task and the model-as-feedbackwriting (MAFW) task in EFL writing development. Ninety intermediate-level Chinese EFL learnerswere randomly assigned to a con...This study compares the relative efficacy of the continuation task and the model-as-feedbackwriting (MAFW) task in EFL writing development. Ninety intermediate-level Chinese EFL learnerswere randomly assigned to a continuation group, a MAFW group, and a control group, each with30 learners. A pretest and a posttest were used to gauge L2 writing development. Results showedthat the continuation task outperformed the MAFW task not only in enhancing the overall qualityof L2 writing, but also in promoting the quality of three components of L2 writing, namely, content,organization, and language. The finding has important implications for L2 writing teaching andlearning.展开更多
China’s environmental governance strategy provides a distinctive pathway for integrating sustainable development into national policy.Understanding its policy trajectory is essential for assessing China’s contributi...China’s environmental governance strategy provides a distinctive pathway for integrating sustainable development into national policy.Understanding its policy trajectory is essential for assessing China’s contribution to global sustainable development and the United Nations Sustainable Development Goals(SDGs).This study constructs a comprehensive database of 425 national environmental governance policy documents issued between 1978 and 2022 and applies Latent Dirichlet Allocation(LDA)modeling to examine the evolution of policy themes and discourse.The results show that China’s environmental governance has undergone four stages-initial exploration,detailed development,transformative leap,and diverse prosperity-reflecting a progressive shift toward more integrated and coordinated governance.Policy priorities have evolved from a primary focus on pollution control and energy transition to an emphasis on institutional construction and organizational reform,thereby strengthening alignment with the SDGs.This transformation is characterized by recurring developmental themes and increasingly preventive,forward-looking,and system-oriented governance approaches.Moreover,the co-evolution of policy concepts and implementation has driven a transition from localized,end-of-pipe responses to comprehensive governance frameworks,alongside a shift from normative guidance towards effectiveness-oriented policy design.By employing a data-driven text analysis approach,this study offers a systematic framework for tracing long-term policy evolution and assessing its implications for sustainable development.展开更多
With the rapid development of digital culture,a large number of cultural texts are presented in the form of digital and network.These texts have significant characteristics such as sparsity,real-time and non-standard ...With the rapid development of digital culture,a large number of cultural texts are presented in the form of digital and network.These texts have significant characteristics such as sparsity,real-time and non-standard expression,which bring serious challenges to traditional classification methods.In order to cope with the above problems,this paper proposes a new ASSC(ALBERT,SVD,Self-Attention and Cross-Entropy)-TextRCNN digital cultural text classification model.Based on the framework of TextRCNN,the Albert pre-training language model is introduced to improve the depth and accuracy of semantic embedding.Combined with the dual attention mechanism,the model’s ability to capture and model potential key information in short texts is strengthened.The Singular Value Decomposition(SVD)was used to replace the traditional Max pooling operation,which effectively reduced the feature loss rate and retained more key semantic information.The cross-entropy loss function was used to optimize the prediction results,making the model more robust in class distribution learning.The experimental results indicate that,in the digital cultural text classification task,as compared to the baseline model,the proposed ASSC-TextRCNN method achieves an 11.85%relative improvement in accuracy and an 11.97%relative increase in the F1 score.Meanwhile,the relative error rate decreases by 53.18%.This achievement not only validates the effectiveness and advanced nature of the proposed approach but also offers a novel technical route and methodological underpinnings for the intelligent analysis and dissemination of digital cultural texts.It holds great significance for promoting the in-depth exploration and value realization of digital culture.展开更多
Text classification means to assign a document to one or more classes or categories according to content. Text classification provides convenience for users to obtain data. Because of the polysemy of text data, multi-...Text classification means to assign a document to one or more classes or categories according to content. Text classification provides convenience for users to obtain data. Because of the polysemy of text data, multi-label classification can handle text data more comprehensively. Multi-label text classification become the key problem in the data mining. To improve the performances of multi-label text classification, semantic analysis is embedded into the classification model to complete label correlation analysis, and the structure, objective function and optimization strategy of this model is designed. Then, the convolution neural network(CNN) model based on semantic embedding is introduced. In the end, Zhihu dataset is used for evaluation. The result shows that this model outperforms the related work in terms of recall and area under curve(AUC) metrics.展开更多
基金supported by National Natural Science Foundation of China(NSFC)(Grant Nos.62162022,62162024)the Key Research and Development Program of Hainan Province(Grant Nos.ZDYF2020040,ZDYF2021GXJS003)+2 种基金the Major Science and Technology Project of Hainan Province(Grant No.ZDKJ2020012)Hainan Provincial Natural Science Foundation of China(Grant Nos.620MS021,621QN211)Science and Technology Development Center of the Ministry of Education Industry-University-Research Innovation Fund(2021JQR017).
文摘In the realm of Multi-Label Text Classification(MLTC),the dual challenges of extracting rich semantic features from text and discerning inter-label relationships have spurred innovative approaches.Many studies in semantic feature extraction have turned to external knowledge to augment the model’s grasp of textual content,often overlooking intrinsic textual cues such as label statistical features.In contrast,these endogenous insights naturally align with the classification task.In our paper,to complement this focus on intrinsic knowledge,we introduce a novel Gate-Attention mechanism.This mechanism adeptly integrates statistical features from the text itself into the semantic fabric,enhancing the model’s capacity to understand and represent the data.Additionally,to address the intricate task of mining label correlations,we propose a Dual-end enhancement mechanism.This mechanism effectively mitigates the challenges of information loss and erroneous transmission inherent in traditional long short term memory propagation.We conducted an extensive battery of experiments on the AAPD and RCV1-2 datasets.These experiments serve the dual purpose of confirming the efficacy of both the Gate-Attention mechanism and the Dual-end enhancement mechanism.Our final model unequivocally outperforms the baseline model,attesting to its robustness.These findings emphatically underscore the imperativeness of taking into account not just external knowledge but also the inherent intricacies of textual data when crafting potent MLTC models.
基金supported in part by the Science and Technology Innovation Program of Hunan Province under Grant 2025RC3166the National Natural Science Foundation of China under Grant 62572176the National Key R&D Program of China under Grant 2024YFF0618800.
文摘The rapid advancement of large language models(LLMs)has driven the pervasive adoption of AI-generated content(AIGC),while also raising concerns about misinformation,academic misconduct,biased or harmful content,and other risks.Detecting AI-generated text has thus become essential to safeguard the authenticity and reliability of digital information.This survey reviews recent progress in detection methods,categorizing approaches into passive and active categories based on their reliance on intrinsic textual features or embedded signals.Passive detection is further divided into surface linguistic feature-based and language model-based methods,whereas active detection encompasses watermarking-based and semantic retrieval-based approaches.This taxonomy enables systematic comparison of methodological differences in model dependency,applicability,and robustness.A key challenge for AI-generated text detection is that existing detectors are highly vulnerable to adversarial attacks,particularly paraphrasing,which substantially compromises their effectiveness.Addressing this gap highlights the need for future research on enhancing robustness and cross-domain generalization.By synthesizing current advances and limitations,this survey provides a structured reference for the field and outlines pathways toward more reliable and scalable detection solutions.
基金supported by Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(RS-2020-NR049579).
文摘High-dimensional data causes difficulties in machine learning due to high time consumption and large memory requirements.In particular,in amulti-label environment,higher complexity is required asmuch as the number of labels.Moreover,an optimization problem that fully considers all dependencies between features and labels is difficult to solve.In this study,we propose a novel regression-basedmulti-label feature selectionmethod that integrates mutual information to better exploit the underlying data structure.By incorporating mutual information into the regression formulation,the model captures not only linear relationships but also complex non-linear dependencies.The proposed objective function simultaneously considers three types of relationships:(1)feature redundancy,(2)featurelabel relevance,and(3)inter-label dependency.These three quantities are computed usingmutual information,allowing the proposed formulation to capture nonlinear dependencies among variables.These three types of relationships are key factors in multi-label feature selection,and our method expresses them within a unified formulation,enabling efficient optimization while simultaneously accounting for all of them.To efficiently solve the proposed optimization problem under non-negativity constraints,we develop a gradient-based optimization algorithm with fast convergence.Theexperimental results on sevenmulti-label datasets show that the proposed method outperforms existingmulti-label feature selection techniques.
基金supported in part by the Science and Technology Project of the State Grid East China Branch(No.520800230008).
文摘The increasing significance of text data in power system intelligence has highlighted the out-of-distribution(OOD)problem as a critical challenge,hindering the deployment of artificial intelligence(AI)models.In a closed-world setting,most AI models cannot detect and reject unexpected data,which exacerbates the harmful impact of the OOD problem.The high similarity between OOD and indistribution(IND)samples in the power system presents challenges for existing OOD detection methods in achieving effective results.This study aims to elucidate and address the OOD problem in power systems through a text classification task.First,the underlying causes of OOD sample generation are analyzed,highlighting the inherent nature of the OOD problem in the power system.Second,a novel method integrating the enhanced Mahalanobis distance with calibration strategies is introduced to improve OOD detection for text data in power system applications.Finally,the case study utilizing the actual text data from power system field operation(PSFO)is conducted,demonstrating the effectiveness of the proposed OOD detection method.Experimental results indicate that the proposed method outperformed existing methods in text OOD detection tasks within the power system,achieving a remarkable 21.03%enhancement of metric in the false positive rate at 95%true positive recall(FPR95)and a 12.97%enhancement in classi-fication accuracy for the mixed IND-OOD scenarios.
基金Support by Sichuan Science and Technology Program[2023YFSY0026,2023YFH0004]Guangzhou Huashang University[2024HSZD01,HS2023JYSZH01].
文摘Graph neural networks(GNN)have shown strong performance in node classification tasks,yet most existing models rely on uniform or shared weight aggregation,lacking flexibility in modeling the varying strength of relationships among nodes.This paper proposes a novel graph coupling convolutional model that introduces an adaptive weighting mechanism to assign distinct importance to neighboring nodes based on their similarity to the central node.Unlike traditional methods,the proposed coupling strategy enhances the interpretability of node interactions while maintaining competitive classification performance.The model operates in the spatial domain,utilizing adjacency list structures for efficient convolution and addressing the limitations of weight sharing through a coupling-based similarity computation.Extensive experiments are conducted on five graph-structured datasets,including Cora,Citeseer,PubMed,Reddit,and BlogCatalog,as well as a custom topology dataset constructed from the Open University Learning Analytics Dataset(OULAD)educational platform.Results demonstrate that the proposed model achieves good classification accuracy,while significantly reducing training time through direct second-order neighbor fusion and data preprocessing.Moreover,analysis of neighborhood order reveals that considering third-order neighbors offers limited accuracy gains but introduces considerable computational overhead,confirming the efficiency of first-and second-order convolution in practical applications.Overall,the proposed graph coupling model offers a lightweight,interpretable,and effective framework for multi-label node classification in complex networks.
基金Natural Science Foundation of Hunan Province(2025JJ90031)Key Research and Development Program of Hunan Province of China(23A0273)Hunan Provincial Administration of Traditional Chinese Medicine(A2023048).
文摘Objective To develop a dual-branch deep learning framework for accurate multi-label classification of fundus diseases,addressing the key limitations of insufficient complementary feature extraction and inadequate cross-modal feature fusion in existing automated diagnostic methods.Methods The fundus multi-label classification dataset with 12 disease categories(FMLC-12)dataset was constructed by integrating complementary samples from Ocular Disease Intelligent Recognition(ODIR)and Retinal Fundus Multi-Disease Image Dataset(RFMiD),yielding 6936 fundus images across 12 retinal pathology categories,and the framework was validated on both FMLC-12 and ODIR.Inspired by the holistic multi-regional assessment principle of the Five Wheels theory in traditional Chinese medicine(TCM)ophthalmology,the dualbranch multi-label network(DBMNet)was developed as a novel framework integrating complementary visual feature extraction with pathological correlation modeling.The architecture employed a TransNeXt backbone within a dual-branch design:one branch processed redgreen-blue(RGB)images to capture color-dependent features,such as vascular patterns and lesion morphology,while the other processed grayscale-converted images to enhance subtle textural details and contrast variations.A feature interaction module(FIM)effectively integrated the multi-scale features from both branches.Comprehensive ablation studies were conducted to evaluate the contributions of the dual-branch architecture and the FIM.The performance of DBMNet was compared against four state-of-the-art methods,including EfficientNet Ensemble,transfer learning-based convolutional neural network(CNN),BFENet,and EyeDeep-Net,using mean average precision(mAP),F1-score,and Cohen's kappa coefficient.Results The dual-branch architecture improved mAP by 15.44 percentage points over the single-branch TransNeXt baseline,increasing from 34.41%to 44.24%,and the addition of FIM further boosted mAP to 49.85%.On FMLC-12,DBMNet achieved an mAP of 49.85%,a Cohen’s kappa coefficient of 62.14%,and an F1-score of 70.21%.Compared with BFENet(mAP:45.42%,kappa:46.64%,F1-score:71.34%),DBMNet outperformed it by 4.43 percentage points in mAP and 15.50 percentage points in kappa,while BFENet achieved a marginally higher F1-score.On ODIR,DBMNet achieved an F1-score of 85.50%,comparable to state-of-the-art methods.Conclusion DBMNet effectively integrates RGB and grayscale visual modalities through a dual-branch architecture,significantly improving multi-label fundus disease classification.The framework not only addresses the issue of insufficient feature fusion in existing methods but also demonstrates outstanding performance in balancing detection across both common and rare diseases,providing a promising and clinically applicable pathway for standardized,intelligent fundus disease classification.
基金funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2026R234)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Spam emails remain one of the most persistent threats to digital communication,necessitating effective detection solutions that safeguard both individuals and organisations.We propose a spam email classification frame-work that uses Bidirectional Encoder Representations from Transformers(BERT)for contextual feature extraction and a multiple-window Convolutional Neural Network(CNN)for classification.To identify semantic nuances in email content,BERT embeddings are used,and CNN filters extract discriminative n-gram patterns at various levels of detail,enabling accurate spam identification.The proposed model outperformed Word2Vec-based baselines on a sample of 5728 labelled emails,achieving an accuracy of 98.69%,AUC of 0.9981,F1 Score of 0.9724,and MCC of 0.9639.With a medium kernel size of(6,9)and compact multi-window CNN architectures,it improves performance.Cross-validation illustrates stability and generalization across folds.By balancing high recall with minimal false positives,our method provides a reliable and scalable solution for current spam detection in advanced deep learning.By combining contextual embedding and a neural architecture,this study develops a security analysis method.
文摘Multi-label feature selection(MFS)is a crucial dimensionality reduction technique aimed at identifying informative features associated with multiple labels.However,traditional centralized methods face significant challenges in privacy-sensitive and distributed settings,often neglecting label dependencies and suffering from low computational efficiency.To address these issues,we introduce a novel framework,Fed-MFSDHBCPSO—federated MFS via dual-layer hybrid breeding cooperative particle swarm optimization algorithm with manifold and sparsity regularization(DHBCPSO-MSR).Leveraging the federated learning paradigm,Fed-MFSDHBCPSO allows clients to perform local feature selection(FS)using DHBCPSO-MSR.Locally selected feature subsets are encrypted with differential privacy(DP)and transmitted to a central server,where they are securely aggregated and refined through secure multi-party computation(SMPC)until global convergence is achieved.Within each client,DHBCPSO-MSR employs a dual-layer FS strategy.The inner layer constructs sample and label similarity graphs,generates Laplacian matrices to capture the manifold structure between samples and labels,and applies L2,1-norm regularization to sparsify the feature subset,yielding an optimized feature weight matrix.The outer layer uses a hybrid breeding cooperative particle swarm optimization algorithm to further refine the feature weight matrix and identify the optimal feature subset.The updated weight matrix is then fed back to the inner layer for further optimization.Comprehensive experiments on multiple real-world multi-label datasets demonstrate that Fed-MFSDHBCPSO consistently outperforms both centralized and federated baseline methods across several key evaluation metrics.
文摘This study compares the relative efficacy of the continuation task and the model-as-feedbackwriting (MAFW) task in EFL writing development. Ninety intermediate-level Chinese EFL learnerswere randomly assigned to a continuation group, a MAFW group, and a control group, each with30 learners. A pretest and a posttest were used to gauge L2 writing development. Results showedthat the continuation task outperformed the MAFW task not only in enhancing the overall qualityof L2 writing, but also in promoting the quality of three components of L2 writing, namely, content,organization, and language. The finding has important implications for L2 writing teaching andlearning.
基金supported by the Key Project of Jiangsu Social Science Fund and the Key Project of Jiangsu Research Center for Xi Jinping Thought on Socialism with Chinese Characteristics for a New Era(Grant No.26ZXZA017).
文摘China’s environmental governance strategy provides a distinctive pathway for integrating sustainable development into national policy.Understanding its policy trajectory is essential for assessing China’s contribution to global sustainable development and the United Nations Sustainable Development Goals(SDGs).This study constructs a comprehensive database of 425 national environmental governance policy documents issued between 1978 and 2022 and applies Latent Dirichlet Allocation(LDA)modeling to examine the evolution of policy themes and discourse.The results show that China’s environmental governance has undergone four stages-initial exploration,detailed development,transformative leap,and diverse prosperity-reflecting a progressive shift toward more integrated and coordinated governance.Policy priorities have evolved from a primary focus on pollution control and energy transition to an emphasis on institutional construction and organizational reform,thereby strengthening alignment with the SDGs.This transformation is characterized by recurring developmental themes and increasingly preventive,forward-looking,and system-oriented governance approaches.Moreover,the co-evolution of policy concepts and implementation has driven a transition from localized,end-of-pipe responses to comprehensive governance frameworks,alongside a shift from normative guidance towards effectiveness-oriented policy design.By employing a data-driven text analysis approach,this study offers a systematic framework for tracing long-term policy evolution and assessing its implications for sustainable development.
基金funded by China National Innovation and Entrepreneurship Project Fund Innovation Training Program(202410451009).
文摘With the rapid development of digital culture,a large number of cultural texts are presented in the form of digital and network.These texts have significant characteristics such as sparsity,real-time and non-standard expression,which bring serious challenges to traditional classification methods.In order to cope with the above problems,this paper proposes a new ASSC(ALBERT,SVD,Self-Attention and Cross-Entropy)-TextRCNN digital cultural text classification model.Based on the framework of TextRCNN,the Albert pre-training language model is introduced to improve the depth and accuracy of semantic embedding.Combined with the dual attention mechanism,the model’s ability to capture and model potential key information in short texts is strengthened.The Singular Value Decomposition(SVD)was used to replace the traditional Max pooling operation,which effectively reduced the feature loss rate and retained more key semantic information.The cross-entropy loss function was used to optimize the prediction results,making the model more robust in class distribution learning.The experimental results indicate that,in the digital cultural text classification task,as compared to the baseline model,the proposed ASSC-TextRCNN method achieves an 11.85%relative improvement in accuracy and an 11.97%relative increase in the F1 score.Meanwhile,the relative error rate decreases by 53.18%.This achievement not only validates the effectiveness and advanced nature of the proposed approach but also offers a novel technical route and methodological underpinnings for the intelligent analysis and dissemination of digital cultural texts.It holds great significance for promoting the in-depth exploration and value realization of digital culture.
文摘Text classification means to assign a document to one or more classes or categories according to content. Text classification provides convenience for users to obtain data. Because of the polysemy of text data, multi-label classification can handle text data more comprehensively. Multi-label text classification become the key problem in the data mining. To improve the performances of multi-label text classification, semantic analysis is embedded into the classification model to complete label correlation analysis, and the structure, objective function and optimization strategy of this model is designed. Then, the convolution neural network(CNN) model based on semantic embedding is introduced. In the end, Zhihu dataset is used for evaluation. The result shows that this model outperforms the related work in terms of recall and area under curve(AUC) metrics.