Current shipping,tourism,and resource development requirements call for more accurate predictions of the Arctic sea-ice concentration(SIC).However,due to the complex physical processes involved,predicting the spatiote...Current shipping,tourism,and resource development requirements call for more accurate predictions of the Arctic sea-ice concentration(SIC).However,due to the complex physical processes involved,predicting the spatiotemporal distribution of Arctic SIC is more challenging than predicting its total extent.In this study,spatiotemporal prediction models for monthly Arctic SIC at 1-to 3-month leads are developed based on U-Net-an effective convolutional deep-learning approach.Based on explicit Arctic sea-ice-atmosphere interactions,11 variables associated with Arctic sea-ice variations are selected as predictors,including observed Arctic SIC,atmospheric,oceanic,and heat flux variables at 1-to 3-month leads.The prediction skills for the monthly Arctic SIC of the test set(from January 2018 to December 2022)are evaluated by examining the mean absolute error(MAE)and binary accuracy(BA).Results showed that the U-Net model had lower MAE and higher BA for Arctic SIC compared to two dynamic climate prediction systems(CFSv2 and NorCPM).By analyzing the relative importance of each predictor,the prediction accuracy relies more on the SIC at the 1-month lead,but on the surface net solar radiation flux at 2-to 3-month leads.However,dynamic models show limited prediction skills for surface net solar radiation flux and other physical processes,especially in autumn.Therefore,the U-Net model can be used to capture the connections among these key physical processes associated with Arctic sea ice and thus offers a significant advantage in predicting Arctic SIC.展开更多
Mango is a plant with high economic value in the agricultural industry;thus,it is necessary to maximize the productivity performance of the mango plant,which can be done by implementing artificial intelligence.In this...Mango is a plant with high economic value in the agricultural industry;thus,it is necessary to maximize the productivity performance of the mango plant,which can be done by implementing artificial intelligence.In this study,a lightweight object detection model will be developed that can detect mango plant conditions based on disease potential,so that it becomes an early detection warning system that has an impact on increasing agricultural productivity.The proposed lightweight model integrates YOLOv7-Tiny and the proposed modules,namely the C2S module.The C2S module consists of three sub-modules such as the convolutional block attention module(CBAM),the coordinate attention(CA)module,and the squeeze-and-excitation(SE)module.The dataset is constructed by eight classes,including seven classes of disease conditions and one class of health conditions.The experimental result shows that the proposed lightweight model has the optimal results,which increase by 13.15% of mAP50 compared to the original model YOLOv7-Tiny.While the mAP50:95 also achieved the highest results compared to other models,including YOLOv3-Tiny,YOLOv4-Tiny,YOLOv5,and YOLOv7-Tiny.The advantage of the proposed lightweightmodel is the adaptability that supports it in constrained environments,such as edge computing systems.This proposedmodel can support a robust,precise,and convenient precision agriculture system for the user.展开更多
The effective channeling of fluid flow by fractures is a liability for enhanced oil recovery(EOR)methods like CO_(2) flooding or CO_(2) storage.Developing a distributed fracture model to understand the heterogeneity o...The effective channeling of fluid flow by fractures is a liability for enhanced oil recovery(EOR)methods like CO_(2) flooding or CO_(2) storage.Developing a distributed fracture model to understand the heterogeneity of the fracture network is essential in characterizing tight and low-permeability reservoirs.In the Ordos Basin,the Chang 8-1-2 layer of the Yanchang Formation is a typical tight and low permeability reservoir in the JH17 wellblock.The strong heterogeneity of distributed fractures,differing fracture scales and fracture types make it difficult to effectively characterize the fracture distribution within the Chang 8-1-2 layer.In this paper,multi-source and multi-attribute methods are used to integrate data into a neural network at different scales,and fuzzy logic control is used to judge the correlation of various attributes.The results suggest that attribute correlation between coherence and fracture indication is the best,followed by correlations with fault distance,north–south slope,and north–south curvature.Advantageous attributes from the target area are used to train the neural network,and the fracture density model and discrete fracture network(DFN)model are built at different scales.This method can be used to effectively predict the distribution characteristics of fractures in the study area.And any learning done by the neural network from this case study can be applied to fracture network modeling for reservoirs of the same type.展开更多
This study compares the relative efficacy of the continuation task and the model-as-feedbackwriting (MAFW) task in EFL writing development. Ninety intermediate-level Chinese EFL learnerswere randomly assigned to a con...This study compares the relative efficacy of the continuation task and the model-as-feedbackwriting (MAFW) task in EFL writing development. Ninety intermediate-level Chinese EFL learnerswere randomly assigned to a continuation group, a MAFW group, and a control group, each with30 learners. A pretest and a posttest were used to gauge L2 writing development. Results showedthat the continuation task outperformed the MAFW task not only in enhancing the overall qualityof L2 writing, but also in promoting the quality of three components of L2 writing, namely, content,organization, and language. The finding has important implications for L2 writing teaching andlearning.展开更多
Objective To develop QingNangTCM,a specialized large language model(LLM)tailored for expert-level traditional Chinese medicine(TCM)question-answering and clinical reasoning,addressing the scarcity of domain-specific c...Objective To develop QingNangTCM,a specialized large language model(LLM)tailored for expert-level traditional Chinese medicine(TCM)question-answering and clinical reasoning,addressing the scarcity of domain-specific corpora and specialized alignment.Methods We constructed QnTCM_Dataset,a corpus of 100000 entries,by integrating data from ShenNong_TCM_Dataset and SymMap v2.0,and synthesizing additional samples via retrieval-augmented generation(RAG)and persona-driven generation.The dataset comprehensively covers diagnostic inquiries,prescriptions,and herbal knowledge.Utilizing P-Tuning v2,we fine-tuned the GLM-4-9B-Chat backbone to develop QingNangTCM.A multidimensional evaluation framework,assessing accuracy,coverage,consistency,safety,professionalism,and fluency,was established using metrics such as bilingual evaluation understudy(BLEU),recall-oriented understudy for gisting evaluation(ROUGE),metric for evaluation of translation with explicit ordering(METEOR),and LLM-as-a-Judge with expert review.Qualitative analysis was conducted across four simulated clinical scenarios:symptom analysis,disease treatment,herb inquiry,and failure cases.Baseline models included GLM-4-9BChat,DeepSeek-V2,HuatuoGPT-II(7B),and GLM-4-9B-Chat(freeze-tuning).Results QingNangTCM achieved the highest scores in BLEU-1/2/3/4(0.425/0.298/0.137/0.064),ROUGE-1/2(0.368/0.157),and METEOR(0.218),demonstrating a balanced and superior normalized performance profile of 0.900 across the dimensions of accuracy,coverage,and consistency.Although its ROUGE-L score(0.299)was lower than that of HuatuoGPT-II(7B)(0.351),it significantly outperformed domain-specific models in expert-validated win rates for professionalism(86%)and safety(73%).Qualitative analysis confirmed that the model strictly adheres to the“symptom-syndrome-pathogenesis-treatment”reasoning chain,though occasional misclassifications and hallucinations persisted when dealing with rare medicinal materials and uncommon syndromes.展开更多
钻井过程中对上返岩屑的监测与识别是感知地层变化、及时发现掉块并减缓井壁失稳风险的关键手段。实现快速、客观、自动化的岩屑识别对保障钻井安全、提高钻井效率具有重要意义。目前,岩屑识别主要依赖人工经验判断,存在主观性强、耗时...钻井过程中对上返岩屑的监测与识别是感知地层变化、及时发现掉块并减缓井壁失稳风险的关键手段。实现快速、客观、自动化的岩屑识别对保障钻井安全、提高钻井效率具有重要意义。目前,岩屑识别主要依赖人工经验判断,存在主观性强、耗时长和工作量大等问题。基于实际采集的岩屑图像,提出一种基于Segment Anything Model 2(SAM2)与KMeans聚类算法的岩屑识别模型,实现对岩屑颗粒的精确分割与自动聚类。同时,设计了交互式选择功能,支持工程师快速挑选目标岩屑块,显著提升岩屑块可视化与识别效率。实验结果表明,SAM2在岩屑图像分割任务中表现优异,分割精度较现有主流方法提升3%~6%。在四川威远构SX井的实际岩屑图像测试中,模型聚类识别准确率达83.9%,与人工标注结果高度一致。在典型井段的应用中,模型识别出4类主要岩屑,各类别占比分布与人工判别结果差异较小。研究结果表明,本文提出的模型方法能够有效划分不同粒径岩屑块并合理预测各类岩性占比,有助于辅助工程师快速判定地层岩性,提升钻井过程监测的客观性与实时性。展开更多
Current experimental and computational methods have limitations in accurately and efficiently classifying ion channels within vast protein spaces.Here we have developed a deep learning algorithm,GPT2 Ion Channel Class...Current experimental and computational methods have limitations in accurately and efficiently classifying ion channels within vast protein spaces.Here we have developed a deep learning algorithm,GPT2 Ion Channel Classifier(GPT2-ICC),which effectively distinguishing ion channels from a test set containing approximately 239 times more non-ion-channel proteins.GPT2-ICC integrates representation learning with a large language model(LLM)-based classifier,enabling highly accurate identification of potential ion channels.Several potential ion channels were predicated from the unannotated human proteome,further demonstrating GPT2-ICC’s generalization ability.This study marks a significant advancement in artificial-intelligence-driven ion channel research,highlighting the adaptability and effectiveness of combining representation learning with LLMs to address the challenges of imbalanced protein sequence data.Moreover,it provides a valuable computational tool for uncovering previously uncharacterized ion channels.展开更多
The high porosity and tunable chemical functionality of metal-organic frameworks(MOFs)make it a promising catalyst design platform.High-throughput screening of catalytic performance is feasible since the large MOF str...The high porosity and tunable chemical functionality of metal-organic frameworks(MOFs)make it a promising catalyst design platform.High-throughput screening of catalytic performance is feasible since the large MOF structure database is available.In this study,we report a machine learning model for high-throughput screening of MOF catalysts for the CO_(2) cycloaddition reaction.The descriptors for model training were judiciously chosen according to the reaction mechanism,which leads to high accuracy up to 97%for the 75%quantile of the training set as the classification criterion.The feature contribution was further evaluated with SHAP and PDP analysis to provide a certain physical understanding.12,415 hypothetical MOF structures and 100 reported MOFs were evaluated under 100℃ and 1 bar within one day using the model,and 239 potentially efficient catalysts were discovered.Among them,MOF-76(Y)achieved the top performance experimentally among reported MOFs,in good agreement with the prediction.展开更多
With the rapid development of Internet of Things technology,the sharp increase in network devices and their inherent security vulnerabilities present a stark contrast,bringing unprecedented challenges to the field of ...With the rapid development of Internet of Things technology,the sharp increase in network devices and their inherent security vulnerabilities present a stark contrast,bringing unprecedented challenges to the field of network security,especially in identifying malicious attacks.However,due to the uneven distribution of network traffic data,particularly the imbalance between attack traffic and normal traffic,as well as the imbalance between minority class attacks and majority class attacks,traditional machine learning detection algorithms have significant limitations when dealing with sparse network traffic data.To effectively tackle this challenge,we have designed a lightweight intrusion detection model based on diffusion mechanisms,named Diff-IDS,with the core objective of enhancing the model’s efficiency in parsing complex network traffic features,thereby significantly improving its detection speed and training efficiency.The model begins by finely filtering network traffic features and converting them into grayscale images,while also employing image-flipping techniques for data augmentation.Subsequently,these preprocessed images are fed into a diffusion model based on the Unet architecture for training.Once the model is trained,we fix the weights of the Unet network and propose a feature enhancement algorithm based on feature masking to further boost the model’s expressiveness.Finally,we devise an end-to-end lightweight detection strategy to streamline the model,enabling efficient lightweight detection of imbalanced samples.Our method has been subjected to multiple experimental tests on renowned network intrusion detection benchmarks,including CICIDS 2017,KDD 99,and NSL-KDD.The experimental results indicate that Diff-IDS leads in terms of detection accuracy,training efficiency,and lightweight metrics compared to the current state-of-the-art models,demonstrating exceptional detection capabilities and robustness.展开更多
Despite its remarkable performance on natural images,the segment anything model(SAM)lacks domain-specific information in medical imaging.and faces the challenge of losing local multi-scale information in the encoding ...Despite its remarkable performance on natural images,the segment anything model(SAM)lacks domain-specific information in medical imaging.and faces the challenge of losing local multi-scale information in the encoding phase.This paper presents a medical image segmentation model based on SAM with a local multi-scale feature encoder(LMSFE-SAM)to address the issues above.Firstly,based on the SAM,a local multi-scale feature encoder is introduced to improve the representation of features within local receptive field,thereby supplying the Vision Transformer(ViT)branch in SAM with enriched local multi-scale contextual information.At the same time,a multiaxial Hadamard product module(MHPM)is incorporated into the local multi-scale feature encoder in a lightweight manner to reduce the quadratic complexity and noise interference.Subsequently,a cross-branch balancing adapter is designed to balance the local and global information between the local multi-scale feature encoder and the ViT encoder in SAM.Finally,to obtain smaller input image size and to mitigate overlapping in patch embeddings,the size of the input image is reduced from 1024×1024 pixels to 256×256 pixels,and a multidimensional information adaptation component is developed,which includes feature adapters,position adapters,and channel-spatial adapters.This component effectively integrates the information from small-sized medical images into SAM,enhancing its suitability for clinical deployment.The proposed model demonstrates an average enhancement ranging from 0.0387 to 0.3191 across six objective evaluation metrics on BUSI,DDTI,and TN3K datasets compared to eight other representative image segmentation models.This significantly enhances the performance of the SAM on medical images,providing clinicians with a powerful tool in clinical diagnosis.展开更多
Background:Spinocerebellar ataxia type 2(SCA2)is a neurodegenerative disease marked by significant clinical and genetic heterogeneity,primarily caused by expanded CAG mutations in the ATXN2 gene.The unstable expansion...Background:Spinocerebellar ataxia type 2(SCA2)is a neurodegenerative disease marked by significant clinical and genetic heterogeneity,primarily caused by expanded CAG mutations in the ATXN2 gene.The unstable expansion of CAG repeats disrupts the genetic stability of animal models,which is detrimental to disease research.Methods:In this study,we established a mouse model in which CAG repeats do not undergo microsatellite instability(MSI)across generations.A humanized ATXN2 cDNA with four CAA interruptions within 73 CAG expansions was inserted into the Rosa26 locus of C57BL/6J mice.A 23 CAG control mouse model was also generated to verify ATXN2 integration and expression.Results:In our model,the number of CAG repeats remained stable during transmission,with no CAG repeat expansion observed in 64 parent-to-offspring transmissions.Compared with SCA2-Q23 mice,SCA2-Q73 mice exhibited progressive motor impairment,reduced Purkinje cell count and volume(indicative of cell atrophy),and muscle atrophy.These observations in the mice suggest that the behavioral and neuropathological phenotypes may reflect the features of SCA2 patients.RNA-seq analysis of the gastrocnemius muscle in SCA2-Q73 mice showed significant changes in muscle differentiation and development gene expression at 56 weeks,with no significant differences at 16 weeks compared to SCA2-Q23 mice.The expression level of the Myf6 gene significantly changed in the muscles of aged mice.Conclusion:In summary,the establishment of this model not only provides a stable animal model for studying CAG transmission in SCA2 but also indicates that the lack of long-term neural stimulation leads to muscle atrophy.展开更多
Background:New variants of severe acute respiratory syndrome coronavirus 2(SARS-CoV-2)continue to drive global epidemics and pose significant health risks.The pathogenicity of these variants evolves under immune press...Background:New variants of severe acute respiratory syndrome coronavirus 2(SARS-CoV-2)continue to drive global epidemics and pose significant health risks.The pathogenicity of these variants evolves under immune pressure and host factors.Understanding these changes is crucial for epidemic control and variant research.Methods:Human angiotensin-converting enzyme 2(hACE2)transgenic mice were in-tranasally challenged with the original strain WH-09 and the variants Delta,Beta,and Omicron BA.1,while BALB/c mice were challenged with Omicron subvariants BA.5,BF.7,and XBB.1.To compare the pathogenicity differences among variants,we con-ducted a comprehensive analysis that included clinical symptom observation,meas-urement of viral loads in the trachea and lungs,evaluation of pulmonary pathology,analysis of immune cell infiltration,and quantification of cytokine levels.Results:In hACE2 mice,the Beta variant caused significant weight loss,severe lung inflammation,increased inflammatory and chemotactic factor secretion,greater mac-rophage and neutrophil infiltration in the lungs,and higher viral loads with prolonged shedding duration.In contrast,BA.1 showed a significant reduction in pathogenicity.The BA.5,BF.7,and XBB.1 variants were less pathogenic than the WH-09,Beta,and Delta variants when infected in BALB/c mice.This was evidenced by reduced weight loss,diminished pulmonary pathology,decreased secretion of inflammatory factors and chemokines,reduced macrophage and neutrophil infiltration,as well as lower viral loads in both the trachea and lungs.Conclusion:In hACE2 mice,the Omicron variant demonstrated the lowest pathogenic-ity,while the Beta variant exhibited the highest.Pathogenicity of the Delta variant was comparable to the original WH-09 strain.Among BALB/c mice,Omicron subvari-ants BA.5,BF.7,and XBB.1 showed no statistically significant differences in virulence.展开更多
Cyberbullying on social media poses significant psychological risks,yet most detection systems over-simplify the task by focusing on binary classification,ignoring nuanced categories like passive-aggressive remarks or...Cyberbullying on social media poses significant psychological risks,yet most detection systems over-simplify the task by focusing on binary classification,ignoring nuanced categories like passive-aggressive remarks or indirect slurs.To address this gap,we propose a hybrid framework combining Term Frequency-Inverse Document Frequency(TF-IDF),word-to-vector(Word2Vec),and Bidirectional Encoder Representations from Transformers(BERT)based models for multi-class cyberbullying detection.Our approach integrates TF-IDF for lexical specificity and Word2Vec for semantic relationships,fused with BERT’s contextual embeddings to capture syntactic and semantic complexities.We evaluate the framework on a publicly available dataset of 47,000 annotated social media posts across five cyberbullying categories:age,ethnicity,gender,religion,and indirect aggression.Among BERT variants tested,BERT Base Un-Cased achieved the highest performance with 93%accuracy(standard deviation across±1%5-fold cross-validation)and an average AUC of 0.96,outperforming standalone TF-IDF(78%)and Word2Vec(82%)models.Notably,it achieved near-perfect AUC scores(0.99)for age and ethnicity-based bullying.A comparative analysis with state-of-the-art benchmarks,including Generative Pre-trained Transformer 2(GPT-2)and Text-to-Text Transfer Transformer(T5)models highlights BERT’s superiority in handling ambiguous language.This work advances cyberbullying detection by demonstrating how hybrid feature extraction and transformer models improve multi-class classification,offering a scalable solution for moderating nuanced harmful content.展开更多
The identification and optimization of mutations in nanobodies are crucial for enhancing their thera-peutic potential in disease prevention and control.However,this process is often complex and time-consuming,which li...The identification and optimization of mutations in nanobodies are crucial for enhancing their thera-peutic potential in disease prevention and control.However,this process is often complex and time-consuming,which limit its widespread application in practice.In this study,we developed a work-flow,named Evolutionary-Nanobody(EvoNB),to predict key mutation sites of nanobodies by combining protein language models(PLMs)and molecular dynamic(MD)simulations.By fine-tuning the ESM2 model on a large-scale nanobody dataset,the ability of EvoNB to capture specific sequence features of nanobodies was significantly enhanced.The fine-tuned EvoNB model demonstrated higher predictive accuracy in the conserved framework and highly variable complementarity-determining regions of nanobodies.Additionally,we selected four widely representative nanobodyeantigen complexes to verify the predicted effects of mutations.MD simulations analyzed the energy changes caused by these mu-tations to predict their impact on binding affinity to the targets.The results showed that multiple mu-tations screened by EvoNB significantly enhanced the binding affinity between nanobody and its target,further validating the potential of this workflow for designing and optimizing nanobody mutations.Additionally,sequence-based predictions are generally less dependent on structural absence,allowing them to be more easily integrated with tools for structural predictions,such as AlphaFold 3.Through mutation prediction and systematic analysis of key sites,we can quickly predict the most promising variants for experimental validation without relying on traditional evolutionary or selection processes.The EvoNB workflow provides an effective tool for the rapid optimization of nanobodies and facilitates the application of PLMs in the biomedical field.展开更多
基金supported by the National Key Research and Development Program of China[grant number 2022YFE0106800]an Innovation Group Project of the Southern Marine Science and Engineering Guangdong Laboratory(Zhuhai)[grant number 311024001]+3 种基金a project supported by the Southern Marine Science and Engineering Guangdong Laboratory(Zhuhai)[grant number SML2023SP209]a Research Council of Norway funded project(MAPARC)[grant number 328943]a Nansen Center´s basic institutional funding[grant number 342624]the high-performance computing support from the School of Atmospheric Science at Sun Yat-sen University。
文摘Current shipping,tourism,and resource development requirements call for more accurate predictions of the Arctic sea-ice concentration(SIC).However,due to the complex physical processes involved,predicting the spatiotemporal distribution of Arctic SIC is more challenging than predicting its total extent.In this study,spatiotemporal prediction models for monthly Arctic SIC at 1-to 3-month leads are developed based on U-Net-an effective convolutional deep-learning approach.Based on explicit Arctic sea-ice-atmosphere interactions,11 variables associated with Arctic sea-ice variations are selected as predictors,including observed Arctic SIC,atmospheric,oceanic,and heat flux variables at 1-to 3-month leads.The prediction skills for the monthly Arctic SIC of the test set(from January 2018 to December 2022)are evaluated by examining the mean absolute error(MAE)and binary accuracy(BA).Results showed that the U-Net model had lower MAE and higher BA for Arctic SIC compared to two dynamic climate prediction systems(CFSv2 and NorCPM).By analyzing the relative importance of each predictor,the prediction accuracy relies more on the SIC at the 1-month lead,but on the surface net solar radiation flux at 2-to 3-month leads.However,dynamic models show limited prediction skills for surface net solar radiation flux and other physical processes,especially in autumn.Therefore,the U-Net model can be used to capture the connections among these key physical processes associated with Arctic sea ice and thus offers a significant advantage in predicting Arctic SIC.
基金supported by National Science and Technology Council(NSTC)Taiwan,Grant No.NSTC 113-2221-E-167-023.
文摘Mango is a plant with high economic value in the agricultural industry;thus,it is necessary to maximize the productivity performance of the mango plant,which can be done by implementing artificial intelligence.In this study,a lightweight object detection model will be developed that can detect mango plant conditions based on disease potential,so that it becomes an early detection warning system that has an impact on increasing agricultural productivity.The proposed lightweight model integrates YOLOv7-Tiny and the proposed modules,namely the C2S module.The C2S module consists of three sub-modules such as the convolutional block attention module(CBAM),the coordinate attention(CA)module,and the squeeze-and-excitation(SE)module.The dataset is constructed by eight classes,including seven classes of disease conditions and one class of health conditions.The experimental result shows that the proposed lightweight model has the optimal results,which increase by 13.15% of mAP50 compared to the original model YOLOv7-Tiny.While the mAP50:95 also achieved the highest results compared to other models,including YOLOv3-Tiny,YOLOv4-Tiny,YOLOv5,and YOLOv7-Tiny.The advantage of the proposed lightweightmodel is the adaptability that supports it in constrained environments,such as edge computing systems.This proposedmodel can support a robust,precise,and convenient precision agriculture system for the user.
基金supported by the National Science and Technology Project of China(No.2024ZD1004300)。
文摘The effective channeling of fluid flow by fractures is a liability for enhanced oil recovery(EOR)methods like CO_(2) flooding or CO_(2) storage.Developing a distributed fracture model to understand the heterogeneity of the fracture network is essential in characterizing tight and low-permeability reservoirs.In the Ordos Basin,the Chang 8-1-2 layer of the Yanchang Formation is a typical tight and low permeability reservoir in the JH17 wellblock.The strong heterogeneity of distributed fractures,differing fracture scales and fracture types make it difficult to effectively characterize the fracture distribution within the Chang 8-1-2 layer.In this paper,multi-source and multi-attribute methods are used to integrate data into a neural network at different scales,and fuzzy logic control is used to judge the correlation of various attributes.The results suggest that attribute correlation between coherence and fracture indication is the best,followed by correlations with fault distance,north–south slope,and north–south curvature.Advantageous attributes from the target area are used to train the neural network,and the fracture density model and discrete fracture network(DFN)model are built at different scales.This method can be used to effectively predict the distribution characteristics of fractures in the study area.And any learning done by the neural network from this case study can be applied to fracture network modeling for reservoirs of the same type.
文摘This study compares the relative efficacy of the continuation task and the model-as-feedbackwriting (MAFW) task in EFL writing development. Ninety intermediate-level Chinese EFL learnerswere randomly assigned to a continuation group, a MAFW group, and a control group, each with30 learners. A pretest and a posttest were used to gauge L2 writing development. Results showedthat the continuation task outperformed the MAFW task not only in enhancing the overall qualityof L2 writing, but also in promoting the quality of three components of L2 writing, namely, content,organization, and language. The finding has important implications for L2 writing teaching andlearning.
基金Hebei Province Higher Education Scientific Research Project(QN2025367)Zhangjiakou City 2022 Municipal Science and Technology Plan Self-raised Fund Project(221105D)Hebei Province Education Science“14th Five-Year Plan”Project(2404224).
文摘Objective To develop QingNangTCM,a specialized large language model(LLM)tailored for expert-level traditional Chinese medicine(TCM)question-answering and clinical reasoning,addressing the scarcity of domain-specific corpora and specialized alignment.Methods We constructed QnTCM_Dataset,a corpus of 100000 entries,by integrating data from ShenNong_TCM_Dataset and SymMap v2.0,and synthesizing additional samples via retrieval-augmented generation(RAG)and persona-driven generation.The dataset comprehensively covers diagnostic inquiries,prescriptions,and herbal knowledge.Utilizing P-Tuning v2,we fine-tuned the GLM-4-9B-Chat backbone to develop QingNangTCM.A multidimensional evaluation framework,assessing accuracy,coverage,consistency,safety,professionalism,and fluency,was established using metrics such as bilingual evaluation understudy(BLEU),recall-oriented understudy for gisting evaluation(ROUGE),metric for evaluation of translation with explicit ordering(METEOR),and LLM-as-a-Judge with expert review.Qualitative analysis was conducted across four simulated clinical scenarios:symptom analysis,disease treatment,herb inquiry,and failure cases.Baseline models included GLM-4-9BChat,DeepSeek-V2,HuatuoGPT-II(7B),and GLM-4-9B-Chat(freeze-tuning).Results QingNangTCM achieved the highest scores in BLEU-1/2/3/4(0.425/0.298/0.137/0.064),ROUGE-1/2(0.368/0.157),and METEOR(0.218),demonstrating a balanced and superior normalized performance profile of 0.900 across the dimensions of accuracy,coverage,and consistency.Although its ROUGE-L score(0.299)was lower than that of HuatuoGPT-II(7B)(0.351),it significantly outperformed domain-specific models in expert-validated win rates for professionalism(86%)and safety(73%).Qualitative analysis confirmed that the model strictly adheres to the“symptom-syndrome-pathogenesis-treatment”reasoning chain,though occasional misclassifications and hallucinations persisted when dealing with rare medicinal materials and uncommon syndromes.
文摘钻井过程中对上返岩屑的监测与识别是感知地层变化、及时发现掉块并减缓井壁失稳风险的关键手段。实现快速、客观、自动化的岩屑识别对保障钻井安全、提高钻井效率具有重要意义。目前,岩屑识别主要依赖人工经验判断,存在主观性强、耗时长和工作量大等问题。基于实际采集的岩屑图像,提出一种基于Segment Anything Model 2(SAM2)与KMeans聚类算法的岩屑识别模型,实现对岩屑颗粒的精确分割与自动聚类。同时,设计了交互式选择功能,支持工程师快速挑选目标岩屑块,显著提升岩屑块可视化与识别效率。实验结果表明,SAM2在岩屑图像分割任务中表现优异,分割精度较现有主流方法提升3%~6%。在四川威远构SX井的实际岩屑图像测试中,模型聚类识别准确率达83.9%,与人工标注结果高度一致。在典型井段的应用中,模型识别出4类主要岩屑,各类别占比分布与人工判别结果差异较小。研究结果表明,本文提出的模型方法能够有效划分不同粒径岩屑块并合理预测各类岩性占比,有助于辅助工程师快速判定地层岩性,提升钻井过程监测的客观性与实时性。
基金funded by grants from the National Key Research and Development Program of China(Grant Nos.:2022YFE0205600 and 2022YFC3400504)the National Natural Science Foundation of China(Grant Nos.:82373792 and 82273857)the Fundamental Research Funds for the Central Universities,China,and the East China Normal University Medicine and Health Joint Fund,China(Grant No.:2022JKXYD07001).
文摘Current experimental and computational methods have limitations in accurately and efficiently classifying ion channels within vast protein spaces.Here we have developed a deep learning algorithm,GPT2 Ion Channel Classifier(GPT2-ICC),which effectively distinguishing ion channels from a test set containing approximately 239 times more non-ion-channel proteins.GPT2-ICC integrates representation learning with a large language model(LLM)-based classifier,enabling highly accurate identification of potential ion channels.Several potential ion channels were predicated from the unannotated human proteome,further demonstrating GPT2-ICC’s generalization ability.This study marks a significant advancement in artificial-intelligence-driven ion channel research,highlighting the adaptability and effectiveness of combining representation learning with LLMs to address the challenges of imbalanced protein sequence data.Moreover,it provides a valuable computational tool for uncovering previously uncharacterized ion channels.
基金financial support from the National Key Research and Development Program of China(2021YFB 3501501)the National Natural Science Foundation of China(No.22225803,22038001,22108007 and 22278011)+1 种基金Beijing Natural Science Foundation(No.Z230023)Beijing Science and Technology Commission(No.Z211100004321001).
文摘The high porosity and tunable chemical functionality of metal-organic frameworks(MOFs)make it a promising catalyst design platform.High-throughput screening of catalytic performance is feasible since the large MOF structure database is available.In this study,we report a machine learning model for high-throughput screening of MOF catalysts for the CO_(2) cycloaddition reaction.The descriptors for model training were judiciously chosen according to the reaction mechanism,which leads to high accuracy up to 97%for the 75%quantile of the training set as the classification criterion.The feature contribution was further evaluated with SHAP and PDP analysis to provide a certain physical understanding.12,415 hypothetical MOF structures and 100 reported MOFs were evaluated under 100℃ and 1 bar within one day using the model,and 239 potentially efficient catalysts were discovered.Among them,MOF-76(Y)achieved the top performance experimentally among reported MOFs,in good agreement with the prediction.
基金supported by the Key Research and Development Program of Hainan Province(Grant Nos.ZDYF2024GXJS014,ZDYF2023GXJS163)the National Natural Science Foundation of China(NSFC)(Grant Nos.62162022,62162024)Collaborative Innovation Project of Hainan University(XTCX2022XXB02).
文摘With the rapid development of Internet of Things technology,the sharp increase in network devices and their inherent security vulnerabilities present a stark contrast,bringing unprecedented challenges to the field of network security,especially in identifying malicious attacks.However,due to the uneven distribution of network traffic data,particularly the imbalance between attack traffic and normal traffic,as well as the imbalance between minority class attacks and majority class attacks,traditional machine learning detection algorithms have significant limitations when dealing with sparse network traffic data.To effectively tackle this challenge,we have designed a lightweight intrusion detection model based on diffusion mechanisms,named Diff-IDS,with the core objective of enhancing the model’s efficiency in parsing complex network traffic features,thereby significantly improving its detection speed and training efficiency.The model begins by finely filtering network traffic features and converting them into grayscale images,while also employing image-flipping techniques for data augmentation.Subsequently,these preprocessed images are fed into a diffusion model based on the Unet architecture for training.Once the model is trained,we fix the weights of the Unet network and propose a feature enhancement algorithm based on feature masking to further boost the model’s expressiveness.Finally,we devise an end-to-end lightweight detection strategy to streamline the model,enabling efficient lightweight detection of imbalanced samples.Our method has been subjected to multiple experimental tests on renowned network intrusion detection benchmarks,including CICIDS 2017,KDD 99,and NSL-KDD.The experimental results indicate that Diff-IDS leads in terms of detection accuracy,training efficiency,and lightweight metrics compared to the current state-of-the-art models,demonstrating exceptional detection capabilities and robustness.
基金supported by Natural Science Foundation Programme of Gansu Province(No.24JRRA231)National Natural Science Foundation of China(No.62061023)Gansu Provincial Science and Technology Plan Key Research and Development Program Project(No.24YFFA024).
文摘Despite its remarkable performance on natural images,the segment anything model(SAM)lacks domain-specific information in medical imaging.and faces the challenge of losing local multi-scale information in the encoding phase.This paper presents a medical image segmentation model based on SAM with a local multi-scale feature encoder(LMSFE-SAM)to address the issues above.Firstly,based on the SAM,a local multi-scale feature encoder is introduced to improve the representation of features within local receptive field,thereby supplying the Vision Transformer(ViT)branch in SAM with enriched local multi-scale contextual information.At the same time,a multiaxial Hadamard product module(MHPM)is incorporated into the local multi-scale feature encoder in a lightweight manner to reduce the quadratic complexity and noise interference.Subsequently,a cross-branch balancing adapter is designed to balance the local and global information between the local multi-scale feature encoder and the ViT encoder in SAM.Finally,to obtain smaller input image size and to mitigate overlapping in patch embeddings,the size of the input image is reduced from 1024×1024 pixels to 256×256 pixels,and a multidimensional information adaptation component is developed,which includes feature adapters,position adapters,and channel-spatial adapters.This component effectively integrates the information from small-sized medical images into SAM,enhancing its suitability for clinical deployment.The proposed model demonstrates an average enhancement ranging from 0.0387 to 0.3191 across six objective evaluation metrics on BUSI,DDTI,and TN3K datasets compared to eight other representative image segmentation models.This significantly enhances the performance of the SAM on medical images,providing clinicians with a powerful tool in clinical diagnosis.
基金CAMS Innovation Fund for Medical Sciences,Grant/Award Number:CIFMS,2021-I2M-1-024The Joint Fund for the Department of Science and Technology of Yunnan Province-Kunming Medical University,Grant/Award Number:202201AY070001-007+1 种基金Open Research Fund Project of Yunnan Provincial Key Laboratory of Pharmacology of Natural Medicines,Grant/Award Number:YKLPNP-G2403The Science and Technology Leading Talent Program of Yunnan Province,Grant/Award Number:202405AB350002。
文摘Background:Spinocerebellar ataxia type 2(SCA2)is a neurodegenerative disease marked by significant clinical and genetic heterogeneity,primarily caused by expanded CAG mutations in the ATXN2 gene.The unstable expansion of CAG repeats disrupts the genetic stability of animal models,which is detrimental to disease research.Methods:In this study,we established a mouse model in which CAG repeats do not undergo microsatellite instability(MSI)across generations.A humanized ATXN2 cDNA with four CAA interruptions within 73 CAG expansions was inserted into the Rosa26 locus of C57BL/6J mice.A 23 CAG control mouse model was also generated to verify ATXN2 integration and expression.Results:In our model,the number of CAG repeats remained stable during transmission,with no CAG repeat expansion observed in 64 parent-to-offspring transmissions.Compared with SCA2-Q23 mice,SCA2-Q73 mice exhibited progressive motor impairment,reduced Purkinje cell count and volume(indicative of cell atrophy),and muscle atrophy.These observations in the mice suggest that the behavioral and neuropathological phenotypes may reflect the features of SCA2 patients.RNA-seq analysis of the gastrocnemius muscle in SCA2-Q73 mice showed significant changes in muscle differentiation and development gene expression at 56 weeks,with no significant differences at 16 weeks compared to SCA2-Q23 mice.The expression level of the Myf6 gene significantly changed in the muscles of aged mice.Conclusion:In summary,the establishment of this model not only provides a stable animal model for studying CAG transmission in SCA2 but also indicates that the lack of long-term neural stimulation leads to muscle atrophy.
基金National Science and Technology Infrastructure of China,Grant/Award Number:National Pathogen Resource Center-NPRC-32National Key Research and Development Program of China,Grant/Award Number:2023YFF0724800CAMS Innovation Fund for Medical Sciences,Grant/Award Number:2021-I2M-1-035。
文摘Background:New variants of severe acute respiratory syndrome coronavirus 2(SARS-CoV-2)continue to drive global epidemics and pose significant health risks.The pathogenicity of these variants evolves under immune pressure and host factors.Understanding these changes is crucial for epidemic control and variant research.Methods:Human angiotensin-converting enzyme 2(hACE2)transgenic mice were in-tranasally challenged with the original strain WH-09 and the variants Delta,Beta,and Omicron BA.1,while BALB/c mice were challenged with Omicron subvariants BA.5,BF.7,and XBB.1.To compare the pathogenicity differences among variants,we con-ducted a comprehensive analysis that included clinical symptom observation,meas-urement of viral loads in the trachea and lungs,evaluation of pulmonary pathology,analysis of immune cell infiltration,and quantification of cytokine levels.Results:In hACE2 mice,the Beta variant caused significant weight loss,severe lung inflammation,increased inflammatory and chemotactic factor secretion,greater mac-rophage and neutrophil infiltration in the lungs,and higher viral loads with prolonged shedding duration.In contrast,BA.1 showed a significant reduction in pathogenicity.The BA.5,BF.7,and XBB.1 variants were less pathogenic than the WH-09,Beta,and Delta variants when infected in BALB/c mice.This was evidenced by reduced weight loss,diminished pulmonary pathology,decreased secretion of inflammatory factors and chemokines,reduced macrophage and neutrophil infiltration,as well as lower viral loads in both the trachea and lungs.Conclusion:In hACE2 mice,the Omicron variant demonstrated the lowest pathogenic-ity,while the Beta variant exhibited the highest.Pathogenicity of the Delta variant was comparable to the original WH-09 strain.Among BALB/c mice,Omicron subvari-ants BA.5,BF.7,and XBB.1 showed no statistically significant differences in virulence.
基金funded by Scientific Research Deanship at University of Hail-Saudi Arabia through Project Number RG-23092.
文摘Cyberbullying on social media poses significant psychological risks,yet most detection systems over-simplify the task by focusing on binary classification,ignoring nuanced categories like passive-aggressive remarks or indirect slurs.To address this gap,we propose a hybrid framework combining Term Frequency-Inverse Document Frequency(TF-IDF),word-to-vector(Word2Vec),and Bidirectional Encoder Representations from Transformers(BERT)based models for multi-class cyberbullying detection.Our approach integrates TF-IDF for lexical specificity and Word2Vec for semantic relationships,fused with BERT’s contextual embeddings to capture syntactic and semantic complexities.We evaluate the framework on a publicly available dataset of 47,000 annotated social media posts across five cyberbullying categories:age,ethnicity,gender,religion,and indirect aggression.Among BERT variants tested,BERT Base Un-Cased achieved the highest performance with 93%accuracy(standard deviation across±1%5-fold cross-validation)and an average AUC of 0.96,outperforming standalone TF-IDF(78%)and Word2Vec(82%)models.Notably,it achieved near-perfect AUC scores(0.99)for age and ethnicity-based bullying.A comparative analysis with state-of-the-art benchmarks,including Generative Pre-trained Transformer 2(GPT-2)and Text-to-Text Transfer Transformer(T5)models highlights BERT’s superiority in handling ambiguous language.This work advances cyberbullying detection by demonstrating how hybrid feature extraction and transformer models improve multi-class classification,offering a scalable solution for moderating nuanced harmful content.
基金supported by the National Natural Science Foundation of China(Grant Nos.:92477103,22273023,12474285 and 22373116)the National Key R&D Program of China(Grant No.:2019YFA0905200)+5 种基金Shanghai Municipal Natural Science Foundation(Grant No.:23ZR1418200)Natural Science Foundation of Chongqing,China(Grant No.:CSTB2023NSCQ-MSX0616)Shanghai Frontiers Science Center of Molecule Intelligent SynthesesShanghai Future Discipline Program(Quantum Science and Tech-nology)Shanghai Municipal Education Commission’s“Artificial Intelligence-Driven Research Paradigm Reform and Discipline Advancement Program”the Fundamental Research Funds for the Central Universities.
文摘The identification and optimization of mutations in nanobodies are crucial for enhancing their thera-peutic potential in disease prevention and control.However,this process is often complex and time-consuming,which limit its widespread application in practice.In this study,we developed a work-flow,named Evolutionary-Nanobody(EvoNB),to predict key mutation sites of nanobodies by combining protein language models(PLMs)and molecular dynamic(MD)simulations.By fine-tuning the ESM2 model on a large-scale nanobody dataset,the ability of EvoNB to capture specific sequence features of nanobodies was significantly enhanced.The fine-tuned EvoNB model demonstrated higher predictive accuracy in the conserved framework and highly variable complementarity-determining regions of nanobodies.Additionally,we selected four widely representative nanobodyeantigen complexes to verify the predicted effects of mutations.MD simulations analyzed the energy changes caused by these mu-tations to predict their impact on binding affinity to the targets.The results showed that multiple mu-tations screened by EvoNB significantly enhanced the binding affinity between nanobody and its target,further validating the potential of this workflow for designing and optimizing nanobody mutations.Additionally,sequence-based predictions are generally less dependent on structural absence,allowing them to be more easily integrated with tools for structural predictions,such as AlphaFold 3.Through mutation prediction and systematic analysis of key sites,we can quickly predict the most promising variants for experimental validation without relying on traditional evolutionary or selection processes.The EvoNB workflow provides an effective tool for the rapid optimization of nanobodies and facilitates the application of PLMs in the biomedical field.