Although Named Entity Recognition(NER)in cybersecurity has historically concentrated on threat intelligence,vital security data can be found in a variety of sources,such as open-source intelligence and unprocessed too...Although Named Entity Recognition(NER)in cybersecurity has historically concentrated on threat intelligence,vital security data can be found in a variety of sources,such as open-source intelligence and unprocessed tool outputs.When dealing with technical language,the coexistence of structured and unstructured data poses serious issues for traditional BERT-based techniques.We introduce a three-phase approach for improved NER inmulti-source cybersecurity data that makes use of large language models(LLMs).To ensure thorough entity coverage,our method starts with an identification module that uses dynamic prompting techniques.To lessen hallucinations,the extraction module uses confidence-based self-assessment and cross-checking using regex validation.The tagging module links to knowledge bases for contextual validation and uses SecureBERT in conjunction with conditional random fields to detect entity boundaries precisely.Our framework creates efficient natural language segments by utilizing decoderbased LLMs with 10B parameters.When compared to baseline SecureBERT implementations,evaluation across four cybersecurity data sources shows notable gains,with a 9.4%–25.21%greater recall and a 6.38%–17.3%better F1-score.Our refined model matches larger models and achieves 2.6%–4.9%better F1-score for technical phrase recognition than the state-of-the-art alternatives Claude 3.5 Sonnet,Llama3-8B,and Mixtral-7B.The three-stage architecture identification-extraction-tagging pipeline tackles important cybersecurity NER issues.Through effective architectures,these developments preserve deployability while setting a new standard for entity extraction in challenging security scenarios.The findings show how specific enhancements in hybrid recognition,validation procedures,and prompt engineering raise NER performance above monolithic LLM approaches in cybersecurity applications,especially for technical entity extraction fromheterogeneous sourceswhere conventional techniques fall short.Because of itsmodular nature,the framework can be upgraded at the component level as new methods are developed.展开更多
Existing Chinese named entity recognition(NER)research utilises 1D lexicon-based sequence labelling frameworks,which can only recognise flat entities.While lexicons serve as prior knowledge and enhance semantic inform...Existing Chinese named entity recognition(NER)research utilises 1D lexicon-based sequence labelling frameworks,which can only recognise flat entities.While lexicons serve as prior knowledge and enhance semantic information,they also pose completeness and resource requirements limitations.This paper proposes a template-based classification(TC)model to avoid lexicon issues and to identify nested entities.Template-based classification provides a template word for each entity type,which utilises contrastive learning to integrate the common characteristics among entities with the same category.Contrastive learning makes template words the centre points of their category in the vector space,thus improving generalisation ability.Additionally,TC presents a 2D tablefilling label scheme that classifies entities based on the attention distribution of template words.The proposed novel decoder algorithm enables TC recognition of both flat and nested entities simultaneously.Experimental results show that TC achieves the state-ofthe-art performance on five Chinese datasets.展开更多
Background:Platinum chemotherapy(CT)remains the backbone of systemic therapy for patients with smallcell lung cancer(SCLC).The nucleotide excision repair(NER)pathway plays a central role in the repair of the DNA damag...Background:Platinum chemotherapy(CT)remains the backbone of systemic therapy for patients with smallcell lung cancer(SCLC).The nucleotide excision repair(NER)pathway plays a central role in the repair of the DNA damage exerted by platinum agents.Alteration in this repair mechanism may affect patients’survival.Materials and Methods:We conducted a retrospective analysis of data from 38 patients with extensive disease(ED)-SCLC who underwent platinum-CT at the Clinical Oncology Unit,Careggi University Hospital,Florence(Italy),from 2015 to 2020.mRNA expression analysis and single nucleotide polymorphism(SNP)characterization of three NER pathway genes—namely ERCC1,ERCC2,and ERCC5—were performed on patient tumor samples.Results:Overall,elevated expression of ERCC genes was observed in SCLC patients compared to healthy controls.Patients with low ERCC1 and ERCC5 expression levels exhibited a better median progression-free survival(mPFS=7.1 vs.4.9 months,p=0.39 for ERCC1 and mPFS=6.9 vs.4.8 months,p=0.093 for ERCC5)and overall survival(mOS=8.7 vs.6.0 months,p=0.4 for ERCC1 and mOS=7.2 vs.6.2 months,p=0.13 for ERCC5).Genotyping analysis of five SNPs of ERCC genes showed a longer survival in patients harboring the wild-type genotype or the heterozygous variant of the ERCC1 rs11615 SNP(p=0.24 for PFS and p=0.14 for OS)and of the rs13181 and rs1799793 ERCC2 SNPs(p=0.43 and p=0.26 for PFS and p=0.21 and p=0.16 for OS,respectively)compared to patients with homozygous mutant genotypes.Conclusions:The comprehensive analysis of ERCC gene expression and SNP variants appears to identify patients who derive greater survival benefits from platinum-CT.展开更多
[ Objective] The aim was to study the protein polymorphism in the blood of Tibetan Mastiff, and provide some theoretical basis for resource protection and reasonable development and utilization of Tibetan Mastiff vari...[ Objective] The aim was to study the protein polymorphism in the blood of Tibetan Mastiff, and provide some theoretical basis for resource protection and reasonable development and utilization of Tibetan Mastiff varieties. [ Method] A total of 103 blood samples were taken from four populations of Hequ Tibetan Mastiff, Qinhai Tibetan Mastiff, Tibetan Spaniel and native dogs of Qinghai. Seven blood protein Iocus(Tf, Po, Sα2, Hb, AIb, Pr and Amy)were investigated by using vertical polyacrylamide gel electrophoresis with discontinuous buffer system. Then the genetic variation during different populations was analyzed. [ Result] Genetic variations were observed in Tf, Sα2 and Po in four populations, others were not polymorphic. There were three alleles at the locus of Tf and Po, two alleles at the loci of Sα2. Effective number of alleles and Nei's average expected heterozygosity were 1. 532 4 and 0.230 3 relatively, all higher in Tibetan Mastiff than other populations. [ Conclusion] Protein locus in blood of Tibetan Mastiff existed in genetic variation.展开更多
Named Entity Recognition(NER)is vital in natural language processing for the analysis of news texts,as it accurately identifies entities such as locations,persons,and organizations,which is crucial for applications li...Named Entity Recognition(NER)is vital in natural language processing for the analysis of news texts,as it accurately identifies entities such as locations,persons,and organizations,which is crucial for applications like news summarization and event tracking.However,NER in the news domain faces challenges due to insufficient annotated data,complex entity structures,and strong context dependencies.To address these issues,we propose a new Chinesenamed entity recognition method that integrates transfer learning with word embeddings.Our approach leverages the ERNIE pre-trained model for transfer learning and obtaining general language representations and incorporates the Soft-lexicon word embedding technique to handle varied entity structures.This dual-strategy enhances the model’s understanding of context and boosts its ability to process complex texts.Experimental results show that our method achieves an F1 score of 94.72% on a news dataset,surpassing baseline methods by 3%–4%,thereby confirming its effectiveness for Chinese-named entity recognition in the news domain.展开更多
The Ninety-East Ridge(NER)is located in the semioceanic to oceanic region of the southern Bengal Fan in the Northeast Indian Ocean.The sedimentary environment,ocean currents,and scientific issues related to climate ch...The Ninety-East Ridge(NER)is located in the semioceanic to oceanic region of the southern Bengal Fan in the Northeast Indian Ocean.The sedimentary environment,ocean currents,and scientific issues related to climate change have always been the focus of scientists.To well understand the sedimentary environment of the sea area,we studied the modern sedimentary environment of the NER by analyzing the redox-sensitive trace elements(RSEs)and biomarkers in the surface sediments of the northern region and both sides of the NER and the mechanism of their formation.The ratios of Mo/U(average 2.22),(Cu+Mo)/Zn(average 1.51),and the results ofδCe<1 of the sediment samples,all indicate the reduction of the sedimentary environment.In addition,the ratio of pristane(Pr)to phytane(Ph),C30diahopane to C30 hopane,and diasterane to sterane were low in all samples,on average of 1.03,0.9,and 0.33,respectively.The analysis of RSE and biomarker data revealed that the sedimentary environment on seabed of the NER is generally a rare low-oxygen reduction environment.Through the analysis of sediment characteristics,material sources,and ocean currents,we preliminarily constructed a genetic model for the low-oxygen reducing environment of surface sediments in the NER.We believe that the low-oxygen reduction environment of surface sediment in the NER could be influenced by multiple factors,such as terrestrial input of materials,productivity at sea surface,and sediment particle size.展开更多
DNA damage refers to the permanent alteration of nucleotide sequences during DNA replication,leading to modifications in genetic characteristics.Cells can rectify the majority of such damage through DNA damage repair(...DNA damage refers to the permanent alteration of nucleotide sequences during DNA replication,leading to modifications in genetic characteristics.Cells can rectify the majority of such damage through DNA damage repair(DDR)mechanisms-including base excision repair(BER),nucleotide excision repair(NER),mismatch repair(MMR),homologous recombination(HR),canonical non-homologous end joining(NHEJ),and alternative non-homologous end joining(alt-NHEJ)-thereby maintaining genomic stability.1 Extrachromosomal DNA(ecDNA)refers to circular DNA molecules existing outside chromosomes,which have been demonstrated to play a critical role in tumor progression and evolution.2 EcDNA has been considered a marker of genomic instability,as ecDNA-positive tumors have been found to exhibit elevated DNA replication stress and higher levels of DNA double-strand breaks(DSBs).3 However,the precise relationship between ecDNA and the DDR,as well as the specific mechanisms governing ecDNA replication and maintenance,remains to be elucidated.展开更多
Some influential factors of anther culture were studied preliminarily by conducting anther culture of the restorers of new cytoplasmic male sterile (NER). Several results were obtain from this experiment and they we...Some influential factors of anther culture were studied preliminarily by conducting anther culture of the restorers of new cytoplasmic male sterile (NER). Several results were obtain from this experiment and they were listed as follow:① MS cultrure medium with such hormones as 2,4-D 2 mg/L,6-BA 0.5 mg/L, NAA 0.5 mg/L was the best suitable for callus induction of NER. ②The difference of induction rate was significantly different between different plant age groups. From the 110th day to 141th day,the induction rate was increased with the increase of age and the difference of induction rate reached 0.01 significant difference level. The induction rate reached the highest value in the 141th day then it declined gradually. ③The combined use of 2, 4-D and 6-BA with proper increase of 2,4-D was good for inducing callus. ④The green plantlet induction rate of NER was increased when the concentration of 6-BA increased from 2 mg/L to 4 mg/L. Adding ZT from 0.5 mg/L to 2 mg/L. 6-BA would led 2.47% increase of green plantlet olantlet induction rate.展开更多
文摘Although Named Entity Recognition(NER)in cybersecurity has historically concentrated on threat intelligence,vital security data can be found in a variety of sources,such as open-source intelligence and unprocessed tool outputs.When dealing with technical language,the coexistence of structured and unstructured data poses serious issues for traditional BERT-based techniques.We introduce a three-phase approach for improved NER inmulti-source cybersecurity data that makes use of large language models(LLMs).To ensure thorough entity coverage,our method starts with an identification module that uses dynamic prompting techniques.To lessen hallucinations,the extraction module uses confidence-based self-assessment and cross-checking using regex validation.The tagging module links to knowledge bases for contextual validation and uses SecureBERT in conjunction with conditional random fields to detect entity boundaries precisely.Our framework creates efficient natural language segments by utilizing decoderbased LLMs with 10B parameters.When compared to baseline SecureBERT implementations,evaluation across four cybersecurity data sources shows notable gains,with a 9.4%–25.21%greater recall and a 6.38%–17.3%better F1-score.Our refined model matches larger models and achieves 2.6%–4.9%better F1-score for technical phrase recognition than the state-of-the-art alternatives Claude 3.5 Sonnet,Llama3-8B,and Mixtral-7B.The three-stage architecture identification-extraction-tagging pipeline tackles important cybersecurity NER issues.Through effective architectures,these developments preserve deployability while setting a new standard for entity extraction in challenging security scenarios.The findings show how specific enhancements in hybrid recognition,validation procedures,and prompt engineering raise NER performance above monolithic LLM approaches in cybersecurity applications,especially for technical entity extraction fromheterogeneous sourceswhere conventional techniques fall short.Because of itsmodular nature,the framework can be upgraded at the component level as new methods are developed.
基金Sichuan Provincial Science and Technology Support Program,Grant/Award Number:2023YFG0151National Natural Science Foundation of China,Grant/Award Numbers:U22B2061,U2336204。
文摘Existing Chinese named entity recognition(NER)research utilises 1D lexicon-based sequence labelling frameworks,which can only recognise flat entities.While lexicons serve as prior knowledge and enhance semantic information,they also pose completeness and resource requirements limitations.This paper proposes a template-based classification(TC)model to avoid lexicon issues and to identify nested entities.Template-based classification provides a template word for each entity type,which utilises contrastive learning to integrate the common characteristics among entities with the same category.Contrastive learning makes template words the centre points of their category in the vector space,thus improving generalisation ability.Additionally,TC presents a 2D tablefilling label scheme that classifies entities based on the attention distribution of template words.The proposed novel decoder algorithm enables TC recognition of both flat and nested entities simultaneously.Experimental results show that TC achieves the state-ofthe-art performance on five Chinese datasets.
文摘Background:Platinum chemotherapy(CT)remains the backbone of systemic therapy for patients with smallcell lung cancer(SCLC).The nucleotide excision repair(NER)pathway plays a central role in the repair of the DNA damage exerted by platinum agents.Alteration in this repair mechanism may affect patients’survival.Materials and Methods:We conducted a retrospective analysis of data from 38 patients with extensive disease(ED)-SCLC who underwent platinum-CT at the Clinical Oncology Unit,Careggi University Hospital,Florence(Italy),from 2015 to 2020.mRNA expression analysis and single nucleotide polymorphism(SNP)characterization of three NER pathway genes—namely ERCC1,ERCC2,and ERCC5—were performed on patient tumor samples.Results:Overall,elevated expression of ERCC genes was observed in SCLC patients compared to healthy controls.Patients with low ERCC1 and ERCC5 expression levels exhibited a better median progression-free survival(mPFS=7.1 vs.4.9 months,p=0.39 for ERCC1 and mPFS=6.9 vs.4.8 months,p=0.093 for ERCC5)and overall survival(mOS=8.7 vs.6.0 months,p=0.4 for ERCC1 and mOS=7.2 vs.6.2 months,p=0.13 for ERCC5).Genotyping analysis of five SNPs of ERCC genes showed a longer survival in patients harboring the wild-type genotype or the heterozygous variant of the ERCC1 rs11615 SNP(p=0.24 for PFS and p=0.14 for OS)and of the rs13181 and rs1799793 ERCC2 SNPs(p=0.43 and p=0.26 for PFS and p=0.21 and p=0.16 for OS,respectively)compared to patients with homozygous mutant genotypes.Conclusions:The comprehensive analysis of ERCC gene expression and SNP variants appears to identify patients who derive greater survival benefits from platinum-CT.
基金Supported by Foundation of Gansu Technology Committee (GKC-97-27-5)Youth Foundation of Tianshui Normal University (X4-25)~~
文摘[ Objective] The aim was to study the protein polymorphism in the blood of Tibetan Mastiff, and provide some theoretical basis for resource protection and reasonable development and utilization of Tibetan Mastiff varieties. [ Method] A total of 103 blood samples were taken from four populations of Hequ Tibetan Mastiff, Qinhai Tibetan Mastiff, Tibetan Spaniel and native dogs of Qinghai. Seven blood protein Iocus(Tf, Po, Sα2, Hb, AIb, Pr and Amy)were investigated by using vertical polyacrylamide gel electrophoresis with discontinuous buffer system. Then the genetic variation during different populations was analyzed. [ Result] Genetic variations were observed in Tf, Sα2 and Po in four populations, others were not polymorphic. There were three alleles at the locus of Tf and Po, two alleles at the loci of Sα2. Effective number of alleles and Nei's average expected heterozygosity were 1. 532 4 and 0.230 3 relatively, all higher in Tibetan Mastiff than other populations. [ Conclusion] Protein locus in blood of Tibetan Mastiff existed in genetic variation.
基金funded by Advanced Research Project(30209040702).
文摘Named Entity Recognition(NER)is vital in natural language processing for the analysis of news texts,as it accurately identifies entities such as locations,persons,and organizations,which is crucial for applications like news summarization and event tracking.However,NER in the news domain faces challenges due to insufficient annotated data,complex entity structures,and strong context dependencies.To address these issues,we propose a new Chinesenamed entity recognition method that integrates transfer learning with word embeddings.Our approach leverages the ERNIE pre-trained model for transfer learning and obtaining general language representations and incorporates the Soft-lexicon word embedding technique to handle varied entity structures.This dual-strategy enhances the model’s understanding of context and boosts its ability to process complex texts.Experimental results show that our method achieves an F1 score of 94.72% on a news dataset,surpassing baseline methods by 3%–4%,thereby confirming its effectiveness for Chinese-named entity recognition in the news domain.
基金Supported by the Science and Technology Development Foundation of South China Sea Bureau,Ministry of Natural Resources,China(No.230204)the National Program on Global Change and Air-Sea Interaction(No.GASI-02-IND-CJ04)+1 种基金the Natural Science Foundation of Guangdong Province,China(No.2021A1515012589)the Key Technologies Research and Development Program of Guangzhou,Guangdong Province,China(No.2023B03J1379)。
文摘The Ninety-East Ridge(NER)is located in the semioceanic to oceanic region of the southern Bengal Fan in the Northeast Indian Ocean.The sedimentary environment,ocean currents,and scientific issues related to climate change have always been the focus of scientists.To well understand the sedimentary environment of the sea area,we studied the modern sedimentary environment of the NER by analyzing the redox-sensitive trace elements(RSEs)and biomarkers in the surface sediments of the northern region and both sides of the NER and the mechanism of their formation.The ratios of Mo/U(average 2.22),(Cu+Mo)/Zn(average 1.51),and the results ofδCe<1 of the sediment samples,all indicate the reduction of the sedimentary environment.In addition,the ratio of pristane(Pr)to phytane(Ph),C30diahopane to C30 hopane,and diasterane to sterane were low in all samples,on average of 1.03,0.9,and 0.33,respectively.The analysis of RSE and biomarker data revealed that the sedimentary environment on seabed of the NER is generally a rare low-oxygen reduction environment.Through the analysis of sediment characteristics,material sources,and ocean currents,we preliminarily constructed a genetic model for the low-oxygen reducing environment of surface sediments in the NER.We believe that the low-oxygen reduction environment of surface sediment in the NER could be influenced by multiple factors,such as terrestrial input of materials,productivity at sea surface,and sediment particle size.
文摘DNA damage refers to the permanent alteration of nucleotide sequences during DNA replication,leading to modifications in genetic characteristics.Cells can rectify the majority of such damage through DNA damage repair(DDR)mechanisms-including base excision repair(BER),nucleotide excision repair(NER),mismatch repair(MMR),homologous recombination(HR),canonical non-homologous end joining(NHEJ),and alternative non-homologous end joining(alt-NHEJ)-thereby maintaining genomic stability.1 Extrachromosomal DNA(ecDNA)refers to circular DNA molecules existing outside chromosomes,which have been demonstrated to play a critical role in tumor progression and evolution.2 EcDNA has been considered a marker of genomic instability,as ecDNA-positive tumors have been found to exhibit elevated DNA replication stress and higher levels of DNA double-strand breaks(DSBs).3 However,the precise relationship between ecDNA and the DDR,as well as the specific mechanisms governing ecDNA replication and maintenance,remains to be elucidated.
基金Supported by the National 863 Project of Tenth-five Year Plan(2001AA2411042004AA241104)+1 种基金Key Breeding Project of Sichuan Province and(200107001-16-01)Key Quality Project of Sichuan Province(200107001-1-7-4)~~
文摘Some influential factors of anther culture were studied preliminarily by conducting anther culture of the restorers of new cytoplasmic male sterile (NER). Several results were obtain from this experiment and they were listed as follow:① MS cultrure medium with such hormones as 2,4-D 2 mg/L,6-BA 0.5 mg/L, NAA 0.5 mg/L was the best suitable for callus induction of NER. ②The difference of induction rate was significantly different between different plant age groups. From the 110th day to 141th day,the induction rate was increased with the increase of age and the difference of induction rate reached 0.01 significant difference level. The induction rate reached the highest value in the 141th day then it declined gradually. ③The combined use of 2, 4-D and 6-BA with proper increase of 2,4-D was good for inducing callus. ④The green plantlet induction rate of NER was increased when the concentration of 6-BA increased from 2 mg/L to 4 mg/L. Adding ZT from 0.5 mg/L to 2 mg/L. 6-BA would led 2.47% increase of green plantlet olantlet induction rate.