期刊文献+
共找到17篇文章
< 1 >
每页显示 20 50 100
Knowledge-Empowered,Collaborative,and Co-Evolving AI Models:The Post-LLM Roadmap 被引量:1
1
作者 Fei Wu Tao Shen +17 位作者 Thomas Back Jingyuan Chen Gang Huang Yaochu Jin Kun Kuang Mengze Li Cewu Lu Jiaxu Miao Yongwei Wang Ying Wei Fan Wu Junchi Yan Hongxia Yang Yi Yang Shengyu Zhang Zhou Zhao Yueting Zhuang Yunhe Pan 《Engineering》 2025年第1期87-100,共14页
Large language models(LLMs)have significantly advanced artificial intelligence(AI)by excelling in tasks such as understanding,generation,and reasoning across multiple modalities.Despite these achievements,LLMs have in... Large language models(LLMs)have significantly advanced artificial intelligence(AI)by excelling in tasks such as understanding,generation,and reasoning across multiple modalities.Despite these achievements,LLMs have inherent limitations including outdated information,hallucinations,inefficiency,lack of interpretability,and challenges in domain-specific accuracy.To address these issues,this survey explores three promising directions in the post-LLM era:knowledge empowerment,model collaboration,and model co-evolution.First,we examine methods of integrating external knowledge into LLMs to enhance factual accuracy,reasoning capabilities,and interpretability,including incorporating knowledge into training objectives,instruction tuning,retrieval-augmented inference,and knowledge prompting.Second,we discuss model collaboration strategies that leverage the complementary strengths of LLMs and smaller models to improve efficiency and domain-specific performance through techniques such as model merging,functional model collaboration,and knowledge injection.Third,we delve into model co-evolution,in which multiple models collaboratively evolve by sharing knowledge,parameters,and learning strategies to adapt to dynamic environments and tasks,thereby enhancing their adaptability and continual learning.We illustrate how the integration of these techniques advances AI capabilities in science,engineering,and society—particularly in hypothesis development,problem formulation,problem-solving,and interpretability across various domains.We conclude by outlining future pathways for further advancement and applications. 展开更多
关键词 Artificial intelligence Large language models Knowledge empowerment Model collaboration Model co-evolution
在线阅读 下载PDF
基于交比上下文的层次化形状特征提取及匹配算法 被引量:3
2
作者 贾棋 刘宇 +2 位作者 樊鑫 郭禾 高新凯 《计算机辅助设计与图形学学报》 EI CSCD 北大核心 2015年第12期2247-2255,共9页
针对视角变换下的形状识别问题,提出一种基于交比上下文关系的层次化形状特征提取及匹配算法.首先通过由粗到精的采样方式建立层次化的形状特征描述子,实现对形状从整体到局部的描述关系;其次通过对传统的交比不变量进行扩展,建立每5个... 针对视角变换下的形状识别问题,提出一种基于交比上下文关系的层次化形状特征提取及匹配算法.首先通过由粗到精的采样方式建立层次化的形状特征描述子,实现对形状从整体到局部的描述关系;其次通过对传统的交比不变量进行扩展,建立每5个采样点之间的射影不变关系;最后在形状匹配方面使用动态规划算法计算形状间相似度.实验结果表明,该算法对形状变形具有很好的识别效果,并且计算复杂度低、特征维度小.此外,文中层次化的方法也适用于其他不变量特征,便于和传统的形状特征表示方法进行融合,充分发挥2种描述子各自的优势,具有一定的扩展性. 展开更多
关键词 形状描述子 交比不变量 层次化表示 形状匹配
在线阅读 下载PDF
Data-Driven Heuristic Assisted Memetic Algorithm for Efficient Inter-Satellite Link Scheduling in the BeiDou Navigation Satellite System 被引量:8
3
作者 Yonghao Du Ling Wang +2 位作者 Lining Xing Jungang Yan Mengsi Cai 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2021年第11期1800-1816,共17页
Inter-satellite link(ISL)scheduling is required by the BeiDou Navigation Satellite System(BDS)to guarantee the system ranging and communication performance.In the BDS,a great number of ISL scheduling instances must be... Inter-satellite link(ISL)scheduling is required by the BeiDou Navigation Satellite System(BDS)to guarantee the system ranging and communication performance.In the BDS,a great number of ISL scheduling instances must be addressed every day,which will certainly spend a lot of time via normal metaheuristics and hardly meet the quick-response requirements that often occur in real-world applications.To address the dual requirements of normal and quick-response ISL schedulings,a data-driven heuristic assisted memetic algorithm(DHMA)is proposed in this paper,which includes a high-performance memetic algorithm(MA)and a data-driven heuristic.In normal situations,the high-performance MA that hybridizes parallelism,competition,and evolution strategies is performed for high-quality ISL scheduling solutions over time.When in quick-response situations,the data-driven heuristic is performed to quickly schedule high-probability ISLs according to a prediction model,which is trained from the high-quality MA solutions.The main idea of the DHMA is to address normal and quick-response schedulings separately,while high-quality normal scheduling data are trained for quick-response use.In addition,this paper also presents an easy-to-understand ISL scheduling model and its NP-completeness.A seven-day experimental study with 10080 one-minute ISL scheduling instances shows the efficient performance of the DHMA in addressing the ISL scheduling in normal(in 84 hours)and quick-response(in 0.62 hour)situations,which can well meet the dual scheduling requirements in real-world BDS applications. 展开更多
关键词 BeiDou Navigation Satellite System(BDS) data-driven heuristic inter-satellite link(ISL)scheduling memetic algorithm METAHEURISTIC quick-response
在线阅读 下载PDF
MSIsensor-pro: Fast, Accurate, and Matched-normal-sample-free Detection of Microsatellite Instability 被引量:4
4
作者 Peng Jia Xiaofei Yang +6 位作者 Li Guo Bowen Liu Jiadong Lin Hao Liang Jianyong Sun Chengsheng Zhang Kai Ye 《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2020年第1期65-71,共7页
Microsatellite instability(MSI)is a key biomarker for cancer therapy and prognosis.Traditional experimental assays are laborious and time-consuming,and next-generation sequencingbased computational methods do not work... Microsatellite instability(MSI)is a key biomarker for cancer therapy and prognosis.Traditional experimental assays are laborious and time-consuming,and next-generation sequencingbased computational methods do not work on leukemia samples,paraffin-embedded samples,or patient-derived xenografts/organoids,due to the requirement of matched normal samples.Herein,we developed MSIsensor-pro,an open-source single sample MSI scoring method for research and clinical applications.MSIsensor-pro introduces a multinomial distribution model to quantify polymerase slippages for each tumor sample and a discriminative site selection method to enable MSI detection without matched normal samples.We demonstrate that MSIsensor-pro is an ultrafast,accurate,and robust MSI calling method.Using samples with various sequencing depths and tumor purities,MSIsensor-pro significantly outperformed the current leading methods in both accuracy and computational cost.MSIsensor-pro is available at https://github.com/xjtu-omics/msisensor-pro and free for non-commercial use,while a commercial license is provided upon request. 展开更多
关键词 MICROSATELLITE Polymerase slippage Multinomial distribution Microsatellite instability TUMOR
原文传递
Identification of driving factors of algal growth in the South-to-North Water Diversion Project by Transformer-based deep learning 被引量:1
5
作者 Jing Qian Nan Pu +3 位作者 Li Qian Xiaobai Xue Yonghong Bi Stefan Norra 《Water Biology and Security》 2023年第3期47-56,共10页
Accurate and credible identification of the drivers of algal growth is essential for sustainable utilization and scientific management of freshwater.In this study,we developed a deep learning-based Transformer model,n... Accurate and credible identification of the drivers of algal growth is essential for sustainable utilization and scientific management of freshwater.In this study,we developed a deep learning-based Transformer model,named Bloomformer-1,for end-to-end identification of the drivers of algal growth without the needing extensive a priori knowledge or prior experiments.The Middle Route of the South-to-North Water Diversion Project(MRP)was used as the study site to demonstrate that Bloomformer-1 exhibited more robust performance(with the highest R^(2),0.80 to 0.94,and the lowest RMSE,0.22–0.43μg/L)compared to four widely used traditional machine learning models,namely extra trees regression(ETR),gradient boosting regression tree(GBRT),support vector regression(SVR),and multiple linear regression(MLR).In addition,Bloomformer-1 had higher interpretability(including higher transferability and understandability)than the four traditional machine learning models,which meant that it was trustworthy and the results could be directly applied to real scenarios.Finally,it was determined that total phosphorus(TP)was the most important driver for the MRP,especially in Henan section of the canal,although total nitrogen(TN)had the highest effect on algal growth in the Hebei section.Based on these results,phosphorus loading controlling in the whole MRP was proposed as an algal control strategy. 展开更多
关键词 Algal growth Deep learning Driving factor determination Model interpretability TRANSFORMER
在线阅读 下载PDF
Implementation of FAIR Guidelines in Selected Non-Western Geographies 被引量:1
6
作者 Yi Lin Putu Hadi Purnama Jati +4 位作者 Aliya Aktau Mariem Ghardallou Sara Nodehi Mirjam van Reisen 《Data Intelligence》 EI 2022年第4期747-770,1051,1053,共26页
This study provides an analysis of the implementation of FAIR Guidelines in selected non-Western geographies. The analysis was based on a systematic literature review to determine if the findability, accessibility, in... This study provides an analysis of the implementation of FAIR Guidelines in selected non-Western geographies. The analysis was based on a systematic literature review to determine if the findability, accessibility, interoperability, and reusability of data is seen as an issue, if the adoption of the FAIR Guidelines is seen as a solution, and if the climate is conducive to the implementation of the FAIR Guidelines. The results show that the FAIR Guidelines have been discussed in most of the countries studied, which have identified data sharing and the reusability of research data as an issue(e.g., Kazakhstan, Russia, countries in the Middle East), and partially introduced in others(e.g., Indonesia). In Indonesia, a FAIR equivalent system has been introduced, although certain functions need to be added for data to be entirely FAIR. In Japan, both FAIR equivalent systems and FAIR-based systems have been adopted and created, and the acceptance of FAIRbased systems is recommended by the Government of Japan. In a number of African countries, the FAIR Guidelines are in the process of being implemented and the implementation of FAIR is well supported. In conclusion, a window of opportunity for implementing the FAIR Guidelines is open in most of the countries studied, however, more awareness needs to be raised about the benefits of FAIR in Russia and Kazakhstan to place it firmly on the policy agenda. 展开更多
关键词 FAIR Guidelines FAIR implementation non-Western geographies
原文传递
Information Streams in Health Facilities: The Case of Uganda
7
作者 Mariam Basajja Mutwalibi Nambobi 《Data Intelligence》 EI 2022年第4期882-898,1048,共18页
With the prevailing COVID-19 pandemic, the lack of digitally-recorded and connected health data poses a challenge for analysing the situation. Virus outbreaks, such as the current pandemic, allow for the optimisation ... With the prevailing COVID-19 pandemic, the lack of digitally-recorded and connected health data poses a challenge for analysing the situation. Virus outbreaks, such as the current pandemic, allow for the optimisation and reuse of data, which can be beneficial in managing future outbreaks. However, there is a general lack of knowledge about the actual flow of information in health facilities, which is also the case in Uganda. In Uganda, where this case study was conducted, there is no comprehensive knowledge about what type of data is collected or how it is collected along the journey of a patient through a health facility. This study investigates information flows of clinical patient data in health facilities in Uganda. The study found that almost all health facilities in Uganda store patient information in paper files on shelves. Hospitals in Uganda are provided with paper tools, such as reporting forms, registers and manuals, in which district data is collected as aggregate data and submitted in the form of digital reports to the Ministry of Health Resource Center. These reporting forms are not digitised and, thus, not machine-actionable. Hence, it is not easy for health facilities, researchers, and others to find and access patient and research data. It is also not easy to reuse and connect this data with other digital health data worldwide, leading to the incorrect conclusion that there is less health data in Uganda. The a FAIR architecture has the potential to solve such problems and facilitate the transition from paper to digital records in the Uganda health system. 展开更多
关键词 Data management Information flow FAIR Guidelines Hospital information system(HIS) Electronic medical record(EMR) Electronic health record(EHR) Patient health record(PHR)
原文传递
Possibility of Enhancing Digital Health Interoperability in Uganda through FAIR Data
8
作者 Mariam Basajja Mutwalibi Nambobi Katy Wolstencroft 《Data Intelligence》 EI 2022年第4期899-916,1043,共19页
The digital health landscape in Uganda is plagued by problems with interoperability and sustainability, due to fragmentation and a lack of integrated digital health solutions. This can be partly attributed to the abse... The digital health landscape in Uganda is plagued by problems with interoperability and sustainability, due to fragmentation and a lack of integrated digital health solutions. This can be partly attributed to the absence of policies on the interoperability of data, as well as the fact that there is no common goal to make digital data and data infrastructure interoperable across the data ecosystem. The promulgation of the FAIR Guidelines in 2016 brought together various data stewards and stakeholders to adopt a common vision on data management and enable greater interoperability. This article explores the potential of enhancing digital health interoperability through FAIR by analysing the digital solutions piloted in Uganda and their sustainability. It looks at the factors that are currently hindering interoperability by examining existing digital health solutions in Uganda, such as the Digital Health Atlas Uganda(DHA-U) and Uganda Digital Health Dashboard(UDHD). The level of FAIRness of the two dashboards was determined using the FAIR Evaluation Services tool. Analysis was also carried out to discover the level of FAIRness of the digital health solutions within the dashboards and the most frequently used software applications and data standards by the different digital health interventions in Uganda. 展开更多
关键词 FAIR FAIRness level INTEROPERABILITY Integration Common standards Health information exchange
原文传递
Proof of Concept and Horizons on Deployment of FAIR Data Points in the COVID-19 Pandemic
9
作者 Mariam Basajja Marek Suchanek +6 位作者 Getu Tadele Taye Samson Yohannes Amare Mutwalibi Nambobi Sakinat Folorunso Ruduan Plug Francisca Oladipo Mirjam van Reisen 《Data Intelligence》 EI 2022年第4期917-937,1045,共22页
Rapid and effective data sharing is necessary to control disease outbreaks,such as the current coronavirus pandemic.Despite the existence of data sharing agreements,data silos,lack of interoperable data infrastructure... Rapid and effective data sharing is necessary to control disease outbreaks,such as the current coronavirus pandemic.Despite the existence of data sharing agreements,data silos,lack of interoperable data infrastructures,and different institutional jurisdictions hinder data sharing and accessibility.To overcome these challenges,the Virus Outbreak Data Network(VODAN)-Africa initiative is championing an approach in which data never leaves the institution where it was generated,but,instead,algorithms can visit the data and query multiple datasets in an automated way.To make this possible,FAIR Data Points—distributed data repositories that host machine-actionable data and metadata that adhere to the FAIR Guidelines(that data should be Findable,Accessible,Interoperable and Reusable)—have been deployed in participating institutions using a dockerised bundle of tools called VODAN in a Box(Vi B).Vi B is a set of multiple FAIR-enabling and open-source services with a single goal:to support the gathering of World Health Organization(WHO)electronic case report forms(e CRFs)as FAIR data in a machine-actionable way,but without exposing or transferring the data outside the facility.Following the execution of a proof of concept,Vi B was deployed in Uganda and Leiden University.The proof of concept generated a first query which was implemented across two continents.A SWOT(strengths,weaknesses,opportunities and threats)analysis of the architecture was carried out and established the changes needed for specifications and requirements for the future development of the solution. 展开更多
关键词 Digital health Data in residence FAIR Guidelines Machine-actionable VODAN-Africa
原文传递
基于形式描述方法的文法进化算法研究(英文)
10
作者 刘国祥 董颖 刘明银 《计算机与应用化学》 CAS CSCD 北大核心 2006年第6期511-514,共4页
本文应用基于形式文法的方法,结合计算机科学中新兴有效的进化计算优化方法,进行了文法进化算法的有效性探索。通过运用巴科斯描述文法的四元组,强调了BNF产生式的设计,充分利用遗传操作来优化求解问题。并以有机化合物的分解(以离解能... 本文应用基于形式文法的方法,结合计算机科学中新兴有效的进化计算优化方法,进行了文法进化算法的有效性探索。通过运用巴科斯描述文法的四元组,强调了BNF产生式的设计,充分利用遗传操作来优化求解问题。并以有机化合物的分解(以离解能适应值进行适应度评价)为例子进行了算法设计,通过适应度函数择优,复制、交换和突变操作获得了实验结果。经过对结果的分析比较证明了算法的有效性。可以说是文法进化算法的成功探索。 展开更多
关键词 形式文法 进化算法 文法算法
原文传递
Towards the Tipping Point for FAIR Implementation 被引量:4
11
作者 Mirjam van Reisen Mia Stokmans +3 位作者 Mariam Basajja Antony Otieno Ong’ayo Christine Kirkpatrick Barend Mons 《Data Intelligence》 2020年第1期264-275,共12页
This article explores the global implementation of the FAIR Guiding Principles for scientific management and data stewardship,which provide that data should be findable,accessible,interoperable and reusable.The implem... This article explores the global implementation of the FAIR Guiding Principles for scientific management and data stewardship,which provide that data should be findable,accessible,interoperable and reusable.The implementation of these principles is designed to lead to the stewardship of data as FAIR digital objects and the establishment of the Internet of FAIR Data and Services(IFDS).If implementation reaches a tipping point,IFDS has the potential to revolutionize how data is managed by making machine and human readable data discoverable for reuse.Accordingly,this article examines the expansion of the implementation of FAIR Guiding Principles,especially how and in which geographies(locations)and areas(topic domains)implementation is taking place.A literature review of academic articles published between 2016 and 2019 on the use of FAIR Guiding Principles is presented.The investigation also includes an analysis of the domains in the IFDS Implementation Networks(INs).Its uptake has been mainly in the Western hemisphere.The investigation found that implementation of FAIR Guiding Principles has taken firm hold in the domain of bio and natural sciences.To achieve a tipping point for FAIR implementation,it is now time to ensure the inclusion of non-European ascendants and of other scientific domains.Apart from equal opportunity and genuine global partnership issues,a permanent European bias poses challenges with regard to the representativeness and validity of data and could limit the potential of IFDS to reach across continental boundaries.The article concludes that,despite efforts to be inclusive,acceptance of the FAIR Guiding Principles and IFDS in different scientific communities is limited and there is a need to act now to prevent dampening of the momentum in the development and implementation of the IFDS.It is further concluded that policy entrepreneurs and the GO FAIR INs may contribute to making the FAIR Guiding Principles more flexible in including different research epistemologies,especially through its GO CHANGE pillar. 展开更多
关键词 FAIR Data HEALTH Digital Health MHEALTH data-driven science FAIR Implementation Networks GO-FAIR
原文传递
FAIR Practices in Africa 被引量:4
12
作者 Mirjam van Reisen Mia Stokmans +5 位作者 Munyaradzi Mawere Mariam Basajja Antony Otieno Ong’ayo Primrose Nakazibwe Christine Kirkpatrick Kudakwashe Chindoza 《Data Intelligence》 2020年第1期246-256,318,319,共13页
This article investigates expansion of the Internet of FAIR Data and Services(IFDS)to Africa,through the three GO FAIR pillars:GO CHANGE,GO BUILD and GO TRAIN.Introduction of the IFDS in Africa has a focus on digital ... This article investigates expansion of the Internet of FAIR Data and Services(IFDS)to Africa,through the three GO FAIR pillars:GO CHANGE,GO BUILD and GO TRAIN.Introduction of the IFDS in Africa has a focus on digital health.Two examples of introducing FAIR are compared:a regional initiative for digital health by governments in the East Africa Community(EAC)and an initiative by a local health provider(Solidarmed)in collaboration with Great Zimbabwe University in Zimbabwe.The obstacles to introducing FAIR are identified as underrepresentation of data from Africa in IFDS at this moment,the lack of explicit recognition of situational context of research in FAIR at present and the lack of acceptability of FAIR as a foreign and European invention which affects acceptance.It is envisaged that FAIR has an important contribution to solve fragmentation in digital health in Africa,and that any obstacles concerning African participation,context relevance and acceptance of IFDS need to be removed.This will require involvement of African researchers and ICT-developers so that it is driven by local ownership.Assessment of ecological validity in FAIR principles would ensure that the context specificity of research is reflected in the FAIR principles.This will help enhance the acceptance of the FAIR Guidelines in Africa and will help strengthen digital health research and services. 展开更多
关键词 FAIR data HEALTH Digital Health digital health in Africa GO FAIR Implementation Network
原文传递
Learning to select the recombination operator for derivative-free optimization 被引量:1
13
作者 Haotian Zhang Jianyong Sun +1 位作者 Thomas Back Zongben Xu 《Science China Mathematics》 SCIE CSCD 2024年第6期1457-1480,共24页
Extensive studies on selecting recombination operators adaptively,namely,adaptive operator selection(AOS),during the search process of an evolutionary algorithm(EA),have shown that AOS is promising for improving EA... Extensive studies on selecting recombination operators adaptively,namely,adaptive operator selection(AOS),during the search process of an evolutionary algorithm(EA),have shown that AOS is promising for improving EA's performance.A variety of heuristic mechanisms for AOS have been proposed in recent decades,which usually contain two main components:the feature extraction and the policy setting.The feature extraction refers to as extracting relevant features from the information collected during the search process.The policy setting means to set a strategy(or policy)on how to select an operator from a pool of operators based on the extracted feature.Both components are designed by hand in existing studies,which may not be efficient for adapting optimization problems.In this paper,a generalized framework is proposed for learning the components of AOS for one of the main streams of EAs,namely,differential evolution(DE).In the framework,the feature extraction is parameterized as a deep neural network(DNN),while a Dirichlet distribution is considered to be the policy.A reinforcement learning method,named policy gradient,is used to train the DNN.As case studies,the proposed framework is applied to two DEs including the classic DE and a recently-proposed DE,which result in two new algorithms named PG-DE and PG-MPEDE,respectively.Experiments on the Congress of Evolutionary Computation(CEC)2018 test suite show that the proposed new algorithms perform significantly better than their counterparts.Finally,we prove theoretically that the considered classic methods are the special cases of the proposed framework. 展开更多
关键词 evolutionary algorithm differential evolution adaptive operator selection reinforcement learning deep learning
原文传递
FAIR Equivalency in Indonesia's Digital Health Framework 被引量:1
14
作者 Putu Hadi Purnama Jati 《Data Intelligence》 EI 2022年第4期798-812,共15页
The objective of this study was to assess the regulatory framework for health data in Indonesia in order to understand the policy context and explore the possibility of expanding the adoption and implementation of the... The objective of this study was to assess the regulatory framework for health data in Indonesia in order to understand the policy context and explore the possibility of expanding the adoption and implementation of the FAIR Guidelines,which state that data should be Findable,Accessible,Interoperable and Reusable(FAIR),in Indonesia.Although the FAIR Guidelines were not explicitly mentioned in any of the policy documents relevant to the Indonesian digital health sector,six out of the eight documents analysed contained FAIR Equivalent principles.In particular,Indonesia’s Population Identification Number(NIK)has the potential,as a unique identifier,to support the integration and interoperability(findability)of data,which is crucial to all other aspects of the FAIR Guidelines.There is also a plan to build standards and protocols into the implementation of information systems in each ministry and government agency to improve data accessibility(accessibility),the integration of the various information systems is planned/ongoing(interoperability),and the need for a standardised arrangement for health information systems related to health data following the community standard is recognised(reusability).The documents at the core of Indonesia’s digital health/e Health policy have the highest FAIR Equivalency Score(FE-Score),showing some degree of alignment between the Indonesian digital health implementation vision and the FAIR Guidelines.This indicates that Indonesia’s digital health sector is open to using the FAIR Guidelines. 展开更多
关键词 EHEALTH FAIR Guidelines Findability ACCESSIBILITY Interoperability REUSABILITY Indonesia’s digital health
原文传递
FAIR Principles:Interpretations and Implementation Considerations 被引量:35
15
作者 Annika Jacobsen Ricardo de Miranda Azevedo +41 位作者 Nick Juty Dominique Batista Simon Coles Ronald Cornet Melanie Courtot Merce Crosas Michel Dumontier Chris T.Evelo Carole Goble Giancarlo Guizzardi Karsten Kryger Hansen Ali Hasnain Kristina Hettne Jaap Heringa Rob W.W.Hooft Melanie Imming Keith G.Jeffery Rajaram Kaliyaperumal Martijn GKersloot Christine R.Kirkpatrick Tobias Kuhn Ignasi Labastida Barbara Magagna PeterMcQuilton Natalie Meyers Annalisa Montesanti Mirjam van Reisen Philippe Rocca-Serra Robert Pergl Susanna-Assunta Sansone Luiz Olavo Bonino da Silva Santos Juliane Schneider George Strawn Mark Thompson Andra Waagmeester Tobias Weigel Mark D.Wilkinson Egon L.Willighagen Peter Wittenburg Marco Roos Barend Mons Erik Schultes 《Data Intelligence》 2020年第1期10-29,293-302,322,共31页
The FAIR principles have been widely cited,endorsed and adopted by a broad range of stakeholders since their publication in 2016.By intention,the 15 FAIR guiding principles do not dictate specific technological implem... The FAIR principles have been widely cited,endorsed and adopted by a broad range of stakeholders since their publication in 2016.By intention,the 15 FAIR guiding principles do not dictate specific technological implementations,but provide guidance for improving Findability,Accessibility,Interoperability and Reusability of digital resources.This has likely contributed to the broad adoption of the FAIR principles,because individual stakeholder communities can implement their own FAIR solutions.However,it has also resulted in inconsistent interpretations that carry the risk of leading to incompatible implementations.Thus,while the FAIR principles are formulated on a high level and may be interpreted and implemented in different ways,for true interoperability we need to support convergence in implementation choices that are widely accessible and(re)-usable.We introduce the concept of FAIR implementation considerations to assist accelerated global participation and convergence towards accessible,robust,widespread and consistent FAIR implementations.Any self-identified stakeholder community may either choose to reuse solutions from existing implementations,or when they spot a gap,accept the challenge to create the needed solution,which,ideally,can be used again by other communities in the future.Here,we provide interpretations and implementation considerations(choices and challenges)for each FAIR principle. 展开更多
关键词 FAIR guiding principles FAIR implementation FAIR convergence FAIR communities choices and challenges
原文传递
Mako:A Graph-based Pattern Growth Approach to Detect Complex Structural Variants 被引量:1
16
作者 Jiadong Lin Xiaofei Yang +12 位作者 Walter Kosters Tun Xu Yanyan Jia Songbo Wang Qihui Zhu Mallory Ryan Li Guo Chengsheng Zhang The Human Genome Structural Variation Consortium Charles Lee Scott E.Devine Evan E.Eichler Kai Ye 《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2022年第1期205-218,共14页
Complex structural variants(CSVs) are genomic alterations that have more than two breakpoints and are considered as the simultaneous occurrence of simple structural variants.However,detecting the compounded mutational... Complex structural variants(CSVs) are genomic alterations that have more than two breakpoints and are considered as the simultaneous occurrence of simple structural variants.However,detecting the compounded mutational signals of CSVs is challenging through a commonly used model-match strategy.As a result,there has been limited progress for CSV discovery compared with simple structural variants.Here,we systematically analyzed the multi-breakpoint connection feature of CSVs,and proposed Mako,utilizing a bottom-up guided model-free strategy,to detect CSVs from paired-end short-read sequencing.Specifically,we implemented a graph-based pattern growth approach,where the graph depicts potential breakpoint connections,and pattern growth enables CSV detection without pre-defined models.Comprehensive evaluations on both simulated and real datasets revealed that Mako outperformed other algorithms.Notably,validation rates of CSVs on real data based on experimental and computational validations as well as manual inspections are around 70%,where the medians of experimental and computational breakpoint shift are 13 bp and 26 bp,respectively.Moreover,the Mako CSV subgraph effectively characterized the breakpoint connections of a CSV event and uncovered a total of 15 CSV types,including two novel types of adjacent segment swap and tandem dispersed duplication.Further analysis of these CSVs also revealed the impact of sequence homology on the formation of CSVs.Mako is publicly available at https://github.com/xjtu-omics/Mako. 展开更多
关键词 Next-generation sequencing Complex structural variant Pattern growth Graph mining Formation mechanism
原文传递
Finding Dutch natives in online forums
17
作者 Bernard van den Boom Cor J.Veenman 《Forensic Sciences Research》 2018年第3期230-239,共10页
Law enforcement agencies have a restricted area in which their powers apply,which is called their jurisdiction.These restrictions also apply to the Internet.However,on the Internet,the physical borders of the jurisdic... Law enforcement agencies have a restricted area in which their powers apply,which is called their jurisdiction.These restrictions also apply to the Internet.However,on the Internet,the physical borders of the jurisdiction,typically country borders,are hard to discover.In our case,it is hard to establish whether someone involved in criminal online behavior is indeed a Dutch citizen.We propose a way to overcome the arduous task of manually investigating whether a user on an Internet forum is Dutch or not.More precisely,we aim to detect that a given English text is written by a Dutch native author.To develop a detector,we follow a machine learning approach.Therefore,we need to prepare a specific training corpus.To obtain a corpus that is representative for online forums,we collected a large amount of English forum posts from Dutch and non-Dutch authors on Reddit.To learn a detection model,we used a bag-of-words representation to capture potential misspellings,grammatical errors or unusual turns of phrases that are characteristic of the mother tongue of the authors.For this learning task,we compare the linear support vector machine and regularized logistic regression using the appropriate performance metrics f1 score,precision,and average precision.Our results show logistic regression with frequency-based feature selection performs best at predicting Dutch natives.Further study should be directed to the general applicability of the results that is to find out if the developed models are applicable to other forums with comparable high performance. 展开更多
关键词 Forensic data science text mining author profiling corpus creation big data open source intelligence native language verification
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部