With the rise of data-intensive research,data literacy has become a critical capability for improving scientific data quality and achieving artificial intelligence(AI)readiness.In the biomedical domain,data are charac...With the rise of data-intensive research,data literacy has become a critical capability for improving scientific data quality and achieving artificial intelligence(AI)readiness.In the biomedical domain,data are characterized by high complexity and privacy sensitivity,calling for robust and systematic data management skills.This paper reviews current trends in scientific data governance and the evolving policy landscape,highlighting persistent challenges such as inconsistent standards,semantic misalignment,and limited awareness of compliance.These issues are largely rooted in the lack of structured training and practical support for researchers.In response,this study builds on existing data literacy frameworks and integrates the specific demands of biomedical research to propose a comprehensive,lifecycle-oriented data literacy competency model with an emphasis on ethics and regulatory awareness.Furthermore,it outlines a tiered training strategy tailored to different research stages—undergraduate,graduate,and professional,offering theoretical foundations and practical pathways for universities and research institutions to advance data literacy education.展开更多
Hefei Light Source(HLS)is a synchrotron radiation light source that primarily produces vacuum ultraviolet and soft X-rays.It currently consists of ten experimental stations,including a soft X-ray microscopy station.As...Hefei Light Source(HLS)is a synchrotron radiation light source that primarily produces vacuum ultraviolet and soft X-rays.It currently consists of ten experimental stations,including a soft X-ray microscopy station.As part of its on-going efforts to establish a centralized scientific data management platform,HLS is in the process of developing a test sys-tem that covers the entire lifecycle of scientific data,including data generation,acquisition,processing,analysis,and de-struction.However,the instruments used in the soft X-ray microscopy experimental station rely on commercial propriet-ary software for data acquisition and processing.We developed a semi-automatic data acquisition program to facilitate the integration of soft X-ray microscopy stations into a centralized scientific data management platform.Additionally,we cre-ated an online data processing platform to assist users in analyzing their scientific data.The system we developed and de-ployed meets the design requirements,successfully integrating the soft X-ray microscopy station into the full lifecycle management of scientific data.展开更多
The main obstacle to the open sharing of scientific data is the lack of a legal protection system for intellectual property.This article analyzes the progress of research papers on intellectual property in scientific ...The main obstacle to the open sharing of scientific data is the lack of a legal protection system for intellectual property.This article analyzes the progress of research papers on intellectual property in scientific data in China through literature search and statistics.Currently,research subjects are unbalanced,research content is uneven,research methods are intellectual single,and research depth is insufficient.It is recommended that different stakeholders engage in deep cross disciplinary cooperation,further improve China’s legal and policy protection system for scientific data intellectual property,and promote the open sharing of scientific data.展开更多
at present,data security has become the most urgent and primary issue in the era of digital economy.Marine scientific data security is the most urgent core issue of marine data resource management and sharing service....at present,data security has become the most urgent and primary issue in the era of digital economy.Marine scientific data security is the most urgent core issue of marine data resource management and sharing service.This paper focuses on the analysis of the needs of marine scientific data security governance,in-depth development of marine scientific data security governance approaches and methods,and practical application in the national marine scientific data center,optimizing the existing data management model,ensuring the safety of marine scientific data,and fully releasing the data value.展开更多
Scientific data refers to the data or data sets generated from scientific research process through observations, experiments, calculations and analyses. These data are fundamental components for developing new knowled...Scientific data refers to the data or data sets generated from scientific research process through observations, experiments, calculations and analyses. These data are fundamental components for developing new knowledge, advancing technological progress, and creating wealth. In recent years, scientific data has been attracting more and more attention for its preserving, archiving and sharing.展开更多
Feature representation is one of the key issues in data clustering. The existing feature representation of scientific data is not sufficient, which to some extent affects the result of scientific data clustering. Ther...Feature representation is one of the key issues in data clustering. The existing feature representation of scientific data is not sufficient, which to some extent affects the result of scientific data clustering. Therefore, the paper proposes a concept of composite text description(CTD) and a CTD-based feature representation method for biomedical scientific data. The method mainly uses different feature weight algorisms to represent candidate features based on two types of data sources respectively, combines and finally strengthens the two feature sets. Experiments show that comparing with traditional methods, the feature representation method is more effective than traditional methods and can significantly improve the performance of biomedcial data clustering.展开更多
National Population Health Data Center(NPHDC)is one of China's 20 national-level science data centers,jointly designated by the Ministry of Science and Technology and the Ministry of Finance.Operated by the Chines...National Population Health Data Center(NPHDC)is one of China's 20 national-level science data centers,jointly designated by the Ministry of Science and Technology and the Ministry of Finance.Operated by the Chinese Academy of Medical Sciences under the oversight of the National Health Commission,NPHDC adheres to national regulations including the Scientific Data Management Measures and the National Science and Technology Infrastructure Service Platform Management Measures,and is committed to collecting,integrating,managing,and sharing biomedical and health data through openaccess platform,fostering open sharing and engaging in international cooperation.展开更多
As the Earth entering into the Anthropocene, global sustainable development requires ecological research to evolve into the large-scale, quantitative, and predictive era. It necessitates a revolution of ecological obs...As the Earth entering into the Anthropocene, global sustainable development requires ecological research to evolve into the large-scale, quantitative, and predictive era. It necessitates a revolution of ecological observation technology and a long-term accumulation of scientific data. The ecosystem flux tower observation technology is the right one to meet this requirement. However, the unique advantages and potential values of global-scale flux tower observation are still not fully appreciated. Reviewing the development history of global meteorological observation and its scientific contributions to the society, we can get an important enlightenment to re-cognize the scientific mission of flux observation.展开更多
Purpose: In the open science era, it is typical to share project-generated scientific data by depositing it in an open and accessible database. Moreover, scientific publications are preserved in a digital library arc...Purpose: In the open science era, it is typical to share project-generated scientific data by depositing it in an open and accessible database. Moreover, scientific publications are preserved in a digital library archive. It is challenging to identify the data usage that is mentioned in literature and associate it with its source. Here, we investigated the data usage of a government-funded cancer genomics project, The Cancer Genome Atlas(TCGA), via a full-text literature analysis.Design/methodology/approach: We focused on identifying articles using the TCGA dataset and constructing linkages between the articles and the specific TCGA dataset. First, we collected 5,372 TCGA-related articles from Pub Med Central(PMC). Second, we constructed a benchmark set with 25 full-text articles that truly used the TCGA data in their studies, and we summarized the key features of the benchmark set. Third, the key features were applied to the remaining PMC full-text articles that were collected from PMC.Findings: The amount of publications that use TCGA data has increased significantly since 2011, although the TCGA project was launched in 2005. Additionally, we found that the critical areas of focus in the studies that use the TCGA data were glioblastoma multiforme, lung cancer, and breast cancer; meanwhile, data from the RNA-sequencing(RNA-seq) platform is the most preferable for use.Research limitations: The current workflow to identify articles that truly used TCGA data is labor-intensive. An automatic method is expected to improve the performance.Practical implications: This study will help cancer genomics researchers determine the latest advancements in cancer molecular therapy, and it will promote data sharing and data-intensive scientific discovery.Originality/value: Few studies have been conducted to investigate data usage by governmentfunded projects/programs since their launch. In this preliminary study, we extracted articles that use TCGA data from PMC, and we created a link between the full-text articles and the source data.展开更多
In the era of the big data. the national strategies and the rapid development of computers and storage technologies bring opportunities and challenges to the library's data services. Based on the investigation litera...In the era of the big data. the national strategies and the rapid development of computers and storage technologies bring opportunities and challenges to the library's data services. Based on the investigation literature of the scientific data services in the university libraries in the United States, the development process of the scientific data is analyzed from three aspects of the service types, the service mode and the service contents. The author of this paper also proposes opportunities and challenges from 5 aspects of the policy support. strengthening the publicity, the self learning, the self positioning and relying on the embedded subject librarians, to promote the development of the library scientific data services.展开更多
Scientific data citation is a common behavior in the process of scientific research and writing academic papers under the context of data-intensive scientific research paradigm. Standardized citation of scientific dat...Scientific data citation is a common behavior in the process of scientific research and writing academic papers under the context of data-intensive scientific research paradigm. Standardized citation of scientific data has received continuous attention from academia and policy management departments in recent years. In order to explore the characteristics and the correlation of scientific data citations of Chinese researchers, based on the results of scientific data citations presented in academic papers, this study use CNKI as the data source to extract771 papers in 12 academic journals during 2017 to 2019. Combining with the Chinese national standard Information Technology-Scientific Data Citation(GB/T 35294-2017), a set of variables were given to reflect the reference characteristics. First, 4992 citation records of scientific data were manually identified and coded one by one, and the citation characteristics were presented with the statistical distribution of data frequency. Then, the chi-square test, log-linear model analysis, and correspondence analysis methods were used to analyze and explore the significant correlation among the characteristic variables. The study found that in general, the phenomenon of scientific data citations in Chinese researchers is widespread, and the number of citations has increased year by year, but there are also a large number of irregular citations. At present, there are roughly two modes of citation labeling behavior, and the traditional document citation mode is still the mainstream citation method for data citation. Furthermore, distributor type of scientific data may affect the reference in marked way. In addition, the completeness of the labeling elements differed in different bibliographic elements of scientific data. Irregular references to Unique Identifiers and parsing addresses are particularly prominent, which may be related to the type of distributor.展开更多
The growing collection of scientific data in various web repositories is referred to as Scientific Big Data,as it fulfills the four“V’s”of Big Data—volume,variety,velocity,and veracity.This phenomenon has created ...The growing collection of scientific data in various web repositories is referred to as Scientific Big Data,as it fulfills the four“V’s”of Big Data—volume,variety,velocity,and veracity.This phenomenon has created new opportunities for startups;for instance,the extraction of pertinent research papers from enormous knowledge repositories using certain innovative methods has become an important task for researchers and entrepreneurs.Traditionally,the content of the papers are compared to list the relevant papers from a repository.The conventional method results in a long list of papers that is often impossible to interpret productively.Therefore,the need for a novel approach that intelligently utilizes the available data is imminent.Moreover,the primary element of the scientific knowledge base is a research article,which consists of various logical sections such as the Abstract,Introduction,Related Work,Methodology,Results,and Conclusion.Thus,this study utilizes these logical sections of research articles,because they hold significant potential in finding relevant papers.In this study,comprehensive experiments were performed to determine the role of the logical sections-based terms indexing method in improving the quality of results(i.e.,retrieving relevant papers).Therefore,we proposed,implemented,and evaluated the logical sections-based content comparisons method to address the research objective with a standard method of indexing terms.The section-based approach outperformed the standard content-based approach in identifying relevant documents from all classified topics of computer science.Overall,the proposed approach extracted 14%more relevant results from the entire dataset.As the experimental results suggested that employing a finer content similarity technique improved the quality of results,the proposed approach has led the foundation of knowledge-based startups.展开更多
In order to realize visualization of three-dimensional data field (TDDF) in instrument, two methods of visualization of TDDF and the usual manner of quick graphic and image processing are analyzed. And how to use Op...In order to realize visualization of three-dimensional data field (TDDF) in instrument, two methods of visualization of TDDF and the usual manner of quick graphic and image processing are analyzed. And how to use OpenGL technique and the characteristic of analyzed data to construct a TDDF, the ways of reality processing and interactive processing are described. Then the medium geometric element and a related realistic model are constructed by means of the first algorithm. Models obtained for attaching the third dimension in three-dimensional data field are presented. An example for TDDF realization of machine measuring is provided. The analysis of resultant graphic indicates that the three-dimensional graphics built by the method developed is featured by good reality, fast processing and strong interaction展开更多
Correction to:Radiation Detection Technology and Methods(2024)8:1486-1495.https://doi.org/10.1007/s41605-024-00470-z.In this article Methods section of the publication,the term(beamline scientific data acquisition sys...Correction to:Radiation Detection Technology and Methods(2024)8:1486-1495.https://doi.org/10.1007/s41605-024-00470-z.In this article Methods section of the publication,the term(beamline scientific data acquisition system)is incorrectly written and redundant,it has been removed.展开更多
Providing knowledge graphs for materials science facilitates understanding of the key data such as materials structure,property,etc.and their relations.However,very little work has been devoted to it.Meanwhile,immedia...Providing knowledge graphs for materials science facilitates understanding of the key data such as materials structure,property,etc.and their relations.However,very little work has been devoted to it.Meanwhile,immediately applying machine learning to materials computation still suffers from the lack of data and costly acquiring.To tackle these problems,we propose literature-aid automatic entity and relation extraction by deliberatively designed matching rules,especially for copper-based composites.Next,we fuse the knowledge by calculating the semantics similarity.Finally,the materials knowledge graphs are constructed and visualized on the Neo4j graph database.The experimental results show a total of 6,154 entities and 15,561 pairs of relations are extracted on the 69,600 open-accessed documents of copper-based composites,with their precision and accuracy rates over 80%.Further,we exemplify the effectiveness by building materials structure-property-value meta-paths and analyzing their impacts.展开更多
Much research is dependent on Information and Communication Technologies(ICT).Researchers in different research domains have set up their own ICT systems(data labs)to support their research,from data collection(observ...Much research is dependent on Information and Communication Technologies(ICT).Researchers in different research domains have set up their own ICT systems(data labs)to support their research,from data collection(observation,experiment,simulation)through analysis(analytics,visualisation)to publication.However,too frequently the Digital Objects(DOs)upon which the research results are based are not curated and thus neither available for reproduction of the research nor utilization for other(e.g.,multidisciplinary)research purposes.The key to curation is rich metadata recording not only a description of the DO and the conditions of its use but also the provenance-the trail of actions performed on the DO along the research workflow.There are increasing real-world requirements for multidisciplinary research.With DOs in domain-specific ICT systems(silos),commonly with inadequate metadata,such research is hindered.Despite wide agreement on principles for achieving FAIR(findable,accessible,interoperable,and reusable)utilization of research data,current practices fall short.FAIR DOs offer a way forward.The paradoxes,barriers and possible solutions are examined.The key is persuading the researcher to adopt best practices which implies decreasing the cost(easy to use autonomic tools)and increasing the benefit(incentives such as acknowledgement and citation)while maintaining researcher independence and flexibility.展开更多
Big data is a strategic highland in the era of knowledge-driven economies, and it is also a new type of strategic resource for all nations. Big data collected from space for Earth observation—so-called Big Earth Data...Big data is a strategic highland in the era of knowledge-driven economies, and it is also a new type of strategic resource for all nations. Big data collected from space for Earth observation—so-called Big Earth Data—is creating new opportunities for the Earth sciences and revolutionizing the innovation of methodologies and thought patterns. It has potential to advance in-depth development of Earth sciences and bring more exciting scientific discoveries.The Academic Divisions of the Chinese Academy of Sciences Forum on Frontiers of Science and Technology for Big Earth Data from Space was held in Beijing in June of 2015.The forum analyzed the development of Earth observation technology and big data, explored the concepts and scientific connotations of Big Earth Data from space, discussed the correlation between Big Earth Data and Digital Earth, and dissected the potential of Big Earth Data from space to promote scientific discovery in the Earth sciences, especially concerning global changes.展开更多
Big data is a revolutionary innovation that has allowed the development of many new methods in scientific research.This new way of thinking has encouraged the pursuit of new discoveries.Big data occupies the strategic...Big data is a revolutionary innovation that has allowed the development of many new methods in scientific research.This new way of thinking has encouraged the pursuit of new discoveries.Big data occupies the strategic high ground in the era of knowledge economies and also constitutes a new national and global strategic resource.“Big Earth data”,derived from,but not limited to,Earth observation has macro-level capabilities that enable rapid and accurate monitoring of the Earth,and is becoming a new frontier contributing to the advancement of Earth science and significant scientific discoveries.Within the context of the development of big data,this paper analyzes the characteristics of scientific big data and recognizes its great potential for development,particularly with regard to the role that big Earth data can play in promoting the development of Earth science.On this basis,the paper outlines the Big Earth Data Science Engineering Project(CASEarth)of the Chinese Academy of Sciences Strategic Priority Research Program.Big data is at the forefront of the integration of geoscience,information science,and space science and technology,and it is expected that big Earth data will provide new prospects for the development of Earth science.展开更多
Animal models are crucial for the study of severe infectious diseases,which is essential for determining their pathogenesis and the development of vaccines and drugs.Animal experiments involving risk grade 3 agents su...Animal models are crucial for the study of severe infectious diseases,which is essential for determining their pathogenesis and the development of vaccines and drugs.Animal experiments involving risk grade 3 agents such as SARS CoV,HIV,M.tb,H7N9,and Brucella must be conducted in an Animal Biosafety Level 3(ABSL-3)facility.Because of the in vivo work,the biosafety risk in ABSL-3 facilities is higher than that in BSL-3 facilities.Undoubtedly,management practices must be strengthened to ensure biosafety in the ABSL-3 facility.Meanwhile,we cannot ignore the reliable scientific results obtained from animal experiments conducted in ABSL-3 laboratories.It is of great practical significance to study the overall biosafety concepts that can increase the scientific data quality.Based on the management of animal experiments in the ABSL-3 Laboratory of Wuhan University,combined with relevant international and domestic literature,we indicate the main safety issues and factors affecting animal experiment results at ABSL-3 facilities.Based on these issues,management practices regarding animal experiments in ABSL-3 facilities are proposed,which take into account both biosafety and scientifically sound data.展开更多
Seeing Apollo 11 land on the Moon, downloading Pluto’s pictures from New Horizons, receiving scientific data on 67-p/Churyumov-Gerasimenko comet from Rosetta, commanding Voyager 1 to turn its camera to take a photogr...Seeing Apollo 11 land on the Moon, downloading Pluto’s pictures from New Horizons, receiving scientific data on 67-p/Churyumov-Gerasimenko comet from Rosetta, commanding Voyager 1 to turn its camera to take a photograph of Earth from a record distance of about 6 billion kilometers, all these and many other incredible achievements would have been impossible without the increasingly advanced space communication and network technologies [1]. Nowadays, these emerging space technologies have made their ambitious move toward commercialization and have been anticipated as an indispensable enabling component of the sixth-generation (6G) communications networks. For instance, numerous enterprises have envisaged massive deployment of low Earth orbit (LEO) constellations to complement the terrestrial networks and provide ubiquitous connectivity through their global footprint [2,3]. Nevertheless, entangled by terrestrial and aerospace components, the sophisticated 6G networks will involve new challenges and paradigm shifts in architectural, management, operational, and signal processing design [4–6].展开更多
文摘With the rise of data-intensive research,data literacy has become a critical capability for improving scientific data quality and achieving artificial intelligence(AI)readiness.In the biomedical domain,data are characterized by high complexity and privacy sensitivity,calling for robust and systematic data management skills.This paper reviews current trends in scientific data governance and the evolving policy landscape,highlighting persistent challenges such as inconsistent standards,semantic misalignment,and limited awareness of compliance.These issues are largely rooted in the lack of structured training and practical support for researchers.In response,this study builds on existing data literacy frameworks and integrates the specific demands of biomedical research to propose a comprehensive,lifecycle-oriented data literacy competency model with an emphasis on ethics and regulatory awareness.Furthermore,it outlines a tiered training strategy tailored to different research stages—undergraduate,graduate,and professional,offering theoretical foundations and practical pathways for universities and research institutions to advance data literacy education.
基金supported by the Fundamental Research Funds for the Central Universities(WK2310000102)。
文摘Hefei Light Source(HLS)is a synchrotron radiation light source that primarily produces vacuum ultraviolet and soft X-rays.It currently consists of ten experimental stations,including a soft X-ray microscopy station.As part of its on-going efforts to establish a centralized scientific data management platform,HLS is in the process of developing a test sys-tem that covers the entire lifecycle of scientific data,including data generation,acquisition,processing,analysis,and de-struction.However,the instruments used in the soft X-ray microscopy experimental station rely on commercial propriet-ary software for data acquisition and processing.We developed a semi-automatic data acquisition program to facilitate the integration of soft X-ray microscopy stations into a centralized scientific data management platform.Additionally,we cre-ated an online data processing platform to assist users in analyzing their scientific data.The system we developed and de-ployed meets the design requirements,successfully integrating the soft X-ray microscopy station into the full lifecycle management of scientific data.
文摘The main obstacle to the open sharing of scientific data is the lack of a legal protection system for intellectual property.This article analyzes the progress of research papers on intellectual property in scientific data in China through literature search and statistics.Currently,research subjects are unbalanced,research content is uneven,research methods are intellectual single,and research depth is insufficient.It is recommended that different stakeholders engage in deep cross disciplinary cooperation,further improve China’s legal and policy protection system for scientific data intellectual property,and promote the open sharing of scientific data.
文摘at present,data security has become the most urgent and primary issue in the era of digital economy.Marine scientific data security is the most urgent core issue of marine data resource management and sharing service.This paper focuses on the analysis of the needs of marine scientific data security governance,in-depth development of marine scientific data security governance approaches and methods,and practical application in the national marine scientific data center,optimizing the existing data management model,ensuring the safety of marine scientific data,and fully releasing the data value.
基金Ministry of Science and Technology "National Science and Technology Platform Program"(2005DKA31800)
文摘Scientific data refers to the data or data sets generated from scientific research process through observations, experiments, calculations and analyses. These data are fundamental components for developing new knowledge, advancing technological progress, and creating wealth. In recent years, scientific data has been attracting more and more attention for its preserving, archiving and sharing.
基金supported by the Agridata,the sub-program of National Science and Technology Infrastructure Program(Grant No.2005DKA31800)
文摘Feature representation is one of the key issues in data clustering. The existing feature representation of scientific data is not sufficient, which to some extent affects the result of scientific data clustering. Therefore, the paper proposes a concept of composite text description(CTD) and a CTD-based feature representation method for biomedical scientific data. The method mainly uses different feature weight algorisms to represent candidate features based on two types of data sources respectively, combines and finally strengthens the two feature sets. Experiments show that comparing with traditional methods, the feature representation method is more effective than traditional methods and can significantly improve the performance of biomedcial data clustering.
文摘National Population Health Data Center(NPHDC)is one of China's 20 national-level science data centers,jointly designated by the Ministry of Science and Technology and the Ministry of Finance.Operated by the Chinese Academy of Medical Sciences under the oversight of the National Health Commission,NPHDC adheres to national regulations including the Scientific Data Management Measures and the National Science and Technology Infrastructure Service Platform Management Measures,and is committed to collecting,integrating,managing,and sharing biomedical and health data through openaccess platform,fostering open sharing and engaging in international cooperation.
基金Science and Technology Service Network Initiative of the Chinese Academy of Sciences(KFJ-SW-STS-169)National Natural Science Foundation of China(41671045 and 31600347)
文摘As the Earth entering into the Anthropocene, global sustainable development requires ecological research to evolve into the large-scale, quantitative, and predictive era. It necessitates a revolution of ecological observation technology and a long-term accumulation of scientific data. The ecosystem flux tower observation technology is the right one to meet this requirement. However, the unique advantages and potential values of global-scale flux tower observation are still not fully appreciated. Reviewing the development history of global meteorological observation and its scientific contributions to the society, we can get an important enlightenment to re-cognize the scientific mission of flux observation.
基金supported by the National Population and Health Scientific Data Sharing Program of Chinathe Knowledge Centre for Engineering Sciences and Technology (Medical Centre)the Fundamental Research Funds for the Central Universities (Grant No.: 13R0101)
文摘Purpose: In the open science era, it is typical to share project-generated scientific data by depositing it in an open and accessible database. Moreover, scientific publications are preserved in a digital library archive. It is challenging to identify the data usage that is mentioned in literature and associate it with its source. Here, we investigated the data usage of a government-funded cancer genomics project, The Cancer Genome Atlas(TCGA), via a full-text literature analysis.Design/methodology/approach: We focused on identifying articles using the TCGA dataset and constructing linkages between the articles and the specific TCGA dataset. First, we collected 5,372 TCGA-related articles from Pub Med Central(PMC). Second, we constructed a benchmark set with 25 full-text articles that truly used the TCGA data in their studies, and we summarized the key features of the benchmark set. Third, the key features were applied to the remaining PMC full-text articles that were collected from PMC.Findings: The amount of publications that use TCGA data has increased significantly since 2011, although the TCGA project was launched in 2005. Additionally, we found that the critical areas of focus in the studies that use the TCGA data were glioblastoma multiforme, lung cancer, and breast cancer; meanwhile, data from the RNA-sequencing(RNA-seq) platform is the most preferable for use.Research limitations: The current workflow to identify articles that truly used TCGA data is labor-intensive. An automatic method is expected to improve the performance.Practical implications: This study will help cancer genomics researchers determine the latest advancements in cancer molecular therapy, and it will promote data sharing and data-intensive scientific discovery.Originality/value: Few studies have been conducted to investigate data usage by governmentfunded projects/programs since their launch. In this preliminary study, we extracted articles that use TCGA data from PMC, and we created a link between the full-text articles and the source data.
文摘In the era of the big data. the national strategies and the rapid development of computers and storage technologies bring opportunities and challenges to the library's data services. Based on the investigation literature of the scientific data services in the university libraries in the United States, the development process of the scientific data is analyzed from three aspects of the service types, the service mode and the service contents. The author of this paper also proposes opportunities and challenges from 5 aspects of the policy support. strengthening the publicity, the self learning, the self positioning and relying on the embedded subject librarians, to promote the development of the library scientific data services.
文摘Scientific data citation is a common behavior in the process of scientific research and writing academic papers under the context of data-intensive scientific research paradigm. Standardized citation of scientific data has received continuous attention from academia and policy management departments in recent years. In order to explore the characteristics and the correlation of scientific data citations of Chinese researchers, based on the results of scientific data citations presented in academic papers, this study use CNKI as the data source to extract771 papers in 12 academic journals during 2017 to 2019. Combining with the Chinese national standard Information Technology-Scientific Data Citation(GB/T 35294-2017), a set of variables were given to reflect the reference characteristics. First, 4992 citation records of scientific data were manually identified and coded one by one, and the citation characteristics were presented with the statistical distribution of data frequency. Then, the chi-square test, log-linear model analysis, and correspondence analysis methods were used to analyze and explore the significant correlation among the characteristic variables. The study found that in general, the phenomenon of scientific data citations in Chinese researchers is widespread, and the number of citations has increased year by year, but there are also a large number of irregular citations. At present, there are roughly two modes of citation labeling behavior, and the traditional document citation mode is still the mainstream citation method for data citation. Furthermore, distributor type of scientific data may affect the reference in marked way. In addition, the completeness of the labeling elements differed in different bibliographic elements of scientific data. Irregular references to Unique Identifiers and parsing addresses are particularly prominent, which may be related to the type of distributor.
基金supported by Institute of Information&communications Technology Planning&Evaluation(IITP)grant funded by the Korea government(MSIT)(2020-0-01592)Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(2019R1F1A1058548).
文摘The growing collection of scientific data in various web repositories is referred to as Scientific Big Data,as it fulfills the four“V’s”of Big Data—volume,variety,velocity,and veracity.This phenomenon has created new opportunities for startups;for instance,the extraction of pertinent research papers from enormous knowledge repositories using certain innovative methods has become an important task for researchers and entrepreneurs.Traditionally,the content of the papers are compared to list the relevant papers from a repository.The conventional method results in a long list of papers that is often impossible to interpret productively.Therefore,the need for a novel approach that intelligently utilizes the available data is imminent.Moreover,the primary element of the scientific knowledge base is a research article,which consists of various logical sections such as the Abstract,Introduction,Related Work,Methodology,Results,and Conclusion.Thus,this study utilizes these logical sections of research articles,because they hold significant potential in finding relevant papers.In this study,comprehensive experiments were performed to determine the role of the logical sections-based terms indexing method in improving the quality of results(i.e.,retrieving relevant papers).Therefore,we proposed,implemented,and evaluated the logical sections-based content comparisons method to address the research objective with a standard method of indexing terms.The section-based approach outperformed the standard content-based approach in identifying relevant documents from all classified topics of computer science.Overall,the proposed approach extracted 14%more relevant results from the entire dataset.As the experimental results suggested that employing a finer content similarity technique improved the quality of results,the proposed approach has led the foundation of knowledge-based startups.
基金This project is supported by National Natural Science Foundation of China (No.50405009)
文摘In order to realize visualization of three-dimensional data field (TDDF) in instrument, two methods of visualization of TDDF and the usual manner of quick graphic and image processing are analyzed. And how to use OpenGL technique and the characteristic of analyzed data to construct a TDDF, the ways of reality processing and interactive processing are described. Then the medium geometric element and a related realistic model are constructed by means of the first algorithm. Models obtained for attaching the third dimension in three-dimensional data field are presented. An example for TDDF realization of machine measuring is provided. The analysis of resultant graphic indicates that the three-dimensional graphics built by the method developed is featured by good reality, fast processing and strong interaction
文摘Correction to:Radiation Detection Technology and Methods(2024)8:1486-1495.https://doi.org/10.1007/s41605-024-00470-z.In this article Methods section of the publication,the term(beamline scientific data acquisition system)is incorrectly written and redundant,it has been removed.
基金partially supported by the Natural Science Foundation of China(62062046)
文摘Providing knowledge graphs for materials science facilitates understanding of the key data such as materials structure,property,etc.and their relations.However,very little work has been devoted to it.Meanwhile,immediately applying machine learning to materials computation still suffers from the lack of data and costly acquiring.To tackle these problems,we propose literature-aid automatic entity and relation extraction by deliberatively designed matching rules,especially for copper-based composites.Next,we fuse the knowledge by calculating the semantics similarity.Finally,the materials knowledge graphs are constructed and visualized on the Neo4j graph database.The experimental results show a total of 6,154 entities and 15,561 pairs of relations are extracted on the 69,600 open-accessed documents of copper-based composites,with their precision and accuracy rates over 80%.Further,we exemplify the effectiveness by building materials structure-property-value meta-paths and analyzing their impacts.
文摘Much research is dependent on Information and Communication Technologies(ICT).Researchers in different research domains have set up their own ICT systems(data labs)to support their research,from data collection(observation,experiment,simulation)through analysis(analytics,visualisation)to publication.However,too frequently the Digital Objects(DOs)upon which the research results are based are not curated and thus neither available for reproduction of the research nor utilization for other(e.g.,multidisciplinary)research purposes.The key to curation is rich metadata recording not only a description of the DO and the conditions of its use but also the provenance-the trail of actions performed on the DO along the research workflow.There are increasing real-world requirements for multidisciplinary research.With DOs in domain-specific ICT systems(silos),commonly with inadequate metadata,such research is hindered.Despite wide agreement on principles for achieving FAIR(findable,accessible,interoperable,and reusable)utilization of research data,current practices fall short.FAIR DOs offer a way forward.The paradoxes,barriers and possible solutions are examined.The key is persuading the researcher to adopt best practices which implies decreasing the cost(easy to use autonomic tools)and increasing the benefit(incentives such as acknowledgement and citation)while maintaining researcher independence and flexibility.
基金supported by the Academic Divisions of the Chinese Academy of Sciences Forum on Frontiers of Science and Technology for Big Earth Data from Space
文摘Big data is a strategic highland in the era of knowledge-driven economies, and it is also a new type of strategic resource for all nations. Big data collected from space for Earth observation—so-called Big Earth Data—is creating new opportunities for the Earth sciences and revolutionizing the innovation of methodologies and thought patterns. It has potential to advance in-depth development of Earth sciences and bring more exciting scientific discoveries.The Academic Divisions of the Chinese Academy of Sciences Forum on Frontiers of Science and Technology for Big Earth Data from Space was held in Beijing in June of 2015.The forum analyzed the development of Earth observation technology and big data, explored the concepts and scientific connotations of Big Earth Data from space, discussed the correlation between Big Earth Data and Digital Earth, and dissected the potential of Big Earth Data from space to promote scientific discovery in the Earth sciences, especially concerning global changes.
基金This work is supported by the Strategic Priority Research Program of Chinese Academy of Sciences,Project title:CASEarth(XDA19000000)and Digital Belt and Road(XDA19030000).
文摘Big data is a revolutionary innovation that has allowed the development of many new methods in scientific research.This new way of thinking has encouraged the pursuit of new discoveries.Big data occupies the strategic high ground in the era of knowledge economies and also constitutes a new national and global strategic resource.“Big Earth data”,derived from,but not limited to,Earth observation has macro-level capabilities that enable rapid and accurate monitoring of the Earth,and is becoming a new frontier contributing to the advancement of Earth science and significant scientific discoveries.Within the context of the development of big data,this paper analyzes the characteristics of scientific big data and recognizes its great potential for development,particularly with regard to the role that big Earth data can play in promoting the development of Earth science.On this basis,the paper outlines the Big Earth Data Science Engineering Project(CASEarth)of the Chinese Academy of Sciences Strategic Priority Research Program.Big data is at the forefront of the integration of geoscience,information science,and space science and technology,and it is expected that big Earth data will provide new prospects for the development of Earth science.
基金We are grateful for the funding from the National Key Research and Development Program of China(grant No.:2016YFC1202203).
文摘Animal models are crucial for the study of severe infectious diseases,which is essential for determining their pathogenesis and the development of vaccines and drugs.Animal experiments involving risk grade 3 agents such as SARS CoV,HIV,M.tb,H7N9,and Brucella must be conducted in an Animal Biosafety Level 3(ABSL-3)facility.Because of the in vivo work,the biosafety risk in ABSL-3 facilities is higher than that in BSL-3 facilities.Undoubtedly,management practices must be strengthened to ensure biosafety in the ABSL-3 facility.Meanwhile,we cannot ignore the reliable scientific results obtained from animal experiments conducted in ABSL-3 laboratories.It is of great practical significance to study the overall biosafety concepts that can increase the scientific data quality.Based on the management of animal experiments in the ABSL-3 Laboratory of Wuhan University,combined with relevant international and domestic literature,we indicate the main safety issues and factors affecting animal experiment results at ABSL-3 facilities.Based on these issues,management practices regarding animal experiments in ABSL-3 facilities are proposed,which take into account both biosafety and scientifically sound data.
文摘Seeing Apollo 11 land on the Moon, downloading Pluto’s pictures from New Horizons, receiving scientific data on 67-p/Churyumov-Gerasimenko comet from Rosetta, commanding Voyager 1 to turn its camera to take a photograph of Earth from a record distance of about 6 billion kilometers, all these and many other incredible achievements would have been impossible without the increasingly advanced space communication and network technologies [1]. Nowadays, these emerging space technologies have made their ambitious move toward commercialization and have been anticipated as an indispensable enabling component of the sixth-generation (6G) communications networks. For instance, numerous enterprises have envisaged massive deployment of low Earth orbit (LEO) constellations to complement the terrestrial networks and provide ubiquitous connectivity through their global footprint [2,3]. Nevertheless, entangled by terrestrial and aerospace components, the sophisticated 6G networks will involve new challenges and paradigm shifts in architectural, management, operational, and signal processing design [4–6].