As a new type of production factor in healthcare,healthcare data elements have been rapidly integrated into various health production processes,such as clinical assistance,health management,biological testing,and oper...As a new type of production factor in healthcare,healthcare data elements have been rapidly integrated into various health production processes,such as clinical assistance,health management,biological testing,and operation and supervision[1,2].Healthcare data elements include biolog.ical and clinical data that are related to disease,environ-mental health data that are associated with life,and operational and healthcare management data that are related to healthcare activities(Figure 1).Activities such as the construction of a data value assessment system,the devel-opment of a data circulation and sharing platform,and the authorization of data compliance and operation products support the strong growth momentum of the market for health care data elements in China[3].展开更多
Thedeployment of the Internet of Things(IoT)with smart sensors has facilitated the emergence of fog computing as an important technology for delivering services to smart environments such as campuses,smart cities,and ...Thedeployment of the Internet of Things(IoT)with smart sensors has facilitated the emergence of fog computing as an important technology for delivering services to smart environments such as campuses,smart cities,and smart transportation systems.Fog computing tackles a range of challenges,including processing,storage,bandwidth,latency,and reliability,by locally distributing secure information through end nodes.Consisting of endpoints,fog nodes,and back-end cloud infrastructure,it provides advanced capabilities beyond traditional cloud computing.In smart environments,particularly within smart city transportation systems,the abundance of devices and nodes poses significant challenges related to power consumption and system reliability.To address the challenges of latency,energy consumption,and fault tolerance in these environments,this paper proposes a latency-aware,faulttolerant framework for resource scheduling and data management,referred to as the FORD framework,for smart cities in fog environments.This framework is designed to meet the demands of time-sensitive applications,such as those in smart transportation systems.The FORD framework incorporates latency-aware resource scheduling to optimize task execution in smart city environments,leveraging resources from both fog and cloud environments.Through simulation-based executions,tasks are allocated to the nearest available nodes with minimum latency.In the event of execution failure,a fault-tolerantmechanism is employed to ensure the successful completion of tasks.Upon successful execution,data is efficiently stored in the cloud data center,ensuring data integrity and reliability within the smart city ecosystem.展开更多
With the rise of data-intensive research,data literacy has become a critical capability for improving scientific data quality and achieving artificial intelligence(AI)readiness.In the biomedical domain,data are charac...With the rise of data-intensive research,data literacy has become a critical capability for improving scientific data quality and achieving artificial intelligence(AI)readiness.In the biomedical domain,data are characterized by high complexity and privacy sensitivity,calling for robust and systematic data management skills.This paper reviews current trends in scientific data governance and the evolving policy landscape,highlighting persistent challenges such as inconsistent standards,semantic misalignment,and limited awareness of compliance.These issues are largely rooted in the lack of structured training and practical support for researchers.In response,this study builds on existing data literacy frameworks and integrates the specific demands of biomedical research to propose a comprehensive,lifecycle-oriented data literacy competency model with an emphasis on ethics and regulatory awareness.Furthermore,it outlines a tiered training strategy tailored to different research stages—undergraduate,graduate,and professional,offering theoretical foundations and practical pathways for universities and research institutions to advance data literacy education.展开更多
We propose a Cross-Chain Mapping Blockchain(CCMB)for scalable data management in massive Internet of Things(IoT)networks.Specifically,CCMB aims to improve the scalability of securely storing,tracing,and transmitting I...We propose a Cross-Chain Mapping Blockchain(CCMB)for scalable data management in massive Internet of Things(IoT)networks.Specifically,CCMB aims to improve the scalability of securely storing,tracing,and transmitting IoT behavior and reputation data based on our proposed cross-mapped Behavior Chain(BChain)and Reputation Chain(RChain).To improve off-chain IoT data storage scalability,we show that our lightweight CCMB architecture efficiently utilizes available fog-cloud resources.The scalability of on-chain IoT data tracing is enhanced using our Mapping Smart Contract(MSC)and cross-chain mapping design to perform rapid Reputation-to-Behavior(R2B)traceability queries between BChain and RChain blocks.To maximize off-chain to on-chain throughput,we optimize the CCMB block settings and producers based on a general Poisson Point Process(PPP)network model.The constrained optimization problem is formulated as a Markov Decision Process(MDP),and solved using a dual-network Deep Reinforcement Learning(DRL)algorithm.Simulation results validate CCMB’s scalability advantages in storage,traceability,and throughput.In specific massive IoT scenarios,CCMB can reduce the storage footprint by 50%and traceability query time by 90%,while improving system throughput by 55%compared to existing benchmarks.展开更多
National Population Health Data Center(NPHDC)is one of China's 20 national-level science data centers,jointly designated by the Ministry of Science and Technology and the Ministry of Finance.Operated by the Chines...National Population Health Data Center(NPHDC)is one of China's 20 national-level science data centers,jointly designated by the Ministry of Science and Technology and the Ministry of Finance.Operated by the Chinese Academy of Medical Sciences under the oversight of the National Health Commission,NPHDC adheres to national regulations including the Scientific Data Management Measures and the National Science and Technology Infrastructure Service Platform Management Measures,and is committed to collecting,integrating,managing,and sharing biomedical and health data through openaccess platform,fostering open sharing and engaging in international cooperation.展开更多
In the context of the rapid development of digital education,the security of educational data has become an increasing concern.This paper explores strategies for the classification and grading of educational data,and ...In the context of the rapid development of digital education,the security of educational data has become an increasing concern.This paper explores strategies for the classification and grading of educational data,and constructs a higher educational data security management and control model centered on the integration of medical and educational data.By implementing a multi-dimensional strategy of dynamic classification,real-time authorization,and secure execution through educational data security levels,dynamic access control is applied to effectively enhance the security and controllability of educational data,providing a secure foundation for data sharing and openness.展开更多
This article introduces the methodologies and instrumentation for data measurement and propagation at the Back-n white neutron facility of the China Spallation Neutron Source.The Back-n facility employs backscattering...This article introduces the methodologies and instrumentation for data measurement and propagation at the Back-n white neutron facility of the China Spallation Neutron Source.The Back-n facility employs backscattering techniques to generate a broad spectrum of white neutrons.Equipped with advanced detectors such as the light particle detector array and the fission ionization chamber detector,the facility achieves high-precision data acquisition through a general-purpose electronics system.Data were managed and stored in a hierarchical system supported by the National High Energy Physics Science Data Center,ensuring long-term preservation and efficient access.The data from the Back-n experiments significantly contribute to nuclear physics,reactor design,astrophysics,and medical physics,enhancing the understanding of nuclear processes and supporting interdisciplinary research.展开更多
Mobile networks possess significant information and thus are considered a gold mine for the researcher’s community.The call detail records(CDR)of a mobile network are used to identify the network’s efficacy and the ...Mobile networks possess significant information and thus are considered a gold mine for the researcher’s community.The call detail records(CDR)of a mobile network are used to identify the network’s efficacy and the mobile user’s behavior.It is evident from the recent literature that cyber-physical systems(CPS)were used in the analytics and modeling of telecom data.In addition,CPS is used to provide valuable services in smart cities.In general,a typical telecom company hasmillions of subscribers and thus generatesmassive amounts of data.From this aspect,data storage,analysis,and processing are the key concerns.To solve these issues,herein we propose a multilevel cyber-physical social system(CPSS)for the analysis and modeling of large internet data.Our proposed multilevel system has three levels and each level has a specific functionality.Initially,raw Call Detail Data(CDR)was collected at the first level.Herein,the data preprocessing,cleaning,and error removal operations were performed.In the second level,data processing,cleaning,reduction,integration,processing,and storage were performed.Herein,suggested internet activity record measures were applied.Our proposed system initially constructs a graph and then performs network analysis.Thus proposed CPSS system accurately identifies different areas of internet peak usage in a city(Milan city).Our research is helpful for the network operators to plan effective network configuration,management,and optimization of resources.展开更多
This research paper compares Excel and R language for data analysis and concludes that R language is more suitable for complex data analysis tasks.R language’s open-source nature makes it accessible to everyone,and i...This research paper compares Excel and R language for data analysis and concludes that R language is more suitable for complex data analysis tasks.R language’s open-source nature makes it accessible to everyone,and its powerful data management and analysis tools make it suitable for handling complex data analysis tasks.It is also highly customizable,allowing users to create custom functions and packages to meet their specific needs.Additionally,R language provides high reproducibility,making it easy to replicate and verify research results,and it has excellent collaboration capabilities,enabling multiple users to work on the same project simultaneously.These advantages make R language a more suitable choice for complex data analysis tasks,particularly in scientific research and business applications.The findings of this study will help people understand that R is not just a language that can handle more data than Excel and demonstrate that r is essential to the field of data analysis.At the same time,it will also help users and organizations make informed decisions regarding their data analysis needs and software preferences.展开更多
This study aims to investigate the influence of social media on college choice among undergraduates majoring in Big Data Management and Application in China.The study attempts to reveal how information on social media...This study aims to investigate the influence of social media on college choice among undergraduates majoring in Big Data Management and Application in China.The study attempts to reveal how information on social media platforms such as Weibo,WeChat,and Zhihu influences the cognition and choice process of prospective students.By employing an online quantitative survey questionnaire,data were collected from the 2022 and 2023 classes of new students majoring in Big Data Management and Application at Guilin University of Electronic Technology.The aim was to evaluate the role of social media in their college choice process and understand the features and information that most attract prospective students.Social media has become a key factor influencing the college choice decision-making of undergraduates majoring in Big Data Management and Application in China.Students tend to obtain school information through social media platforms and use this information as an important reference in their decision-making process.Higher education institutions should strengthen their social media information dissemination,providing accurate,timely,and attractive information.It is also necessary to ensure effective management of social media platforms,maintain a positive reputation for the school on social media,and increase the interest and trust of prospective students.Simultaneously,educational decision-makers should consider incorporating social media analysis into their recruitment strategies to better attract new student enrollment.This study provides a new perspective for understanding higher education choice behavior in the digital age,particularly by revealing the importance of social media in the educational decision-making process.This has important practical and theoretical implications for higher education institutions,policymakers,and social media platform operators.展开更多
As an introductory course for the emerging major of big data management and application,“Introduction to Big Data”has not yet formed a curriculum standard and implementation plan that is widely accepted and used by ...As an introductory course for the emerging major of big data management and application,“Introduction to Big Data”has not yet formed a curriculum standard and implementation plan that is widely accepted and used by everyone.To this end,we discuss some of our explorations and attempts in the construction and teaching process of big data courses for the major of big data management and application from the perspective of course planning,course implementation,and course summary.After interviews with students and feedback from questionnaires,students are highly satisfied with some of the teaching measures and programs currently adopted.展开更多
As technology and the internet develop,more data are generated every day.These data are in large sizes,high dimensions,and complex structures.The combination of these three features is the“Big Data”[1].Big data is r...As technology and the internet develop,more data are generated every day.These data are in large sizes,high dimensions,and complex structures.The combination of these three features is the“Big Data”[1].Big data is revolutionizing all industries,bringing colossal impacts to them[2].Many researchers have pointed out the huge impact that big data can have on our daily lives[3].We can utilize the information we obtain and help us make decisions.Also,the conclusions we drew from the big data we analyzed can be used as a prediction for the future,helping us to make more accurate and benign decisions earlier than others.If we apply these technics in finance,for example,in stock,we can get detailed information for stocks.Moreover,we can use the analyzed data to predict certain stocks.This can help people decide whether to buy a stock or not by providing predicted data for people at a certain convincing level,helping to protect them from potential losses.展开更多
The critical role of patient-reported outcome measures(PROMs)in enhancing clinical decision-making and promoting patient-centered care has gained a profound significance in scientific research.PROMs encapsulate a pati...The critical role of patient-reported outcome measures(PROMs)in enhancing clinical decision-making and promoting patient-centered care has gained a profound significance in scientific research.PROMs encapsulate a patient's health status directly from their perspective,encompassing various domains such as symptom severity,functional status,and overall quality of life.By integrating PROMs into routine clinical practice and research,healthcare providers can achieve a more nuanced understanding of patient experiences and tailor treatments accordingly.The deployment of PROMs supports dynamic patient-provider interactions,fostering better patient engagement and adherence to tre-atment plans.Moreover,PROMs are pivotal in clinical settings for monitoring disease progression and treatment efficacy,particularly in chronic and mental health conditions.However,challenges in implementing PROMs include data collection and management,integration into existing health systems,and acceptance by patients and providers.Overcoming these barriers necessitates technological advancements,policy development,and continuous education to enhance the acceptability and effectiveness of PROMs.The paper concludes with recommendations for future research and policy-making aimed at optimizing the use and impact of PROMs across healthcare settings.展开更多
On the basis of PDM(product data management) definition and its connotation, the factors to ensure implementation success are analyzed. The definition phase, analysis phase, design phase, build and test phase, and pos...On the basis of PDM(product data management) definition and its connotation, the factors to ensure implementation success are analyzed. The definition phase, analysis phase, design phase, build and test phase, and post production phase during PDM implementation are described. The implementation is divided into ten processes, which consist of the above different phases. The relationships between phases and processes are illustrated. Finally, a workflow is proposed to guide the implementing at a fixed price.展开更多
Cloud-native data warehouses have revolutionized data analysis by enabling elasticity,high availability and lower costs.And the increasing popularity of artificial intelligence(AI)drives data warehouses to provide pre...Cloud-native data warehouses have revolutionized data analysis by enabling elasticity,high availability and lower costs.And the increasing popularity of artificial intelligence(AI)drives data warehouses to provide predictive analytics besides the existing descriptive analytics.Consequently,more vendors start to support training and inference of AI models in data warehouses,exploiting the benefits of near-data processing for fast model development and deployment.However,most of the existing solutions are limited by a complex syntax or slow data transportation across engines.In this paper,we present GaussDB-AISQL,a composable SQL system with AI capabilities.GaussDB-AISQL adopts a composable system design that decouples computing,storage,caching,DB engine and AI engine.Our system offers all the functionality needed by end-to-end model training and inference during the model lifecycle.It also enjoys the simplicity and efficiency by providing a SQL-like syntax and removes the burden of manual model management.When training an AI model,GaussDB-AISQL benefits from highly parallel data transportation by concurrent data pulling from the distributed shared memory.The feature selection algorithms in GaussDB-AISQL make the training more data-efficient.When running model inference,GaussDB-AISQL registers the trained model object in the local data warehouse as a user-defined-function,which avoids moving inference data out of the data warehouse to an external AI engine.Experiments show that GaussDB-AISQL is up to 19×faster than baseline approaches.展开更多
The rapid advancement of artificial intelligence technology is driving transformative changes in medical diagnosis,treatment,and management systems through large-scale deep learning models-a process that brings both g...The rapid advancement of artificial intelligence technology is driving transformative changes in medical diagnosis,treatment,and management systems through large-scale deep learning models-a process that brings both groundbreaking opportunities and multifaceted challenges.This study focuses on the medical and healthcare applications of large-scale deep learning architectures,conducting a comprehensive survey to categorize and analyze their diverse uses.The survey results reveal that current applications of large models in healthcare encompass medical data management,healthcare services,medical devices,and preventive medicine,among others.Concurrently,large models demonstrate significant advantages in the medical domain,especially in high-precision diagnosis and prediction,data analysis and knowledge discovery,and enhancing operational efficiency.Nevertheless,we identify several challenges that need urgent attention,including improving the interpretability of large models,strengthening privacy protection,and addressing issues related to handling incomplete data.This research is dedicated to systematically elucidating the deep collaborative mechanisms between artificial intelligence and the healthcare field,providing theoretical references and practical guidance for both academia and industry.展开更多
The Internet of Things(IoT)technology provides new impetus for the development of building intelligence.This research focuses on the intelligent design and management of buildings based on IoT engineering.It expounds ...The Internet of Things(IoT)technology provides new impetus for the development of building intelligence.This research focuses on the intelligent design and management of buildings based on IoT engineering.It expounds on the system design principles such as sensor technology,communication network technology,and data storage and analysis,and analyzes the key points of design,including design requirement analysis,equipment layout,and system integration.Through specific cases,it demonstrates the application practice of the system in buildings,and presents the application effect of intelligent system management with multi-parameter values,providing theoretical and practical references for the development of building intelligence and helping to achieve efficient,energy-saving,and safe building operation.展开更多
The CifNet network multi-well data management system is developed for 100MB or 1000MB local network environments which are used in Chinese oil industry. The kernel techniques of CifNet system include: 1, establishing ...The CifNet network multi-well data management system is developed for 100MB or 1000MB local network environments which are used in Chinese oil industry. The kernel techniques of CifNet system include: 1, establishing a high efficient and low cost network multi-well data management architecture based on the General Logging Curve Theory and the Cif data format; 2, implementing efficient visit and transmission of multi-well data in C/S local network based on TCP/IP protocol; 3,ensuring the safety of multi-well data in store, visit and application based on Unix operating system security. By using CifNet system, the researcher in office or at home can visit curves of any borehole in any working area of any oilfield. The application foreground of CifNet system is also commented.展开更多
The mining industry faces a number of challenges that promote the adoption of new technologies.Big data,which is driven by the accelerating progress of information and communication technology,is one of the promising ...The mining industry faces a number of challenges that promote the adoption of new technologies.Big data,which is driven by the accelerating progress of information and communication technology,is one of the promising technologies that can reshape the entire mining landscape.Despite numerous attempts to apply big data in the mining industry,fundamental problems of big data,especially big data management(BDM),in the mining industry persist.This paper aims to fill the gap by presenting the basics of BDM.This work provides a brief introduction to big data and BDM,and it discusses the challenges encountered by the mining industry to indicate the necessity of implementing big data.It also summarizes data sources in the mining industry and presents the potential benefits of big data to the mining industry.This work also envisions a future in which a global database project is established and big data is used together with other technologies(i.e.,automation),supported by government policies and following international standards.This paper also outlines the precautions for the utilization of BDM in the mining industry.展开更多
The wealth of user data acts as a fuel for network intelligence toward the sixth generation wireless networks(6G).Due to data heterogeneity and dynamics,decentralized data management(DM)is desirable for achieving tran...The wealth of user data acts as a fuel for network intelligence toward the sixth generation wireless networks(6G).Due to data heterogeneity and dynamics,decentralized data management(DM)is desirable for achieving transparent data operations across network domains,and blockchain can be a promising solution.However,the increasing data volume and stringent data privacy-preservation requirements in 6G bring significantly technical challenge to balance transparency,efficiency,and privacy requirements in decentralized blockchain-based DM.In this paper,we investigate blockchain solutions to address the challenge.First,we explore the consensus protocols and scalability mechanisms in blockchains and discuss the roles of DM stakeholders in blockchain architectures.Second,we investigate the authentication and authorization requirements for DM stakeholders.Third,we categorize DM privacy requirements and study blockchain-based mechanisms for collaborative data processing.Subsequently,we present research issues and potential solutions for blockchain-based DM toward 6G from these three perspectives.Finally,we conclude this paper and discuss future research directions.展开更多
基金supported by National Natural Science Foundation of China(Grants 72474022,71974011,72174022,71972012,71874009)"BIT think tank"Promotion Plan of Science and Technology Innovation Program of Beijing Institute of Technology(Grants 2024CX14017,2023CX13029).
文摘As a new type of production factor in healthcare,healthcare data elements have been rapidly integrated into various health production processes,such as clinical assistance,health management,biological testing,and operation and supervision[1,2].Healthcare data elements include biolog.ical and clinical data that are related to disease,environ-mental health data that are associated with life,and operational and healthcare management data that are related to healthcare activities(Figure 1).Activities such as the construction of a data value assessment system,the devel-opment of a data circulation and sharing platform,and the authorization of data compliance and operation products support the strong growth momentum of the market for health care data elements in China[3].
基金supported by the Deanship of Scientific Research and Graduate Studies at King Khalid University under research grant number(R.G.P.2/93/45).
文摘Thedeployment of the Internet of Things(IoT)with smart sensors has facilitated the emergence of fog computing as an important technology for delivering services to smart environments such as campuses,smart cities,and smart transportation systems.Fog computing tackles a range of challenges,including processing,storage,bandwidth,latency,and reliability,by locally distributing secure information through end nodes.Consisting of endpoints,fog nodes,and back-end cloud infrastructure,it provides advanced capabilities beyond traditional cloud computing.In smart environments,particularly within smart city transportation systems,the abundance of devices and nodes poses significant challenges related to power consumption and system reliability.To address the challenges of latency,energy consumption,and fault tolerance in these environments,this paper proposes a latency-aware,faulttolerant framework for resource scheduling and data management,referred to as the FORD framework,for smart cities in fog environments.This framework is designed to meet the demands of time-sensitive applications,such as those in smart transportation systems.The FORD framework incorporates latency-aware resource scheduling to optimize task execution in smart city environments,leveraging resources from both fog and cloud environments.Through simulation-based executions,tasks are allocated to the nearest available nodes with minimum latency.In the event of execution failure,a fault-tolerantmechanism is employed to ensure the successful completion of tasks.Upon successful execution,data is efficiently stored in the cloud data center,ensuring data integrity and reliability within the smart city ecosystem.
文摘With the rise of data-intensive research,data literacy has become a critical capability for improving scientific data quality and achieving artificial intelligence(AI)readiness.In the biomedical domain,data are characterized by high complexity and privacy sensitivity,calling for robust and systematic data management skills.This paper reviews current trends in scientific data governance and the evolving policy landscape,highlighting persistent challenges such as inconsistent standards,semantic misalignment,and limited awareness of compliance.These issues are largely rooted in the lack of structured training and practical support for researchers.In response,this study builds on existing data literacy frameworks and integrates the specific demands of biomedical research to propose a comprehensive,lifecycle-oriented data literacy competency model with an emphasis on ethics and regulatory awareness.Furthermore,it outlines a tiered training strategy tailored to different research stages—undergraduate,graduate,and professional,offering theoretical foundations and practical pathways for universities and research institutions to advance data literacy education.
基金supported in part by the National Key Research and Development Program of China under Grant 2023YFB3106900the National Natural Science Foundation of China under Grant 62171113the China Scholarship Council under Grant 202406080100.
文摘We propose a Cross-Chain Mapping Blockchain(CCMB)for scalable data management in massive Internet of Things(IoT)networks.Specifically,CCMB aims to improve the scalability of securely storing,tracing,and transmitting IoT behavior and reputation data based on our proposed cross-mapped Behavior Chain(BChain)and Reputation Chain(RChain).To improve off-chain IoT data storage scalability,we show that our lightweight CCMB architecture efficiently utilizes available fog-cloud resources.The scalability of on-chain IoT data tracing is enhanced using our Mapping Smart Contract(MSC)and cross-chain mapping design to perform rapid Reputation-to-Behavior(R2B)traceability queries between BChain and RChain blocks.To maximize off-chain to on-chain throughput,we optimize the CCMB block settings and producers based on a general Poisson Point Process(PPP)network model.The constrained optimization problem is formulated as a Markov Decision Process(MDP),and solved using a dual-network Deep Reinforcement Learning(DRL)algorithm.Simulation results validate CCMB’s scalability advantages in storage,traceability,and throughput.In specific massive IoT scenarios,CCMB can reduce the storage footprint by 50%and traceability query time by 90%,while improving system throughput by 55%compared to existing benchmarks.
文摘National Population Health Data Center(NPHDC)is one of China's 20 national-level science data centers,jointly designated by the Ministry of Science and Technology and the Ministry of Finance.Operated by the Chinese Academy of Medical Sciences under the oversight of the National Health Commission,NPHDC adheres to national regulations including the Scientific Data Management Measures and the National Science and Technology Infrastructure Service Platform Management Measures,and is committed to collecting,integrating,managing,and sharing biomedical and health data through openaccess platform,fostering open sharing and engaging in international cooperation.
基金supported by:the 2023 Basic Public Welfare Research Project of the Wenzhou Science and Technology Bureau“Research on Multi-Source Data Classification and Grading Standards and Intelligent Algorithms for Higher Education Institutions”(Project No.G2023094)Major Humanities and Social Sciences Research Projects in Zhejiang higher education institutions(Grant/Award Number:2024QN061)2023 Basic Public Welfare Research Project of Wenzhou(No.:S2023014).
文摘In the context of the rapid development of digital education,the security of educational data has become an increasing concern.This paper explores strategies for the classification and grading of educational data,and constructs a higher educational data security management and control model centered on the integration of medical and educational data.By implementing a multi-dimensional strategy of dynamic classification,real-time authorization,and secure execution through educational data security levels,dynamic access control is applied to effectively enhance the security and controllability of educational data,providing a secure foundation for data sharing and openness.
基金supported by the National Key Research and Development Plan(No.2023YFA1606602)。
文摘This article introduces the methodologies and instrumentation for data measurement and propagation at the Back-n white neutron facility of the China Spallation Neutron Source.The Back-n facility employs backscattering techniques to generate a broad spectrum of white neutrons.Equipped with advanced detectors such as the light particle detector array and the fission ionization chamber detector,the facility achieves high-precision data acquisition through a general-purpose electronics system.Data were managed and stored in a hierarchical system supported by the National High Energy Physics Science Data Center,ensuring long-term preservation and efficient access.The data from the Back-n experiments significantly contribute to nuclear physics,reactor design,astrophysics,and medical physics,enhancing the understanding of nuclear processes and supporting interdisciplinary research.
基金supported by Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(NRF-2021R1A6A1A03039493).
文摘Mobile networks possess significant information and thus are considered a gold mine for the researcher’s community.The call detail records(CDR)of a mobile network are used to identify the network’s efficacy and the mobile user’s behavior.It is evident from the recent literature that cyber-physical systems(CPS)were used in the analytics and modeling of telecom data.In addition,CPS is used to provide valuable services in smart cities.In general,a typical telecom company hasmillions of subscribers and thus generatesmassive amounts of data.From this aspect,data storage,analysis,and processing are the key concerns.To solve these issues,herein we propose a multilevel cyber-physical social system(CPSS)for the analysis and modeling of large internet data.Our proposed multilevel system has three levels and each level has a specific functionality.Initially,raw Call Detail Data(CDR)was collected at the first level.Herein,the data preprocessing,cleaning,and error removal operations were performed.In the second level,data processing,cleaning,reduction,integration,processing,and storage were performed.Herein,suggested internet activity record measures were applied.Our proposed system initially constructs a graph and then performs network analysis.Thus proposed CPSS system accurately identifies different areas of internet peak usage in a city(Milan city).Our research is helpful for the network operators to plan effective network configuration,management,and optimization of resources.
文摘This research paper compares Excel and R language for data analysis and concludes that R language is more suitable for complex data analysis tasks.R language’s open-source nature makes it accessible to everyone,and its powerful data management and analysis tools make it suitable for handling complex data analysis tasks.It is also highly customizable,allowing users to create custom functions and packages to meet their specific needs.Additionally,R language provides high reproducibility,making it easy to replicate and verify research results,and it has excellent collaboration capabilities,enabling multiple users to work on the same project simultaneously.These advantages make R language a more suitable choice for complex data analysis tasks,particularly in scientific research and business applications.The findings of this study will help people understand that R is not just a language that can handle more data than Excel and demonstrate that r is essential to the field of data analysis.At the same time,it will also help users and organizations make informed decisions regarding their data analysis needs and software preferences.
文摘This study aims to investigate the influence of social media on college choice among undergraduates majoring in Big Data Management and Application in China.The study attempts to reveal how information on social media platforms such as Weibo,WeChat,and Zhihu influences the cognition and choice process of prospective students.By employing an online quantitative survey questionnaire,data were collected from the 2022 and 2023 classes of new students majoring in Big Data Management and Application at Guilin University of Electronic Technology.The aim was to evaluate the role of social media in their college choice process and understand the features and information that most attract prospective students.Social media has become a key factor influencing the college choice decision-making of undergraduates majoring in Big Data Management and Application in China.Students tend to obtain school information through social media platforms and use this information as an important reference in their decision-making process.Higher education institutions should strengthen their social media information dissemination,providing accurate,timely,and attractive information.It is also necessary to ensure effective management of social media platforms,maintain a positive reputation for the school on social media,and increase the interest and trust of prospective students.Simultaneously,educational decision-makers should consider incorporating social media analysis into their recruitment strategies to better attract new student enrollment.This study provides a new perspective for understanding higher education choice behavior in the digital age,particularly by revealing the importance of social media in the educational decision-making process.This has important practical and theoretical implications for higher education institutions,policymakers,and social media platform operators.
文摘As an introductory course for the emerging major of big data management and application,“Introduction to Big Data”has not yet formed a curriculum standard and implementation plan that is widely accepted and used by everyone.To this end,we discuss some of our explorations and attempts in the construction and teaching process of big data courses for the major of big data management and application from the perspective of course planning,course implementation,and course summary.After interviews with students and feedback from questionnaires,students are highly satisfied with some of the teaching measures and programs currently adopted.
文摘As technology and the internet develop,more data are generated every day.These data are in large sizes,high dimensions,and complex structures.The combination of these three features is the“Big Data”[1].Big data is revolutionizing all industries,bringing colossal impacts to them[2].Many researchers have pointed out the huge impact that big data can have on our daily lives[3].We can utilize the information we obtain and help us make decisions.Also,the conclusions we drew from the big data we analyzed can be used as a prediction for the future,helping us to make more accurate and benign decisions earlier than others.If we apply these technics in finance,for example,in stock,we can get detailed information for stocks.Moreover,we can use the analyzed data to predict certain stocks.This can help people decide whether to buy a stock or not by providing predicted data for people at a certain convincing level,helping to protect them from potential losses.
文摘The critical role of patient-reported outcome measures(PROMs)in enhancing clinical decision-making and promoting patient-centered care has gained a profound significance in scientific research.PROMs encapsulate a patient's health status directly from their perspective,encompassing various domains such as symptom severity,functional status,and overall quality of life.By integrating PROMs into routine clinical practice and research,healthcare providers can achieve a more nuanced understanding of patient experiences and tailor treatments accordingly.The deployment of PROMs supports dynamic patient-provider interactions,fostering better patient engagement and adherence to tre-atment plans.Moreover,PROMs are pivotal in clinical settings for monitoring disease progression and treatment efficacy,particularly in chronic and mental health conditions.However,challenges in implementing PROMs include data collection and management,integration into existing health systems,and acceptance by patients and providers.Overcoming these barriers necessitates technological advancements,policy development,and continuous education to enhance the acceptability and effectiveness of PROMs.The paper concludes with recommendations for future research and policy-making aimed at optimizing the use and impact of PROMs across healthcare settings.
文摘On the basis of PDM(product data management) definition and its connotation, the factors to ensure implementation success are analyzed. The definition phase, analysis phase, design phase, build and test phase, and post production phase during PDM implementation are described. The implementation is divided into ten processes, which consist of the above different phases. The relationships between phases and processes are illustrated. Finally, a workflow is proposed to guide the implementing at a fixed price.
基金supported by the fund for building world-class universities(disciplines)of Renmin University of China.
文摘Cloud-native data warehouses have revolutionized data analysis by enabling elasticity,high availability and lower costs.And the increasing popularity of artificial intelligence(AI)drives data warehouses to provide predictive analytics besides the existing descriptive analytics.Consequently,more vendors start to support training and inference of AI models in data warehouses,exploiting the benefits of near-data processing for fast model development and deployment.However,most of the existing solutions are limited by a complex syntax or slow data transportation across engines.In this paper,we present GaussDB-AISQL,a composable SQL system with AI capabilities.GaussDB-AISQL adopts a composable system design that decouples computing,storage,caching,DB engine and AI engine.Our system offers all the functionality needed by end-to-end model training and inference during the model lifecycle.It also enjoys the simplicity and efficiency by providing a SQL-like syntax and removes the burden of manual model management.When training an AI model,GaussDB-AISQL benefits from highly parallel data transportation by concurrent data pulling from the distributed shared memory.The feature selection algorithms in GaussDB-AISQL make the training more data-efficient.When running model inference,GaussDB-AISQL registers the trained model object in the local data warehouse as a user-defined-function,which avoids moving inference data out of the data warehouse to an external AI engine.Experiments show that GaussDB-AISQL is up to 19×faster than baseline approaches.
基金funded by the National Natural Science Foundation of China(Grant No.62272236)the Natural Science Foundation of Jiangsu Province(Grant No.BK20201136).
文摘The rapid advancement of artificial intelligence technology is driving transformative changes in medical diagnosis,treatment,and management systems through large-scale deep learning models-a process that brings both groundbreaking opportunities and multifaceted challenges.This study focuses on the medical and healthcare applications of large-scale deep learning architectures,conducting a comprehensive survey to categorize and analyze their diverse uses.The survey results reveal that current applications of large models in healthcare encompass medical data management,healthcare services,medical devices,and preventive medicine,among others.Concurrently,large models demonstrate significant advantages in the medical domain,especially in high-precision diagnosis and prediction,data analysis and knowledge discovery,and enhancing operational efficiency.Nevertheless,we identify several challenges that need urgent attention,including improving the interpretability of large models,strengthening privacy protection,and addressing issues related to handling incomplete data.This research is dedicated to systematically elucidating the deep collaborative mechanisms between artificial intelligence and the healthcare field,providing theoretical references and practical guidance for both academia and industry.
文摘The Internet of Things(IoT)technology provides new impetus for the development of building intelligence.This research focuses on the intelligent design and management of buildings based on IoT engineering.It expounds on the system design principles such as sensor technology,communication network technology,and data storage and analysis,and analyzes the key points of design,including design requirement analysis,equipment layout,and system integration.Through specific cases,it demonstrates the application practice of the system in buildings,and presents the application effect of intelligent system management with multi-parameter values,providing theoretical and practical references for the development of building intelligence and helping to achieve efficient,energy-saving,and safe building operation.
文摘The CifNet network multi-well data management system is developed for 100MB or 1000MB local network environments which are used in Chinese oil industry. The kernel techniques of CifNet system include: 1, establishing a high efficient and low cost network multi-well data management architecture based on the General Logging Curve Theory and the Cif data format; 2, implementing efficient visit and transmission of multi-well data in C/S local network based on TCP/IP protocol; 3,ensuring the safety of multi-well data in store, visit and application based on Unix operating system security. By using CifNet system, the researcher in office or at home can visit curves of any borehole in any working area of any oilfield. The application foreground of CifNet system is also commented.
文摘The mining industry faces a number of challenges that promote the adoption of new technologies.Big data,which is driven by the accelerating progress of information and communication technology,is one of the promising technologies that can reshape the entire mining landscape.Despite numerous attempts to apply big data in the mining industry,fundamental problems of big data,especially big data management(BDM),in the mining industry persist.This paper aims to fill the gap by presenting the basics of BDM.This work provides a brief introduction to big data and BDM,and it discusses the challenges encountered by the mining industry to indicate the necessity of implementing big data.It also summarizes data sources in the mining industry and presents the potential benefits of big data to the mining industry.This work also envisions a future in which a global database project is established and big data is used together with other technologies(i.e.,automation),supported by government policies and following international standards.This paper also outlines the precautions for the utilization of BDM in the mining industry.
基金supported by research grants from Huawei Technologies Canada and from the Natural Sciences and Engineering Research Council(NSERC)of Canada.
文摘The wealth of user data acts as a fuel for network intelligence toward the sixth generation wireless networks(6G).Due to data heterogeneity and dynamics,decentralized data management(DM)is desirable for achieving transparent data operations across network domains,and blockchain can be a promising solution.However,the increasing data volume and stringent data privacy-preservation requirements in 6G bring significantly technical challenge to balance transparency,efficiency,and privacy requirements in decentralized blockchain-based DM.In this paper,we investigate blockchain solutions to address the challenge.First,we explore the consensus protocols and scalability mechanisms in blockchains and discuss the roles of DM stakeholders in blockchain architectures.Second,we investigate the authentication and authorization requirements for DM stakeholders.Third,we categorize DM privacy requirements and study blockchain-based mechanisms for collaborative data processing.Subsequently,we present research issues and potential solutions for blockchain-based DM toward 6G from these three perspectives.Finally,we conclude this paper and discuss future research directions.