In recent years,China has paid more and more attention to the development of marine economy and the management and protection of fishery resources.The management departments at all levels regulate and manage the fishi...In recent years,China has paid more and more attention to the development of marine economy and the management and protection of fishery resources.The management departments at all levels regulate and manage the fishing behavior of fishing vessels through the data of fishing trajectories.In this paper,the distribution of shrimp farms in the East China Sea is predicted by studying the trajectories and behavior patterns of shrimp boats in the system of fishing trajectories.At the same time,a set of shrimp farm distribution management system based on Back Propagation algorithm is established.It can monitor the trajectories of fishing boats and the distribution of shrimp groups in real time,which effectively improves the work efficiency and management mode of the management department.It also plays a positive role in regulating the behavior of fishing boats at sea.展开更多
This paper designs and develops a framework on a distributed computing platform for massive multi-source spatial data using a column-oriented database(HBase).This platform consists of four layers including ETL(extract...This paper designs and develops a framework on a distributed computing platform for massive multi-source spatial data using a column-oriented database(HBase).This platform consists of four layers including ETL(extraction transformation loading) tier,data processing tier,data storage tier and data display tier,achieving long-term store,real-time analysis and inquiry for massive data.Finally,a real dataset cluster is simulated,which are made up of 39 nodes including 2 master nodes and 37 data nodes,and performing function tests of data importing module and real-time query module,and performance tests of HDFS's I/O,the MapReduce cluster,batch-loading and real-time query of massive data.The test results indicate that this platform achieves high performance in terms of response time and linear scalability.展开更多
Background A production line is the basic unit of smart factories and smart manufacturing.However,owing to the development of the industrial Internet of Things,sensors,and other technologies,more data are being collec...Background A production line is the basic unit of smart factories and smart manufacturing.However,owing to the development of the industrial Internet of Things,sensors,and other technologies,more data are being collected,leading to a data explosion,and the heterogeneous nature of multiple sources makes it difficult to manage data in a unified manner.Methods A production line data collection,storage,and management system based on cloud-fog-edge computing collaboration and a digital twin was designed.Multi-source heterogeneous data were collected and transmitted based on the OPC UA,and an information model of the production line was established.Modules for data mapping,publishing,and receiving were developed to achieve unified data collection and transmission.The data storage and management platform was constructed by front-end and back-end separation technologies.Results The developed data collection and management system was verified for functionality and performance on a digital twin production line.Functional tests show that the system has the functions of data acquisition and transmission,device addition and viewing,device data querying and downloading,data and model visualization,and user rights setting.The average time for edge data collection and transmission is 183.6ms.The average response time of the cloud for fog requests is less than 1s.This shows that the system can satisfy the real-time requirements of a digital twin production line.Conclusions The proposed system is real-time and stable,providing support for big data and virtual-reality interaction in digital twins.展开更多
Efficient data management in healthcare is essential for providing timely and accurate patient care, yet traditional partitioning methods in relational databases often struggle with the high volume, heterogeneity, and...Efficient data management in healthcare is essential for providing timely and accurate patient care, yet traditional partitioning methods in relational databases often struggle with the high volume, heterogeneity, and regulatory complexity of healthcare data. This research introduces a tailored partitioning strategy leveraging the MD5 hashing algorithm to enhance data insertion, query performance, and load balancing in healthcare systems. By applying a consistent hash function to patient IDs, our approach achieves uniform distribution of records across partitions, optimizing retrieval paths and reducing access latency while ensuring data integrity and compliance. We evaluated the method through experiments focusing on partitioning efficiency, scalability, and fault tolerance. The partitioning efficiency analysis compared our MD5-based approach with standard round-robin methods, measuring insertion times, query latency, and data distribution balance. Scalability tests assessed system performance across increasing dataset sizes and varying partition counts, while fault tolerance experiments examined data integrity and retrieval performance under simulated partition failures. The experimental results demonstrate that the MD5-based partitioning strategy significantly reduces query retrieval times by optimizing data access patterns, achieving up to X% better performance compared to round-robin methods. It also scales effectively with larger datasets, maintaining low latency and ensuring robust resilience under failure scenarios. This novel approach offers a scalable, efficient, and fault-tolerant solution for healthcare systems, facilitating faster clinical decision-making and improved patient care in complex data environments.展开更多
There is a great thrust in industry toward the development of more feasible and viable tools for storing fast-growing volume, velocity, and diversity of data, termed 'big data'. The structural shift of the storage m...There is a great thrust in industry toward the development of more feasible and viable tools for storing fast-growing volume, velocity, and diversity of data, termed 'big data'. The structural shift of the storage mechanism from traditional data management systems to NoSQL technology is due to the intention of fulfilling big data storage requirements. However, the available big data storage technologies are inefficient to provide consistent, scalable, and available solutions for continuously growing heterogeneous data. Storage is the preliminary process of big data analytics for real-world applications such as scientific experiments, healthcare, social networks, and e-business. So far, Amazon, Google, and Apache are some of the industry standards in providing big data storage solutions, yet the literature does not report an in-depth survey of storage technologies available for big data, investigating the performance and magnitude gains of these technologies. The primary objective of this paper is to conduct a comprehensive investigation of state-of-the-art storage technologies available for big data. A well-defined taxonomy of big data storage technologies is presented to assist data analysts and researchers in understanding and selecting a storage mecha- nism that better fits their needs. To evaluate the performance of different storage architectures, we compare and analyze the ex- isling approaches using Brewer's CAP theorem. The significance and applications of storage technologies and support to other categories are discussed. Several future research challenges are highlighted with the intention to expedite the deployment of a reliable and scalable storage system.展开更多
目的探索医院医用供应-加工-配送(Supply-Processing-Distribution,SPD)供应链服务改革,助力医用耗材精细化管理,通过数据整合,加强医用耗材各流通环节的精细化管理实践。方法打通数据接口,连通医院信息系统(Hospital Information Syste...目的探索医院医用供应-加工-配送(Supply-Processing-Distribution,SPD)供应链服务改革,助力医用耗材精细化管理,通过数据整合,加强医用耗材各流通环节的精细化管理实践。方法打通数据接口,连通医院信息系统(Hospital Information System,HIS)、医院资源规划(Hospital Resource Planning,HRP)系统、电子病历(Electronic Medical Record,EMR)系统等信息系统,构建医用耗材供应管理系统,系统涵盖耗材编码、库存管理、临床追溯、智能补货、结算对接等功能,通过对各使用环节的物流数据监控分析,实现医用耗材供应全生命周期数据追溯管理。结果SPD模式上线后,2024年支持手术13万余台次、配送耗材98万余件;与传统模式相比,SPD模式的术前准备、耗材取用、结算计费时间分别缩短84.96%、24.00%、62.50%(P均<0.001);在成本与管理优化方面,实现耗材零库存管理,1101个耗材品规中有178个完成调价,年节约成本330万元,2024年手术耗材成本较2023年显著降低(P<0.001)。结论SPD改革满足耗材精细化管理需求,首次实现“临床-后勤-财务”全链条闭环管理,具应用集成性与临床适应性,可提升管理效率,对医疗机构高质量发展有现实意义与推广价值。展开更多
文摘In recent years,China has paid more and more attention to the development of marine economy and the management and protection of fishery resources.The management departments at all levels regulate and manage the fishing behavior of fishing vessels through the data of fishing trajectories.In this paper,the distribution of shrimp farms in the East China Sea is predicted by studying the trajectories and behavior patterns of shrimp boats in the system of fishing trajectories.At the same time,a set of shrimp farm distribution management system based on Back Propagation algorithm is established.It can monitor the trajectories of fishing boats and the distribution of shrimp groups in real time,which effectively improves the work efficiency and management mode of the management department.It also plays a positive role in regulating the behavior of fishing boats at sea.
基金Supported by the National Science and Technology Support Project(No.2012BAH01F02)from Ministry of Science and Technology of Chinathe Director Fund(No.IS201116002)from Institute of Seismology,CEA
文摘This paper designs and develops a framework on a distributed computing platform for massive multi-source spatial data using a column-oriented database(HBase).This platform consists of four layers including ETL(extraction transformation loading) tier,data processing tier,data storage tier and data display tier,achieving long-term store,real-time analysis and inquiry for massive data.Finally,a real dataset cluster is simulated,which are made up of 39 nodes including 2 master nodes and 37 data nodes,and performing function tests of data importing module and real-time query module,and performance tests of HDFS's I/O,the MapReduce cluster,batch-loading and real-time query of massive data.The test results indicate that this platform achieves high performance in terms of response time and linear scalability.
基金supported by the National Key Research and Development Program of China under Grant 2020YFB1708400.
文摘Background A production line is the basic unit of smart factories and smart manufacturing.However,owing to the development of the industrial Internet of Things,sensors,and other technologies,more data are being collected,leading to a data explosion,and the heterogeneous nature of multiple sources makes it difficult to manage data in a unified manner.Methods A production line data collection,storage,and management system based on cloud-fog-edge computing collaboration and a digital twin was designed.Multi-source heterogeneous data were collected and transmitted based on the OPC UA,and an information model of the production line was established.Modules for data mapping,publishing,and receiving were developed to achieve unified data collection and transmission.The data storage and management platform was constructed by front-end and back-end separation technologies.Results The developed data collection and management system was verified for functionality and performance on a digital twin production line.Functional tests show that the system has the functions of data acquisition and transmission,device addition and viewing,device data querying and downloading,data and model visualization,and user rights setting.The average time for edge data collection and transmission is 183.6ms.The average response time of the cloud for fog requests is less than 1s.This shows that the system can satisfy the real-time requirements of a digital twin production line.Conclusions The proposed system is real-time and stable,providing support for big data and virtual-reality interaction in digital twins.
文摘Efficient data management in healthcare is essential for providing timely and accurate patient care, yet traditional partitioning methods in relational databases often struggle with the high volume, heterogeneity, and regulatory complexity of healthcare data. This research introduces a tailored partitioning strategy leveraging the MD5 hashing algorithm to enhance data insertion, query performance, and load balancing in healthcare systems. By applying a consistent hash function to patient IDs, our approach achieves uniform distribution of records across partitions, optimizing retrieval paths and reducing access latency while ensuring data integrity and compliance. We evaluated the method through experiments focusing on partitioning efficiency, scalability, and fault tolerance. The partitioning efficiency analysis compared our MD5-based approach with standard round-robin methods, measuring insertion times, query latency, and data distribution balance. Scalability tests assessed system performance across increasing dataset sizes and varying partition counts, while fault tolerance experiments examined data integrity and retrieval performance under simulated partition failures. The experimental results demonstrate that the MD5-based partitioning strategy significantly reduces query retrieval times by optimizing data access patterns, achieving up to X% better performance compared to round-robin methods. It also scales effectively with larger datasets, maintaining low latency and ensuring robust resilience under failure scenarios. This novel approach offers a scalable, efficient, and fault-tolerant solution for healthcare systems, facilitating faster clinical decision-making and improved patient care in complex data environments.
文摘There is a great thrust in industry toward the development of more feasible and viable tools for storing fast-growing volume, velocity, and diversity of data, termed 'big data'. The structural shift of the storage mechanism from traditional data management systems to NoSQL technology is due to the intention of fulfilling big data storage requirements. However, the available big data storage technologies are inefficient to provide consistent, scalable, and available solutions for continuously growing heterogeneous data. Storage is the preliminary process of big data analytics for real-world applications such as scientific experiments, healthcare, social networks, and e-business. So far, Amazon, Google, and Apache are some of the industry standards in providing big data storage solutions, yet the literature does not report an in-depth survey of storage technologies available for big data, investigating the performance and magnitude gains of these technologies. The primary objective of this paper is to conduct a comprehensive investigation of state-of-the-art storage technologies available for big data. A well-defined taxonomy of big data storage technologies is presented to assist data analysts and researchers in understanding and selecting a storage mecha- nism that better fits their needs. To evaluate the performance of different storage architectures, we compare and analyze the ex- isling approaches using Brewer's CAP theorem. The significance and applications of storage technologies and support to other categories are discussed. Several future research challenges are highlighted with the intention to expedite the deployment of a reliable and scalable storage system.
文摘目的探索医院医用供应-加工-配送(Supply-Processing-Distribution,SPD)供应链服务改革,助力医用耗材精细化管理,通过数据整合,加强医用耗材各流通环节的精细化管理实践。方法打通数据接口,连通医院信息系统(Hospital Information System,HIS)、医院资源规划(Hospital Resource Planning,HRP)系统、电子病历(Electronic Medical Record,EMR)系统等信息系统,构建医用耗材供应管理系统,系统涵盖耗材编码、库存管理、临床追溯、智能补货、结算对接等功能,通过对各使用环节的物流数据监控分析,实现医用耗材供应全生命周期数据追溯管理。结果SPD模式上线后,2024年支持手术13万余台次、配送耗材98万余件;与传统模式相比,SPD模式的术前准备、耗材取用、结算计费时间分别缩短84.96%、24.00%、62.50%(P均<0.001);在成本与管理优化方面,实现耗材零库存管理,1101个耗材品规中有178个完成调价,年节约成本330万元,2024年手术耗材成本较2023年显著降低(P<0.001)。结论SPD改革满足耗材精细化管理需求,首次实现“临床-后勤-财务”全链条闭环管理,具应用集成性与临床适应性,可提升管理效率,对医疗机构高质量发展有现实意义与推广价值。