This paper describes the fundamentals of cloud computing and current big-data key technologies. We categorize big-da- ta processing as batch-based, stream-based, graph-based, DAG-based, interactive-based, or visual-ba...This paper describes the fundamentals of cloud computing and current big-data key technologies. We categorize big-da- ta processing as batch-based, stream-based, graph-based, DAG-based, interactive-based, or visual-based according to the processing technique. We highlight the strengths and weaknesses of various big-data cloud processing techniques in order to help the big-data community select the appropri- ate processing technique. We also provide big data research challenges and future directions in aspect to transportation management systems.展开更多
There is a lack of high-quality,large-scale,real-world evidence from patients with metastatic colorectal cancer(mCRC),especially in China.It remains unclear whether efforts to improve the quality of care for mCRC woul...There is a lack of high-quality,large-scale,real-world evidence from patients with metastatic colorectal cancer(mCRC),especially in China.It remains unclear whether efforts to improve the quality of care for mCRC would improve patient survival outcomes in real-world practice.On the basis of an intelligent bigdata platform,we established a large-scale retrospective cohort of mCRC patients.We investigated the temporal changes in the systemic and local treatment(resection,ablation,or radiation to liver,lung,or extrahepatic and/or extrapulmonary metastases)patterns of mCRC,and whether these changes were associated with improved overall survival(OS)over time.Between July 2012 and December 2018,3403 eligible patients were included in this research.The median OS was 42.8 months(95%confidence interval(CI),40.7–46.6)for the entire cohort,25.6 months(95%CI,24.7–26.9)for those treated with systemic therapy only,and not reached(95%CI,78.6 months–not reached)for those receiving local therapy.The utility rate of local therapy increased continuously from 37.9%in 2012–2014 to 46.9%in 2017–2018.A dramatic increase in the utility rate of either cetuximab or bevacizumab was observed since 2017(39.9%,43.2%,and 60.3%in 2012–2014,2015–2016,and 2017–2018,respectively).Compared with 2012–2014,the OS of the entire population significantly improved in 2015–2016(hazard ratio(HR)=0.87(95%CI,0.78–0.99);P=0.034),but not for patients receiving systemic therapy only(HR=0.99(95%CI,0.86–1.14);P=0.889),whereas an improved OS was found in 2015–2018 for both the entire population(HR=0.75(95%CI,0.70–0.81);P<0.001)and for patients receiving systemic therapy only(HR=0.83(95%CI,0.77–0.91);P<0.001).In summary,the quality of care for mCRC,as indicated by the utility rate of targeted and local therapies,has been continuously improving over time in this study cohort,which is associated with continuously improving survival outcomes for these patients.展开更多
Tourism destination images in terms of the gaps between the projected and perceived images are of great significance in the development of destinations.Additionally,the use of big-data in tourism studies remains under...Tourism destination images in terms of the gaps between the projected and perceived images are of great significance in the development of destinations.Additionally,the use of big-data in tourism studies remains under-utilized despite the boom in big-data applications and the increasing number of electronic User Generated Contents(UGC).Aiming to take advantage of tourism UGC to fully understand the destination image gap between official promotion materials and tourist perception of Sanya City in China,this study innovatively employed a big-data analysis technique,Tourism Sentiment Evaluation(TSE)model and proposed a new analysis framework integrating the“cognitive-affective”model with the gpp analysis of projected and perceived destination image to explore the destination image gap of Sanya It is found that Sanya's perceptive destination image is overall consistent with its official positioning;however,there also exist image gaps between the two groups in terms of the impact of festival events and tourists'attitude towards core scenic spots amongst others.This study's findings are discussed in light of their methodological,theoretical,and practical implications for destination positioning,marketing,and management.展开更多
Data governance is a subject that is becoming increasingly important in business and government. In fact, good governance data allows improved interactions between employees of one or more organizations. Data quality ...Data governance is a subject that is becoming increasingly important in business and government. In fact, good governance data allows improved interactions between employees of one or more organizations. Data quality represents a great challenge because the cost of non-quality can be very high. Therefore the use of data quality becomes an absolute necessity within an organization. To improve the data quality in a Big-Data source, our purpose, in this paper, is to add semantics to data and help user to recognize the Big-Data schema. The originality of this approach lies in the semantic aspect it offers. It detects issues in data and proposes a data schema by applying a semantic data profiling.展开更多
With the rapid development of the internet, internet of things, mobile internet, and cloud computing, the amount of data in circulation has grown rapidly. More social information has contributed to the growth of big d...With the rapid development of the internet, internet of things, mobile internet, and cloud computing, the amount of data in circulation has grown rapidly. More social information has contributed to the growth of big data, and data has become a core asset. Big data is challenging in terms of effective storage, efficient computation and analysis, and deep data mining. In this paper, we discuss the signif- icance of big data and discuss key technologies and problems in big-data analyties. We also discuss the future prospects of big-data analylics.展开更多
The buzz-word big-data refers to the large-scale distributed data processing applications that operate on exceptionally large amounts of data. Google's MapReduce and Apache's Hadoop, its open-source implementation, ...The buzz-word big-data refers to the large-scale distributed data processing applications that operate on exceptionally large amounts of data. Google's MapReduce and Apache's Hadoop, its open-source implementation, are the defacto software systems for big-data applications. An observation of the MapReduce framework is that the framework generates a large amount of intermediate data. Such abundant information is thrown away after the tasks finish, because MapReduce is unable to utilize them. In this paper, we propose Dache, a data-aware cache framework for big-data applications. In Dache, tasks submit their intermediate results to the cache manager. A task queries the cache manager before executing the actual computing work. A novel cache description scheme and a cache request and reply protocol are designed. We implement Dache by extending Hadoop. Testbed experiment results demonstrate that Dache significantly improves the completion time of MapReduce jobs.展开更多
【目的】本研究旨在系统解析多源大数据驱动的生态系统文化服务(cultural ecosystem service,CES)评估创新,明晰研究进展与未来方向。【方法】以“生态系统文化服务”和“价值评估”为关键词,检索Web of Science与CNKI数据库2000—2024...【目的】本研究旨在系统解析多源大数据驱动的生态系统文化服务(cultural ecosystem service,CES)评估创新,明晰研究进展与未来方向。【方法】以“生态系统文化服务”和“价值评估”为关键词,检索Web of Science与CNKI数据库2000—2024年的文献。从大数据类型、CES价值类型、评估对象与评估方法4个维度梳理研究成果,对当前研究机遇、挑战及未来趋势进行系统性评述,并系统性总结基于多源大数据的CES评估工作流。【结果】1)CES评估范式呈现从传统经济核算向智能评估转型的趋势。统计表明,约70%的研究通过多源数据的应用实现了范式革新,主要体现在CES价值类型维度拓展、评估对象类型细化、评估方法应用创新3个方面。2)大数据应用突破了传统信息获取瓶颈,形成政府公开数据(生态环境数据、人口经济数据等)与用户生成数据(社交媒体数据、地图与兴趣点数据、位置服务数据等)融合的多元化格局,显著提升了CES价值解析的精度、时空覆盖度及场景适用性。3)机器学习、深度学习等人工智能技术与大数据分析手段成为新兴的CES评估方法,能进行海量数据处理与深度信息挖掘,有效提升了评估效率与准确性。【结论】多源大数据的应用使得CES评估从传统经济核算转向智能感知分析,为CES研究提供了新依据。未来需推动评估框架的标准化,以提升研究结果的科学性和解释力。展开更多
基金supported in part by the National Basic Research Program(973 Program,No.2015CB352400)NSFC under grant U1401258U.S NSF under grant CCF-1016966
文摘This paper describes the fundamentals of cloud computing and current big-data key technologies. We categorize big-da- ta processing as batch-based, stream-based, graph-based, DAG-based, interactive-based, or visual-based according to the processing technique. We highlight the strengths and weaknesses of various big-data cloud processing techniques in order to help the big-data community select the appropri- ate processing technique. We also provide big data research challenges and future directions in aspect to transportation management systems.
基金This study was supported by the grants from the National Natural Science Foundation of China(81930065)the Natural Science Foundation of Guangdong Province(2014A030312015)+1 种基金the Science and Technology Program of Guangdong(2019B020227002)the Science and Technology Program of Guangzhou(201904020046,201803040019,and 201704020228).
文摘There is a lack of high-quality,large-scale,real-world evidence from patients with metastatic colorectal cancer(mCRC),especially in China.It remains unclear whether efforts to improve the quality of care for mCRC would improve patient survival outcomes in real-world practice.On the basis of an intelligent bigdata platform,we established a large-scale retrospective cohort of mCRC patients.We investigated the temporal changes in the systemic and local treatment(resection,ablation,or radiation to liver,lung,or extrahepatic and/or extrapulmonary metastases)patterns of mCRC,and whether these changes were associated with improved overall survival(OS)over time.Between July 2012 and December 2018,3403 eligible patients were included in this research.The median OS was 42.8 months(95%confidence interval(CI),40.7–46.6)for the entire cohort,25.6 months(95%CI,24.7–26.9)for those treated with systemic therapy only,and not reached(95%CI,78.6 months–not reached)for those receiving local therapy.The utility rate of local therapy increased continuously from 37.9%in 2012–2014 to 46.9%in 2017–2018.A dramatic increase in the utility rate of either cetuximab or bevacizumab was observed since 2017(39.9%,43.2%,and 60.3%in 2012–2014,2015–2016,and 2017–2018,respectively).Compared with 2012–2014,the OS of the entire population significantly improved in 2015–2016(hazard ratio(HR)=0.87(95%CI,0.78–0.99);P=0.034),but not for patients receiving systemic therapy only(HR=0.99(95%CI,0.86–1.14);P=0.889),whereas an improved OS was found in 2015–2018 for both the entire population(HR=0.75(95%CI,0.70–0.81);P<0.001)and for patients receiving systemic therapy only(HR=0.83(95%CI,0.77–0.91);P<0.001).In summary,the quality of care for mCRC,as indicated by the utility rate of targeted and local therapies,has been continuously improving over time in this study cohort,which is associated with continuously improving survival outcomes for these patients.
基金supported by The Ministry of education of Humanities and Social Science Project(19YJAZH060)Study on Agglomeration Pattern,Development Quality And Spatial Optimization Of Urban Leisure Industry In Guangdong-Hong Kong-Macao Greater Bay Area,supported by Guangdong Philosophical and Social Sciences Project(GD20SQ21).
文摘Tourism destination images in terms of the gaps between the projected and perceived images are of great significance in the development of destinations.Additionally,the use of big-data in tourism studies remains under-utilized despite the boom in big-data applications and the increasing number of electronic User Generated Contents(UGC).Aiming to take advantage of tourism UGC to fully understand the destination image gap between official promotion materials and tourist perception of Sanya City in China,this study innovatively employed a big-data analysis technique,Tourism Sentiment Evaluation(TSE)model and proposed a new analysis framework integrating the“cognitive-affective”model with the gpp analysis of projected and perceived destination image to explore the destination image gap of Sanya It is found that Sanya's perceptive destination image is overall consistent with its official positioning;however,there also exist image gaps between the two groups in terms of the impact of festival events and tourists'attitude towards core scenic spots amongst others.This study's findings are discussed in light of their methodological,theoretical,and practical implications for destination positioning,marketing,and management.
文摘Data governance is a subject that is becoming increasingly important in business and government. In fact, good governance data allows improved interactions between employees of one or more organizations. Data quality represents a great challenge because the cost of non-quality can be very high. Therefore the use of data quality becomes an absolute necessity within an organization. To improve the data quality in a Big-Data source, our purpose, in this paper, is to add semantics to data and help user to recognize the Big-Data schema. The originality of this approach lies in the semantic aspect it offers. It detects issues in data and proposes a data schema by applying a semantic data profiling.
文摘With the rapid development of the internet, internet of things, mobile internet, and cloud computing, the amount of data in circulation has grown rapidly. More social information has contributed to the growth of big data, and data has become a core asset. Big data is challenging in terms of effective storage, efficient computation and analysis, and deep data mining. In this paper, we discuss the signif- icance of big data and discuss key technologies and problems in big-data analyties. We also discuss the future prospects of big-data analylics.
基金supported in part by the Natural Science Foundation of USA(Nos.ECCS 1128209,CNS 1138963,CNS 1065444,and CCF 1028167)
文摘The buzz-word big-data refers to the large-scale distributed data processing applications that operate on exceptionally large amounts of data. Google's MapReduce and Apache's Hadoop, its open-source implementation, are the defacto software systems for big-data applications. An observation of the MapReduce framework is that the framework generates a large amount of intermediate data. Such abundant information is thrown away after the tasks finish, because MapReduce is unable to utilize them. In this paper, we propose Dache, a data-aware cache framework for big-data applications. In Dache, tasks submit their intermediate results to the cache manager. A task queries the cache manager before executing the actual computing work. A novel cache description scheme and a cache request and reply protocol are designed. We implement Dache by extending Hadoop. Testbed experiment results demonstrate that Dache significantly improves the completion time of MapReduce jobs.
文摘【目的】本研究旨在系统解析多源大数据驱动的生态系统文化服务(cultural ecosystem service,CES)评估创新,明晰研究进展与未来方向。【方法】以“生态系统文化服务”和“价值评估”为关键词,检索Web of Science与CNKI数据库2000—2024年的文献。从大数据类型、CES价值类型、评估对象与评估方法4个维度梳理研究成果,对当前研究机遇、挑战及未来趋势进行系统性评述,并系统性总结基于多源大数据的CES评估工作流。【结果】1)CES评估范式呈现从传统经济核算向智能评估转型的趋势。统计表明,约70%的研究通过多源数据的应用实现了范式革新,主要体现在CES价值类型维度拓展、评估对象类型细化、评估方法应用创新3个方面。2)大数据应用突破了传统信息获取瓶颈,形成政府公开数据(生态环境数据、人口经济数据等)与用户生成数据(社交媒体数据、地图与兴趣点数据、位置服务数据等)融合的多元化格局,显著提升了CES价值解析的精度、时空覆盖度及场景适用性。3)机器学习、深度学习等人工智能技术与大数据分析手段成为新兴的CES评估方法,能进行海量数据处理与深度信息挖掘,有效提升了评估效率与准确性。【结论】多源大数据的应用使得CES评估从传统经济核算转向智能感知分析,为CES研究提供了新依据。未来需推动评估框架的标准化,以提升研究结果的科学性和解释力。