To cope with privacy leakage caused by multimedia outsourcing and sharing,data provenance is used to analyze leaked multimedia and provide reactive accountability.Existing schemes of multimedia provenance are based on...To cope with privacy leakage caused by multimedia outsourcing and sharing,data provenance is used to analyze leaked multimedia and provide reactive accountability.Existing schemes of multimedia provenance are based on watermarking protocols.In an outsourcing scenario,existing schemes face two severe challenges:1)when data leakage occurs,there exists a probability that data provenance results can be repudiated,in which case data provenance tracking fails;and 2)when outsourced data are shared,data encryption transfer causes key management burden outside the schemes,and privacy leakage threatens users.In this paper,we propose a novel data provenance scheme with an improved LUT-based fingerprinting protocol,which integrates an asymmetric watermarking protocol,robust watermark algorithm and homomorphic encryption and digital signatures to achieve full non-repudiation provenance.We build an in-scheme stream cipher to protect outsourced multimedia data from privacy leakage and complicated key management.Our scheme is also lightweight and easy to deploy.Extensive security and performance analysis compares our scheme with the state of the art.The results show that our scheme has not only better provenance security and data confidentiality but also higher efficiency for multimedia outsourcing,sharing and provenance.展开更多
Large models,such as large language models(LLMs),vision-language models(VLMs),and multimodal agents,have become key elements in artificial intelli⁃gence(AI)systems.Their rapid development has greatly improved percepti...Large models,such as large language models(LLMs),vision-language models(VLMs),and multimodal agents,have become key elements in artificial intelli⁃gence(AI)systems.Their rapid development has greatly improved perception,generation,and decision-making in various fields.However,their vast scale and complexity bring about new security challenges.Issues such as backdoor vulnerabilities during training,jailbreaking in multimodal rea⁃soning,and data provenance and copyright auditing have made security a critical focus for both academia and industry.展开更多
To combat increasingly sophisticated cyber attacks,the security community has proposed and deployed a large body of threat detection approaches to discover malicious behaviors on host systems and attack payloads in ne...To combat increasingly sophisticated cyber attacks,the security community has proposed and deployed a large body of threat detection approaches to discover malicious behaviors on host systems and attack payloads in network traffic.Several studies have begun to focus on threat detection methods based on provenance data of host-level event tracing.On the other side,with the significant development of big data and artificial intelligence technologies,large-scale graph computing has been widely used.To this end,kinds of research try to bridge the gap between threat detection based on host log provenance data and graph algorithm,and propose the threat detection algorithm based on system provenance graph.These approaches usually generate the system provenance graph via tagging and tracking of system events,and then leverage the characteristics of the graph to conduct threat detection and attack investigation.For the purpose of deeply understanding the correctness,effectiveness,and efficiency of different graph-based threat detection algorithms,we pay attention to mainstream threat detection methods based on provenance graphs.We select and implement 5 state-of-the-art threat detection approaches among a large number of studies as evaluation objects for further analysis.To this end,we collect about 40GB of host-level raw log data in a real-world IT environment,and simulate 6 types of cyber attack scenarios in an isolated environment for malicious provenance data to build our evaluation datasets.The crosswise comparison and longitudinal assessment interpret in detail these detection approaches can detect which attack scenarios well and why.Our empirical evaluation provides a solid foundation for the improvement direction of the threat detection approach.展开更多
Despite the multifaceted advantages of cloud computing,concerns about data leakage or abuse impedes its adoption for security-sensi tive tasks.Recent investigations have revealed that the risk of unauthorized data acc...Despite the multifaceted advantages of cloud computing,concerns about data leakage or abuse impedes its adoption for security-sensi tive tasks.Recent investigations have revealed that the risk of unauthorized data access is one of the biggest concerns of users of cloud-based services.Transparency and accountability for data managed in the cloud is necessary.Specifically,when using a cloudhost service,a user typically has to trust both the cloud service provider and cloud infrastructure provider to properly handling private data.This is a multi-party system.Three particular trust models can be used according to the credibility of these providers.This pa per describes techniques for preventing data leakage that can be used with these different models.展开更多
One of the key goals of the FAIR guiding principles is defined by its final principle-to optimize data sets for reuse by both humans and machines.To do so,data providers need to implement and support consistent machin...One of the key goals of the FAIR guiding principles is defined by its final principle-to optimize data sets for reuse by both humans and machines.To do so,data providers need to implement and support consistent machine readable metadata to describe their data sets.This can seem like a daunting task for data providers,whether it is determining what level of detail should be provided in the provenance metadata or figuring out what common shared vocabularies should be used.Additionally,for existing data sets it is often unclear what steps should be taken to enable maximal,appropriate reuse.Data citation already plays an important role in making data findable and accessible,providing persistent and unique identifiers plus metadata on over 16 million data sets.In this paper,we discuss how data citation and its underlying infrastructures,in particular associated metadata,provide an important pathway for enabling FAIR data reuse.展开更多
基金The authors would like to thank the anonymous referees for their valuable comments and helpful suggestions.The work is supported by the National Key Research and Development Program of China(No.2016YFB0800402)the National Natural Science Foundation of China(No.U1405254,No.U1536207).
文摘To cope with privacy leakage caused by multimedia outsourcing and sharing,data provenance is used to analyze leaked multimedia and provide reactive accountability.Existing schemes of multimedia provenance are based on watermarking protocols.In an outsourcing scenario,existing schemes face two severe challenges:1)when data leakage occurs,there exists a probability that data provenance results can be repudiated,in which case data provenance tracking fails;and 2)when outsourced data are shared,data encryption transfer causes key management burden outside the schemes,and privacy leakage threatens users.In this paper,we propose a novel data provenance scheme with an improved LUT-based fingerprinting protocol,which integrates an asymmetric watermarking protocol,robust watermark algorithm and homomorphic encryption and digital signatures to achieve full non-repudiation provenance.We build an in-scheme stream cipher to protect outsourced multimedia data from privacy leakage and complicated key management.Our scheme is also lightweight and easy to deploy.Extensive security and performance analysis compares our scheme with the state of the art.The results show that our scheme has not only better provenance security and data confidentiality but also higher efficiency for multimedia outsourcing,sharing and provenance.
文摘Large models,such as large language models(LLMs),vision-language models(VLMs),and multimodal agents,have become key elements in artificial intelli⁃gence(AI)systems.Their rapid development has greatly improved perception,generation,and decision-making in various fields.However,their vast scale and complexity bring about new security challenges.Issues such as backdoor vulnerabilities during training,jailbreaking in multimodal rea⁃soning,and data provenance and copyright auditing have made security a critical focus for both academia and industry.
基金supported by National Natural Science Foundation of China (No. U1736218)National Key R&D Program of China (No. 2018YFB0804704)partially supported by CNCERT/CC
文摘To combat increasingly sophisticated cyber attacks,the security community has proposed and deployed a large body of threat detection approaches to discover malicious behaviors on host systems and attack payloads in network traffic.Several studies have begun to focus on threat detection methods based on provenance data of host-level event tracing.On the other side,with the significant development of big data and artificial intelligence technologies,large-scale graph computing has been widely used.To this end,kinds of research try to bridge the gap between threat detection based on host log provenance data and graph algorithm,and propose the threat detection algorithm based on system provenance graph.These approaches usually generate the system provenance graph via tagging and tracking of system events,and then leverage the characteristics of the graph to conduct threat detection and attack investigation.For the purpose of deeply understanding the correctness,effectiveness,and efficiency of different graph-based threat detection algorithms,we pay attention to mainstream threat detection methods based on provenance graphs.We select and implement 5 state-of-the-art threat detection approaches among a large number of studies as evaluation objects for further analysis.To this end,we collect about 40GB of host-level raw log data in a real-world IT environment,and simulate 6 types of cyber attack scenarios in an isolated environment for malicious provenance data to build our evaluation datasets.The crosswise comparison and longitudinal assessment interpret in detail these detection approaches can detect which attack scenarios well and why.Our empirical evaluation provides a solid foundation for the improvement direction of the threat detection approach.
基金supported by National Basic Research (973) Program of China (2011CB302505)Natural Science Foundation of China (61373145, 61170210)+1 种基金National High-Tech R&D (863) Program of China (2012AA012600,2011AA01A203)Chinese Special Project of Science and Technology (2012ZX01039001)
文摘Despite the multifaceted advantages of cloud computing,concerns about data leakage or abuse impedes its adoption for security-sensi tive tasks.Recent investigations have revealed that the risk of unauthorized data access is one of the biggest concerns of users of cloud-based services.Transparency and accountability for data managed in the cloud is necessary.Specifically,when using a cloudhost service,a user typically has to trust both the cloud service provider and cloud infrastructure provider to properly handling private data.This is a multi-party system.Three particular trust models can be used according to the credibility of these providers.This pa per describes techniques for preventing data leakage that can be used with these different models.
基金This work was partially supported by Horizon 2020,INFRADEV-4-2014-2015,654248,CORBEL,Coordinated Research Infrastructures Building Enduring Life-science services.
文摘One of the key goals of the FAIR guiding principles is defined by its final principle-to optimize data sets for reuse by both humans and machines.To do so,data providers need to implement and support consistent machine readable metadata to describe their data sets.This can seem like a daunting task for data providers,whether it is determining what level of detail should be provided in the provenance metadata or figuring out what common shared vocabularies should be used.Additionally,for existing data sets it is often unclear what steps should be taken to enable maximal,appropriate reuse.Data citation already plays an important role in making data findable and accessible,providing persistent and unique identifiers plus metadata on over 16 million data sets.In this paper,we discuss how data citation and its underlying infrastructures,in particular associated metadata,provide an important pathway for enabling FAIR data reuse.