As technology develops,the amount of information being used has increased a lot.Every company learns big data to provide customized services with its customers.Accordingly,collecting and analyzing data of the data sub...As technology develops,the amount of information being used has increased a lot.Every company learns big data to provide customized services with its customers.Accordingly,collecting and analyzing data of the data subject has become one of the core competencies of the companies.However,when collecting and using it,the authority of the data subject may be violated.The data often identifies its subject by itself,and even if it is not a personal information that infringes on an individual’s authority,the moment it is connected,it becomes important and sensitive personal information that we have never thought of.Therefore,recent privacy regulations such as GDPR(GeneralData ProtectionRegulation)are changing to guarantee more rights of the data subjects.To use data effectively without infringing on the rights of the data subject,the concept of de-identification has been created.Researchers and companies can make personal information less identifiable through appropriate de-identification/pseudonymization and use the data for the purpose of statistical research.De-identification/pseudonymization techniques have been studied a lot,but it is difficult for companies and researchers to know how to de-identify/pseudonymize data.It is difficult to clearly understand how and to what extent each organization should take deidentification measures.Currently,each organization does not systematically analyze and conduct the situation but only takes minimal action while looking at the guidelines distributed by each country.We solved this problem from the perspective of risk management.Several steps are required to secure the dataset starting from pre-processing to releasing the dataset.We can analyze the dataset,analyze the risk,evaluate the risk,and treat the risk appropriately.The outcomes of each step can then be used to take appropriate action on the dataset to eliminate or reduce its risk.Then,we can release the dataset under its own purpose.These series of processes were reconstructed to fit the current situation by analyzing various standards such as ISO/IEC(International Organization for Standardization/International Electrotechnical Commission)20889,NIST IR(National Institute of Standards and Technology Interagency Reports)8053,NIST SP(National Institute of Standards and Technology Special Publications)800-188,and ITU-T(International Telecommunications Union-Telecommunication)X.1148.We propose an integrated framework based on situational awareness model and risk management model.We found that this framework can be specialized for multiple domains,and it is useful because it is based on a variety of case and utility-based ROI calculations.展开更多
According to BBC News,online hate speech increased by 20%during the COVID-19 pandemic.Hate speech from anonymous users can result in psychological harm,including depression and trauma,and can even lead to suicide.Mali...According to BBC News,online hate speech increased by 20%during the COVID-19 pandemic.Hate speech from anonymous users can result in psychological harm,including depression and trauma,and can even lead to suicide.Malicious online comments are increasingly becoming a social and cultural problem.It is therefore critical to detect such comments at the national level and detect malicious users at the corporate level.To achieve a healthy and safe Internet environment,studies should focus on institutional and technical topics.The detection of toxic comments can create a safe online environment.In this study,to detect malicious comments,we used approxi-mately 9,400 examples of hate speech from a Korean corpus of entertainment news comments.We developed toxic comment classification models using supervised learning algorithms,including decision trees,random forest,a support vector machine,and K-nearest neighbors.The proposed model uses random forests to classify toxic words,achieving an F1-score of 0.94.We analyzed the trained model using the permutation feature importance,which is an explanatory machine learning method.Our experimental results confirmed that the toxic comment classifier properly classified hate words used in Korea.Using this research methodology,the proposed method can create a healthy Internet environment by detecting malicious comments written in Korean.展开更多
The damage caused by malicious software is increasing owing to the COVID-19 pandemic,such as ransomware attacks on information technology and operational technology systems based on corporate networks and social infra...The damage caused by malicious software is increasing owing to the COVID-19 pandemic,such as ransomware attacks on information technology and operational technology systems based on corporate networks and social infrastructures and spear-phishing attacks on business or research institutes.Recently,several studies have been conducted to prevent further phishing emails in the workplace because malware attacks employ emails as the primary means of penetration.However,according to the latest research,there appears to be a limitation in blocking email spoofing through advanced blocking systems such as spam email filtering solutions and advanced persistent threat systems.Therefore,experts believe that it is more critical to restore services immediately through resilience than the advanced prevention program in the event of damage caused by malicious software.In accordance with this trend,we conducted a survey among 100 employees engaging in information security regarding the effective factors for countering malware attacks through email.Furthermore,we confirmed that resilience,backup,and restoration were effective factors in responding to phishing emails.In contrast,practical exercise and attack visualization were recognized as having little effect on malware attacks.In conclusion,our study reminds business and supervisory institutions to carefully examine their regular voluntary exercises or mandatory training programs and assists private corporations and public institutions to establish counter-strategies for dealing with malware attacks.展开更多
文摘As technology develops,the amount of information being used has increased a lot.Every company learns big data to provide customized services with its customers.Accordingly,collecting and analyzing data of the data subject has become one of the core competencies of the companies.However,when collecting and using it,the authority of the data subject may be violated.The data often identifies its subject by itself,and even if it is not a personal information that infringes on an individual’s authority,the moment it is connected,it becomes important and sensitive personal information that we have never thought of.Therefore,recent privacy regulations such as GDPR(GeneralData ProtectionRegulation)are changing to guarantee more rights of the data subjects.To use data effectively without infringing on the rights of the data subject,the concept of de-identification has been created.Researchers and companies can make personal information less identifiable through appropriate de-identification/pseudonymization and use the data for the purpose of statistical research.De-identification/pseudonymization techniques have been studied a lot,but it is difficult for companies and researchers to know how to de-identify/pseudonymize data.It is difficult to clearly understand how and to what extent each organization should take deidentification measures.Currently,each organization does not systematically analyze and conduct the situation but only takes minimal action while looking at the guidelines distributed by each country.We solved this problem from the perspective of risk management.Several steps are required to secure the dataset starting from pre-processing to releasing the dataset.We can analyze the dataset,analyze the risk,evaluate the risk,and treat the risk appropriately.The outcomes of each step can then be used to take appropriate action on the dataset to eliminate or reduce its risk.Then,we can release the dataset under its own purpose.These series of processes were reconstructed to fit the current situation by analyzing various standards such as ISO/IEC(International Organization for Standardization/International Electrotechnical Commission)20889,NIST IR(National Institute of Standards and Technology Interagency Reports)8053,NIST SP(National Institute of Standards and Technology Special Publications)800-188,and ITU-T(International Telecommunications Union-Telecommunication)X.1148.We propose an integrated framework based on situational awareness model and risk management model.We found that this framework can be specialized for multiple domains,and it is useful because it is based on a variety of case and utility-based ROI calculations.
文摘According to BBC News,online hate speech increased by 20%during the COVID-19 pandemic.Hate speech from anonymous users can result in psychological harm,including depression and trauma,and can even lead to suicide.Malicious online comments are increasingly becoming a social and cultural problem.It is therefore critical to detect such comments at the national level and detect malicious users at the corporate level.To achieve a healthy and safe Internet environment,studies should focus on institutional and technical topics.The detection of toxic comments can create a safe online environment.In this study,to detect malicious comments,we used approxi-mately 9,400 examples of hate speech from a Korean corpus of entertainment news comments.We developed toxic comment classification models using supervised learning algorithms,including decision trees,random forest,a support vector machine,and K-nearest neighbors.The proposed model uses random forests to classify toxic words,achieving an F1-score of 0.94.We analyzed the trained model using the permutation feature importance,which is an explanatory machine learning method.Our experimental results confirmed that the toxic comment classifier properly classified hate words used in Korea.Using this research methodology,the proposed method can create a healthy Internet environment by detecting malicious comments written in Korean.
基金This study was supported by a grant from the Korean Health Technology RD Project,Ministry of Health and Welfare,Republic of Korea(HI19C0866).
文摘The damage caused by malicious software is increasing owing to the COVID-19 pandemic,such as ransomware attacks on information technology and operational technology systems based on corporate networks and social infrastructures and spear-phishing attacks on business or research institutes.Recently,several studies have been conducted to prevent further phishing emails in the workplace because malware attacks employ emails as the primary means of penetration.However,according to the latest research,there appears to be a limitation in blocking email spoofing through advanced blocking systems such as spam email filtering solutions and advanced persistent threat systems.Therefore,experts believe that it is more critical to restore services immediately through resilience than the advanced prevention program in the event of damage caused by malicious software.In accordance with this trend,we conducted a survey among 100 employees engaging in information security regarding the effective factors for countering malware attacks through email.Furthermore,we confirmed that resilience,backup,and restoration were effective factors in responding to phishing emails.In contrast,practical exercise and attack visualization were recognized as having little effect on malware attacks.In conclusion,our study reminds business and supervisory institutions to carefully examine their regular voluntary exercises or mandatory training programs and assists private corporations and public institutions to establish counter-strategies for dealing with malware attacks.