Due to the increasing importance of online product reviews,how to accurately identify fake reviews has become an issue of concern to enterprises and consumers.The contextual features encapsulate the semantic informati...Due to the increasing importance of online product reviews,how to accurately identify fake reviews has become an issue of concern to enterprises and consumers.The contextual features encapsulate the semantic information of review,while the behavioral features reflect the behavioral patterns of reviewers.However,an appropriate method to integrate contextual and behavioral features is a challenging task,hence an end-to-end model based on Weighted Fusion of Contextual Features and Reviewer Behaviors(WF-CFRB)for fake review detection is proposed.Firstly,the categories of average cosine similarity and the corpus of review are jointly fed into BERT to obtain contextual feature vectors.Then,the underlying patterns of the reviewer behaviors are extracted by CNN to construct behavioral feature vectors.Finally,a weighted fusion method is adopted to fuse contextual and behavior features for fake review detection.WF-CFRB and each component are evaluated on YELP dataset.WF-CFRB achieves F1 score of 81.31%and AUC score of 81.27%,and it also outperforms the other baseline models in terms of accuracy and recall.Compared with the original BERT model,the experimental results indicate that cosine similarity provides BERT with more information,which is useful to construct the contextual feature vectors.Through the weighted fusion of contextual and behavioral features,WF-CFRB yields excellent performance on fake review detection,which is particularly suitable for scenarios where behavioral features can be captured.展开更多
Reviews have a significant impact on online businesses.Nowadays,online consumers rely heavily on other people’s reviews before purchasing a product,instead of looking at the product description.With the emergence of ...Reviews have a significant impact on online businesses.Nowadays,online consumers rely heavily on other people’s reviews before purchasing a product,instead of looking at the product description.With the emergence of technology,malicious online actors are using techniques such as Natural Language Processing(NLP)and others to generate a large number of fake reviews to destroy their competitors’markets.To remedy this situation,several researches have been conducted in the last few years.Most of them have applied NLP techniques to preprocess the text before building Machine Learning(ML)or Deep Learning(DL)models to detect and filter these fake reviews.However,with the same NLP techniques,machine-generated fake reviews are increasing exponentially.This work explores a powerful text representation technique called Embedding models to combat the proliferation of fake reviews in online marketplaces.Indeed,these embedding structures can capture much more information from the data compared to other standard text representations.To do this,we tested our hypothesis in two different Recurrent Neural Network(RNN)architectures,namely Long Short-Term Memory(LSTM)and Gated Recurrent Unit(GRU),using fake review data from Amazon and TripAdvisor.Our experimental results show that our best-proposed model can distinguish between real and fake reviews with 91.44%accuracy.Furthermore,our results corroborate with the state-of-the-art research in this area and demonstrate some improvements over other approaches.Therefore,proper text representation improves the accuracy of fake review detection.展开更多
Fake reviews,also known as deceptive opinions,are used to mislead people and have gained more importance recently.This is due to the rapid increase in online marketing transactions,such as selling and purchasing.E-com...Fake reviews,also known as deceptive opinions,are used to mislead people and have gained more importance recently.This is due to the rapid increase in online marketing transactions,such as selling and purchasing.E-commerce provides a facility for customers to post reviews and comment about the product or service when purchased.New customers usually go through the posted reviews or comments on the website before making a purchase decision.However,the current challenge is how new individuals can distinguish truthful reviews from fake ones,which later deceives customers,inflicts losses,and tarnishes the reputation of companies.The present paper attempts to develop an intelligent system that can detect fake reviews on ecommerce platforms using n-grams of the review text and sentiment scores given by the reviewer.The proposed methodology adopted in this study used a standard fake hotel review dataset for experimenting and data preprocessing methods and a term frequency-Inverse document frequency(TF-IDF)approach for extracting features and their representation.For detection and classification,n-grams of review texts were inputted into the constructed models to be classified as fake or truthful.However,the experiments were carried out using four different supervised machine-learning techniques and were trained and tested on a dataset collected from the Trip Advisor website.The classification results of these experiments showed that na飗e Bayes(NB),support vector machine(SVM),adaptive boosting(AB),and random forest(RF)received 88%,93%,94%,and 95%,respectively,based on testing accuracy and tje F1-score.The obtained results were compared with existing works that used the same dataset,and the proposed methods outperformed the comparable methods in terms of accuracy.展开更多
基金supported by National Key Research and Development Program Project“Research on data-driven comprehensive quality accurate service technology for small medium and micro enterprises”under Grant No.2019YFB1405303the Project of Cultivation for Young Top-motch Talents of Beijing Municipal Institutions“Research on the comprehensive quality intelligent service and optimized technology for small medium and micro enterprises”under Grant No.BPHR202203233National Natural Science Foundation of China“Research on the influence and governance strategy of online review manipulation with the perspective of E-commerce ecosystem”under Grant No.72174018.
文摘Due to the increasing importance of online product reviews,how to accurately identify fake reviews has become an issue of concern to enterprises and consumers.The contextual features encapsulate the semantic information of review,while the behavioral features reflect the behavioral patterns of reviewers.However,an appropriate method to integrate contextual and behavioral features is a challenging task,hence an end-to-end model based on Weighted Fusion of Contextual Features and Reviewer Behaviors(WF-CFRB)for fake review detection is proposed.Firstly,the categories of average cosine similarity and the corpus of review are jointly fed into BERT to obtain contextual feature vectors.Then,the underlying patterns of the reviewer behaviors are extracted by CNN to construct behavioral feature vectors.Finally,a weighted fusion method is adopted to fuse contextual and behavior features for fake review detection.WF-CFRB and each component are evaluated on YELP dataset.WF-CFRB achieves F1 score of 81.31%and AUC score of 81.27%,and it also outperforms the other baseline models in terms of accuracy and recall.Compared with the original BERT model,the experimental results indicate that cosine similarity provides BERT with more information,which is useful to construct the contextual feature vectors.Through the weighted fusion of contextual and behavioral features,WF-CFRB yields excellent performance on fake review detection,which is particularly suitable for scenarios where behavioral features can be captured.
文摘Reviews have a significant impact on online businesses.Nowadays,online consumers rely heavily on other people’s reviews before purchasing a product,instead of looking at the product description.With the emergence of technology,malicious online actors are using techniques such as Natural Language Processing(NLP)and others to generate a large number of fake reviews to destroy their competitors’markets.To remedy this situation,several researches have been conducted in the last few years.Most of them have applied NLP techniques to preprocess the text before building Machine Learning(ML)or Deep Learning(DL)models to detect and filter these fake reviews.However,with the same NLP techniques,machine-generated fake reviews are increasing exponentially.This work explores a powerful text representation technique called Embedding models to combat the proliferation of fake reviews in online marketplaces.Indeed,these embedding structures can capture much more information from the data compared to other standard text representations.To do this,we tested our hypothesis in two different Recurrent Neural Network(RNN)architectures,namely Long Short-Term Memory(LSTM)and Gated Recurrent Unit(GRU),using fake review data from Amazon and TripAdvisor.Our experimental results show that our best-proposed model can distinguish between real and fake reviews with 91.44%accuracy.Furthermore,our results corroborate with the state-of-the-art research in this area and demonstrate some improvements over other approaches.Therefore,proper text representation improves the accuracy of fake review detection.
文摘Fake reviews,also known as deceptive opinions,are used to mislead people and have gained more importance recently.This is due to the rapid increase in online marketing transactions,such as selling and purchasing.E-commerce provides a facility for customers to post reviews and comment about the product or service when purchased.New customers usually go through the posted reviews or comments on the website before making a purchase decision.However,the current challenge is how new individuals can distinguish truthful reviews from fake ones,which later deceives customers,inflicts losses,and tarnishes the reputation of companies.The present paper attempts to develop an intelligent system that can detect fake reviews on ecommerce platforms using n-grams of the review text and sentiment scores given by the reviewer.The proposed methodology adopted in this study used a standard fake hotel review dataset for experimenting and data preprocessing methods and a term frequency-Inverse document frequency(TF-IDF)approach for extracting features and their representation.For detection and classification,n-grams of review texts were inputted into the constructed models to be classified as fake or truthful.However,the experiments were carried out using four different supervised machine-learning techniques and were trained and tested on a dataset collected from the Trip Advisor website.The classification results of these experiments showed that na飗e Bayes(NB),support vector machine(SVM),adaptive boosting(AB),and random forest(RF)received 88%,93%,94%,and 95%,respectively,based on testing accuracy and tje F1-score.The obtained results were compared with existing works that used the same dataset,and the proposed methods outperformed the comparable methods in terms of accuracy.