Motif-based graph local clustering(MGLC)algorithms are gen-erally designed with the two-phase framework,which gets the motif weight for each edge beforehand and then conducts the local clustering algorithm on the weig...Motif-based graph local clustering(MGLC)algorithms are gen-erally designed with the two-phase framework,which gets the motif weight for each edge beforehand and then conducts the local clustering algorithm on the weighted graph to output the result.Despite correctness,this frame-work brings limitations on both practical and theoretical aspects and is less applicable in real interactive situations.This research develops a purely local and index-adaptive method,Index-adaptive Triangle-based Graph Local Clustering(TGLC+),to solve the MGLC problem w.r.t.triangle.TGLC+combines the approximated Monte-Carlo method Triangle-based Random Walk(TRW)and deterministic Brute-Force method Triangle-based Forward Push(TFP)adaptively to estimate the Personalized PageRank(PPR)vector without calculating the exact triangle-weighted transition probability and then outputs the clustering result by conducting the standard sweep procedure.This paper presents the efficiency of TGLC+through theoretical analysis and demonstrates its effectiveness through extensive experiments.To our knowl-edge,TGLC+is the first to solve the MGLC problem without computing the motif weight beforehand,thus achieving better efficiency with comparable effectiveness.TGLC+is suitable for large-scale and interactive graph analysis tasks,including visualization,system optimization,and decision-making.展开更多
本文从金融科技大数据出发,以人工智能的吉布斯随机搜索(Gibbs Sampling)算法为工具,在大数据框架下建立了针对公司财务欺诈风险的特征因子筛选的一般处理方法与特征提取推断原理,并结合上市公司的财务报表数据进行实证分析,结合从2017...本文从金融科技大数据出发,以人工智能的吉布斯随机搜索(Gibbs Sampling)算法为工具,在大数据框架下建立了针对公司财务欺诈风险的特征因子筛选的一般处理方法与特征提取推断原理,并结合上市公司的财务报表数据进行实证分析,结合从2017年1月到2018年12月证监会对上市公司财务报表信息披露违规的数据样本,筛选出刻画财务欺诈的特征因子并进行了验证测试,支持财务欺诈的识别。本文提出的框架和模型方法可以加强和提升对上市公司财务欺诈风险的识别能力,并实现对公司财务在欺诈方面的探测与预测(Detecting and Predicting)功能。展开更多
基金supported by the Fundamental Research Funds for the Central Universities(No.2020JS005).
文摘Motif-based graph local clustering(MGLC)algorithms are gen-erally designed with the two-phase framework,which gets the motif weight for each edge beforehand and then conducts the local clustering algorithm on the weighted graph to output the result.Despite correctness,this frame-work brings limitations on both practical and theoretical aspects and is less applicable in real interactive situations.This research develops a purely local and index-adaptive method,Index-adaptive Triangle-based Graph Local Clustering(TGLC+),to solve the MGLC problem w.r.t.triangle.TGLC+combines the approximated Monte-Carlo method Triangle-based Random Walk(TRW)and deterministic Brute-Force method Triangle-based Forward Push(TFP)adaptively to estimate the Personalized PageRank(PPR)vector without calculating the exact triangle-weighted transition probability and then outputs the clustering result by conducting the standard sweep procedure.This paper presents the efficiency of TGLC+through theoretical analysis and demonstrates its effectiveness through extensive experiments.To our knowl-edge,TGLC+is the first to solve the MGLC problem without computing the motif weight beforehand,thus achieving better efficiency with comparable effectiveness.TGLC+is suitable for large-scale and interactive graph analysis tasks,including visualization,system optimization,and decision-making.
文摘本文从金融科技大数据出发,以人工智能的吉布斯随机搜索(Gibbs Sampling)算法为工具,在大数据框架下建立了针对公司财务欺诈风险的特征因子筛选的一般处理方法与特征提取推断原理,并结合上市公司的财务报表数据进行实证分析,结合从2017年1月到2018年12月证监会对上市公司财务报表信息披露违规的数据样本,筛选出刻画财务欺诈的特征因子并进行了验证测试,支持财务欺诈的识别。本文提出的框架和模型方法可以加强和提升对上市公司财务欺诈风险的识别能力,并实现对公司财务在欺诈方面的探测与预测(Detecting and Predicting)功能。