摘要
本研究基于2018-2023年沪深A股中小企业数据,探讨利用机器学习算法识别“专精特新”企业的潜力。研究背景源于国家梯度培育政策的需求,传统认定方法存在主观性时滞问题。通过整合财务与专利数据,构建XGBoost预测模型,实证显示模型AUC达0.896,性能优于逻辑回归等基准模型。特征重要性分析揭示研发强度、专利存量和净资产收益率为核心驱动因素,印证了“创新投入—产出—效益”的政策逻辑,为政府精准筛查和企业自我提升提供了数据支撑。
This study explores the potential of using machine learning algorithms to identify"specialized,refined,distinctive,and innovative"enterprises based on data from small and medium-sized enterprises(SMEs)listed on the Shanghai and Shenzhen Ashares markets from 2018 to 2023.Motivated by the national tiered cultivation policy,the research addresses the subjectivity and time lags inherent in traditional identification methods.By integrating financial and patent data,an XGBoost prediction model was constructed.Empirical results show the model's AUC reaching 0.896,outperforming benchmark models such as logistic regression.Feature importance analysis reveals R&D intensity,patent stock,and return on equity as core drivers,validating the policy logic of"innovation input-output-benefit."The findings provide data support for government precision screening and corporate self-improvement.
作者
韩文喆
HAN Wenzhe(School of Economics and Management,Southwest University,Chongqing 400715,China)
关键词
专精特新
机器学习
XGBoost
特征重要性
"specialization,refinement,specialization,and innovation"enterprises
machine learning
XGBoost
feature importance