摘要
酶作为高效且环境友好的生物催化剂,在医药、食品、化工、农业等领域广泛应用,然而,野生型酶常存在热稳定性差、底物谱窄及活性不足等局限。蛋白质语言模型(PLMs)通过借鉴自然语言处理方法,在大规模蛋白质序列和结构数据上进行自监督学习,能够捕捉序列、结构与功能之间的进化规律,在酶工程中展现出巨大潜力。本文系统综述代表性PLMs的模型架构及训练策略,总结其在零样本与少样本预测、酶功能预测及从头设计中的应用进展,具体包括利用PLMs实现突变效应预测与催化性能优化,辅助自动化进化平台加速迭代,提升酶的热稳定性、活性及底物适应性;结合多模态表征与小样本学习提升特定任务预测精度;以及在全新功能性酶蛋白设计中的探索。本文还讨论了PLMs在规模、泛化能力及与生物物理知识融合方面的挑战,并展望其在可控功能蛋白设计和工业应用中的前景。
Enzymes,as efficient and environmentally friendly biocatalysts,are widely applied in medicine,food,chemical,agricultural,and other industries.However,the wild-type enzymes often suffer from poor thermostability,narrow substrate spectrum,and limited activity.Protein language models(PLMs),inspired by natural language processing,are trained on large-scale protein sequence and structure datasets through self-supervised learning to capture evolutionary patterns linking sequence,structure,and function,thereby showing great potential in enzyme engineering.This review systematically summarized representative PLM architectures and training strategies,and highlighted their recent applications in zero-shot and few-shot prediction,enzyme function prediction,and de novo design.Specifically,PLMs have been used for mutation effect prediction and catalytic performance optimization,and to accelerate iterative evolution when integrated with automated platforms,thereby enhancing enzyme thermostability,activity,and substrate adaptability.Moreover,multimodal representations and few-shot learning approaches have improved task-specific prediction accuracy,while PLMs have also enabled the design of novel functional enzymes.Finally,the challenges of model scalability,generalization,and integration with biophysical knowledge were discussed,along with future prospects of PLMs for controllable functional protein design and industrial applications.
作者
宋中迪
周佳楠
陈妍
江玲
于浩然
SONG Zhongdi;ZHOU Jianan;CHEN Yan;JIANG Ling;YU Haoran(Zhejiang Collaborative Innovation Center for Full-Process Monitoring and Green Governance of Emerging Contaminants,Interdisciplinary Research Academy,Zhejiang Shuren University,Hangzhou 310015;Institute of Bioengineering,College of Chemical and Biological Engineering,Zhejiang University,Hangzhou 310027;ZJU-Hangzhou Global Scientific and Technological Innovation Centre,Hangzhou 311200)
出处
《中国食品学报》
北大核心
2025年第11期1-12,共12页
Journal of Chinese Institute of Food Science and Technology
基金
浙江省“尖兵”“领雁”研发攻关计划项目(2025C01097)。
关键词
蛋白质语言模型
酶工程
零样本预测
酶功能
从头设计
protein language models
enzyme engineering
zero-shot prediction
enzyme function
de novo design