One of the long-standing controversial arguments in protein folding is Levinthal's paradox. We have recently proposed a new nucleation hypothesis and shown that the nucleation residues are the most conserved sequence...One of the long-standing controversial arguments in protein folding is Levinthal's paradox. We have recently proposed a new nucleation hypothesis and shown that the nucleation residues are the most conserved sequences in protein. To avoid the complicated effect of tertiary interactions, we limit our search for structural codes to the nucleation residues. Starting with the hypotheses of secondary structure nucleation and conservation of residues important for folding, we have analysed 762 folds classified as unique by SCOP. Segments of 17 residues around the top 20% conserved amino acids are analysed, resulting in approximately 100 clusters each for the main secondary structure classes of helix, sheet and coil. Helical clusters have the longest correlation range, coils the shortest (four residues). Strong specific sequence-structure correlation is observed for coil but not for helix and sheet, suggesting a mapping relationship between the sequence and the structure for coil. We propose that the central sequences in these clusters form 'structural codes', a useful basis set for identifying nucleation sites, protein fragments stable in isolation, and secondary structural patterns in proteins (particularly turns and loops).展开更多
The family GH126,best represented by the amylolytic enzyme CPF_2247 from Clostridium perfringens,exclusively includes proteins of bacterial origin,covering predominantly the phylum Bacillota.Although all the members s...The family GH126,best represented by the amylolytic enzyme CPF_2247 from Clostridium perfringens,exclusively includes proteins of bacterial origin,covering predominantly the phylum Bacillota.Although all the members should adopt the catalytic(α/α)6-barrel domain,neither the catalytic machinery nor the reaction mechanism has been determined as yet.The limited biochemical characterization,especially some uncertainty concerning the endo-vs exo-mode of action and retaining vs inverting mechanism,combined with the sequence-structural resemblance of GH126 members to inverting β-glucanases from families GH8 and GH48(the clan GH-M),may lead to misclassification of putative proteins.The present study was therefore designed in an effort to identify unique sequence-structural features that would definitively differentiate family GH126 from both GH8 and GH48.To achieve this,a sequence logo,representing the seven GH126 conserved sequence regions established previously,was created using 1665 GH126 sequences.The logo was compared with GH8 and GH48 logos based on,respectively,86 and 63 selected enzymes.An invariant tyrosine residue in CSR-6 was identified as a reliable marker for the family GH126.In addition,protein BLAST searches identified 87 putative proteins that taxonomically extend the family GH126 not only outside Bacillota,but also outside Bacteria to include representatives of archaeons and eukaryotes(fungi).Evolutionary analysis of the 434 sequences representing all the three families GH126 with GH8 and GH48,including the BLAST hits,revealed an intermediate group.In the future,it may either define a new GH family closely related to GH126,or at least constitute a future GH126 subfamily.展开更多
文摘One of the long-standing controversial arguments in protein folding is Levinthal's paradox. We have recently proposed a new nucleation hypothesis and shown that the nucleation residues are the most conserved sequences in protein. To avoid the complicated effect of tertiary interactions, we limit our search for structural codes to the nucleation residues. Starting with the hypotheses of secondary structure nucleation and conservation of residues important for folding, we have analysed 762 folds classified as unique by SCOP. Segments of 17 residues around the top 20% conserved amino acids are analysed, resulting in approximately 100 clusters each for the main secondary structure classes of helix, sheet and coil. Helical clusters have the longest correlation range, coils the shortest (four residues). Strong specific sequence-structure correlation is observed for coil but not for helix and sheet, suggesting a mapping relationship between the sequence and the structure for coil. We propose that the central sequences in these clusters form 'structural codes', a useful basis set for identifying nucleation sites, protein fragments stable in isolation, and secondary structural patterns in proteins (particularly turns and loops).
基金financially supported by the Grant No.2/0146/21 from the Slovak Grant Agency VEGAthe Grant No.FPPV-35-2024 from the University of SS.Cyril and Methodius in Trnava.
文摘The family GH126,best represented by the amylolytic enzyme CPF_2247 from Clostridium perfringens,exclusively includes proteins of bacterial origin,covering predominantly the phylum Bacillota.Although all the members should adopt the catalytic(α/α)6-barrel domain,neither the catalytic machinery nor the reaction mechanism has been determined as yet.The limited biochemical characterization,especially some uncertainty concerning the endo-vs exo-mode of action and retaining vs inverting mechanism,combined with the sequence-structural resemblance of GH126 members to inverting β-glucanases from families GH8 and GH48(the clan GH-M),may lead to misclassification of putative proteins.The present study was therefore designed in an effort to identify unique sequence-structural features that would definitively differentiate family GH126 from both GH8 and GH48.To achieve this,a sequence logo,representing the seven GH126 conserved sequence regions established previously,was created using 1665 GH126 sequences.The logo was compared with GH8 and GH48 logos based on,respectively,86 and 63 selected enzymes.An invariant tyrosine residue in CSR-6 was identified as a reliable marker for the family GH126.In addition,protein BLAST searches identified 87 putative proteins that taxonomically extend the family GH126 not only outside Bacillota,but also outside Bacteria to include representatives of archaeons and eukaryotes(fungi).Evolutionary analysis of the 434 sequences representing all the three families GH126 with GH8 and GH48,including the BLAST hits,revealed an intermediate group.In the future,it may either define a new GH family closely related to GH126,or at least constitute a future GH126 subfamily.