Machine Learning in Enzyme Engineering
Mazurenko, S., Prokop., Z., Damborsky, J., 2019, ACS Catalysis XXX-XXX.
We analyse the state of the art in databases and machine learning methods used for training and validating predictors in enzyme engineering in this Perspective. We discuss current limitations and challenges which the community is facing and recent advancements in experimental and theoretical methods that have the potential to address those challenges. We also present our view on possible future directions for developing the applications to the design of efficient biocatalysts.
Caver Web 1.0: Identification of Tunnels and Channels in Proteins and Analysis of Ligand Transport
Stourac, J., Vavra, O., Kokkonen, P., Filipovic, J., Pinto, G., Brezovsky, J., Damborsky, J., Bednar, D., 2019, Nucleic Acid Research W1: W414–W422.
Caver Web 1.0 is a web server for comprehensive analysis of protein tunnels and channels, and study of the ligands’ transport through these transport pathways. Caver Web is the first interactive tool allowing both the analyses within a single graphical user interface. The tool is very fast (2-20 min per job) and is applicable even for virtual screening purposes. Its simple setup and comprehensive graphical user interface make the tool accessible for a broad scientific community. The server is freely available at https://loschmidt.chemi.muni.cz/caverweb.
Light-Emitting Dehalogenases: Reconstruction of Multifunctional Biocatalysts
Chaloupkova, R., Liskova, V., Toul, M., Markova, K., Sebestova, E., Hernychova, L., Marek, M., Pinto. G. P., Pluskal, D., Waterman, J., Prokop, Z., Damborsky, J., 2019, ACS Catalysis 9: 4810–4823.
To obtain structural insights into the emergence of new biological functions from catalytically promiscuous enzymes, we reconstructed an ancestor of catalytically distinct, but evolutionarily related, haloalkane dehalogenases (EC 18.104.22.168) and Renilla luciferase (EC 22.214.171.124). This ancestor has both hydrolase and monooxygenase activities. We demonstrate, that a single substitution next to the catalytic pentad enables the emergence of new activity at enzyme class-level. Ancestral reconstruction has a clear potential for obtaining multi-functional catalysts.
Exploring the Challenges of Computational Enzyme Design by Rebuilding the Active Site of a Dehalogenase
Jindal, G., Slanska, K., Kolev, V., Damborsky, J., Prokop, Z., Warshel, A., 2019, Proceedings of the National Academy of Sciences of the United States of America 116: 389–394.
The goal of rational computer-aided enzyme design is hampered by the lack of knowledge of the maximum possible rate enhancement. We address this problem by considering the enzyme DhlA, which is naturally adapted for the degradation of dihalogenated ethanes. Using empirical valence bond calculations, we determine the effect of finding mutations that reduce the catalysis and then introducing mutations that restore catalysis. One of our predicted cycles is confirmed experimentally, while the other attempt remains inconclusive. We believe that the proposed strategy provides a very powerful way of validating and refining approaches for computational enzyme design.
Molecular Gating of an Engineered Enzyme Captured in Real Time
Kokkonen, P., Sykora, J., Prokop, Z., Ghose, A., Bednar, D., Amaro, M., Beerens, K., Bidmanova, S., Slanska, M., Brezovsky, J., Damborsky, J., Hof, M., 2018, Journal of the American Chemical Society 140: 17999–18008.
Engineering dynamical molecular gates represents a widely applicable strategy for designing efficient biocatalysts. Here we analyzed the dynamics of a molecular gate artificially introduced into an access tunnel of the most efficient haloalkane dehalogenase using pre-steady-state kinetics, a single-molecule fluorescence spectroscopy and molecular dynamics. Photoinduced electron-transfer – fluorescence correlation spectroscopy (PET-FCS) has enabled real-time observation of molecular gating at single molecule level with the rate constants (kon = 1822 s-1, koff = 60 s-1) corresponding well with those from the pre-steady-state kinetics (k-1 = 1100 s-1, k1 = 20 s-1).
Evolutionary Analysis is a Powerful Complement to Energy Calculations for Protein Stabilization
Beerens, K., Mazurenko, S., Kunka, A., Marques, S. M., Hansen, N., Musil, M., Chaloupkova, R., Waterman, J., Brezovsky, J., Bednar, D., Prokop, Z., Damborsky, J., 2018, ACS Catalysis 8: 9420−9428.
Stability is one of the most important characteristics of proteins and the role of computational approaches in modifying protein stability is rapidly expanding. Here we present a detailed mechanistic study of stabilizing mutations derived from the phylogenetic analysis. We explain why these highly beneficial mutations can be easily missed by widely used force-field calculations. A hybrid approach to protein stabilization – combining both energy calculation and evolutionary analysis – is freely available to the broad scientific community via the web server application FireProt: https://loschmidt.chemi.muni.cz/fireprot/.
CalFitter: A Web Server for Analysis of Protein Thermal Denaturation Data
Mazurenko, S., Stourac, J., Kunka, A., Nedejlkovic, S., Bednar, D., Prokop, Z., Damborsky, J., 2018, Nucleic Acids Research 46: W344-W349.
CalFitter web server is a unified platform for a comprehensive data fitting and an analysis of protein thermal denaturation data. The server allows simultaneous global data fitting using any combination of input data types and offers twelve protein unfolding pathway models to select from. The data fitting produces optimal parameters, their confidence intervals, and statistical information to define unfolding pathways. The server provides an interactive and easy-to-use interface that allows users to directly analyse input datasets:
HotSpot Wizard 3.0: Web Server for Automated Design of Mutations and Smart Libraries Based on Sequence Input Information
Sumbalova, L., Stourac, J., Martinek, T., Bednar, D., Damborsky, J., 2018, Nucleic Acids Research 46: W356-W362.
HotSpot Wizard is a web server for an automatic identification of ‘hot spots’ for the engineering of substrate specificity, activity or enantioselectivity of enzymes. The version 3.0 accepts the protein sequence as input data. The protein structure for the query sequence is obtained either from eight repositories of homology models or is modelled using Modeller and I-Tasser. A new module for the estimation of thermodynamic stabilities using the Rosetta suite has also been introduced which prevents destabilising mutations:
Exploration of Enzyme Diversity by Integrating Bioinformatics with Expression Analysis and Biochemical Characterization
Vanacek, P., Sebestova, E., Babkova, P., Bidmanova, S., Daniel, L., Dvorak, P., Stepankova, V., Chaloupkova, R., Brezovsky, J., Prokop, Z., Damborsky, J., 2018, ACS Catalysis 8: 2402–2412.
Millions of protein sequences are being discovered at an incredible pace, representing an inexhaustible source of biocatalysts. We present an integrated system for automated in silico screening and systematic characterization of diverse family members. The workflow consists of: (i) identification and computational characterization of relevant genes by sequence/structural bioinformatics, (ii) expression analysis and activity screening of selected proteins, and (iii) complete biochemical/biophysical characterization.
FireProt: Web Server for Automated Design of Thermostable Proteins
Musil, M., Stourac, J., Bendl, J., Brezovsky, J., Prokop, Z., Zendulka, J., Martinek, T., Bednar, D., Damborsky, J., 2017, Nucleic Acids Research 45: W393-W399.
FireProt is a web server for the automated design of multiple-point thermostable mutant proteins that combines structural and evolutionary information in its calculation core. FireProt utilizes sixteen tools and three protein engineering strategies for making reliable protein designs. The server is complemented with interactive, easy-to-use interface that allows users to directly analyze and optionally modify designed thermostable mutants. FireProt is freely available at https://loschmidt.chemi.muni.cz/fireprot.
Different Structural Origins of the Enantioselectivity of Haloalkane Dehalogenases toward Linear β-Haloalkanes: Open–Solvated versus Occluded–Desolvated Active Sites
Liskova, V., Stepankova, V., Bednar, D., Brezovsky, J., Prokop, Z., Chaloupkova, R., Damborsky, J., 2017, Angewandte Chemie International Edition 56: 4719-4723.
The enzymatic enantiodiscrimination of linear β-haloalkanes is difficult because the simple structures of the substrates prevent directional interactions. Herein we describe two distinct molecular mechanisms for the enantiodiscrimination of the β-haloalkane 2-bromopentane by haloalkane dehalogenases. Highly enantioselective DbjA has an open, solvent-accessible active site, whereas the engineered enzyme DhaA31 has an occluded and less solvated cavity but shows similar enantioselectivity. The enantioselectivity of DhaA31 arises from steric hindrance imposed by two specific substitutions rather than hydration as in DbjA.
Enzyme Tunnels and Gates as Relevant Targets in Drug Design
Marques, S. M., Daniel, L., Buryska, T., Prokop, Z., Brezovsky, J., Damborsky, J., 2016, Medicinal Research Reviews 37: 1095-1139.
We described a set of general concepts relating to the structural properties, function, and classification of enzyme tunnels and gates. We highlighted the potential of enzyme tunnels and gates as targets for the binding of small molecules, the different types of their binding, and the potential pharmacological benefits. Twelve examples of ligands bound to the tunnels and/or gates of clinically relevant enzymes were used to illustrate the different binding modes and to explain some new strategies for drug design.
Engineering a de Novo Transport Tunnel
Brezovsky, J., Babkova, P., Degtjarik, O., Fortova, A., Gora, A., Iermak, I., Rezacova, P., Dvorak, P., Kuta Smatanova, I., Prokop, Z., Chaloupkova, R., Damborsky, J., 2016, ACS Catalysis 6: 7597-7610.
We described the computational design and directed evolution of a de novo transport tunnel in a haloalkane dehalogenase. Mutants with a blocked native tunnel and a newly opened auxiliary tunnel in a distinct part of the structure showed dramatically modified properties. The mutants with blocked tunnels acquired specificity never observed with native family members: up to 32 times increased substrate inhibition and 17 times reduced catalytic rates. Opening of the auxiliary tunnel resulted in specificity and substrate inhibition similar to those of the native enzyme and the most proficient haloalkane dehalogenase reported to date.
PredictSNP2: A Unified Platform for Accurately Evaluating SNP Effects by Exploiting the Different Characteristics of Variants in Distinct Genomic Regions
Bendl, J., Musil, M., Stourac, J., Zendulka, J., Damborsky, J., Brezovsky, J., 2016, PLOS Computational Biology 12: e1004962.
We have developed a web server PredictSNP2 providing easy access to binary predictions and uniform confidence values for the five best-performing prediction tools and their consensus. These predictions are supplemented with information gathered from eight publicly available databases. PredictSNP2 extends the scope of genome analysis to the level of nucleotide substitutions that enables to identify disease-related variants within the whole genome.
HotSpot Wizard 2: Automated Design of Site-Specific Mutations and Smart Libraries in Protein Engineering
Bendl, J., Stourac, J., Sebestova, E., Vavra, O., Musil, M., Brezovsky, J., Damborsky, J., 2016, Nucleic Acids Research 44: W479-W487.
We developed HotSpot Wizard 2.0, a web server for automated identification of hot spots and design of smart libraries for engineering proteins’ stability, catalytic activity, substrate specificity and enantioselectivity. Compared to its predecessor, HotSpot Wizard 2.0 introduces several major improvements, extending the scope and quality of its analyses. It implements four different established protein engineering strategies, enabling the user to selectively target sites affecting the protein’s stability and catalytic properties. A new graphical interface provides an intuitive and comprehensive overview of the results of the analysis. The resulting pipeline of twenty integrated tools, including our in-house software Caver 3.0 for analysis of protein tunnels and channels, and three databases represents a unique one-stop solution that makes library design accessible even to users with no prior knowledge of bioinformatics.
FireProt: Energy- and Evolution-Based Computational Design of Thermostable Multiple-Point Mutants
Bednar, D., Beerens, K., Sebestova, E., Bendl, J., Khare, S., Chaloupkova, R., Prokop, Z., Brezovský, J., Baker, D., Damborsky, J., 2015, PLOS Computational Biology 11: e1004556.
FireProt is a robust computational strategy for predicting highly stable multiple-point mutants that combines energy- and evolution-based approaches with smart filtering to identify additive stabilizing mutations. We demonstrate that thermostability of the model enzymes can be substantially increased (ΔTm = 24°C and 21°C) by constructing and characterizing only a handful of multiple-point mutants. FireProt’s reliability and applicability was demonstrated by validating its predictions against 656 mutations from the ProTherm database. FireProt can be applied to any protein for which a tertiary structure and homologous sequences are available.
Site-specific Analysis of Protein Hydration Based on Unnatural Amino Acid Fluorescence
Amaro, M., Brezovsky, J., Kovacova, S., Sykora, J., Bednar, D., Nemec, V., Liskova, V., Kurumbang, N., Beerens, K., Chaloupkova, R., Paruch, K., Hof, M., Damborsky, J., 2015, Journal of the American Chemical Society 137: 4988-4992.
Hydration of proteins profoundly affects their functions. We describe a simple and general method for a site-specific analysis of protein hydration based on the in vivo incorporation of fluorescent unnatural amino acids and their analysis by steady-state fluorescence spectroscopy. Using this method, we investigate the hydration of functionally important regions of dehalogenases DhaA and DbjA. The experimental results are compared to findings from molecular dynamics simulations. Given the ongoing development of unnatural amino acids technology, this method could potentially be used to analyze hydration at specific sites in a wide range of proteins.
Dynamics and Hydration Explain Failed Functional Transformation in Dehalogenase Design
Sykora, J., Brezovsky, J., Koudelakova, T., Lahoda, M., Fortova, A., Chernovets, T., Chaloupkova, R., Stepankova, Prokop, Z., Kuta Smatanova, I., Hof, M., Damborsky, J., 2014, Nature Chemical Biology 10: 428-430.
We emphasize the importance of dynamics and hydration for enzymatic catalysis and protein design by transplanting the active site from a haloalkane dehalogenase with high enantioselectivity to nonselective dehalogenase. Protein crystallography confirms that the active site geometry of the redesigned dehalogenase matches that of the target, but its enantioselectivity remains low. Time-dependent fluorescence shifts and computer simulations revealed that dynamics and hydration at the tunnel mouth differ substantially between the redesigned and target dehalogenase.
PredictSNP: Robust and Accurate Consensus Classifier for Prediction of Disease-Related Mutations
Bendl J., Stourac J., Salanda O., Pavelka A., Wieben E.D., Zendulka J., Brezovsky J., Damborsky J., 2014, PLOS Computational Biology 10: e1003440.
We have constructed three independent datasets by removing duplicities, inconsistencies and mutations previously used in the training of evaluated tools. The benchmark dataset containing over 43,000 mutations was employed for the unbiased evaluation of eight established prediction tools: MAPP, nsSNPAnalyzer, PANTHER, PhD-SNP, PolyPhen-1, PolyPhen-2, SIFT and SNAP. The six best performing tools were combined into a consensus classifier PredictSNP, resulting into significantly improved prediction performance and robustness. The web server and the datasets are freely available to the academic community at https://loschmidt.chemi.muni.cz/predictsnp.
Gates of Enzymes
Gora, A., Brezovsky, J., Damborsky, J., 2013, Chemical Reviews 113: 5871–5923.
This review highlights the importance of gates in enzymes. The gates control substrate access to the active site and product release, restrict solvent access to specific protein regions, and synchronize processes occurring in distinct parts of the enzyme. Survey of 129 gates in 71 enzymes enabled a rigorous definition of gates and establishment of a new scheme for their classification. Gates were assigned to six distinct classes – wings, swinging doors, apertures, drawbridges, double drawbridges and shells. Presented are summary statistics describing the propensity of specific amino acid residues in particular gate classes. The proposed classification scheme provides guidance for the analysis and engineering of gates in biomolecular systems.
Engineering Enzyme Stability and Resistance to an Organic Cosolvent by Modification of Residues in the Access Tunnel
Koudelakova, T., Chaloupkova, R., Brezovsky, J., Prokop, Z., Sebestova, E., Hesseler, M., Khabiri, M., Plevaka, M., Kulik, D., Kuta Smatanova, I., Rezacova, P., Ettrich, R., Bornscheuer, U. T., Damborsky, J., 2013, Angewandte Chemie International Edition 52: 1959-1963.
Mutations targeting as few as four residues lining the access tunnel extended enzyme’s half-life in 40% dimethyl sulfoxide from minutes to weeks (4,000-fold) and increased its melting temperature by 19 °C. Protein crystallography and molecular dynamics revealed that the tunnel residue packing is a key determinant of protein stability and the active-site accessibility for co-solvent molecules (red dots). The broad applicability of this concept was verified by analyzing twenty six proteins with buried active sites from all six enzyme classes.
CAVER 3.0: A Tool for Analysis of Transport Pathways in Dynamic Protein Structures
Chovancova, E., Pavelka, A., Benes, P., Strnad, O., Brezovsky, J., Kozlikova, B., Gora, A., Sustr, V., Klvana, M., Medek, P., Biedermannova, L., Sochor, J., Damborsky, J., 2012, PLOS Computational Biology 8: e1002708.
Tunnels and channels facilitate the transport of small molecules, ions and water solvent in a large variety of proteins. CAVER is a software tool widely used for the identification and characterization of transport pathways in static macromolecular structures. A new version of CAVER was developed enabling automatic analysis of tunnels and channels in large ensembles of protein conformations. CAVER 3.0 implements new algorithms for the calculation and clustering of pathways. The software is freely available as a multiplatform command-line application at https://www.caver.cz.
Enantioselectivity of Haloalkane Dehalogenases and its Modulation by Surface Loop Engineering
Prokop, Z., Sato, Y., Brezovsky, J., Mozga, T., Chaloupkova, R., Koudelakova, T., Jerabek, P., Stepankova, V., Natsume, R., Leeuwen, J. G. E., Janssen, D. B., Florian, J., Nagata, Y., Senda, T., Damborsky, J., 2010, Angewandte Chemie International Edition 49: 6111-6115.
Engineering of the surface loop in haloalkane dehalogenases affects their enantiodiscrimination behavior. The temperature dependence of the enantioselectivity (lnE versus 1/T) of β-bromoalkanes by haloalkane dehalogenases is reversed (red data points) by deletion of the surface loop; the selectivity switches back when an additional single-point mutation is made. This behavior is not observed for α-bromoesters.
Redesigning Dehalogenase Access Tunnels as a Strategy for Degrading an Anthropogenic Substrate
Pavlova, M., Klvana, M., Chaloupkova, R., Banas, P., Otyepka, M., Wade, R., Nagata, Y., Damborsky, J., 2009, Nature Chemical Biology 5: 727-733.
Engineering enzymes to degrade anthropogenic compounds efficiently is challenging. We obtained Rhodococcus rhodochrous haloalkane dehalogenase mutants with up to 32-fold higher activity than wild type toward the toxic, recalcitrant anthropogenic compound 1,2,3-trichloropropane (TCP) using a new strategy. Key residues in access tunnels connecting the buried active site with bulk solvent by rational design were identified and randomized by directed evolution. The most active mutant has large aromatic residues at two out of three randomized positions and two positions modified by site-directed mutagenesis. These changes apparently enhance activity with TCP by decreasing accessibility of the active site for water molecules, thereby promoting activated complex formation.
HotSpot Wizard: a Web Server for Identification of Hot Spots in Protein Engineering
Pavelka, A., Chovancova, E., Damborsky, J., 2009, Nucleic Acids Research 37: W376-W383.
HotSpot Wizard is a web server for automatic identification of ‘hot spots’ for engineering of substrate specificity, activity or enantioselectivity of enzymes and for annotation of protein structures. The web server implements the protein engineering protocol, which targets evolutionarily variable amino acid positions located in the active site or lining the access tunnels. The ‘hot spots’ for mutagenesis are selected through the integration of structural, functional and evolutionary information obtained from: (i) the databases RCSB PDB, UniProt, PDBSWS, Catalytic Site Atlas and nr NCBI and (ii) the tools CASTp, CAVER, BLAST, CD-HIT, MUSCLE and Rate4Site. The HotSpot Wizard is freely available at https://loschmidt.chemi.muni.cz/hotspotwizard/.