Advanced Database Mining of Efficient Biocatalysts by Sequence and Structure Bioinformatics and Microfluidics
Vasina, M., Vanacek, P., Hon, J., Kovar, D., Faldynova, H., Kunka, A., Buryska, T., Badenhorst, C.P.S., Mazurenko, S., Bednar, D., Stavrakis, S., Bornscheuer, U.T., deMello, A., Damborsky, J., Prokop, Z.
CHEM CATALYSIS 2: 2704-2725 (2022)
Next-generation sequencing doubles genomic databases every 2.5 years. The accumulation of sequence data provides a unique opportunity to identify interesting biocatalysts directly in the databases without tedious and time-consuming engineering. Herein, we present a pipeline integrating sequence and structural bioinformatics with microfluidic enzymology for bioprospecting of efficient and robust haloalkane dehalogenases. The bioinformatic part identified 2,905 putative dehalogenases and prioritized a “small-but-smart” set of 45 genes, yielding 40 active enzymes, 24 of which were biochemically characterized by microfluidic enzymology techniques. Combining microfluidics with modern global data analysis provided precious mechanistic insights related to the high catalytic efficiency of selected enzymes. Overall, we have doubled the dehalogenation “toolbox” characterized over three decades, yielding biocatalysts that surpass the efficiency of currently available wild-type and engineered enzymes. This pipeline is generally applicable to other enzyme families and can accelerate the identification of efficient biocatalysts for industrial use.