Digital interventions revolutionizing enzyme engineering 

Artificial Intelligence and Machine Learning are extremely critical for leapfrogging the traditional enzyme evolution process, writes Naveen Kulkarni, CEO, Quantumzyme

About Author: Naveen Kulkarni, Chief Executive Officer of Quantumzyme comes with more than 20 years of experience covering a broad range of scientific, entrepreneurial and innovation strategies. With markets spanning Europe, USA, Australia and India, he has also served as the Director at Philips Research, prior to founding Quantumzyme.

Enzyme engineering is the process of improving the efficiency of an existing enzyme or the formulation of an advanced enzyme activity by altering its amino acid sequence. This is done to overcome the disadvantages of native enzymes as biocatalysts and enhance the biocatalysts. The recent surge in high-throughput experimental methods including Next Generation Sequencing is bringing a paradigm shift in our approaches towards enzyme engineering.
Artificial intelligence (AI) and machine learning (ML) are the part of next gen computing technology that are correlated with each other. Although these are two related technologies and sometimes used as a synonym for each other, still, both are the two different terms. AI and ML have a great potential to revolutionize smart enzyme engineering without the explicit need for a complete understanding of the underlying molecular system.
On a broad level, AI is a bigger concept that creates intelligent machines that can simulate human thinking capability and behavior, whereas, machine learning is an application or subset of AI that allows machines to learn from data without being programmed explicitly.
Why digital data matters?
Both AI and ML are required to enhance the capabilities of Enzyme Engineering along with enabling the possibility of seeing several hundred mutations in the least amount of time. Most essential component for building AI and ML is the data set and the field of Enzyme Engineering has recently started focusing on curating digital data. It can be said that it is in its nascent stage of curating data and it will only get better. So, what kind of data do we need for AI and ML.

“Most essential component for building AI ML is the data set and the field of Enzyme Engineering has recently started focusing on curating digital data”

The larger accumulation of data has proven to be very useful for building AI/ML models, the reason being these algorithms strive on understanding from more and more relevant data. The data available widely on the databases that are used have an abundance of data related to protein sequence but lacks in details like the protein structure, kinetic/performance data and reaction details.
The challenge with the available data is that most of the databases are not structured and cannot be directly used for machine learning models. And because of which data gathering, cleaning and preprocessing steps are time consuming and at times impossible as the data saved are mostly images or saved in no defined structure. The databases acquired from combining data accumulated from different sources may provide enough data, but the problem arises with respect to the data consistency and comparability.
Evolving technologies
The ML umbrella also encompasses another term, deep learning. Deep models are networks of large size and complexity that are able to model difficult data distributions such as data of high dimensionality. These models extract useful patterns directly from the raw data without the need for manually created features or any human intervention. However, due to the large number of parameters in deep models, exhaustive training data sets are required for model training/learning. With more experimental data and more computational power available, ML efforts in the field of enzymology are growing and will continue to grow, including the creation of new databases for training, data preprocessing and the ethical concerns related to creating new biomolecules with unintended characteristics.
There are various technologies available to develop efficient models like TensorFlow, which helps in model tracking, performance monitoring and model retraining. These technologies support reducing tasks by implementing automation and help the team understand the problem better by providing more insights, visual illustrations of complex concepts and intelligent predictions.
Way Forward
In conclusion, AI and ML are extremely critical for leapfrogging the traditional enzyme evolution process and achieving success in Enzyme Engineering. In the past decade there has not been any specific focus on this, but the turn of this decade has pushed harder on the need for faster solutions for newer and higher efficient enzymes, thereby a greener earth.

**Views expressed by the author are his own.