Using Computers to Improve Biofuel Production

Petroleum is the most-used energy source in the world. However, as you probably know, petroleum is a fossil fuel that is very harmful to the environment, in addition to not being renewable. Biofuels are a type of fuel produced from plant material. Biofuels are considered an excellent alternative energy source because they are less polluting than fossil fuels. However, biofuel production is expensive. Therefore, scientists are working on many strategies to reduce biofuel costs, particularly using computers to discover new biotechnological products, or improve existing ones, to produce more biofuel with fewer costs. In this article, we will tell you how computers can be used to improve biofuel production.


BIOFUELS: AN ENVIRONMENTALLY FRIENDLY ALTERNATIVE
Petroleum-based products are widely used to power our cars, to heat our homes, to generate electricity, and to make plastics. Fossil fuels, like petroleum oil, are buried deep within the Earth. They take millions of years to form and their formation depends on high-pressure environments and dead organisms like plants, algae, bacteria, and animals (including dinosaurs). When burned, fossil fuels release carbon dioxide. Therefore, the more petroleum burned, the more carbon dioxide is released into the atmosphere, contributing to global warming [ ].
Biofuels may be a better source of energy. Biofuels can be produced BIOFUEL Fuels made from organic matter, such as plants. First-generation biofuels are generally those most easily obtained, such as from sugarcane juice. Second-generation biofuels are those produced from organic matter, as those remaining from first-generation production.
from plants, such as corn, sugarcane, and soy. Because they come from plants that we can continue to grow, biofuels are considered renewable and sustainable, which means we can produce this kind of energy continuously [ ]. However, biofuel production is expensive. Many complex processes are needed to create biofuels from plant biomass. So, many people still believe that petroleum is BIOMASS organic matter of plant or animal origin. the more cost-friendly choice, but they are ignoring the long-term environmental issues.
For years, scientists have been working to improve biofuel production. For instance, to produce biofuel from sugarcane, sugar is extracted from cane juice and used to produce bioethanol (a type of fuel), through a process called fermentation. However, lots of waste FERMENTATION Chemical process of producing ethanol biofuel from sugars, generally made by enzymes from yeast or bacteria. biomass and sugar are left over after the extraction process. A recent study in Brazil estimated that, if the leftover sugar was extracted, biofuel production could be doubled [ ]! Biofuel produced from sugarcane biomass is called second-generation biofuel.
Second-generation biofuel production involves many steps (Figure ). Saccharification is a crucial step: enzymes are used to break down the SACCHARIFICATION Process of extracting sugars (glucose molecules) from organic matter. This process can be done in several ways. For example, enzymes can be used to perform chemical reactions to break down the plant's molecules, releasing glucose molecules.

ENZYME
A type of protein that speeds up chemical reactions. Also referred as molecules or macromolecules. leftover sugarcane biomass. Enzymes are proteins, and like all proteins they are made of chains of subunits called amino acids (that are made by atoms). Enzymes speed up chemical reactions, like breaking down other substances. The sugarcane waste and enzymes are mixed together in a large tank, where the enzymes break down the waste to release sugar. Di erent enzymes have di erent abilities to release sugar from sugarcane waste [ ]. Improving the less-e cient enzymes could be a good strategy to improve this stage of biofuel production. Computers can be used to detect the most essential characteristics of e cient enzymes, then these characteristics can help scientists to design enzymes that are more e cient.

Figure
Production of first-and second-generation biofuels. First-generation biofuels are easily produced (for example, from the sugarcane juice). Second-generation biofuels are produced from the plant biomass left over from making first-generation biofuels. This process uses enzymes to break down the leftover plant biomass to release sugar molecules (called saccharification). Converting sugar into bioethanol is called fermentation. Computer simulations, used together with genetic engineering, can produce enzymes that are good at saccharification, so that biofuels can be produced more easily.

HOW ARE COMPUTERS MAKING THE DIFFERENCE?
Scientists can use genetic engineering to generate enzymes that are

GENETIC ENGINEERING
Process of modifying the structure of a biological molecule (mutation) through laboratory experiments. more e cient at helping chemical reactions happen faster. Genetic engineering is complicated. It involves mutating the structure of an enzyme and studying how the mutations a ect the enzyme's function. There are zillions of possible mutations and combinations of mutations-for instance, an enzyme with amino acids could be given amino acid mutations! A scientist could not possibly make and test all of these mutations! Instead, computers can be used to simulate mutations, pointing out the most promising ones to test in lab experiments. Special computer programs can simulate the structures of molecules like enzymes, based on their DNA sequences. Some software even uses graphics cards (traditionally used for running games) to show the functions of mutant enzymes as a movie [ ]! kids.frontiersin.org June | Volume | Article |

ALGORITHM: A COMPLEX WORD FOR A SIMPLE THING
Computers are powerful, but they also have limitations. If we want a computer to do something, a step-by-step procedure must be created. This procedure is called an algorithm. For example, to design ALGORITHM A step-by-step procedure used by a computer to solve a problem.
a better enzyme, we must first understand the structure of the original enzyme. Every enzyme has a unique signature pattern, almost like

SIGNATURE PATTERN
In structural bioinformatics, signature patterns are a set of characteristics obtained from computing analyses of some biomolecule. For example, counting the neighbor atoms number (the final list is the molecule signature).
a fingerprint, based on the types of atoms it contains. Enzymes are composed of several atoms connected by chemical interactions.
Atoms are very small particles, and the distances between them are also tiny. The types and locations of atoms in an enzyme determine the enzyme's shape, function, and how e cient it is at biofuel production. Thus, we created an algorithm to analyze each atom and its neighbors, to give us the enzyme's signature pattern. Using computers, we can represent the signature patterns of enzymes mathematically, using a list of numbers [ ]. Then, using simple equations, we can calculate the distance between the numbers to determine how similar enzymes are to each other. Similar enzymes will have similar signature patterns and might have similar functions.
Then, we proposed possible mutations for our non-e ective enzymes to make their signature patterns more like those of the e cient enzymes. To do this, we calculated the distances among the enzymes' signatures. This can be a little confusing. Look at Figure to understand better. Remember: similar enzymes will have similar signatures. Also, enzymes with similar signatures will be closer than enzymes with di erent signatures. Therefore, we can use the distances to compare e cient and non-e cient enzymes. For example, imagine that the blue enzyme is a known e cient enzyme for biofuel production (do not worry, many other scientists probed that before). Additionally, pink and green enzymes are two mutants produced by genetic engineering (unfortunately, we do not have additional information about them). Based on our algorithm, we could suppose that the pink enzyme is the most e cient mutant for biofuel production because its signature point is closer to the blue enzyme's signature point (the e cient one) than the green's point (the other mutant).

HOW DO COMPUTERS COMPARE MUTANT ENZYMES?
Suppose that each enzyme's signature is like a star in the sky. Stars can be grouped in constellations. How do we know which stars belong to the same constellation? They will be closer, and their positions and alignment will form some (slightly) recognizable shapes. We can use shapes and distances between stars to detect which constellations stars belong to. If you know the constellations, you can easily see them by glancing at the night sky. But your brain already knows the forms of the constellations. A computer does not know them; therefore, we need to teach it. After learning the fundamentals, the computer just needs to repeat the math. And computers are particularly good at Figure   Figure (A) Visualization of three enzymes. Note that the blue and pink enzymes have a similar shape. Both are enzymes used for biofuel production, but the green enzyme, with a very di erent shape, is not. (B) Enzymes' representation as atoms. Some of the main atoms in each enzyme are shown connected to their neighbors by lines. Our algorithm counts the neighboring pairs and converts this into a set of numbers. All these numbers give us the signature pattern of the enzyme. (C) Here, each signature is represented as a sphere. Look how the spheres of similar proteins are closer. Therefore, the blue and pink enzymes have more similar signature patterns, di erent from the green enzyme. making calculations! Suppose you can detect a constellation in s. In that case, a computer with a good algorithm could detect billions of constellations in less than a second. Imagine we are looking at a shining star on a beautiful night and we see that it belongs to the Capricornus constellation. Now, imagine that we have the magic power to move stars-to push them in random directions. Suppose we want to move our star to the Sagittarius constellation. So, using our power, we move our star several times, until it stops closer to Sagittarius (Figure ).
The same analogy can be made for computer simulations of enzyme mutations. Each star represents an enzyme used in biofuel production. The Capricornus constellation represents a set of enzymes that are ine cient for biofuel production. Sagittarius represents e cient enzymes. The "magic power" of moving stars represents the computer's ability to simulate mutation. Mutations can occur naturally, but this process depends on many factors and can take millions of years. Genetic engineering allows scientists to quickly insert mutations into the structure of a molecule, allowing it to improve its activity (or decrease).

kids.frontiersin.org
June | Volume | Article | Figure   Figure Yellow lines connect the stars that are in the same constellation. Each panel of the figure represents a use of our "magic power" to move stars. These movements correspond to mutations of an enzyme. After each movement, we calculate the distance between the closest stars from each constellation. Note that the purple, blue, and red stars stay closer to Capricornus, while only the green star moves closer to Sagittarius. Therefore, only the green mutation event changes the star's constellation. When mutating enzymes, this would mean modifications in the structure of these molecules.
We simulate mutations by changing random parts of the enzyme. Then, we observe whether the mutated enzyme has a signature pattern closer to the known e cient or ine cient enzymes. If the mutation makes an enzyme's signature pattern more similar to that of the e cient enzymes, we can assume that the mutant enzyme will have characteristics similar to the e cient ones. Then, scientists can make and test this mutation in the lab, to see if the mutated enzyme actually is more e cient at biofuel production.

CONCLUSION
Over the last several years, many studies have been done to improve enzymes for biofuel production. However, laboratory tests are expensive and take a lot of time. Using computer simulations, we can run millions of tests in seconds. Although they are not as accurate as laboratory tests, computer results can help scientists to figure out which laboratory tests are likely to give them positive results. Designing algorithms for biological purposes is not an extraordinarily complex task if you understand the biological problem well and know a programming language (we recommend Python).
Computers are one of humanity's most amazing technological advances. They are responsible for a revolution in life sciences, helping scientists to improve biotechnological products like biofuels,  . doi: . /frym. .

CONFLICT OF INTEREST:
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
COPYRIGHT © Mariano, Santos, Meleiro, de Lima, Marins and de Melo-Minardi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

JOSÉ LUCAS, AGE:
José Lucas is a student within the Mizzou Academy high school program at Colegio Maxi and he is excited to take part in the scientific process.

MARCELLA, AGE:
Marcella is a student within the Mizzou Academy high school program at Colegio Maxi and she is excited to take part in the scientific process.

DIEGO MARIANO
Diego has a Ph.D. in bioinformatics from the Universidade Federal de Minas Gerais, working in the field of data science and machine learning applied to the improvement of enzymes used in the production of biofuels. He is currently doing a post-doctoral internship at the Department of Computer Science with kids.frontiersin.org June | Volume | Article | a focus on developing web systems for bioinformatics, exploratory analysis, and data visualization. *diegomariano@ufmg.br

LUCIANNA HELENE SANTOS
Lucianna graduated in mathematics ( ) and received a master's in computational sciences from the Universidade Estadual do Rio de Janeiro ( ). She holds a Ph.D. from the Graduate Program in Computational Biology and Systems at Instituto Oswaldo Cruz (FIOCRUZ), where she worked with molecular modeling of biological systems. She has experience in structural biology, with an emphasis on molecular modeling, working mainly on the following topics: molecular dynamics, receptor-ligand interaction, computational free-energy calculations, protein-protein interaction.

LUANA PARRAS MELEIRO
Luana graduated in chemistry with qualification in technological chemistry, biotechnology and agroindustry from the Faculdade de Filosofia Ciências e Letras de Ribeirão Preto (FFCLRP), USP ( ). She has a Ph.D. in science, with a focus in enzymology, molecular biology and studies of the structure and function of proteins, with special attention to the enzymes involved in the saccharification process of lignocellulosic biomass, also from FFCLRP-USP.

LEONARDO HENRIQUE FRANÇA DE LIMA
Leonardo graduated in biological sciences from the Universidade Federal de Minas Gerais ( ), got a master's in chemical engineering -biotechnological processes from the Universidade Estadual de Campinas ( ) and a Ph.D. in biomolecular physics from the Universidade de São Paulo ( ). He is currently an adjunct professor at the Universidade Federal de São João Del-Rei, Campus Sete Lagoas. His postdoctoral internship in computational biophysics took place at the Institut für Allgemeine, Anorganische und Theoretische Chemie of the Centrum für Chemie und Biomedizin (CCB) of the Universität Innsbruck, Innsbruck, Austria ( ).

LUIS FERNANDO MARINS
Luis graduated in oceanology from the Universidade Federal de Rio Grande -FURG ( ), master's ( ) and doctorate ( ) in biological oceanography (FURG), and did an internship at the University of Southampton, UK. He is currently an associate professor at the Institute of Biological Sciences (ICB-FURG), advisor of the Postgraduate Program in Physiological Sciences (PPGCF-FURG), advisor of the Postgraduate Program in Aquaculture (PPGAq-FURG), and President of the Internal Biosafety Commission (CIBio-FURG).

RAQUEL CARDOSO DE MELO-MINARDI
Raquel holds a Ph.D. in bioinformatics from the Universidade Federal de Minas Gerais ( ) and a degree in computer science from the same institution ( ). She did her postdoctoral research at the Comissariat à l'Energie Atomique et aux Énergies Alternatives / CEA in France ( / ). She is currently a professor at the Universidade Federal de Minas Gerais in the Department of Computer Science. She is an a liate member of the Brazilian Academy of Sciences ( -).