Friday, 2nd April 2010
Analysing evolution II
I wanted to re-run my simulation with a couple of changes. First, I wanted to see whether different 'species' could co-exist, so I recorded the genome of every organism, rather than just the fittest organism. I also recorded the concentration of metabolites in the solution at the end of each generation, so I can see which are used most.
Second, in order to encourage different species to evolve I doubled the population size to 128 cells and simplified the way the next generation is selected. Now, the fittest half of a population produces two daughter cells for the next generation. As a result, mutations now take longer to spread through the population, so a single species is less likely to dominate.
To further encourage different species to evolve and in preparation of later updates, I created a more diverse metabolic and challenging environment. The first change to create two potential sources energy, metabolites FG and FK. These provide energy both in the form of a concentration gradient (they are more concentrated outside the cell than inside) and because they release energy when broken down (the reaction is far from equilibrium). I imagine FG and FK are the equivalent of two sugars (such as glucose and fructose). Since they both contain chemical F, this is likely to become a waste product.
While I decided to keep EH as an equivalent to ATP (because its hydrolysis reaction is the most favourable), I decided to make IH the equivalent to DNA, and changed the measure of fitness: now cells must maximise the amount of IH they accumulate. In the ancestral cell, IH formation is driven by EH hydrolysis. I also decided to make KG the equivalent to amino acids, though in this simulation, this still has no effect.
With twice the population size and about twice the number of genes, in the ancestral cell, this simulation was a bit slower to run, but after a few days, it still managed to reach 1920 generations. Below is a graph showing the maximum (light blue) and median (dark blue) fitness of cells in the population, where fitness is defined by concentration of IH a cell accumulates by the end of a generation.
As in the previous simulation, the fitness increases for first 200 or so generations before levelling off. In generation 0, the fittest cell accumulated 1.99 units of IH, by generation 200, it accumulated 21.1 units, and by generation 1920, 20.9 units. After the first 200 generations, there was little difference between the fittest and median fittest organisms, suggesting that the successful genes spread quickly through the population.
Below is a graph showing the concentration of various metabolites relative the the concentration in the 0th generation at the end of each generation. Note that the concentration of metabolites is reset in each generation, so changes all changes are due to changes in the metabolism of the cells. The metabolites not shown did not change significantly and there was no big changes after the 1000th generation.
As predicted F was a major waste product and as the cells evolved they released more of it into the environment. Similarly, K and G, which are also the waste products of the "sugars" increased. Conversely, E was an important metabolite for synthesising EH, so was more was taken up from the environment as the cells evolved. Surprisingly, IL, which was not part of the original metabolic pathway also became an important metabolite, later we'll why.
There are a few large changes in metabolite concentrations around generation 80 and 200, which suggest that some important mutations occured and took hold at these points.
Since I recorded the genome of each organism in this run of evolution (1920 generation each with 128 cells, so nearly 250 000 genomes), I can also plot the fitness of, for example, the 64th fittest organism in each generation, which gives a measure of the median fitness. As you can see, the graph is very similar in general, only the position of fluctuations is different.
A quick look at the proteomes suggest that the advantageous mutation in generation 1 is for an enzyme that catalyses the reaction EH + IL ⇌ EL + IH. This reaction is a more efficient way of using the EH gradient to drive IH production and uses the small amout on IL in the cell. The improved fitness of a cell in generation 14 is likely to be due to a mutation that creates a G/IL antiporter, which allows the cell to take up IL using the G gradient.