Previously, I talked about what fitness means. Now, it’s worth taking a moment to talk about how we calculate fitness in a few of our experimental systems, both biological and computational. Because what exactly we choose to measure can result in different outcomes, the way in which we measure something is important and therefore valuable to think critically about.
E. coli Fitness
In biology land, while most of us can agree on the general concept, there are some practical limitations to how we can measure fitness. While we’d like to know how many copies of their genetic variants individuals get into future generations of the gene pool, physical organisms take a while to reproduce. Further, many of our measurements are noisy – if you are a tracking a population of birds, and you can no longer find one of them, it can be hard to tell if it was eaten by a cat (and thus is helping either Fluffy or Mittens’ genetic legacy) or if it migrated to a new territory.
Because of these limitations, most often researchers will measure a proxy of fitness. How many eggs did each fish lay? How many chicks did the bird rear? How many seeds were produced by each plant? All of these are important components of fitness, but none of them capture the entirety of it. And most of them are plagued by the same issue I brought up in the cat example: sometimes having more offspring means having fewer grandoffspring because each of those offspring ends up with fewer resources. We’d really want to measure lineages over multiple generations to deal with that issue.
In my work with bacteria in the Long-Term Evolution Experiment, I’ve been fortunate to find a system where we can assay fitness more directly (by comparing growth rates between two populations yielding their relative fitness). We have a neutral genetic marker that allows us to tell whether a particular colony can or cannot use the sugar arabinose. This sugar is not present in the environment in which they’re evolving, or the environment in which they’re competing, so there’s no particular advantage to cells for being able to use it. But we can include this sugar on plates where we want to tell how many bacteria of each type we have. (Footnote 1)
With the ability to tell our two types of bacteria apart, we can perform what we call competition assays. I’ve included a schematic of this below. First, we take two different populations, which differ in this sugar usage, and grow them up separately. We transfer them at least once into the same growth medium in which they’re going to compete – most often, the one in which they’ve evolved – and allow them another growth cycle so that they are physiologically conditioned to this medium. Then, we start the competition by transferring some of both populations into the same flask. We take an initial sample of this mixture, dilute it and plate it so we can estimate how many of each type of bacteria were around at the start of the competition, and let the rest of the flask compete. At the end of the growth cycle, we take a new sample, dilute and plate that, and use it to find out how many of each type we have at the end. (We can also extend the competition across multiple growth cycles to get greater precision in our calculations.)
Knowing the population sizes at both the start and the end of the competition allows us to determine how many generations each of the populations experienced over the course of the competition; because the populations grow by binary fission, and there is very little death over the course of the competition assay, we take the ratio of the final population size to the initial population size, and find the base 2 log of this ratio to get the number of generations. We then calculate fitness as the ratio of the number of generations of one type to the other. If they have the same number of generations, they get a fitness of 1; if the one we’re interested in has fewer generations than the one we’re measuring it against, it will have a fitness between 0 and 1; if the one we’re interested in has more generations, it will have a fitness greater than 1.
In the figure above, let’s say that the first day involved a dilution factor of 10^4, and the second day 10^6. The number of generations of the blue phenotype is therefore log2 ((8 * 10^6)/(6 * 10^4)) = 7.058894… The number of generations of the yellow phenotype is log2 ((3 * 10^6) / (6 * 10^4)) = 5.643856… And therefore the fitness of the blue phenotype relative to the yellow phenotype is 7.058894… / 5.643856…. = 1.250722… That is, the blue population goes through about 5 generations in the time it takes the yellow population to go through 4 generations. If we were instead interested in the fitness of the yellow population, we’d invert the division, and end up with a fitness of 0.7995383… (Footnote 2)
Because we’re looking at competitions across multiple generations – typically between 6 and 7 for the simple ones, around 20 for higher precision estimates, and more like 40 when we have to be sure two different version of a bacterium are competitively equivalent – this gets around the issue of whether an organism having more inferior offspring might be bad for its genetic lineage in the long run. Our way of calculating fitness looks at the fate of a lineage across generations, so all of those different components are integrated together.
Genetic Algorithms Fitness
In the world of computational science, many of the practical limitations talked about in biology fall away. We can have perfect information about all the organisms within a population, and we don’t have a lot of error in measuring it. The computational version of fitness can be defined in quite a few ways. One relatively common approach people take is to assign a fitness function to a problem. Because many computational problems have known optimal solutions, researchers can score population members on how close they come to the optimal solution.
There are also some confusing points of terminology, where some people focus on costs and set their algorithms to minimize it, and others focus on fitness and set their algorithms to maximize it. As a biologist, this former approach completely baffled me when I first found out about it, and thus I was very confused by this xkcd until I read the alt text: https://xkcd.com/534/
In Avida, we almost never have known optimal solutions, so we define fitness very differently. Fitness is actually one of the pieces of data automatically generated by the software, and included in the default output file. In this system, we define fitness as being merit divided by gestation time. In most environments, organisms can gain merit by doing whatever we are rewarding them for doing: performing logic tasks, navigating paths, coordinating behaviors, etc. Merit is awarded as additional CPU cycles, so organisms with higher merit can execute more of their instructions per unit time. This makes them run faster in everything they do, both the tasks, and the fundamental job of copying themselves. Gestation time measures the number of CPU cycles required for an organism to successfully copy itself. If two organisms are performing the same tasks, but one copies itself faster, it has an advantage in getting more copies of its genome into the population in a given amount of time.
There is a further pressure to replicate quickly in Avida: survival. In most set ups, a new organism being born in a location kills whatever organism previously existed there. As such, organisms that replicate more slowly than their neighbors tend to die without leaving offspring. Slow replication speed gets eliminated from the population unless it comes with high enough merit that the organism can blaze through a bunch of instructions in the length of time its neighbors go through a few. Our concept of fitness in Avida is much closer to what it is in biology than what it is most genetic algorithms research.
Relative vs. Absolute
It’s important to realize that all of these things tell us about relative fitness. Other groups sometimes prefer to use measurements of absolute fitness. In microbes, this is most often in terms of the maximum growth rate of their organisms, which is one of the key (but not only) components of fitness. These two approaches tell us different things. Measurements of absolute fitness are very important for figuring out what will happen to a population as a whole – will it expand, contract, or stay stable? Relative fitness, on the other hand, is useful for telling us what will happen within a given population – will these particular genetic variants become more or less frequent within the population, regardless of whether the population itself is growing or contracting? As with many things in science, what questions you have determine how you should go about trying to find the answer.
1: Chemistry interlude: The way we tell our colonies apart is actually all dependent on pH. Those bacteria that cannot use the sugar are forced to use the amino acids present, and form dark red colonies due to a dye on the plate; those that can use the sugar will preferentially use the sugar instead of the amino acids, and form pink colonies (which in the literature are somewhat confusingly called white colonies). The dye changes color because the bacteria using the amino acids end up creating ammonia (thereby changing the local pH and the color of the dye). I suspect, though I’ve never actually researched it, that the reason why the colonies we call white are actually pink is that when the bacteria run out of the sugar they turn to the amino acids as the only things left, and start building up some ammonia as well. They don’t end up making as much of it, thus the color change is much more subtle, and they end up light pink instead of dark red.
2: The number shown in this fitness calculation is far too precise for the data involved in calculating it. With so few colonies on the plate, we wouldn’t be anywhere near as confident for the number of decimal places displayed as if we had been working with 20x as many colonies on each plate. It works as an illustration, but shouldn’t be taken too literally.