Rachel ThomasLiving through a pandemic we all know by now how important the reproduction ratio, *R*, is to understanding the progress of a disease. But what do you need to know to calculate it?

How long does it take until one infection generates another?

Knowing *R*, the average number of people infected by a single infected person, helps us understand what is happening with the disease. If *R*>1 then the epidemic will grow, *R*=1 means we are plateauing, and *R*<1 means that the epidemic will decline. The *R* number also gives an intuitive way for predicting the strength of intervention needed to stop an epidemic.

But *R* is not easy to measure in practice, particularly during a live epidemic. Instead it is much easier to estimate the growth rate of the disease directly from available data, and then use this to estimate the value of *R*. But in order to do that, you need to understand another important aspect of the disease – the *generation time* – the time from when one person is infected, to when they go on to infect another person.

The generation timeSee here for all our coverage of the COVID-19 pandemic.

The generation time isn't one single number for a particular virus. Instead it is defined on pairs of people: the infector and the infectee they go on to infect. The interval of time between when one person was infected (the infector) and when they went on to infect someone else (the infectee) will be different for different pairs of people.

The range of values the generation time takes can be well described by a probability distribution (often epidemiologists use a *gamma distribution*, which is used to describe waiting times between events). Then statistical measures, such as the mean and variance of the generation times, can be used as a parameter within our models of how the virus will behave.

This all sounds very promising but there's a hitch: we can't directly measure these generation times. We almost never know the moment, or even the day, that someone becomes infected. What is usually directly observed is when someone develops symptoms, or someone tests positive.

An infector (blue) and infectee (purple). The red interval represents the generation time and the orange interval the serial time.

Estimating generation time

There is a related concept in epidemiology called the *serial interval*, the time from when an infector develops symptoms to when the infectee develops symptoms. It is possible that these two concepts are so closely intertwined that the mean of the serial interval is likely to be the same as the mean of the generation time. But the distributions of these two measures will certainly be different. One obvious difference is that the serial interval can be negative if the infector takes far longer to develop symptoms than the infectee, while the generation time must always be a positive number. And as we know now for COVID-19, some infectious people will never go on to generate symptoms.

An infector (blue) and infectee (purple). In this example the serial time is negative because the infectee develops symptoms before the infector.

The other problem is that it's very rare to be able to accurately identify the pairs of infectors and infectees. But where it's possible to trace cases accurately there is an opportunity to at least observe the serial intervals and estimate the generation time. An example is the detailed data obtained for certain clusters of COVID-19 infections in China and Singapore in early 2020 that enabled scientists to calculate the distribution of the serial intervals. From this, and from detailed information on contacts between the infectors and infectees, and assumptions about the incubation time of the virus, scientists estimate that the mean generation time for COVID-19 is between 3 to 5 days in those settings.

Why is it important?

Understanding the generation time of COVID-19 has real practical importance now because it affects the estimates of the reproduction ratio *R*. It is relatively straightforward to measure the growth rate directly from disease data, and then use this to estimate the value of *R*. "[But] how you translate between *R* and the growth rate depends on your assumptions about generation time," says Julia Gog, Professor of Mathematical Biology at the University of Cambridge and a founding member of the JUNIPER modelling consortium.

The relationship between the value of *R* and the growth rate, and the role generation time plays isn't straightforward — you can see a more mathematical explanation of why in our article about the growth rate. But this relationship can be simplified when you are using specific probability distributions to describe the generation time.

For example, if the generation time follows a gamma distribution as we mentioned above, the relationship is

Here is the growth rate, and and are parameters describing the shape of the generation time distribution: is the mean and is the variance. You can see from this equation that if the growth rate is zero, , then no matter what is happening with the generation time. Similarly, if the growth rate is positive, , then and if the growth rate is negative, then regardless of the generation time.

Although the broad direction of *R* is unaffected by generation time, the size of *R* will be affected by our assumptions about the generation time and this could impact our understanding of how the disease is behaving.

What does a change in generation time mean?

The generation time distribution can't be directly measured, so it is inevitable that researchers will need to make assumptions. Therefore it's really important to know what impact a wrong assumption would have on understanding the growth of a disease.

This figure shows how different assumptions on generation time affect *R*.

The graph above shows how a different assumption about the generation time (GT) can affect our estimate of *R*. We can see that if the growth rate is zero, then *R*=1 for all the different distributions of generation time. But for any non-zero growth rate, there are significant differences in the resulting estimates for *R* as you vary the mean generation time.

If the true generation time is longer than we assume, then the true values of *R* become more extreme, pushed higher for positive growth rates and lower for negative growth rates. For example, in order to account for any fixed positive growth rate, if the generation time is longer than assumed so generations are less frequent, then each case needs to infect more people on average. "An intuitive way to remember this is if the generations are less frequent, each generation needs to do more work," says Gog.

Conversely, a shorter generation time corresponds to values of *R* that are closer to 1. And this, Gog says, could change how we evaluate the possible impact of interventions to control the disease. If the true generation time is shorter than we have assumed, then the true value of *R* will be closer to 1 than we estimate, and even a small change in transmission might push the value over, or under the critical threshold of *R*=1.

"For a growing epidemic the interventions that just reduce all transmission [such as a working from home or a lockdown] might be more effective than we expect," says Gog. "Overestimating *R* means you make [the epidemic] seem harder to control than it is." But she has a note of caution: this goes both ways. "If cases are decreasing our *R* will be an underestimate and thus it will be less safe than we think to lift restrictions."

The other impact of the true generation time being shorter than we think is that interventions that depend on the progression of an infection, such as contact tracing or medical treatments that need to be used early, could be less effective. But exactly how so will depend on the interplay between when an infection will be symptomatic, test positive and infectious.

Thus, getting good estimates of the generation time, and knowing how an inaccurate estimate will affect our calculation of *R*, is essential when it comes to understanding where we are in the pandemic, and predicting our pandemic future.

About this article

This article was produced with Julia Gog, and is partly based on Julia Gog's talk at the Understanding the generation time for COVID-19 event that took place at the Newton Gateway for Mathematics in July 2021. You can read our report at the event here.

Gog is a founding member of the JUNIPER modelling consortium, and a member of SPI-M, a modelling group which feeds its results into the

Scientific Advisory Group for Emergencies (SAGE), as well as of the steering committee of a national consortium, led by the Royal Society, to deal with the COVID-19 pandemic.

Rachel Thomas is Editor of *Plus*.

*This article was produced as part of our collaborations with JUNIPER, the Joint UNIversity Pandemic and Epidemic Response modelling consortium, and the Isaac Newton Institute for Mathematical Sciences (INI).*

* JUNIPER comprises academics from the universities of Cambridge, Warwick, Bristol, Exeter, Oxford, Manchester, and Lancaster, who are using a range of mathematical and statistical techniques to address pressing question about the control of COVID-19. You can see more content produced with JUNIPERhere.*

*The INI is an international research centre and our neighbour here on the University of Cambridge's maths campus. It attracts leading mathematical scientists from all over the world, and is open to all. Visit www.newton.ac.uk to find out more. *