Rt Comparison

Data and required packages

We attach some required packages

library(ggplot2)
library(dplyr)
library(grid)
library(EpiEstim)
library(EpiInvert)
library(EpiNow2)

Loading stored data on COVID-19 daily incidence up to 2022-05-05 for France, Germany, the USA and the UK:

data(incidence)
tail(incidence)
#>           date   FRA    DEU    USA    UK
#> 828 2022-04-30 49482  11718  23349     0
#> 829 2022-05-01 36726   4032  16153     0
#> 830 2022-05-02  8737 113522  81644    32
#> 831 2022-05-03 67017 106631  61743 35518
#> 832 2022-05-04 47925  96167 114308 16924
#> 833 2022-05-05 44225  85073  72158 12460

Loading some festive days for the same countries:

data(festives)
head(festives)
#>          USA        DEU        FRA         UK
#> 1 2020-01-01 2020-01-01 2020-01-01 2020-01-01
#> 2 2020-01-20 2020-04-10 2020-04-10 2020-04-10
#> 3 2020-02-17 2020-04-13 2020-04-13 2020-04-13
#> 4 2020-05-25 2020-05-01 2020-05-01 2020-05-08
#> 5 2020-06-21 2020-05-21 2020-05-08 2020-05-25
#> 6 2020-07-03 2020-06-01 2020-05-21 2020-06-21

Loading the serial interval proposed in Nishiura et al. which follows a log-normal distribution with a median of 4 days, a mean of 4.7 days and a standard deviation of 2.9 days:

si_distrib <- read.csv(url("https://www.ctim.es/covid19/nishiura.csv"),header=FALSE)
head(si_distrib)
#>           V1
#> 1 0.00000000
#> 2 0.04169583
#> 3 0.16207936
#> 4 0.20316240
#> 5 0.17512594
#> 6 0.13039113

In what follows we will estimate Rt using the different methods in the case of France. For the other countries, we can do it in the same way, changing, in the script ‘FRA’ by ‘DEU’, ‘USA’ or ‘UK’.

Rt estimation using the different methods

EpiEstim (EE)

We prepare the incidence data for EpiEstim:

  FRA_incid <- data.frame(
    dates = as.Date(incidence$date),
    I = incidence$FRA
  )
  FRA_incid$I[incidence$FRA < 0] <- 0

We compute the EpiEstim Rt estimate

  N<-length(FRA_incid$I)
  FRA_EpiEstim <- estimate_R(FRA_incid,
                            method="non_parametric_si",
                            config = make_config(list(
                            t_start = seq(30, N-6),
                            t_end = seq(30+6, N),
                            si_distr = si_distrib$V1))
  )
  tail(FRA_EpiEstim$R$Mean)
  [1] 0.7337554 0.7319374 0.7672001 0.7490753 0.7473938 0.7532675

Wallinga - Teunis (WT)

We compute the Wallinga-Teunis Rt estimate:

  FRA_Wall_Teun <- wallinga_teunis(FRA_incid,
                                   method="non_parametric_si",
                                   config = list(
                                   t_start = seq(30, N-6),
                                   t_end = seq(30+6, N),
                                   si_distr =si_distrib$V1,
                                   n_sim =0))
  tail(FRA_Wall_Teun$R$`Mean(R)`)
  [1] 0.6699915 0.6393988 0.6294066 0.5260823 0.4289424 0.3322501

EpiInvert (EI)

We compute the EpiInvert Rt estimate:

  FRA_EpiInvert <- EpiInvert(incidence$FRA,
                             "2022-05-05",
                             festives$FRA,
                             select_params(list(si_distr =si_distrib$V1,
                                                max_time_interval=9999)))
 tail(FRA_EpiInvert$Rt)
  [1] 0.7637433 0.7652671 0.7659675 0.7661722 0.7661985 0.7661985

EpiNow2 (EN)

EpiNow2 does not use the serial interval. Instead, it uses the generation time and the incidence is computed backward using distributions for the incubation period and reporting delay. In this way, EpiNow2 models the underlying latent process of infections. It is highly configurable and includes as an option a forecast of Rt and the incidence. We do not use the forecast option in this study. The reporting delay is country-depending and we have not information about it for the countries under study, so for this study we will not use the delay in the observations introduced by the incubation period and the reporting delay. We use the generative model R_t = GP which leads to a large speed up. For comparison purposes, we use a generation time distribution with the same mean and standard deviation that the serial interval used above:

generation_time <- list(
    mean = 4.7,
    mean_sd = 0.01,
    sd = 2.9,
    sd_sd = 0.01,
    max=15
    )

We prepare the incidence data to call to EpiNow2

  reported_cases <- data.frame(
    date = as.Date(incidence$date),
    confirm = incidence$FRA
  )
  reported_cases$confirm[reported_cases$confirm < 0] <- 0

We execute EpiNow2

FRA_EpiNow2 <- epinow(reported_cases = reported_cases,
                    generation_time = generation_time,
                    rt = rt_opts(gp_on="R0"),
                    horizon=0,
                    stan = stan_opts(cores = 4))

tail(summary(FRA_EpiNow2, type = "parameters", params = "R")$mean)
[1] 0.831922024 0.839024173 0.84640193 0.854011864 0.8618104 0.869754617

Note: We thank Prof. S.Abbott for his comments and suggestions on this study and on the use of EpiNow2.

Comparison results

Optimal shift between the different Rt estimates

First, we will compare the Rt estimations provided by the different methods by estimating the optimal shift between the time series in each pair of methods. We assume that the Rt estimates are defined in the time interval [0,T] where the value in T corresponds to the last value of Rt provided by the method. Let R and R’ be two estimations of Rt provided by two methods. We define the optimal shift, s(R,R’), by minimizing the average error d(R,R’:s) :

\((1) \ \ \ \ s(R,R^{\prime})=\underset{s\in\lbrack-25,25]}{\arg\min} \ \ d(R,R^{\prime}\!:\!s)\equiv\sum_{t=30}^{T-30}\frac{|R(t)-R^{\prime}(t+s)|+|R(t-s)-R^{\prime}(t)|}{2(T-60)}\)

The value 30 is introduced in the formula to avoid boundary effects in the comparison. The lower the value of d(R,R’:s) the better the match between R and R’ after a shift given by s. Notice that the optimal shift is anti-symmetric, that is s(R,R’)=-s(R’,R) and d(R,R’:s) is symmetric, that is d(R,R’:s)=d(R’,R:s). To evaluate R(t-s) or R’(t+s) we use linear interpolation.

In the following four figures, we show, for the different countries, the plot of Rt obtained by the methods EE (EpiEstim), EI (EpiInvert), WT (Wallinga-Teunis) and EN (EpiNow2) after applying the optimal shift estimated between EI and the other methods. That is, we compute s(EI,EE), s(EI,WT) and s(EI,EN).

We observe a quite good agreement between the different Rt estimates after applying the optimal shift. The obtained shifted Rt estimates that we show in the above figures can be downloaded from:

  FRA_Rt_shift <- read.csv(url("https://www.ctim.es/covid19/FRA_Rt_shift.csv"),sep=";")
  DEU_Rt_shift <- read.csv(url("https://www.ctim.es/covid19/DEU_Rt_shift.csv"),sep=";")
  USA_Rt_shift <- read.csv(url("https://www.ctim.es/covid19/USA_Rt_shift.csv"),sep=";")
  UK_Rt_shift <- read.csv(url("https://www.ctim.es/covid19/UK_Rt_shift.csv"),sep=";")

Next, we present two tables with the mean and standard deviation of s(R,R’) and d(R,R’,s(R,R’)) for the different countries.

Table 1 : mean and standard deviation of s(R,R’) for the different countries

	EI	EE	WT	EN
EI	0 \(\pm\) 0	7.36 \(\pm\) 0.02	3.02 \(\pm\) 0.19	4.80 \(\pm\) 0.44
EE	-7.36 \(\pm\) 0.02	0 \(\pm\) 0	-4.32 \(\pm\) 0.03	-2.97 \(\pm\) 0.52
WT	-3.02 \(\pm\) 0.19	4.32 \(\pm\) 0.03	0 \(\pm\) 0	1.55 \(\pm\) 0.13
EN	-4.80 \(\pm\) 0.44	2.97 \(\pm\) 0.52	-1.55 \(\pm\) 0.13	0 \(\pm\) 0

Table 2 : mean and standard deviation of d(R,R’,s(R,R’)) for the different countries

	EI	EE	WT	EN
EI	0 \(\pm\) 0	0.034 \(\pm\) 0.011	0.026 \(\pm\) 0.009	0.060 \(\pm\) 0.022
EE	0.034 \(\pm\) 0.011	0 \(\pm\) 0	0.025 \(\pm\) 0.007	0.061 \(\pm\) 0.018
WT	0.026 \(\pm\) 0.009	0.025 \(\pm\) 0.007	0 \(\pm\) 0	0.046 \(\pm\) 0.015
EN	0.060 \(\pm\) 0.022	0.061 \(\pm\) 0.018	0.046 \(\pm\) 0.015	0 \(\pm\) 0

In PNAS (2021), we show that the expected value of s(EI,EE) is about 3 +m where m is closed to a central value (median or mean) of the serial interval distribution. Since the used serial interval has a median of 4 and a mean of 4.7, the expected value of s(EI,EE) is between 7 and 7.7. Notice that, as shown in table 1, the obtained mean value of s(EI,EE) for the four considered countries is 7.36, which confirms our expectations. On the other hand, in PNAS (2021), we also show that if R and R’ are the Rt estimations for WT and EE respectively, then

\(R(t)=\sum_k\Phi(k)R'(t+k).\)

The second term of this equation can be approximated in the following way:

\(\sum_k\Phi(k)R'(t+k) \approx R'(t+m)\)

where m is a central value of the serial interval \(\Phi\). Therefore, the expected value of s(WT,EE) is between 4 and 4.7 which is confirmed by the obtained value, 4.32, shown in table 1. On the other hand, as it is also expected, we have that, using the values of table 1, s(EI,EE)=7.36 is closed to s(EI,WT)+s(WT,EE)=3.02+4.32=7.34. Although the method EN is quite different from the other ones, the stability of the shift value between EN and the other methods for the different countries suggests that there must be a theoretical reason to justify such a shift.

In the following plot we show the optimal shift s(R,R’) versus the average error d(R,R’,s(R,R’)) for any pair of methods and the different countries. As stated numerically in tables 1 and 2, we can observe that the value of s(R,R’) is very stable for the different countries and that the largest values of d(R,R’,s(R,R’)) corresponds to the comparison of EN with the other methods and the lowest values to the comparison of WT with with the other methods.

Regularity of the different Rt estimates

To measure the regularity of the different Rt estimates, we use the following averages of the absolute values of the first and second derivatives of Rt.

\((2) \ \ \ \ \text{FIRST DERIVATIVE AVERAGE:} \ \ ||dR/dt||=\sum_{t=30}^{T-30}\frac{|R(t+h)-R(t-h)|}{2(T-60)}\)

\((3) \ \ \ \ \text{SECOND DERIVATIVE AVERAGE:} \ \ ||d^2R/dt^2||=\sum_{t=30}^{T-30}\frac{|R(t+h)-2R(t)+R(t-h)|}{T-60}\)

In the following plot we compare the regularity of the different Rt estimates. We observe that by far, EN provides the smoother estimate, EE provides the most irregular estimates and EI and WT provide similar results. Concerning the regularity we have to take into account that we can tune the regularity of the methods by changing the method’s parameters. For instance, in the case of WT and EE we can change the number of weeks to average the incidence and in the case of EI there exists a parameter to fit the regularity in the estimation. In this comparison we use the default parameters provided by each method.

Comparison of the Rt estimations when we approach the current time

Next, we study in more detail the behavior of the shifted Rt estimates when we approach the current time. In the next figure we show the shifted Rt estimates after March 1, 2022 for the different countries. We observe, as expected, that the precision of WT deteriorates as the current time approaches. The reason is that QT is designed to calculate Rt backward. To calculate Rt at time t it uses the incidence data up to time t+max(support(\(\Phi\))) (where support(\(\Phi\))=\(\{k: \Phi(k)>0\}\)). Therefore, the quality of the estimate deteriorates as we approach the current time. Regarding the methods EI, EE and EN, the estimation can be quite different in the three cases. In the case of Germany, we can observe that the estimates of EE and EI are decreasing until reaching the last value of EE , but then the estimate of EI grows in the following days approaching the value of EN. This may be due, as showed in PNAS, 2021, to the fact that EI reacts before EE to a trend change. As can be seen, the estimation of EN is very smooth. A smooth estimate has the advantage that it is more stable but on the other hand we can expect it to take longer to react to a change of trend. Therefore, it is necessary to seek a certain balance between the stability of the estimate and its ability to react to a trend change. In any case, as already mentioned, by modifying the parameters of the methods it is possible to tune the regularity of the estimation.