The influence of the early 2020s crises on the .cz domain

ADAM report 2/2025

Author

Dan Řezníček

Published

March 5, 2025

Abstract

To better understand how the .cz domain fared in face of the challenges that emerged during the first half of the 2020s, I investigated whether the spread of COVID-19, inflation rate, and the war in Ukraine associated with the count of second-level domains under .cz, the count of Czech holders of .cz domains, and traffic under .cz. The analysis shows that all of these were diversely related to the changes in the transmission of COVID-19 and the rise and fall of the inflation rate in the Czech Republic. Domain and holder counts decreased during the first COVID-19 wave but increased during the second wave. Domain counts also decreased when the inflation rate was high. Traffic increased during both COVID-19 waves.

1 Introduction

The COVID-19 pandemic, increased inflation rate, and the war in Ukraine have influenced many aspects of life in the early 2020s. The goal of this report was to investigate whether any of these crises also related to trends within the .cz domain. Can we observe some associations between monthly counts of domains, counts of holders, and average traffic on one side and the spread of COVID-19, the rise and fall of inflation, and the Russo-Ukrainian war on the other?

COVID-19. One popular notion claims that the COVID-19 pandemic lock-downs forced much of social life into the online space (Zanella et al. 2024). People increasingly turned to the internet for the purposes of work, education, social life, and entertainment (Feldmann et al. 2021; Priyadarshini et al. 2022). Then, if businesses and individuals opted to move their operations and interests online, such increased demand for online activities and presence might have influenced the count of domains in the .cz TLD. Already in April 2020, an ADAM report (Andziński et al. 2020) observed an increase in domain registrations connected to the pandemic, although it focused on second-level domains directly related to the COVID-19 pandemic. Here, I focus on the count of all second level domains as the pandemic probably did not prompt an increase in creation of domains exclusively related to the COVID-19 pandemic (and because more time has passed since, allowing for such examination). After all, many actors might have been outright forced by the COVID-19 to go online with their interests. Supporting this assertion, we can observe that after a period of saturation and stagnation (2017–2019), the count of second-level domains in the .cz TLD started growing once again when the COVID-19 pandemic raged most prominently (in the period between 2020 and 2022). After 2022, the domain count seems to stagnate once again.

Beside the domain count, the count of Czech holders of .cz domains shows a larger increase in 2020 than the year-on-year increases observed between 2017 and 2019.¹ Additionally, traffic kept on rising throughout the years (see Figure 3), providing us with yet another perspective for the inquiry on whether the COVID-19 pandemic influenced trends in the .cz domain. Taken together, we may hypothesize that the spread of COVID-19 influenced the count of domains, the count of Czech domain holders, and traffic; namely:

H1: Transmission of COVID-19 positively associates with the domain count.
H2: Transmission of COVID-19 positively associates with the count of Czech holders.
H3: Transmission of COVID-19 positively associates with QPS.

Inflation. Global supply chains problems, increases and volatility in energy and commodity prices, the Russo-Ukrainian war, increased demand, fiscal policies reacting to the COVID-19 pandemic, and companies passing the emergent price shocks to their customers all resulted in increased inflation (Komárek et al. 2024), which affected the Czech Republic rather harshly when compared to other European states (Czech National Bank 2022). When looking into the monthly year-on-year inflation rates, we can observe an increasing trend since summer 2021, peaking in summer 2022, and then decreasing throughout the whole year 2023, until reaching pre-2019 rates in 2024 (see Figure 5). Such increased consumer prices surely created a pressure to reevaluate resource allocation among individuals and businesses. In turn, this pressure could have lead to a diminished interest in creating new domains or renewing already registered domains, hypothetically reducing the counts of both domains and holders (As for the traffic, it is less clear whether the inflation should be envisioned to have influenced its values; however, it is worth to explore this possibility too, without stating the direction of the proposed association). This is, of course, an assertion that tries to paint a narrative for how inflation might have influenced the said variables. The price of a single .cz domain is not that high to necessarily burden most budgets. Nevertheless, it is not easy to discern whether inflation itself might be the culprit or if it “merely” coincides with other trends and processes which could have been “actually” responsible for such proposed associations. Even so, it is worth investigating the inflation rate because it can provide us with a useful proxy, describing a period of socio-economic hardship and its influence on the .cz domain. Then, with such a grain of salt, it seems worthwhile exploring hypotheses which propose the following.

H4: Inflation negatively associates with the domain count.
H5: Inflation negatively associates with the count of Czech domain holders.
H6: Inflation associates with QPS.

Russo-Ukrainian war. Lastly, although already mentioned in the previous paragraph as one of the causes of increased inflation, it might prove interesting to consider whether the Russo-Ukrainian war² has manifested some influence on the .cz domain. In comparison to the already resolved inflation crisis, the conflict is still ongoing and may exert some influence of its own on the .cz domain. Beside causing geopolitical instability and energy crises, the Russian aggression also brought a large wave of Ukrainian war-refugees seeking asylum in the Czech Republic and an information war largely pronounced in the online space. Consequently, new domains dedicated either to help the Ukrainian war refugees or to the dissemination of information supporting either of the sides and their coalitionary partners could have emerged, along with new holders and increased QPS. Another ADAM report (Quiros Segovia, Andziński, and Helebrant 2022) found that a small increase in conflict-related domains was observed around the time when the invasion began. However, whether a similar increase can be estimated over a longer period and on the total domain count remains an open question. Therefore, we can hypothesize that:

H7: The war in Ukraine positively associates with the domain count.
H8: The war in Ukraine positively associates with the count of Czech domain holders.
H9: The war in Ukraine positively associates with QPS.

Of course, the temporal dimension is important for such an analysis because the count of domains, holders, and queries naturally change over time and are determined by their own preceding historical values. Therefore, I also focused on overall temporal trends and seasonal patterns which are known to associate with domain counts (see Quiros Segovia and Řezníček 2025).

Interactive features in this report

Note that most charts (or at least the focal ones) in this report are interactive—they can be zoomed, filtered (by clicking or double clicking items in the legend), and hovering the mouse cursor over datapoints reveals further information.
Sections marked by “▶ Code” can be expanded to reveal the actual R code (R Core Team 2024) used for producing the statistics, graphs, and tables.
To keep the text lighter, additional information is often collapsed within callout blocks (such as this one) which can be expanded (or collapsed) by clicking on their headers.

Code

library(dplyr)
library(readr)
library(ggplot2)
library(plotly)
library(lubridate)
library(tidyr)
library(knitr)
library(kableExtra)
library(formattable)
library(jsonlite)
library(mgcv)
library(gratia)
library(forecast)
library(forcats)
library(colorspace)
library(countrycode)
library(ggthemes)
library(scales)
library(performance)
library(purrr)
library(GGally)
library(marginaleffects)
theme_set(theme_minimal())

Code

root_endpoint <- "https://stats.nic.cz"

api_endpoint <- function(endpoint, ...) {
  pars <- list(...)
  if (length(pars) == 0) {
    return(paste0(root_endpoint, endpoint))
  }
  pstr <- mapply(function(x, y) paste(x, URLencode(toString(y)), sep = "="),
                 names(pars), pars)
  paste0(root_endpoint, endpoint, "?", paste(pstr, collapse = "&"))
}

fromJSONwithToken <- function(url, token=Sys.getenv("API_TOKEN")) {
    curl_handle <- curl::new_handle()
    if (token != "") {
        curl::handle_setheaders(curl_handle,
            "Authorization" = paste("Bearer", token)
        )
    }
    request <- curl::curl_fetch_memory(url, handle = curl_handle)
    return(jsonlite::fromJSON(rawToChar(request$content)))
}

2 Data

All data used in this report are available as .csv files in this report’s repository and also shown entirely in a callout block in Section 2.7.

2.1 Domains

Data on the domain count were obtained using the /fred_domains endpoint from the ADAM project’s Postgres database and operationalized as domain counts at the end of each month.

Figure 1 illustrates domain count’s overall trend, also discerning some seasonal patterns within respective years—beside the saturation of domain counts (left), we can observe the typical double-peaked seasonal pattern (right).

Code

#data imported and prepared
df_domains <- fromJSON(
  api_endpoint("/fred_domains")
  ) |>
  filter(zone == "cz") |>
    mutate(
    year  = year(ts),
    month = month(ts),
    day   = day(ts),
    ts    = as.Date(ts)
  ) |>
  group_by(year, month) |>
  filter(day == max(day)) |>
  select(ts, year, month, domains) |>
  ungroup() |>
  filter(year >= 2020) |>
  filter(ts < "2024-10-01")

#and saved as a csv
write_csv(df_domains, file = "domains.csv")

Code

df_domains <- read_csv("domains.csv", show_col_types = FALSE)

Code

plot_ly(
  data = df_domains,
  type = "scatter",
  mode = "lines",
  x    = ~ts,
  y    = ~domains
  ) |>
  layout(
    xaxis = list(title = "Date"),
    yaxis = list(title = "Domain count")
  )
plot_ly(
  data       = df_domains,
  type       = "scatter",
  mode       = "lines",
  x          = ~month,
  y          = ~domains,
  color      = ~as.factor(year)
  ) |>
  layout(
    xaxis = list(title = "Month"),
    yaxis = list(title = "Domain count")
  )

(a) Trend.

(b) Seasonality.

Figure 1: Domain count.

2.2 Czech holders

Data on the domain holders were obtained using the /fred_holders_by_cc endpoint. Again, the values were operationalized as counts at the end of each month and only Czech holders were kept in the analysis. Figure 2 illustrates the overall trend in holder count, also discerning the seasonal patterns within respective years. While the count of holders does not seem to saturate as the count of domains does, a similar seasonal pattern is present, although little less pronounced than in the domain count.

Code

#data imported and prepared
df_holders <- fromJSON(
  api_endpoint("/fred_holders_by_cc")
  ) |>
  filter(zone == "cz" & cc == "CZ") |>
  mutate(
    year  = year(ts),
    month = month(ts),
    day   = day(ts),
    ts    = as.Date(ts)
  ) |>
  group_by(year, month) |>
  filter(day == max(day)) |>
  select(ts, year, month, holders) |>
  ungroup() |>
  filter(year >= 2020) |>
  filter(ts < "2024-10-01")

#and saved as a csv
write_csv(df_holders, file = "holders.csv")

Code

df_holders <- read_csv("holders.csv", show_col_types = FALSE)

Code

plot_ly(
  data = df_holders,
  type = "scatter",
  mode = "lines",
  x    = ~ts,
  y    = ~holders
  ) |>
  layout(
    xaxis = list(title = "Date"),
    yaxis = list(title = "CZ holder count")
  )
plot_ly(
  data       = df_holders,
  type       = "scatter",
  mode       = "lines",
  x          = ~month,
  y          = ~holders,
  color      = ~as.factor(year)
  ) |>
  layout(
    xaxis = list(title = "Month"),
    yaxis = list(title = "CZ holder count")
  )

(a) Trend.

(b) Seasonality.

Figure 2: Czech holders count.

2.3 Traffic

As for the last response variable modeled in this report, traffic data were obtained using the /cz_qps_total_1h endpoint and summarized into monthly mean QPS values. The data begin in September 2020, getting rid of some non-reliable observations. Figure 3 illustrates the overall monthly trend in traffic, also discerning the seasonal patterns within respective years. The trend shows slightly accelerating increase over time. The seasonal pattern is not as clear as for the domain and holder counts but seems to exhibit some sort of summer depression (although not in 2024).

Code

#data imported and prepared
df_qps <- fromJSON(
  api_endpoint("/cz_qps_total_1h")
  ) |>
  mutate(
    year  = year(ts),
    month = month(ts),
    ts    = as.Date(ts)
  ) |>
  group_by(year, month) |>
  reframe(
    ts    = max(ts),
    year  = year,
    month = month,
    qps   = mean(qps)
  ) |>
  ungroup() |>
  unique() |>
  filter(ts >= "2020-09-01") |>
  filter(ts < "2024-10-01")

#and saved as a csv
write_csv(df_qps, file = "qps.csv")

Code

df_qps <- read_csv("qps.csv", show_col_types = FALSE)

Code

plot_ly(
  data = df_qps,
  type = "scatter",
  mode = "lines",
  x    = ~ts,
  y    = ~qps
  ) |>
  layout(
    xaxis = list(title = "Date"),
    yaxis = list(title = "QPS")
  )
plot_ly(
  data       = df_qps,
  type       = "scatter",
  mode       = "lines",
  x          = ~month,
  y          = ~qps,
  color      = ~as.factor(year)
  ) |>
  layout(
    xaxis = list(title = "Month"),
    yaxis = list(title = "QPS")
  )

(a) Trend.

(b) Seasonality.

Figure 3: Traffic.

2.4 COVID-19

For the COVID-19, data by Our World in Data were used. I considered the count of new cases, count of new deaths, and the stringency index as predictor variables, opting for the new cases as they exhibited most fluent seasonality patterns of the three (see Figure 4). Sub-figure a clearly shows two massive spikes of new COVID-19 cases in the winter of 2021 and 2022, and sub-figure b illustrates how this trend behaved seasonally, peaking in January and February (in 2021 and 2022) and gaining on strength since September (in 2020 and 2021) after a seasonal decline during summer.

Code

df_covid <-
  read_csv("cz_covid.csv") |>
  select(date,
         stringency_index,
         new_cases,
         new_deaths,
         ) |>
  mutate(
    year  = year(date),
    month = month(date),
    ts    = as.Date(date)
    ) |>
  group_by(year, month) |>
  reframe(
    ts             = max(ts),
    max_stringency = max(stringency_index),
    new_cases      = sum(new_cases),
    new_deaths     = sum(new_deaths)
  ) |>
  ungroup() |>
  unique()

Code

plot_ly(
  data = df_covid,
  type = "scatter",
  mode = "lines",
  x    = ~ts,
  y    = ~new_cases
  ) |>
  layout(
    xaxis = list(title = "Date"),
    yaxis = list(title = "New cases per month")
  )
plot_ly(
  data       = df_covid,
  type       = "scatter",
  mode       = "lines",
  x          = ~month,
  y          = ~new_cases,
  color      = ~as.factor(year)
  ) |>
  layout(
    xaxis = list(title = "Month"),
    yaxis = list(title = "New cases per month")
  )

(a) Trend.

(b) Seasonality.

Figure 4: COVID-19—new cases per month.

2.5 Inflation

For the inflation rate, data by the Czech Statistical Office were used. I opted for the increase in CPI compared with the corresponding month of a preceding year metric as it indicates a percentage change in the price level between the reference month of a given year and the corresponding month of a preceding year. Figure 5 shows inflation’s trend (and the irrelevance of seasonality).

Code

df_inflation <- 
  read_csv("inflation.csv") |>
  pivot_longer(
    !year, names_to = "month", values_to = "inflation"
    ) |>
  filter(year > 2018) |>
  mutate(
    ts    = as.Date(paste0(year, "-", month, "-15")),
    year  = as.integer(year),
    month = as.integer(month)
  ) |>
  arrange(year, month)

Code

plot_ly(
  data = df_inflation,
  type = "scatter",
  mode = "lines",
  x    = ~ts,
  y    = ~inflation
  ) |>
  layout(
    xaxis = list(title = "Date"),
    yaxis = list(title = "Inflation rate")
  )
plot_ly(
  data       = df_inflation,
  type       = "scatter",
  mode       = "lines",
  x          = ~month,
  y          = ~inflation,
  color      = ~as.factor(year)
  ) |>
  layout(
    xaxis = list(title = "Month"),
    yaxis = list(title = "Inflation rate")
  )

(a) Trend.

(b) Seasonality.

Figure 5: Year-on-year monthly inflation (CZK).

2.6 Russo-Ukrainian war

Lastly, the Russo-Ukrainian war was specified as a factor variable with values “Peace” (Until February 23th 2022) and “War” (since February 24th 2022).

Code

df_crises <- df_qps |>
  left_join(df_domains,   by = c("year", "month")) |>
  left_join(df_holders,   by = c("year", "month")) |>
  left_join(df_covid,     by = c("year", "month")) |>
  left_join(df_inflation, by = c("year", "month")) |>
  select(!c(ts.x, ts.y, ts.x.x, ts.y.y)) #drop duplicates

Code

df_crises <- df_crises |>
    mutate(
    ukraine =
      case_when(ts < "2022-02-24" ~ "Peace",
                ts > "2022-02-24" ~ "War"),
    ukraine = as.factor(ukraine),
    year_f  = as.factor(year),
    max_stringency =
      if_else(is.na(max_stringency), 0, max_stringency),
    new_cases =
      if_else(is.na(new_cases), 0, new_cases),
    new_deaths = 
      if_else(is.na(new_deaths), 0, new_deaths),
    time         = seq(from = 1,
                       to   = nrow(df_crises),
                       by   = 1),
    new_cases = new_cases/1000 #turn into thousands
    ) |>
  ungroup()

2.7 The dataset

The entire dataset is available for inspection in the callout block below. Note that new cases of COVID-19 were recalculated into thousands.

The dataset

Code

df_crises |>
  mutate(date = paste0(year, "-", month)) |>
  select(date, time,
         domains,
         holders,
         qps,
         new_cases,
         inflation,
         ukraine) |>
  kable(
    "html",
    digits = 2,
    col.names = c(
      "Date", "Time",
      "Domains",
      "CZ holders",
      "QPS",
      "New COVID cases",
      "Inflation",
      "War in Ukraine"
      )
  )

tbl-data-view
Date	Time	Domains	CZ holders	QPS	New COVID cases	Inflation	War in Ukraine
2020-9	1	1368781	654791	14894.78	39.25	3.2	Peace
2020-10	2	1364093	654611	13928.01	187.94	2.9	Peace
2020-11	3	1369032	656968	14671.87	269.55	2.7	Peace
2020-12	4	1370804	658071	15398.89	154.10	2.3	Peace
2021-1	5	1378195	661960	16585.94	318.62	2.2	Peace
2021-2	6	1387674	666320	16859.40	254.11	2.1	Peace
2021-3	7	1395295	669351	16143.66	284.34	2.3	Peace
2021-4	8	1399309	670642	14679.03	106.44	3.1	Peace
2021-5	9	1401016	670961	15334.59	42.92	2.9	Peace
2021-6	10	1401323	670946	15342.71	5.71	2.8	Peace
2021-7	11	1400490	670716	14853.23	5.57	3.4	Peace
2021-8	12	1403622	671094	15348.25	6.54	4.1	Peace
2021-9	13	1408476	672685	15507.51	10.86	4.9	Peace
2021-10	14	1413755	674090	15915.06	75.39	5.8	Peace
2021-11	15	1420857	675185	15788.99	369.83	6.0	Peace
2021-12	16	1424131	675342	15405.15	331.06	6.6	Peace
2022-1	17	1432454	676575	16097.25	617.10	9.9	Peace
2022-2	18	1441347	679048	15724.20	686.04	11.1	Peace
2022-3	19	1444213	679567	15897.77	253.42	12.7	War
2022-4	20	1442541	675817	15838.66	127.79	14.2	War
2022-5	21	1439607	674849	16318.36	28.96	16.0	War
2022-6	22	1438268	674863	16551.80	11.57	17.2	War
2022-7	23	1437182	674511	15555.57	75.01	17.5	War
2022-8	24	1440667	675363	17635.59	72.11	17.2	War
2022-9	25	1445559	677033	17168.33	81.06	18.0	War
2022-10	26	1452935	678306	17176.03	94.90	15.1	War
2022-11	27	1462843	679512	17342.93	21.35	16.2	War
2022-12	28	1463116	679025	17265.51	21.78	15.8	War
2023-1	29	1463084	680624	17330.86	11.38	17.5	War
2023-2	30	1470108	683378	17950.18	19.24	16.7	War
2023-3	31	1473706	684943	18565.98	22.70	15.0	War
2023-4	32	1468610	684686	19183.52	9.57	12.7	War
2023-5	33	1466240	684678	18901.38	1.64	11.1	War
2023-6	34	1467104	684544	18791.62	0.68	9.7	War
2023-7	35	1466822	683683	17519.86	0.34	8.8	War
2023-8	36	1468764	684997	18867.83	0.98	8.5	War
2023-9	37	1471167	686788	18599.48	5.38	6.9	War
2023-10	38	1472772	688821	19236.23	16.18	8.5	War
2023-11	39	1474194	690459	20969.48	24.73	7.3	War
2023-12	40	1468788	690326	19866.98	56.28	6.9	War
2024-1	41	1467715	692502	20806.68	9.49	2.3	War
2024-2	42	1471419	695908	20948.74	2.18	2.0	War
2024-3	43	1472117	696938	20194.06	0.66	2.0	War
2024-4	44	1469049	697046	19044.95	0.27	2.9	War
2024-5	45	1467388	697560	20573.34	0.27	2.6	War
2024-6	46	1465440	697782	21473.08	0.00	2.0	War
2024-7	47	1464980	698143	21833.11	0.00	2.2	War
2024-8	48	1465976	699412	21685.01	0.00	2.2	War
2024-9	49	1468911	701506	21624.75	0.00	2.6	War

3 Models

In this section, models estimating associations with domain counts are presented first (Section 3.1), followed by models focused on holder counts (Section 3.2), and then models focused on the traffic (Section 3.3). For all, the same modelling approach was used, although some models are hidden in collapsed callout blocks (available for the keen readers). The domain counts were modeled using the negative-binomial distribution, the domain holders with the Poisson distribution, and the traffic with the normal distribution. All models were fit using the mgcv package’s gamm function which provides tools for fitting generalized additive mixed models (“Mgcv: Mixed GAM Computation Vehicle with Automatic Smoothness Estimation” 2000, ver 1.9-1; Wood 2017; Simpson 2018a, 2018b; Pedersen et al. 2019) (see callout block below).

What are generalized additive (mixed) models?

Generalized additive models (GAMs) are often portrayed to be situated in a middle ground between interpretable but often inflexible linear models and flexible but black-boxish machine learning models. GAMs can be used to model nonlinear relationships (overcoming limits of linear models) while still providing inferential statistics and explanatory insights (avoiding the black-box nature of predictions made by machine learning models).

To capture these non-linear relationships, GAMs use smooth functions which are functions that are composed of smaller basis functions. While the smaller basis functions capture smaller fractions of the relationships, they add up into the bigger smooth function, which is in turn able to describe nonlinear relationships between the variables. In effect, the associations estimated by GAMs wiggle as the size of the relationship between variables need not be linear.

Additionally, generalized additive mixed models (GAMMs) offer further modeling possibilities. In this report, it is the specification of correlation structures which helps account for residual autocorrelations in the data that has not been accounted for by the smooths.

For an introduction on GAMs, an interactive course by Noam Ross or an introductory text by Michael Clark are recommended. Furthermore, introductory lectures by Noam Ross and Gavin Simpson are also freely available.

3.1 Domains

Domains - GAMM 1

To set some initial model specification, gamm_domains_1 estimates the domain count by a seasonal pattern (months within a year), an overall monthly trend, (thousands of) new COVID-19 cases, the rate of inflation, and whether the Russo-Ukrainian war was ongoing. I also specified a varying intercept for the respective years and a correlation structure accounting for the autocorrelation of observations. However, this initial model does not yet interact the focal predictors with the temporal variables (providing a simpler perspective which sets ground for the following models), therefore, all the results are hidden within the collapsed callout blocks.

Code

gamm_domains_1 <- gamm(
  domains ~
    s(month, k = 12, bs = "cc") + 
    s(time) +
    s(new_cases) +
    s(inflation) +
    ukraine,
  random = list(year_f = ~ 1),
  correlation =
    corARMA(form = ~ time,
            p    = 2,
            q    = 2),
  data = df_crises,
  family = nb,
  method = "REML"
  )
saveRDS(gamm_domains_1, file = "gamm_domains_1.rds")

Code

gamm_domains_1 <- readRDS(file = "gamm_domains_1.rds")

Results

Code

summary(gamm_domains_1$gam)


Family: negative binomial 
Link function: log 

Formula:
domains ~ s(month, k = 12, bs = "cc") + s(time) + s(new_cases) + 
    s(inflation) + ukraine

Parametric coefficients:
              Estimate Std. Error   t value Pr(>|t|)    
(Intercept) 14.1814589  0.0006696 21177.710  < 2e-16 ***
ukraineWar  -0.0041888  0.0010041    -4.171 0.000277 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
               edf Ref.df      F p-value    
s(month)     9.663 10.000  25.06 < 2e-16 ***
s(time)      4.907  4.907 858.12 < 2e-16 ***
s(new_cases) 1.000  1.000  12.54 0.00147 ** 
s(inflation) 4.215  4.215  10.55   3e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.981   
  Scale est. = 6.8992e-06  n = 49

Code

k.check(gamm_domains_1$gam)

             k'      edf   k-index p-value
s(month)     10 9.663024 0.7460053  0.0400
s(time)       9 4.907299 0.2803317  0.0000
s(new_cases)  9 1.000000 0.9504532  0.3025
s(inflation)  9 4.215083 1.2467346  0.9550

Code

appraise(gamm_domains_1$gam,
         method    = "normal",
         point_col = "steelblue",
         line_col  = "black",
         point_alpha = 0.4) &
  theme_minimal()

Code

concurvity(gamm_domains_1$gam, full = FALSE) |>
  kable(
    format = "html",
    digits = 2)

	para	s(month)	s(time)	s(new_cases)	s(inflation)
para	1	0.00	0.00	0.00	0.00
s(month)	0	1.00	0.99	0.55	0.51
s(time)	0	0.99	1.00	0.94	0.98
s(new_cases)	0	0.55	0.94	1.00	0.63
s(inflation)	0	0.51	0.98	0.63	1.00

	para	s(month)	s(time)	s(new_cases)	s(inflation)
para	1	0.00	0.00	0.00	0.00
s(month)	0	1.00	0.04	0.27	0.02
s(time)	0	0.71	1.00	0.76	0.96
s(new_cases)	0	0.29	0.53	1.00	0.28
s(inflation)	0	0.38	0.34	0.31	1.00

	para	s(month)	s(time)	s(new_cases)	s(inflation)
para	1	0.00	0.00	0.00	0.00
s(month)	0	1.00	0.05	0.23	0.01
s(time)	0	0.21	1.00	0.60	0.95
s(new_cases)	0	0.21	0.53	1.00	0.28
s(inflation)	0	0.16	0.32	0.39	1.00

Code

acf(residuals(gamm_domains_1$lme, type = "normalized"),
    ylim = c(-1, 1),
    main = "")
pacf(residuals(gamm_domains_1$lme, type = "normalized"),
     ylim = c(-1, 1),
     main = "")

Smooths

Code

fig1 <- draw(
  gamm_domains_1$gam, residuals = TRUE, select = 1)
fig1 <- ggplotly(fig1)
fig2 <- draw(
  gamm_domains_1$gam, residuals = TRUE, select = 2)
ticks <- df_crises |>
    select(time, year, month) |>
    mutate(
        ticktext = paste0(year, "-", month) 
    ) |>
    slice(which(row_number() %% 10 == 1))
fig2 <- ggplotly(fig2) |>
  layout(
    xaxis = list(tickvals = ~ ticks$time,
                 ticktext = ~ ticks$ticktext
                 )
  )
fig3 <- draw(
  gamm_domains_1$gam, residuals = TRUE, select = 3)
fig3 <- ggplotly(fig3)
fig4 <- draw(
  gamm_domains_1$gam, residuals = TRUE, select = 4)
fig4 <- ggplotly(fig4)
subplot(fig1, fig2, fig3, fig4,
        nrows = 2,
        titleX = TRUE,
        titleY = TRUE,
        margin = 0.1) |>
  layout(title = "")

Figure 6: gamm_domains_1 partial effects.

In Figure 6, we can see that the domain count varies seasonally (top-left), with the typical spring and autumn peaks and summer depression in-between. The domain count also has a diminishing negative association and later a positive increasing association with the monthly temporal trend (i.e., domain count increases over time; top-right). The model also estimates that the count of domains decreases with higher count of new COVID-19 cases with the size of the association increasing as the count of new cases rises, although the association is uncertain below some 230 thousand cases below which most observations lie (bottom-left). Note that the smooth fails to capture the two data-points at the high-end of the new COVID-19 cases. Next, inflation (bottom-right) was at first estimated to have a diminishing negative association with the domain count, turning increasingly positive after reaching 6% rate. The war in Ukraine seems to decrease the domain count (see the Results callout block above).

While gamm_domains_1’s results do not support any of the hypotheses (in fact, they seem to be rather contrary to them), it would be more interesting to see how exactly do new COVID-19 cases and inflation associate with the domain count in time.

Predictions

Code

fitted_noint <- fitted_values(
  gamm_domains_1,
  data = df_crises
  )

plot_ly(
  data = fitted_noint,
  name = "Predictions",
  x = ~ts,
  y = ~.fitted,
  type = "scatter",
  mode = "lines",
  line = list(color = "#fc8d62",
              dash = "dot"),
  hovertemplate = paste0(
    fitted_noint$year, "-", fitted_noint$month, "<br>",
    format(fitted_noint$.fitted, big.mark = " ", scientific = FALSE),
    " domains", "<br>",
    "95% CI [",
      format(fitted_noint$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_noint$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
    )
  ) |>
  add_ribbons(
    name = "CI",
    x = ~ts,
    ymin = fitted_noint$.lower_ci,
    ymax = fitted_noint$.upper_ci,
    line = list(color = "#fc8d62"),
    fillcolor = "#fc8d62",
    opacity = 0.05,
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_noint$month, "<br>",
      "95% CI [",
      format(fitted_noint$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_noint$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
      )
  )  |>
  add_trace(
    name = "Historical",
    x = ~ts,
    y = ~domains,
    line = list(color = "#66c2a5",
                dash = "full"),
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_noint$month, "<br>",
      format(fitted_noint$domains, big.mark = " ", scientific = FALSE), " domains"
    )
  ) |>
  layout(
    xaxis = list(
      title = "Date"
    ),
    yaxis = list(
      title = "Domains"
    )
  )

Figure 7: Domain count predictions based on model gamm_domains_1.

3.1.1 Domains - GAMM 2

The model gamm_domains_2 estimates the domain count by a seasonal pattern (months within a year), an overall monthly trend, (thousands of) new COVID-19 cases, and whether the Russo-Ukrainian war was ongoing. Furthermore, I specified an interaction term between the monthly trend and new COVID-19 cases. I also specified a varying intercept for the respective years and a correlation structure accounting for the autocorrelation of observations. Due to concurvity issues, the inflation rate was dropped from the model (and is modeled separately in gamm_domains_3 below).

What is concurvity?

Concurvity is a generalization of collinearity to the framework of generalized additive models. Similarly to collinearity issues within the generalized linear models, concurvity describes a computational issue within a generalized additive model when one smooth term can be approximated by other smooth terms. Concurvity is estimated on a range from 0 (no overlap between the smooths) to 1 (complete overlap between the smooth functions). As stated by Simon Wood in the mgcv documentation, concurvity often becomes an issue when “… a smooth of space is included in a model, along with smooths of other covariates that also vary more or less smoothly in space. Similarly it tends to be an issue in models including a smooth of time, along with smooths of other time varying covariates. Concurvity can be viewed as a generalization of co-linearity, and causes similar problems of interpretation. It can also make estimates somewhat unstable (so that they become sensitive to apparently innocuous modelling details, for example).”

Code

gamm_domains_2 <- gamm(
  domains ~
    s(month, k = 12, bs = "cc") +
    ti(time) +
    ti(new_cases) +
    ti(time, new_cases, k = c(5, 5)) +
    ukraine,
  random = list(year_f = ~ 1),
  correlation =
    corARMA(form = ~ time,
            p    = 4,
            q    = 0),
  data = df_crises,
  family = nb,
  method = "REML"
  )
saveRDS(gamm_domains_2, file = "gamm_domains_2.rds")

Code

gamm_domains_2 <- readRDS(file = "gamm_domains_2.rds")

Results

Code

summary(gamm_domains_2$gam)


Family: negative binomial 
Link function: log 

Formula:
domains ~ s(month, k = 12, bs = "cc") + ti(time) + ti(new_cases) + 
    ti(time, new_cases, k = c(5, 5)) + ukraine

Parametric coefficients:
             Estimate Std. Error   t value Pr(>|t|)    
(Intercept) 14.181070   0.001172 12103.456   <2e-16 ***
ukraineWar  -0.002957   0.001926    -1.536    0.136    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                     edf Ref.df       F  p-value    
s(month)           9.663 10.000  19.710  < 2e-16 ***
ti(time)           3.792  3.792 288.042  < 2e-16 ***
ti(new_cases)      1.000  1.000   6.336 0.017837 *  
ti(time,new_cases) 4.213  4.214   7.380 0.000367 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.987   
  Scale est. = 5.5794e-06  n = 49

Code

k.check(gamm_domains_2$gam)

                   k'       edf   k-index p-value
s(month)           10 9.6625805 0.9599449  0.3700
ti(time)            4 3.7916693 0.1982739  0.0000
ti(new_cases)       4 0.9999741 0.7857882  0.0425
ti(time,new_cases) 16 4.2135243 0.3527887  0.0000

Code

appraise(gamm_domains_2$gam,
         method      = "normal",
         point_col   = "steelblue",
         line_col    = "black",
         point_alpha = 0.4) &
  theme_minimal()

Code

concurvity(gamm_domains_2$gam, full = FALSE) |>
  kable(
    format = "html",
    digits = 2)

	para	s(month)	ti(time)	ti(new_cases)	ti(time,new_cases)
para	1.00	0.00	0.00	0.00	0.82
s(month)	0.00	1.00	0.10	0.45	0.82
ti(time)	0.00	0.10	1.00	0.63	1.00
ti(new_cases)	0.00	0.45	0.63	1.00	1.00
ti(time,new_cases)	0.82	0.82	1.00	1.00	1.00

	para	s(month)	ti(time)	ti(new_cases)	ti(time,new_cases)
para	1.00	0.00	0.00	0.00	0.01
s(month)	0.00	1.00	0.04	0.27	0.07
ti(time)	0.00	0.04	1.00	0.29	0.46
ti(new_cases)	0.00	0.23	0.44	1.00	0.67
ti(time,new_cases)	0.82	0.50	0.77	0.95	1.00

	para	s(month)	ti(time)	ti(new_cases)	ti(time,new_cases)
para	1.00	0.00	0.00	0.00	0.09
s(month)	0.00	1.00	0.02	0.27	0.19
ti(time)	0.00	0.02	1.00	0.25	0.18
ti(new_cases)	0.00	0.08	0.30	1.00	0.30
ti(time,new_cases)	0.82	0.25	0.79	0.92	1.00

Code

acf(residuals(gamm_domains_2$lme, type = "normalized"),
    ylim = c(-1, 1),
    main = "",
    lag.max = 30)
pacf(residuals(gamm_domains_2$lme, type = "normalized"),
     ylim = c(-1, 1),
     main = "",
     lag.max = 30)

Code

fig <- draw(
  gamm_domains_2$gam, residuals = TRUE, select = 1)
ggplotly(fig)
fig <- draw(
  gamm_domains_2$gam, residuals = TRUE, select = 2)
ggplotly(fig) |>
  layout(
    xaxis = list(tickvals = ~ ticks$time,
                 ticktext = ~ ticks$ticktext)
  )
fig <- draw(
  gamm_domains_2$gam, residuals = TRUE, select = 3)
ggplotly(fig)

(a) Seasonal pattern.

(b) Monthly trend.

Figure 8: gamm_domains_2 partial effects.

We can observe the typical double-peaked seasonal pattern and the monthly temporal trend plateauing since mid-2023. The smooth for new cases was significant, estimating a diminishing positive association with the domain count below some 97 thousand new cases and an increasingly negative one above. The term for the the Russo-Ukrainian war was insignificant. The newly introduced interaction term was significant and is plotted with more detail below in Figure 9.

Code

#' Get CIs excluding zero and paste positive and negative associations to new columns to facilitate plotting of interaction terms fitted by mcgv GAMs.
#' 
#' Works only for interaction terms as it is useful to highlight such areas when building an graph for smooth interaction terms.
#' 
#' @param df is the df returned by the gratia__smooth_estimates(), already containing CIs added with add_confint().
#' @param term is either "s", "te", or "ti" term used to specify the interaction in the GAM.
#' @param var1 is the first predictor in the interaction, e.g. "month".
#' @param var2 is the second predictor in the interaction, e.g. "new_cases".

get_ci_areas <- function(df, term, var1, var2){
  df |>
    mutate(
      var1_l_ci_neg = case_when(
        .smooth == paste0(term, "(", var1, ",", var2, ")") &
          .lower_ci < 0 &
          .upper_ci < 0
        ~ .lower_ci),
      var1_u_ci_neg = case_when(
        .smooth == paste0(term, "(", var1, ",", var2, ")") &
          .lower_ci < 0 &
          .upper_ci < 0
        ~ .upper_ci),
      var1_sig_neg = case_when(
        !is.na(var1_l_ci_neg) &
          !is.na(var1_u_ci_neg)
        ~ eval(parse(text = var1))
        ),
      var2_sig_neg = case_when(
        !is.na(var1_l_ci_neg) &
          !is.na(var1_u_ci_neg)
        ~ eval(parse(text = var2))
        ),
      var1_l_ci_pos = case_when(
        .smooth == paste0(term, "(", var1, ",", var2, ")") &
          .lower_ci > 0 &
          .upper_ci > 0
        ~ .lower_ci),
      var1_u_ci_pos = case_when(
        .smooth == paste0(term, "(", var1, ",", var2, ")") &
          .lower_ci > 0 &
          .upper_ci > 0
        ~ .upper_ci),
      var1_sig_pos = case_when(
        !is.na(var1_l_ci_pos) &
          !is.na(var1_u_ci_pos)
        ~ eval(parse(text = var1))
        ),
      var2_sig_pos = case_when(
        !is.na(var1_l_ci_pos) &
          !is.na(var1_u_ci_pos)
        ~ eval(parse(text = var2))
        )
      )
  }

Code

gam_smooth <-
  smooth_estimates(gamm_domains_2$gam) |>
  add_confint() |>
  filter(.smooth == "ti(time,new_cases)")
gam_smooth <- get_ci_areas(gam_smooth, "ti", "time", "new_cases")

Code

ticks <- df_crises |>
  select(time, year, month) |>
  mutate(
    ticktext = paste0(year, "-", month) 
  ) |>
  slice(which(row_number() %% 10 == 1))

plot_ly(
  data = gam_smooth,
  type = "contour",
  x    = ~time,
  y    = ~new_cases,
  z    = ~.estimate,
  contours = list(
    coloring = "heatmap",
    showlabels = TRUE
  ),
  colors   = "RdBu",
  reversescale = TRUE,
  lines     = list(color = "black"),
  hoverinfo = "text",
  text      = paste0(
    "<b>New cases</b>: ",
    round(gam_smooth$new_cases, 1),
    "<br>",
    "<b>Estimate</b>: ",
    round(gam_smooth$.estimate, 4),
    "<br>",
    "<b>95% CI</b> (",
    round(gam_smooth$.lower_ci, 4),
    ", ",
    round(gam_smooth$.upper_ci, 4),
    ")"
    )
  ) |>
  add_trace(
    name   = "Historical",
    x      = ~ df_crises$time,
    y      = ~ df_crises$new_cases,
    type   = "scatter",
    mode   = "lines",
    #marker = list(color = "black"),
    line   = list(color = "black"),
    hoverinfo = "text",
    text = paste0(
      "<b>Month</b>: ",
      round(df_crises$month, 0),
      " (", df_crises$year, ")",
      "<br>",
      "<b>New cases</b>: ",
      round(df_crises$new_cases, 1)
      )
    ) |>
    add_trace(
    name   = "Positive CI",
    type   = "scatter",
    mode   = "markers",
    x      = ~ gam_smooth$var1_sig_pos,
    y      = ~ gam_smooth$var2_sig_pos,
    marker = list(color   = "black",
                  opacity = 0.1,
                  symbol  = "triangle-up-open"),
    showlegend = TRUE#,
    #visible = "legendonly"
    ) |>
  add_trace(
    name   = "Negative CI",
    type   = "scatter",
    mode   = "markers",
    x      = ~ gam_smooth$var1_sig_neg,
    y      = ~ gam_smooth$var2_sig_neg,
    marker = list(color   = "black",
                  opacity = 0.1,
                  symbol  = "triangle-down-open"),
    showlegend = TRUE#,
    #visible = "legendonly"
    ) |>
  layout(
    title = "Associations with domain count",
    xaxis = list(title    = "Time",
                 tickvals = ~ ticks$time,
                 ticktext = ~ ticks$ticktext),
    yaxis = list(title = "New cases (thousands)")
    ) |>
  colorbar(title = "Estimate",
           limits = c(-max(abs(gam_smooth$.estimate)),
                      max(abs(gam_smooth$.estimate))))

Figure 9: Domain count association with the interaction between monthly temporal trend and new COVID-19 cases.

The interaction’s association was estimated positive around the peak of the pandemic (and also later since October 2022 but mostly far from the historically observed new COVID-19 cases denoted by the black line). A triangle-like area of negative association was estimated between some 70 thousand and 325 thousand new cases between September 2020 and December 2021 (and another top-right area far from any observations).

So many charts! How to make sense of them?

It is true that this report contains many charts describing results of many models. This can be overwhelming and hard to make sense of. However, when reporting the results of the models, I always follow the same approach by which I try to describe (A) diagnostics (Did the model fit well?), (B) partial smooth effects (Are there associations? How large?), and (C) predictions (What values can we predict from the estimated associations?).

Collapsed in the Results sections, the first four charts describe how well were the models fit to the data. These model diagnostic plots are not of primary interest but it makes sense to be open about them for those interested. Furthermore, I also report correlograms (ACF and partial ACF) which show autocorrelation of residuals. In time series models it is important to check whether the residuals are significantly correlated (which they should not be) to ensure that the errors in the model predictions do not exhibit strong patterns of dependence on each other. Put simply, the spikes in these charts should not cross the significance levels denoted by the blue dotted lines. If the spikes remain below these levels, the residuals are relatively independent, which suggests that the model captures the underlying data structure adequately. In most models in this report, autocorrelations were observed; therefore I specified correlation structures which usually resolved these depedencies sufficiently.
I always plot smooths estimated by the GAMMs (e.g., Figure 6). These are partial effects on the link scale (the domain and holder GAMMs are fitted using the negative binomial and Poisson distributions and use a log-link; the traffic GAMMs are fitted using the Gaussian distribution and therefore do not need any transformation when fitting). These plots show the individual component effect of a smooth function on the link scale, conditional on all other terms in the model being set to zero. These plots, then, help us to make interpretations on whether and how the predictors associate with the response variable. Below zero, the association is negative (i.e., it decreases the value of the response); above zero, the association is positive. Keep in mind that in GAMMs, the size (or we can say “strength”) of these associations is not constant. Instead, GAMMs allow for these association to vary—wiggle—depending on the values of the predictor variables. In other words, the size of the association might vary as the value of the predictor changes. Note that I also plot the interaction terms in further detail (e.g., Figure 9) as they are focal in this report.
Finally, I always plot predictions fitted from the smooths using the historically observed data. Note that I predict these values (a) from all terms included in the GAMMs, and also (b) all terms but the interaction terms. Plotting both (a) and (b) predictions allows us to inspect how the interaction terms add to the predictions as we can see the difference between these predictions. Note that I predict only from the historically observed values of the predictors as I’m interested in how well the models captured these data. In the end, these predictions provide us with practical values—not as often unintelligible partial effects (part B) which may be on a log scale and do not translate linearly to the original scale, but as, for example, predicted domain counts.

To restate, the procedure is to answer the following questions: (A) Has the model worked properly? (B) Has the model found any associations? (C) How should we understand the associations in comprehensible, practical values?

In Figure 10 below, we can inspect a time series showing predictions made by the model and compare them to the historically observed domain counts. Importantly, the plot contains two traces for the predictions—one that represents predictions made from all the terms in the model (“All”) and one excluding the interaction term (“No interaction”). This allows us to inspect the practical impact of the associations described in Figure 9 above as the plot shows how much the domain count predictions change once we take the interaction term into account. We can see that the interaction term improves predictions during the COVID-19 peaks (i.e., the predictions are closer to the historical observations than when we exclude the interaction term) as the model predicts less domain during the first wave and more domains during the second wave when the interaction is included. For example, in February 2022 (peak of the second COVID-19 wave), including the interaction term allows the model to predict 7 600 more domains than when excluded, providing a better prediction.

Code

fitted_noint <- fitted_values(
  gamm_domains_2,
  data = df_crises,
  terms = c(
    "(Intercept)",
    "ti(new_cases)",
    "ti(time)",
    "s(month)",
    "ukraine"
    )
  )
fitted_int <- fitted_values(
  gamm_domains_2,
  data = df_crises,
  terms = c(
    "(Intercept)",
    "ti(new_cases)",
    "ti(time)",
    "ti(time,new_cases)",
    "s(month)",
    "ukraine"
    )
  )

plot_ly(
  name = "No interaction",
  data = fitted_noint,
  x = ~ts,
  y = ~.fitted,
  type = "scatter",
  mode = "lines",
  line = list(color = "#fc8d62",
              dash = "dot"),
  hovertemplate = paste0(
    fitted_noint$year, "-", fitted_noint$month, "<br>",
    format(fitted_noint$.fitted, big.mark = " ", scientific = FALSE),
    " domains", "<br>",
    "95% CI [",
      format(fitted_noint$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_noint$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
  )
  ) |>
  add_ribbons(
    name = "No interaction CI",
    x = ~ts,
    ymin = fitted_noint$.lower_ci,
    ymax = fitted_noint$.upper_ci,
    line = list(color = "#fc8d62"),
    fillcolor = "#fc8d62",
    opacity = 0.05,
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_noint$month, "<br>",
      "95% CI [",
      format(fitted_noint$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_noint$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
      )
  ) |>
  add_trace(
    name = "All",
    x = ~ts,
    y = ~fitted_int$.fitted,
    line = list(color = "#8da0cb",
                dash = "dot"),
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_int$month, "<br>",
      format(fitted_int$.fitted, big.mark = " ", scientific = FALSE),
      " domains", "<br>",
      "95% CI [",
      format(fitted_int$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_int$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
    )
  ) |>
  add_ribbons(
    name = "All CI",
    x = ~ts,
    ymin = fitted_int$.lower_ci,
    ymax = fitted_int$.upper_ci,
    line = list(color = "#8da0cb"),
    fillcolor = "#8da0cb",
    opacity = 0.05,
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_int$month, "<br>",
      "95% CI [",
      format(fitted_int$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_int$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
      )
  ) |>
  add_trace(
    name = "Historical",
    x = ~ts,
    y = ~domains,
    line = list(color = "#66c2a5",
                dash = "full"),
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_noint$month, "<br>",
      format(fitted_noint$domains, big.mark = " ", scientific = FALSE), " domains"
    )
  ) |>
  layout(
    xaxis = list(
      title = "Date"
    ),
    yaxis = list(
      title = "Domains"
    )
  )

Figure 10: Domain count predictions based on model gamm_domains_2.

3.1.2 Domains - GAMM 3

In gamm_domains_3, I dropped the focus on the new COVID-19 cases and instead specified an interaction term between the inflation rate and the monthly temporal trend.

Code

gamm_domains_3 <- gamm(
  domains ~
    s(month, k = 12, bs = "cc") + 
    ti(time) +
    ti(inflation) +
    ti(time, inflation, k = c(10, 5)) +
    ukraine,
  random = list(year_f = ~ 1),
  correlation =
    corARMA(form = ~ time,
            p    = 1,
            q    = 0),
  data = df_crises,
  family = nb,
  method = "REML"
  )
saveRDS(gamm_domains_3, file = "gamm_domains_3.rds")

Code

gamm_domains_3 <- readRDS(file = "gamm_domains_3.rds")

Results

Code

summary(gamm_domains_3$gam)


Family: negative binomial 
Link function: log 

Formula:
domains ~ s(month, k = 12, bs = "cc") + ti(time) + ti(inflation) + 
    ti(time, inflation, k = c(10, 5)) + ukraine

Parametric coefficients:
              Estimate Std. Error  t value Pr(>|t|)    
(Intercept) 14.1843627  0.0016711 8487.894   <2e-16 ***
ukraineWar  -0.0008321  0.0020331   -0.409    0.686    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                    edf Ref.df       F  p-value    
s(month)           8.35  10.00   4.693 0.000112 ***
ti(time)           1.00   1.00  84.764  < 2e-16 ***
ti(inflation)      1.00   1.00 121.891  < 2e-16 ***
ti(time,inflation) 9.24   9.24   6.554 3.67e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.994   
  Scale est. = 3.8031e-06  n = 49

Code

appraise(gamm_domains_3$gam,
         method      = "normal",
         point_col   = "steelblue",
         line_col    = "black",
         point_alpha = 0.4) &
  theme_minimal()

Code

concurvity(gamm_domains_3$gam, full = FALSE) |>
  kable(
    format = "html",
    digits = 2)

	para	s(month)	ti(time)	ti(inflation)	ti(time,inflation)
para	1	0.00	0.00	0.00	1
s(month)	0	1.00	0.10	0.39	1
ti(time)	0	0.10	1.00	0.97	1
ti(inflation)	0	0.39	0.97	1.00	1
ti(time,inflation)	1	1.00	1.00	1.00	1

	para	s(month)	ti(time)	ti(inflation)	ti(time,inflation)
para	1	0.00	0.00	0.00	0.54
s(month)	0	1.00	0.04	0.00	0.02
ti(time)	0	0.01	1.00	0.97	0.38
ti(inflation)	0	0.25	0.23	1.00	0.30
ti(time,inflation)	1	0.92	1.00	1.00	1.00

	para	s(month)	ti(time)	ti(inflation)	ti(time,inflation)
para	1	0.00	0.00	0.00	0.10
s(month)	0	1.00	0.02	0.14	0.14
ti(time)	0	0.02	1.00	0.47	0.28
ti(inflation)	0	0.07	0.51	1.00	0.22
ti(time,inflation)	1	0.70	1.00	1.00	1.00

Code

acf(residuals(gamm_domains_3$lme, type = "normalized"),
    ylim = c(-1, 1),
    main = "")
pacf(residuals(gamm_domains_3$lme, type = "normalized"),
     ylim = c(-1, 1),
     main = "")

Code

fig <- draw(
  gamm_domains_3$gam, residuals = TRUE, select = 1)
ggplotly(fig)
fig <- draw(
  gamm_domains_3$gam, residuals = TRUE, select = 2)
ggplotly(fig) |>
  layout(
    xaxis = list(tickvals = ~ ticks$time,
                 ticktext = ~ ticks$ticktext)
  )
fig <- draw(
  gamm_domains_3$gam, residuals = TRUE, select = 3)
ggplotly(fig)

(a) Seasonal pattern.

(b) Monthly trend.

Figure 11: gamm_domains_3 partial effects.

With the exception of the term for the Russo-Ukrainian war, all the terms were significant.

Code

gam_smooth <-
  smooth_estimates(gamm_domains_3$gam) |>
  add_confint() |>
  filter(.smooth == "ti(time,inflation)")
gam_smooth <- get_ci_areas(gam_smooth, "ti", "time", "inflation")

Code

plot_ly(
  data = gam_smooth,
  type = "contour",
  x    = ~time,
  y    = ~inflation,
  z    = ~.estimate,
  contours = list(
    coloring = "heatmap",
    showlabels = TRUE
  ),
  colors   = "RdBu",
  reversescale = TRUE,
  lines     = list(color = "black"),
  hoverinfo = "text",
  text      = paste0(
    "<b>Inflation</b>: ",
    round(gam_smooth$inflation, 1),
    "<br>",
    "<b>Estimate</b>: ",
    round(gam_smooth$.estimate, 4),
    "<br>",
    "<b>95% CI</b> (",
    round(gam_smooth$.lower_ci, 4),
    ", ",
    round(gam_smooth$.upper_ci, 4),
    ")"
    )
  ) |>
  add_trace(
    name   = "Historical",
    x      = ~ df_crises$time,
    y      = ~ df_crises$inflation,
    type   = "scatter",
    mode   = "lines",
    line   = list(color = "black"),
    hoverinfo = "text",
    text = paste0(
      "<b>Month</b>: ",
      round(df_crises$month, 0),
      " (", df_crises$year, ")",
      "<br>",
      "<b>Inflation</b>: ",
      round(df_crises$inflation, 1)
      )
    ) |>
  add_trace(
    name   = "Positive CI",
    type   = "scatter",
    mode   = "markers",
    x      = ~ gam_smooth$var1_sig_pos,
    y      = ~ gam_smooth$var2_sig_pos,
    marker = list(color   = "black",
                  opacity = 0.1,
                  symbol  = "triangle-up-open"),
    showlegend = TRUE#,
    #visible = "legendonly"
    ) |>
  add_trace(
    name   = "Negative CI",
    type   = "scatter",
    mode   = "markers",
    x      = ~ gam_smooth$var1_sig_neg,
    y      = ~ gam_smooth$var2_sig_neg,
    marker = list(color   = "black",
                  opacity = 0.1,
                  symbol  = "triangle-down-open"),
    showlegend = TRUE#,
    #visible = "legendonly"
    ) |>
  layout(
    title = "Associations with domain count",
    xaxis = list(title    = "Time",
                 tickvals = ~ ticks$time,
                 ticktext = ~ ticks$ticktext),
    yaxis = list(title = "Inflation")
    ) |>
  colorbar(title = "Estimate",
           limits = c(-max(abs(gam_smooth$.estimate)),
                      max(abs(gam_smooth$.estimate))))

Figure 12: Domain count association with inflation rate in interaction with the monthly temporal trend.

In Figure 12, we can observe that the model estimates significant negative associations between the domain count and the rate of inflation. The associations are located around the peak of inflation’s rate and at the beginning and the end of the investigated period (The model also estimates some positive associations but the areas seem far from the historical observations).

In Figure 13 below, we can see that including the interaction term seems important for predicting domain counts during the inflation’s highest values (period between March 2022 and June 2023; and also at the start and the end of the investigated period). Indeed, when the interactions term is included, the model predicts 22 027 less domains in July 2022 (as an example) than when excluded, copying the historical observations noticeably better.

Code

fitted_noint <- fitted_values(
  gamm_domains_3,
  data = df_crises,
  terms = c(
    "(Intercept)",
    "ti(inflation)",
    "ti(time)",
    "s(month)",
    "ukraine"
    )
  )
fitted_int <- fitted_values(
  gamm_domains_3,
  data = df_crises,
  terms = c(
    "(Intercept)",
    "ti(inflation)",
    "ti(time)",
    "ti(time,inflation)",
    "s(month)",
    "ukraine"
    )
  )

plot_ly(
  name = "No interaction",
  data = fitted_noint,
  x = ~ts,
  y = ~.fitted,
  type = "scatter",
  mode = "lines",
  line = list(color = "#fc8d62",
              dash = "dot"),
  hovertemplate = paste0(
    fitted_noint$year, "-", fitted_noint$month, "<br>",
    format(fitted_noint$.fitted, big.mark = " ", scientific = FALSE),
    " domains", "<br>",
    "95% CI [",
      format(fitted_noint$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_noint$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
  )
  ) |>
  add_ribbons(
    name = "No interaction CI",
    x = ~ts,
    ymin = fitted_noint$.lower_ci,
    ymax = fitted_noint$.upper_ci,
    line = list(color = "#fc8d62"),
    fillcolor = "#fc8d62",
    opacity = 0.05,
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_noint$month, "<br>",
      "95% CI [",
      format(fitted_noint$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_noint$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
      )
  ) |>
  add_trace(
    name = "All",
    x = ~ts,
    y = ~fitted_int$.fitted,
    line = list(color = "#8da0cb",
                dash = "dot"),
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_int$month, "<br>",
      format(fitted_int$.fitted, big.mark = " ", scientific = FALSE),
      " domains", "<br>",
      "95% CI [",
      format(fitted_int$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_int$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
    )
  ) |>
  add_ribbons(
    name = "All CI",
    x = ~ts,
    ymin = fitted_int$.lower_ci,
    ymax = fitted_int$.upper_ci,
    line = list(color = "#8da0cb"),
    fillcolor = "#8da0cb",
    opacity = 0.05,
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_int$month, "<br>",
      "95% CI [",
      format(fitted_int$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_int$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
      )
  ) |>
  add_trace(
    name = "Historical",
    x = ~ts,
    y = ~domains,
    line = list(color = "#66c2a5",
                dash = "full"),
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_noint$month, "<br>",
      format(fitted_noint$domains, big.mark = " ", scientific = FALSE), " domains"
    )
  ) |>
  layout(
    xaxis = list(
      title = "Date"
    ),
    yaxis = list(
      title = "Domains"
    )
  )

Figure 13: Domain count predictions based on model gamm_domains_3.

AIC & BIC

Code

AIC(gamm_domains_1$lme,
    gamm_domains_2$lme,
    gamm_domains_3$lme)

                   df       AIC
gamm_domains_1$lme 15 -469.2533
gamm_domains_2$lme 16 -466.1264
gamm_domains_3$lme 13 -436.1309

Code

BIC(gamm_domains_1$lme,
    gamm_domains_2$lme,
    gamm_domains_3$lme)

                   df       BIC
gamm_domains_1$lme 15 -440.8760
gamm_domains_2$lme 16 -435.8573
gamm_domains_3$lme 13 -411.5372

3.1.3 Discussion: Domains and 2020s crises

Two hypotheses focused on the domain count (H1 and H4) found support in models which utilized terms interacting the new COVID-19 cases and inflation rate with the monthly trend (see Section 3.1.1 and Section 3.1.2). However, the interactions for the new COVID-19 cases also estimated associations suggesting the opposite. Such divergences require contemplation.

Regarding H1 (Transmission of COVID-19 positively associates with the domain count.), model gamm_domains_2 (Section 3.1.1) estimated both positive and negative associations between the count of domains and the interaction between new COVID-19 cases and the monthly trend. Focusing on the positive association, it suggests that during the peak of the COVID-19 pandemic (winter 2022), there was a significant increase in the number of domains. However, during the first wave, the interaction term estimated a negative relationship, showing that an increase in new COVID-19 cases also decreases the count of domains. Taken together, while the negative association found in the interaction term suggests a decrease in domains specific to the first COVID-19 wave (i.e., slowing down the domain market, as COVID-19 has done for many other areas; see Czech National Bank (2020)), the interaction term also reveals a short-lived increase specific to the second COVID-19 wave. Perhaps, this increase during the second wave of the pandemic might have been a results of necessity as more individuals and companies opted (or were finally forced) to create domains to meet their needs. Alternatively, individuals and companies might have been better prepared to face the challenges posed by the pandemic during the second wave, moving their interests online more proficiently compared to the first wave. To the contrary, the pressure might have not been large enough during the first wave (or the confusion was too large), possibly motivating rather austere and restrained measures as it was not yet known what effects on day-to-day life would the pandemic bring.

Regarding H4 (Inflation negatively associates with the domain count.), the modeling results estimate that there were less domains when inflation rate reached its highest values between 12.7% and 18% in a period between April 2022 and April 2023 (based on the interaction term in gamm_domain_3, Section 3.1.2). For example, the model predicts 22 027 less domains in July 2022 than when the interaction terms gets excluded from making the predictions. Then, the results suggest that the domain count has indeed suffered a notable decrease during this period of socio-economic hardship.³

As for the H7 (The war in Ukraine positively associates with the domain count.), little support was found in the models. Once interaction terms were introduced to the models, the term remained insignificant and the only time it was estimated significant (model gamm_domain_1) it showed an opposite direction than proposed in the hypothesis.

The models also observed the typical double-peaked pattern for the seasonal variation—peaking in March and November—which can be attributed to the vacation patterns of Czech citizens (see our previous report, Quiros Segovia and Řezníček 2025). Lastly, the monthly temporal trend smooths captured the slow-down in the domain count’s trajectory (since 2023), accounting for the temporal inertia and saturation in domain counts (however, this slow-down was not estimated in gamm_domains_3).

3.2 Holders

Next, I present results of the models predicting the count of Czech domain holders. With the exception of using the Poisson distribution, specifying different correlation structures and knots for the interaction terms, the models are the same as in the domains section; therefore, I do not describe the specifications and only report the results.

Holders - GAMM 1

Code

gamm_holders_1 <- gamm(
  holders ~
    s(month, k = 12, bs = "cc") + 
    s(time) +
    s(new_cases) +
    s(inflation) +
    ukraine,
  random = list(year_f = ~ 1),
  correlation =
    corARMA(form = ~ time,
            p    = 2,
            q    = 1),
  data = df_crises,
  family = poisson,
  method = "REML",
  control =
         nlmeControl(maxIter     = 1e8,
                     msMaxIter   = 1e8,
                     msMaxEval   = 1e8,
                     msVerbose   = FALSE,
                     optimMethod = "L-BFGS-B")
  )
saveRDS(gamm_holders_1, file = "gamm_holders_1.rds")

Code

gamm_holders_1 <- readRDS(file = "gamm_holders_1.rds")

Results

Code

summary(gamm_holders_1$gam)


Family: poisson 
Link function: log 

Formula:
holders ~ s(month, k = 12, bs = "cc") + s(time) + s(new_cases) + 
    s(inflation) + ukraine

Parametric coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 13.427022   0.001559  8610.1   <2e-16 ***
ukraineWar   0.003206   0.002465     1.3    0.205    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
               edf Ref.df       F  p-value    
s(month)     8.318 10.000   4.769 8.78e-05 ***
s(time)      6.636  6.636 196.592  < 2e-16 ***
s(new_cases) 4.308  4.308   1.781    0.146    
s(inflation) 1.000  1.000   0.091    0.765    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.988   
  Scale est. = 1         n = 49

Code

k.check(gamm_holders_1$gam)

             k'      edf   k-index p-value
s(month)     10 8.317928 1.0296393  0.5375
s(time)       9 6.636259 0.3633848  0.0000
s(new_cases)  9 4.308354 0.8349647  0.1025
s(inflation)  9 1.000011 0.9344576  0.3050

Code

appraise(gamm_holders_1$gam,
         method      = "simulate",
         point_col   = "steelblue",
         line_col    = "black",
         point_alpha = 0.4) &
  theme_minimal()

Code

concurvity(gamm_holders_1$gam, full = FALSE) |>
  kable(
    format = "html",
    digits = 2)

	para	s(month)	s(time)	s(new_cases)	s(inflation)
para	1	0.00	0.00	0.00	0.00
s(month)	0	1.00	0.99	0.55	0.51
s(time)	0	0.99	1.00	0.94	0.98
s(new_cases)	0	0.55	0.94	1.00	0.63
s(inflation)	0	0.51	0.98	0.63	1.00

	para	s(month)	s(time)	s(new_cases)	s(inflation)
para	1	0.00	0.00	0.00	0.00
s(month)	0	1.00	0.07	0.27	0.00
s(time)	0	0.73	1.00	0.70	0.97
s(new_cases)	0	0.33	0.61	1.00	0.30
s(inflation)	0	0.41	0.21	0.18	1.00

	para	s(month)	s(time)	s(new_cases)	s(inflation)
para	1	0.00	0.00	0.00	0.00
s(month)	0	1.00	0.05	0.23	0.01
s(time)	0	0.21	1.00	0.60	0.95
s(new_cases)	0	0.21	0.53	1.00	0.28
s(inflation)	0	0.16	0.32	0.39	1.00

Code

acf(residuals(gamm_holders_1$lme, type = "normalized"),
    ylim = c(-1, 1),
    main = "")
pacf(residuals(gamm_holders_1$lme, type = "normalized"),
     ylim = c(-1, 1),
     main = "")

Smooths

The first model found associations only with the temporal predictors.

Code

fig1 <- draw(
  gamm_holders_1$gam, residuals = TRUE, select = 1)
fig1 <-  ggplotly(fig1)
fig2 <- draw(
  gamm_holders_1$gam, residuals = TRUE, select = 2)
fig2 <- ggplotly(fig2) |>
  layout(
    xaxis = list(tickvals = ~ ticks$time,
                 ticktext = ~ ticks$ticktext)
  )
fig3 <- draw(
  gamm_holders_1$gam, residuals = TRUE, select = 3)
fig3 <- ggplotly(fig3)
fig4 <- draw(
  gamm_holders_1$gam, residuals = TRUE, select = 4)
fig4 <- ggplotly(fig4)
subplot(fig1, fig2, fig3, fig4,
        nrows = 2,
        titleX = TRUE,
        titleY = TRUE,
        margin = 0.1) |>
  layout(title = "")

Figure 14: gamm_holders_1 partial effects.

Predictions

Code

fitted_noint <- fitted_values(
  gamm_holders_1,
  data = df_crises
  )

plot_ly(
  data = fitted_noint,
  name = "Predictions",
  x = ~ts,
  y = ~.fitted,
  type = "scatter",
  mode = "lines",
  line = list(color = "#fc8d62",
              dash = "dot"),
  hovertemplate = paste0(
    fitted_noint$year, "-", fitted_noint$month, "<br>",
    format(fitted_noint$.fitted, big.mark = " ", scientific = FALSE),
    " holders", "<br>",
    "95% CI [",
      format(fitted_noint$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_noint$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
    )
  ) |>
  add_ribbons(
    name = "CI",
    x = ~ts,
    ymin = fitted_noint$.lower_ci,
    ymax = fitted_noint$.upper_ci,
    line = list(color = "#fc8d62"),
    fillcolor = "#fc8d62",
    opacity = 0.05,
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_noint$month, "<br>",
      "95% CI [",
      format(fitted_noint$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_noint$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
      )
  )  |>
  add_trace(
    name = "Historical",
    x = ~ts,
    y = ~holders,
    line = list(color = "#66c2a5",
                dash = "full"),
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_noint$month, "<br>",
      format(fitted_noint$holders, big.mark = " ", scientific = FALSE), " domains"
    )
  ) |>
  layout(
    xaxis = list(
      title = "Date"
    ),
    yaxis = list(
      title = "Holders"
    )
  )

Figure 15: Domain count predictions based on model gamm_holders_1.

3.2.1 Holders - GAMM 2

The second model found a significant association of holder counts with the interaction between the monthly trend and new COVID-19 cases. The main effect for the monthly trend and the seasonal pattern were also significant (other terms were insignificant).

Code

gamm_holders_2 <- gamm(
  holders ~
    s(month, k = 12, bs = "cc") +
    ti(time) +
    ti(new_cases) +
    ti(time, new_cases, k = c(10, 6)) +
    ukraine,
  random = list(year_f = ~ 1),
  correlation =
    corARMA(form = ~ time,
            p    = 1,
            q    = 1),
  data = df_crises,
  family = poisson,
  method = "REML",
  control =
         nlmeControl(maxIter     = 1e8,
                     msMaxIter   = 1e8,
                     msMaxEval   = 1e8,
                     msVerbose   = FALSE,
                     optimMethod = "L-BFGS-B")
  )
saveRDS(gamm_holders_2, file = "gamm_holders_2.rds")

Results

Code

summary(gamm_holders_2$gam)


Family: poisson 
Link function: log 

Formula:
holders ~ s(month, k = 12, bs = "cc") + ti(time) + ti(new_cases) + 
    ti(time, new_cases, k = c(10, 6)) + ukraine

Parametric coefficients:
             Estimate Std. Error  t value Pr(>|t|)    
(Intercept) 13.430123   0.001534 8752.841   <2e-16 ***
ukraineWar  -0.002079   0.002149   -0.967    0.343    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                     edf Ref.df       F p-value    
s(month)           9.383 10.000  14.180 < 2e-16 ***
ti(time)           3.775  3.775 102.313 < 2e-16 ***
ti(new_cases)      1.004  1.004   0.515 0.48271    
ti(time,new_cases) 7.561  7.561   3.847 0.00356 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.992   
  Scale est. = 1         n = 49

Code

appraise(gamm_holders_2$gam,
         method      = "simulate",
         point_col   = "steelblue",
         line_col    = "black",
         point_alpha = 0.4) &
  theme_minimal()

Code

concurvity(gamm_holders_2$gam, full = FALSE) |>
  kable(
    format = "html",
    digits = 2)

	para	s(month)	ti(time)	ti(new_cases)	ti(time,new_cases)
para	1	0.00	0.00	0.00	1
s(month)	0	1.00	0.10	0.45	1
ti(time)	0	0.10	1.00	0.63	1
ti(new_cases)	0	0.45	0.63	1.00	1
ti(time,new_cases)	1	1.00	1.00	1.00	1

	para	s(month)	ti(time)	ti(new_cases)	ti(time,new_cases)
para	1	0.00	0.00	0.00	0.04
s(month)	0	1.00	0.06	0.27	0.09
ti(time)	0	0.02	1.00	0.29	0.41
ti(new_cases)	0	0.11	0.56	1.00	0.47
ti(time,new_cases)	1	0.99	1.00	1.00	1.00

	para	s(month)	ti(time)	ti(new_cases)	ti(time,new_cases)
para	1	0.00	0.00	0.00	0.05
s(month)	0	1.00	0.02	0.27	0.14
ti(time)	0	0.02	1.00	0.25	0.14
ti(new_cases)	0	0.08	0.30	1.00	0.17
ti(time,new_cases)	1	0.90	1.00	1.00	1.00

Code

acf(residuals(gamm_holders_2$lme, type = "normalized"),
    ylim = c(-1, 1),
    main = "")
pacf(residuals(gamm_holders_2$lme, type = "normalized"),
     ylim = c(-1, 1),
     main = "")

Code

fig <- draw(
  gamm_holders_2$gam, residuals = TRUE, select = 1)
ggplotly(fig)
fig <- draw(
  gamm_holders_2$gam, residuals = TRUE, select = 2)
ggplotly(fig) |>
  layout(
    xaxis = list(tickvals = ~ ticks$time,
                 ticktext = ~ ticks$ticktext)
  )
fig <- draw(
  gamm_holders_2$gam, residuals = TRUE, select = 3)
ggplotly(fig)

(a) Seasonal pattern.

(b) Monthly trend.

Figure 16: gamm_holders_2 partial effects.

In Figure 16 above, the smooth for the seasonal pattern once again estimates a double-peaked association with a clear negative depression during summer (although the autumn peak does not seem different from zero; sub-figure a). The monthly trend shows an overall increase, with a period of a slowdown somewhere between November 2021 and January 2023 (sub-figure b). The smooth for the new COVID-19 cases was insignificant.

Code

gam_smooth <-
  smooth_estimates(gamm_holders_2$gam) |>
  add_confint() |>
  filter(.smooth == "ti(time,new_cases)")
gam_smooth <- get_ci_areas(gam_smooth, "ti", "time", "new_cases")

Code

plot_ly(
  data = gam_smooth,
  type = "contour",
  x    = ~time,
  y    = ~new_cases,
  z    = ~.estimate,
  contours = list(
    coloring = "heatmap",
    showlabels = TRUE
  ),
  colors   = "RdBu",
  reversescale = TRUE,
  lines     = list(color = "black"),
  hoverinfo = "text",
  text      = paste0(
    "<b>New cases</b>: ",
    round(gam_smooth$new_cases, 1),
    "<br>",
    "<b>Estimate</b>: ",
    round(gam_smooth$.estimate, 4),
    "<br>",
    "<b>95% CI</b> (",
    round(gam_smooth$.lower_ci, 4),
    ", ",
    round(gam_smooth$.upper_ci, 4),
    ")"
    )
  ) |>
  add_trace(
    name   = "Historical",
    x      = ~ df_crises$time,
    y      = ~ df_crises$new_cases,
    type   = "scatter",
    mode   = "lines",
    line   = list(color = "black"),
    hoverinfo = "text",
    text = paste0(
      "<b>Month</b>: ",
      round(df_crises$month, 0),
      " (", df_crises$year, ")",
      "<br>",
      "<b>New cases</b>: ",
      round(df_crises$new_cases, 1)
      )
    ) |>
  add_trace(
    name   = "Positive CI",
    type   = "scatter",
    mode   = "markers",
    x      = ~ gam_smooth$var1_sig_pos,
    y      = ~ gam_smooth$var2_sig_pos,
    marker = list(color   = "black",
                  opacity = 0.1,
                  symbol  = "triangle-up-open"),
    showlegend = TRUE#,
    #visible = "legendonly"
    ) |>
  add_trace(
    name   = "Negative CI",
    type   = "scatter",
    mode   = "markers",
    x      = ~ gam_smooth$var1_sig_neg,
    y      = ~ gam_smooth$var2_sig_neg,
    marker = list(color   = "black",
                  opacity = 0.1,
                  symbol  = "triangle-down-open"),
    showlegend = TRUE#,
    #visible = "legendonly"
    ) |>
  layout(
    title = "Associations with holder count",
    xaxis = list(title    = "Time",
                 tickvals = ~ ticks$time,
                 ticktext = ~ ticks$ticktext),
    yaxis = list(title = "New cases")
    ) |>
  colorbar(title = "Estimate")

Figure 17: Holder count association with the interaction between monthly temporal trend and new COVID-19 cases.

The interaction term (Figure 17 above) estimates a positive association around March 2022, right after the peak of new COVID-19 cases. The smooth also estimates two areas with negative associations. The first one is located between October and December 2020, the second one between May and August 2022.

Below, Figure 18 shows that when the interaction term is included, the model predicts 2 134 more holders in March 2022 than when excluded. Note that Figure 17 also suggest two areas of negative association.

Code

fitted_noint <- fitted_values(
  gamm_holders_2,
  data = df_crises,
  terms = c(
    "(Intercept)",
    "ti(new_cases)",
    "ti(time)",
    "s(month)",
    "ukraine"
    )
  )
fitted_int <- fitted_values(
  gamm_holders_2,
  data = df_crises,
  terms = c(
    "(Intercept)",
    "ti(new_cases)",
    "ti(time)",
    "ti(time,new_cases)",
    "s(month)",
    "ukraine"
    )
  )

plot_ly(
  name = "No interaction",
  data = fitted_noint,
  x = ~ts,
  y = ~.fitted,
  type = "scatter",
  mode = "lines",
  line = list(color = "#fc8d62",
              dash = "dot"),
  hovertemplate = paste0(
    fitted_noint$year, "-", fitted_noint$month, "<br>",
    format(fitted_noint$.fitted, big.mark = " ", scientific = FALSE),
    " holders", "<br>",
    "95% CI [",
      format(fitted_noint$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_noint$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
  )
  ) |>
  add_ribbons(
    name = "No interaction CI",
    x = ~ts,
    ymin = fitted_noint$.lower_ci,
    ymax = fitted_noint$.upper_ci,
    line = list(color = "#fc8d62"),
    fillcolor = "#fc8d62",
    opacity = 0.05,
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_noint$month, "<br>",
      "95% CI [",
      format(fitted_noint$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_noint$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
      )
  ) |>
  add_trace(
    name = "All",
    x = ~ts,
    y = ~fitted_int$.fitted,
    line = list(color = "#8da0cb",
                dash = "dot"),
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_int$month, "<br>",
      format(fitted_int$.fitted, big.mark = " ", scientific = FALSE),
      " holders", "<br>",
      "95% CI [",
      format(fitted_int$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_int$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
    )
  ) |>
  add_ribbons(
    name = "All CI",
    x = ~ts,
    ymin = fitted_int$.lower_ci,
    ymax = fitted_int$.upper_ci,
    line = list(color = "#8da0cb"),
    fillcolor = "#8da0cb",
    opacity = 0.05,
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_int$month, "<br>",
      "95% CI [",
      format(fitted_int$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_int$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
      )
  ) |>
  add_trace(
    name = "Historical",
    x = ~ts,
    y = ~holders,
    line = list(color = "#66c2a5",
                dash = "full"),
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_noint$month, "<br>",
      format(fitted_noint$holders, big.mark = " ", scientific = FALSE), " holders"
    )
  ) |>
  layout(
    xaxis = list(
      title = "Date"
    ),
    yaxis = list(
      title = "Holders"
    )
  )

Figure 18: Domain count predictions based on model gamm_holders_2.

3.2.2 Holders - GAMM 3

Lastly, the third model found a significant associations between the holder count and the monthly trend, the seasonal pattern and the interaction between the monthly trend and the inflation rate.

Code

gamm_holders_3 <- gamm(
  holders ~
    s(month, k = 12, bs = "cc") + 
    ti(time) +
    ti(inflation) +
    ti(time, inflation, k = c(10, 5)) +
    ukraine,
  random = list(year_f = ~ 1),
  correlation =
    corARMA(form = ~ time,
            p    = 2,
            q    = 1),
  data = df_crises,
  family = poisson,
  method = "REML",
  control =
         nlmeControl(maxIter     = 1e8,
                     msMaxIter   = 1e8,
                     msMaxEval   = 1e8,
                     msVerbose   = FALSE,
                     optimMethod = "L-BFGS-B")
  )
saveRDS(gamm_holders_3, file = "gamm_holders_3.rds")

Code

gamm_holders_3 <- readRDS(file = "gamm_holders_3.rds")

Results

Code

summary(gamm_holders_3$gam)


Family: poisson 
Link function: log 

Formula:
holders ~ s(month, k = 12, bs = "cc") + ti(time) + ti(inflation) + 
    ti(time, inflation, k = c(10, 5)) + ukraine

Parametric coefficients:
             Estimate Std. Error   t value Pr(>|t|)    
(Intercept) 1.343e+01  6.696e-04 20052.818   <2e-16 ***
ukraineWar  1.312e-03  1.018e-03     1.288    0.207    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                     edf Ref.df       F  p-value    
s(month)           8.977 10.000  12.413  < 2e-16 ***
ti(time)           3.907  3.907 646.238  < 2e-16 ***
ti(inflation)      1.002  1.002   0.047 0.830512    
ti(time,inflation) 2.094  2.094  11.023 0.000154 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.991   
  Scale est. = 1         n = 49

Code

appraise(gamm_holders_3$gam,
         method      = "simulate",
         point_col   = "steelblue",
         line_col    = "black",
         point_alpha = 0.4) &
  theme_minimal()

Code

concurvity(gamm_holders_3$gam, full = FALSE) |>
  kable(
    format = "html",
    digits = 2)

	para	s(month)	ti(time)	ti(inflation)	ti(time,inflation)
para	1	0.00	0.00	0.00	1
s(month)	0	1.00	0.10	0.39	1
ti(time)	0	0.10	1.00	0.97	1
ti(inflation)	0	0.39	0.97	1.00	1
ti(time,inflation)	1	1.00	1.00	1.00	1

	para	s(month)	ti(time)	ti(inflation)	ti(time,inflation)
para	1	0.00	0.00	0.00	0.01
s(month)	0	1.00	0.06	0.00	0.08
ti(time)	0	0.02	1.00	0.97	0.87
ti(inflation)	0	0.33	0.18	1.00	0.21
ti(time,inflation)	1	0.96	1.00	1.00	1.00

	para	s(month)	ti(time)	ti(inflation)	ti(time,inflation)
para	1	0.00	0.00	0.00	0.10
s(month)	0	1.00	0.02	0.14	0.14
ti(time)	0	0.02	1.00	0.47	0.28
ti(inflation)	0	0.07	0.51	1.00	0.22
ti(time,inflation)	1	0.70	1.00	1.00	1.00

Code

acf(residuals(gamm_holders_3$lme, type = "normalized"),
    ylim = c(-1, 1),
    main = "")
pacf(residuals(gamm_holders_3$lme, type = "normalized"),
     ylim = c(-1, 1),
     main = "")

Code

fig <- draw(
  gamm_holders_3$gam, residuals = TRUE, select = 1)
ggplotly(fig)
fig <- draw(
  gamm_holders_3$gam, residuals = TRUE, select = 2)
ggplotly(fig) |>
  layout(
    xaxis = list(tickvals = ~ ticks$time,
                 ticktext = ~ ticks$ticktext)
  )
fig <- draw(
  gamm_holders_3$gam, residuals = TRUE, select = 3)
ggplotly(fig)

(a) Seasonal pattern.

(b) Monthly trend.

Figure 19: gamm_holders_3 partial effects.

In Figure 19 above, we can observe the usual temporal smooths. In Figure 20 below, the model estimates two positive and two negative areas. With the inflation rate below 8.3% and until mid-2022, the relationship is positive. Then, when inflation surges above 12.3%, the relationship becomes negative until September 2022. However, after the inflation rate peaks, the relationship suddenly turns positive again and remains different from zero above 12.3% inflation rate. Then, upon falling below 8.3% inflation rate, the relationship turns negative for the rest of the tested period.

Code

gam_smooth <-
  smooth_estimates(gamm_holders_3$gam) |>
  add_confint() |>
  filter(.smooth == "ti(time,inflation)")
gam_smooth <- get_ci_areas(gam_smooth, "ti", "time", "inflation")

Code

plot_ly(
  data = gam_smooth,
  type = "contour",
  x    = ~time,
  y    = ~inflation,
  z    = ~.estimate,
  contours = list(
    coloring = "heatmap",
    showlabels = TRUE
  ),
  colors   = "RdBu",
  reversescale = TRUE,
  lines     = list(color = "black"),
  hoverinfo = "text",
  text      = paste0(
    "<b>Inflation</b>: ",
    round(gam_smooth$inflation, 1),
    "<br>",
    "<b>Estimate</b>: ",
    round(gam_smooth$.estimate, 4),
    "<br>",
    "<b>95% CI</b> (",
    round(gam_smooth$.lower_ci, 4),
    ", ",
    round(gam_smooth$.upper_ci, 4),
    ")"
    )
  ) |>
  add_trace(
    name   = "Historical",
    x      = ~ df_crises$time,
    y      = ~ df_crises$inflation,
    type   = "scatter",
    mode   = "lines",
    line   = list(color = "black"),
    hoverinfo = "text",
    text = paste0(
      "<b>Month</b>: ",
      round(df_crises$month, 0),
      " (", df_crises$year, ")",
      "<br>",
      "<b>Inflation</b>: ",
      round(df_crises$inflation, 1)
      )
    ) |>
  add_trace(
    name   = "Positive CI",
    type   = "scatter",
    mode   = "markers",
    x      = ~ gam_smooth$var1_sig_pos,
    y      = ~ gam_smooth$var2_sig_pos,
    marker = list(color   = "black",
                  opacity = 0.1,
                  symbol  = "triangle-up-open"),
    showlegend = TRUE#,
    #visible = "legendonly"
    ) |>
  add_trace(
    name   = "Negative CI",
    type   = "scatter",
    mode   = "markers",
    x      = ~ gam_smooth$var1_sig_neg,
    y      = ~ gam_smooth$var2_sig_neg,
    marker = list(color   = "black",
                  opacity = 0.1,
                  symbol  = "triangle-down-open"),
    showlegend = TRUE#,
    #visible = "legendonly"
    ) |>
  layout(
    title = "Associations with holder count",
    xaxis = list(title    = "Time",
                 tickvals = ~ ticks$time,
                 ticktext = ~ ticks$ticktext),
    yaxis = list(title = "Inflation")
    ) |>
  colorbar(title = "Estimate",
           limits = c(-max(abs(gam_smooth$.estimate)),
                      max(abs(gam_smooth$.estimate))))

Figure 20: Holder count association with inflation rate in interaction with the monthly temporal trend.

In Figure 21 below, predictions are plotted.

Code

fitted_noint <- fitted_values(
  gamm_holders_3,
  data = df_crises,
  terms = c(
    "(Intercept)",
    "ti(inflation)",
    "ti(time)",
    "s(month)",
    "ukraine"
    )
  )
fitted_int <- fitted_values(
  gamm_holders_3,
  data = df_crises,
  terms = c(
    "(Intercept)",
    "ti(inflation)",
    "ti(time)",
    "ti(time,inflation)",
    "s(month)",
    "ukraine"
    )
  )

plot_ly(
  name = "No interaction",
  data = fitted_noint,
  x = ~ts,
  y = ~.fitted,
  type = "scatter",
  mode = "lines",
  line = list(color = "#fc8d62",
              dash = "dot"),
  hovertemplate = paste0(
    fitted_noint$year, "-", fitted_noint$month, "<br>",
    format(fitted_noint$.fitted, big.mark = " ", scientific = FALSE),
    " holders", "<br>",
    "95% CI [",
      format(fitted_noint$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_noint$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
  )
  ) |>
  add_ribbons(
    name = "No interaction CI",
    x = ~ts,
    ymin = fitted_noint$.lower_ci,
    ymax = fitted_noint$.upper_ci,
    line = list(color = "#fc8d62"),
    fillcolor = "#fc8d62",
    opacity = 0.05,
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_noint$month, "<br>",
      "95% CI [",
      format(fitted_noint$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_noint$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
      )
  ) |>
  add_trace(
    name = "All",
    x = ~ts,
    y = ~fitted_int$.fitted,
    line = list(color = "#8da0cb",
                dash = "dot"),
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_int$month, "<br>",
      format(fitted_int$.fitted, big.mark = " ", scientific = FALSE),
      " holders", "<br>",
      "95% CI [",
      format(fitted_int$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_int$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
    )
  ) |>
  add_ribbons(
    name = "All CI",
    x = ~ts,
    ymin = fitted_int$.lower_ci,
    ymax = fitted_int$.upper_ci,
    line = list(color = "#8da0cb"),
    fillcolor = "#8da0cb",
    opacity = 0.05,
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_int$month, "<br>",
      "95% CI [",
      format(fitted_int$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_int$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
      )
  ) |>
  add_trace(
    name = "Historical",
    x = ~ts,
    y = ~holders,
    line = list(color = "#66c2a5",
                dash = "full"),
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_noint$month, "<br>",
      format(fitted_noint$holders, big.mark = " ", scientific = FALSE), " holders"
    )
  ) |>
  layout(
    xaxis = list(
      title = "Date"
    ),
    yaxis = list(
      title = "Holders"
    )
  )

Figure 21: Domain count predictions based on model gamm_holders_3.

AIC & BIC

Code

AIC(gamm_holders_1$lme,
    gamm_holders_2$lme,
    gamm_holders_3$lme)

                   df       AIC
gamm_holders_1$lme 13 -472.4934
gamm_holders_2$lme 13 -481.8192
gamm_holders_3$lme 14 -478.3854

Code

BIC(gamm_holders_1$lme,
    gamm_holders_2$lme,
    gamm_holders_3$lme)

                   df       BIC
gamm_holders_1$lme 13 -447.8997
gamm_holders_2$lme 13 -457.2256
gamm_holders_3$lme 14 -451.9000

3.2.3 Discussion: Holders and 2020s crises

Regarding the holder counts, the models estimated associations which support the proposed hypotheses but concomitantly paint a more complex picture than envisioned as they also suggest associations going in contrary directions (similarly to the domain models), albeit with insignificant main effect smooths for the new COVID-19 cases and the inflation rate (unlike the domain models).

When inspecting model gamm_holders_2 (Section 3.2.1), the interaction term between monthly trend and new COVID-19 cases finds support for the hypothesis H2 (Transmission of COVID-19 positively associates with the count of Czech holders.) around March 2022, one month after the peak of the second COVID-19 wave. Furthermore, the very same interaction term also predicts a decrease of Czech holders at the beginning of the first wave. In sum, the holder count seems to be associated with the intensity of COVID-19 transmission depending on time—the results suggest a dip in holder counts at the beginning of the first wave but an increase in holder counts right after the peak of the second wave. Similarly to the domain counts, the speculation about austerity, precaution, and confusion during the first wave and necessity or preparedness during the second wave might be the story behind these changes.

Model gamm_holders_3 (Section 3.2.2) finds somewhat perplexing results concerning the hypothesis H5 (Inflation negatively associates with the count of Czech domain holders.). The interaction term between the monthly trend and the inflation rate suggests associations not only in both directions, but also reversing their directions at low and high inflation rates. Following the historically observed inflation values, the results suggest that when the inflation rate was low at the beginning of the investigated period, we should expect more domain holders (implicitly supporting H5). Then, once the inflation rate goes above some 12.3%, the association becomes negative, suggesting that high inflation values associate negatively with holder counts (supporting H5). However, the association suddenly becomes positive in the middle of the inflation’s peak, going contrary to H5. Such a reversal seems confusing. Furthermore, as inflation gradually decreases, the positive association becomes uncertain and turns negative once the rate falls below 8.3%. Again, such a development seems confusing. In sum, while the upward-going-inflation part of the investigated period seems to support the hypothesis H5, the downward-going-inflation part suggests the exact opposite. Why were the associations estimated inversely during the downward-going-inflation part remains a mystery.

No support was found for the hypothesis H8 (The war in Ukraine positively associates with the count of Czech domain holders.).

Lastly, the count of holders was estimated to vary seasonally, copying the typical double-peaked shape. The count of holders was also estimated to gradually increases over time, with a period of slowdown around the year 2022.

3.3 Traffic

Finally, I finish the modelling with three models estimating the traffic in the .cz domain. I fully comment only on the second model as results in the first one are not too interesting and the third model’s results seem confusing.

Traffic - GAMM 1

Code

gamm_qps_1 <- gamm(
  qps ~
    s(month, k = 12, bs = "cc") + 
    s(time) +
    s(new_cases) +
    s(inflation) +
    ukraine,
  random = list(year_f = ~ 1),
  correlation =
    corARMA(form = ~ time,
            p    = 1,
            q    = 0),
  data = df_crises,
  family = gaussian,
  method = "REML"
  )
saveRDS(gamm_qps_1, file = "gamm_qps_1.rds")

Code

gamm_qps_1 <- readRDS(file = "gamm_qps_1.rds")

Results

Code

summary(gamm_qps_1$gam)


Family: gaussian 
Link function: identity 

Formula:
qps ~ s(month, k = 12, bs = "cc") + s(time) + s(new_cases) + 
    s(inflation) + ukraine

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  17080.0      561.6  30.413   <2e-16 ***
ukraineWar     762.0      854.3   0.892    0.377    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                   edf Ref.df      F  p-value    
s(month)     3.665e-06 10.000  0.000   0.3866    
s(time)      1.143e+00  1.143 26.372 4.08e-06 ***
s(new_cases) 1.000e+00  1.000  0.206   0.6520    
s(inflation) 1.578e+00  1.578  3.231   0.0377 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.903   
  Scale est. = 5.287e+05  n = 49

Code

appraise(gamm_qps_1$gam,
         point_col   = "steelblue",
         line_col    = "black",
         point_alpha = 0.4) &
  theme_minimal()

Code

concurvity(gamm_qps_1$gam, full = FALSE) |>
  kable(
    format = "html",
    digits = 2)

	para	s(month)	s(time)	s(new_cases)	s(inflation)
para	1	0.00	0.00	0.00	0.00
s(month)	0	1.00	0.99	0.55	0.51
s(time)	0	0.99	1.00	0.94	0.98
s(new_cases)	0	0.55	0.94	1.00	0.63
s(inflation)	0	0.51	0.98	0.63	1.00

	para	s(month)	s(time)	s(new_cases)	s(inflation)
para	1	0.00	0.00	0.00	0.00
s(month)	0	1.00	0.04	0.27	0.00
s(time)	0	0.99	1.00	0.76	0.97
s(new_cases)	0	0.37	0.55	1.00	0.30
s(inflation)	0	0.21	0.28	0.31	1.00

	para	s(month)	s(time)	s(new_cases)	s(inflation)
para	1	0.00	0.00	0.00	0.00
s(month)	0	1.00	0.05	0.23	0.01
s(time)	0	0.21	1.00	0.60	0.95
s(new_cases)	0	0.21	0.53	1.00	0.28
s(inflation)	0	0.16	0.32	0.39	1.00

Code

acf(residuals(gamm_qps_1$lme, type = "normalized"),
    ylim = c(-1, 1),
    main = "")
pacf(residuals(gamm_qps_1$lme, type = "normalized"),
     ylim = c(-1, 1),
     main = "")

Smooths

Code

fig1 <- draw(
  gamm_qps_1$gam, residuals = TRUE, select = 1)
fig1 <- ggplotly(fig1)
fig2 <- draw(
  gamm_qps_1$gam, residuals = TRUE, select = 2)
fig2 <- ggplotly(fig2) |>
  layout(
    xaxis = list(tickvals = ~ ticks$time,
                 ticktext = ~ ticks$ticktext)
  )
fig3 <- draw(
  gamm_qps_1$gam, residuals = TRUE, select = 3)
fig3 <- ggplotly(fig3)
fig4 <- draw(
  gamm_qps_1$gam, residuals = TRUE, select = 4)
fig4 <- ggplotly(fig)
subplot(fig1, fig2, fig3, fig4,
        nrows = 2,
        titleX = TRUE,
        titleY = TRUE,
        margin = 0.1) |>
  layout(title = "")

Figure 22: gamm_qps_1 partial effects.

Predictions

Code

fitted_noint <- fitted_values(
  gamm_qps_1,
  data = df_crises
  )

plot_ly(
  data = fitted_noint,
  name = "Predictions",
  x = ~ts,
  y = ~.fitted,
  type = "scatter",
  mode = "lines",
  line = list(color = "#fc8d62",
              dash = "dot"),
  hovertemplate = paste0(
    fitted_noint$year, "-", fitted_noint$month, "<br>",
    format(fitted_noint$.fitted, big.mark = " ", scientific = FALSE),
    " domains", "<br>",
    "95% CI [",
      format(fitted_noint$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_noint$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
    )
  ) |>
  add_ribbons(
    name = "CI",
    x = ~ts,
    ymin = fitted_noint$.lower_ci,
    ymax = fitted_noint$.upper_ci,
    line = list(color = "#fc8d62"),
    fillcolor = "#fc8d62",
    opacity = 0.05,
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_noint$month, "<br>",
      "95% CI [",
      format(fitted_noint$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_noint$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
      )
  )  |>
  add_trace(
    name = "Historical",
    x = ~ts,
    y = ~qps,
    line = list(color = "#66c2a5",
                dash = "full"),
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_noint$month, "<br>",
      format(fitted_noint$qps, big.mark = " ", scientific = FALSE), " domains"
    )
  ) |>
  layout(
    xaxis = list(
      title = "Date"
    ),
    yaxis = list(
      title = "QPS"
    )
  )

Figure 23: Domain count predictions based on model gamm_qps_1.

3.3.1 Traffic - GAMM 2

Model gamm_qps_2 estimates traffic by a seasonal pattern (months within a year), an overall monthly trend, (thousands of) new COVID-19 cases, whether the Russo-Ukrainian war was ongoing, and by an interaction term between the monthly temporal trend and new COVID-19 cases. As before, I also specified a varying intercept for the respective years and a correlation structure accounting for the autocorrelation of observations.

Code

gamm_qps_2 <- gamm(
  qps ~
    s(month, k = 12, bs = "cc") +
    ti(time) +
    ti(new_cases) +
    ti(time, new_cases, k = c(10, 5)) +
    ukraine,
  random = list(year_f = ~ 1),
  correlation =
    corARMA(form = ~ time,
            p    = 2,
            q    = 2),
  data = df_crises,
  family = gaussian,
  method = "REML",
  control =
    nlmeControl(maxIter     = 1e8,
                msMaxIter   = 1e8,
                msMaxEval   = 1e8,
                msVerbose   = FALSE,
                optimMethod = "L-BFGS-B")
  )
saveRDS(gamm_qps_2, file = "gamm_qps_2.rds")

Code

gamm_qps_2 <- readRDS(file = "gamm_qps_2.rds")

Results

Code

summary(gamm_qps_2$gam)


Family: gaussian 
Link function: identity 

Formula:
qps ~ s(month, k = 12, bs = "cc") + ti(time) + ti(new_cases) + 
    ti(time, new_cases, k = c(10, 5)) + ukraine

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 16945.49     172.94  97.983   <2e-16 ***
ukraineWar     72.13     236.03   0.306    0.761    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                      edf  Ref.df      F  p-value    
s(month)           1.6250 10.0000  0.436   0.0655 .  
ti(time)           2.5848  2.5848 64.948  < 2e-16 ***
ti(new_cases)      0.9998  0.9998 31.046 2.37e-06 ***
ti(time,new_cases) 0.9991  0.9991 25.937 8.54e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.897   
  Scale est. = 5.3527e+05  n = 49

Code

appraise(gamm_qps_2$gam,
         point_col   = "steelblue",
         line_col    = "black",
         point_alpha = 0.4) &
  theme_minimal()

Code

concurvity(gamm_qps_2$gam, full = FALSE) |>
  kable(
    format = "html",
    digits = 2)

	para	s(month)	ti(time)	ti(new_cases)	ti(time,new_cases)
para	1.00	0.00	0.00	0.00	0.99
s(month)	0.00	1.00	0.10	0.45	1.00
ti(time)	0.00	0.10	1.00	0.63	1.00
ti(new_cases)	0.00	0.45	0.63	1.00	1.00
ti(time,new_cases)	0.99	1.00	1.00	1.00	1.00

	para	s(month)	ti(time)	ti(new_cases)	ti(time,new_cases)
para	1.00	0.00	0.00	0.00	0.34
s(month)	0.00	1.00	0.04	0.27	0.16
ti(time)	0.00	0.06	1.00	0.29	0.09
ti(new_cases)	0.00	0.38	0.59	1.00	0.46
ti(time,new_cases)	0.99	0.95	0.99	1.00	1.00

	para	s(month)	ti(time)	ti(new_cases)	ti(time,new_cases)
para	1.00	0.00	0.00	0.00	0.07
s(month)	0.00	1.00	0.02	0.27	0.15
ti(time)	0.00	0.02	1.00	0.25	0.19
ti(new_cases)	0.00	0.08	0.30	1.00	0.20
ti(time,new_cases)	0.99	0.70	0.99	1.00	1.00

Code

acf(residuals(gamm_qps_2$lme, type = "normalized"),
    ylim = c(-1, 1),
    main = "")
pacf(residuals(gamm_qps_2$lme, type = "normalized"),
     ylim = c(-1, 1),
     main = "")

Code

fig <- draw(
  gamm_qps_2$gam, residuals = TRUE, select = 1)
ggplotly(fig)
fig <- draw(
  gamm_qps_2$gam, residuals = TRUE, select = 2)
ggplotly(fig) |>
  layout(
    xaxis = list(tickvals = ~ ticks$time,
                 ticktext = ~ ticks$ticktext)
  )
fig <- draw(
  gamm_qps_2$gam, residuals = TRUE, select = 3)
ggplotly(fig)

(a) Seasonal pattern.

(b) Monthly trend.

Figure 24: gamm_qps_2 partial effects.

Monthly averages for QPS values do not seem to be associated with any seasonal pattern (while the smooths somewhat wiggles in the typical shape, it remains uncertain throughout the entire year). Nevertheless, QPS has been estimated to increase with the monthly trend, gradually increasing in its positive association. For the main effect term on the association with the new COVID-19 cases, the model estimated a diminishing positive association below some 97 thousand cases and an increasingly negative association above (although less observations lie therein, making the estimates increasingly less certain). The term for the Russo-Ukrainian war was insignificant.

Code

gam_smooth <-
  smooth_estimates(gamm_qps_2$gam) |>
  add_confint() |>
  filter(.smooth == "ti(time,new_cases)")
gam_smooth <- get_ci_areas(gam_smooth, "ti", "time", "new_cases")

Code

plot_ly(
  data = gam_smooth,
  type = "contour",
  x    = ~time,
  y    = ~new_cases,
  z    = ~.estimate,
  colors   = "RdBu",
  reversescale = TRUE,
  contours = list(
    coloring = "heatmap",
    showlabels = TRUE
  ),
  lines     = list(color = "black"),
  hoverinfo = "text",
  text      = paste0(
    "<b>Month</b>: ",
    round(gam_smooth$month, 0),
    "<br>",
    "<b>New cases</b>: ",
    round(gam_smooth$new_cases, 1),
    "<br>",
    "<b>Estimate</b>: ",
    round(gam_smooth$.estimate, 4),
    "<br>",
    "<b>95% CI</b> (",
    round(gam_smooth$.lower_ci, 4),
    ", ",
    round(gam_smooth$.upper_ci, 4),
    ")"
    )
  ) |>
  add_trace(
    name   = "Historical",
    x      = ~ df_crises$time,
    y      = ~ df_crises$new_cases,
    type   = "scatter",
    mode   = "lines",
    line   = list(color = "black"),
    hoverinfo = "text",
    text = paste0(
      "<b>Month</b>: ",
      round(df_crises$month, 0),
      " (", df_crises$year, ")",
      "<br>",
      "<b>New cases</b>: ",
      round(df_crises$new_cases, 1)
      )
    ) |>
  add_trace(
    name   = "Positive CI",
    type   = "scatter",
    mode   = "markers",
    x      = ~ gam_smooth$var1_sig_pos,
    y      = ~ gam_smooth$var2_sig_pos,
    marker = list(color   = "black",
                  opacity = 0.1,
                  symbol  = "triangle-up-open")#,
    #visible = "legendonly"
    ) |>
  add_trace(
    name   = "Negative CI",
    type   = "scatter",
    mode   = "markers",
    x      = ~ gam_smooth$var1_sig_neg,
    y      = ~ gam_smooth$var2_sig_neg,
    marker = list(color   = "black",
                  opacity = 0.1,
                  symbol  = "triangle-down-open")#,
    #visible = "legendonly"
    ) |>
  layout(
    title = "Associations with QPS",
    xaxis = list(title    = "Time",
                 tickvals = ~ ticks$time,
                 ticktext = ~ ticks$ticktext),
    yaxis = list(title = "New cases")
    ) |>
  colorbar(title = "Estimate")

Figure 25: Traffic’s association with the interaction between monthly temporal trend and new COVID-19 cases.

In Figure 25 above, the entire surface of the plot estimated significant associations between the QPS values and the interaction between the monthly trend and new COVID-19 cases. Notably, the smooth estimates the largest positive associations precisely at the peaks of the COVID-19 pandemic (January 2021 and January and February 2022). The interaction term also estimates a negative relationship for the period between September 2020 and September 2022 when below 90 thousand new COVID-19 cases, wherein lie historical observations of new COVID-19 cases during their summery seasonal dips.

In Figure 26 below, we can see that the model predicts some 2000 more QPS during both the first and the second wave when the interaction term is included.

Code

fitted_noint <- fitted_values(
  gamm_qps_2,
  data = df_crises,
  terms = c(
    "(Intercept)",
    "ti(new_cases)",
    "ti(time)",
    "s(month)",
    "ukraine"
    )
  )
fitted_int <- fitted_values(
  gamm_qps_2,
  data = df_crises,
  terms = c(
    "(Intercept)",
    "ti(new_cases)",
    "ti(time)",
    "ti(time,new_cases)",
    "s(month)",
    "ukraine"
    )
  )

plot_ly(
  name = "No interaction",
  data = fitted_noint,
  x = ~ts,
  y = ~.fitted,
  type = "scatter",
  mode = "lines",
  line = list(color = "#fc8d62",
              dash = "dot"),
  hovertemplate = paste0(
    fitted_noint$year, "-", fitted_noint$month, "<br>",
    format(fitted_noint$.fitted, big.mark = " ", scientific = FALSE),
    " QPS", "<br>",
    "95% CI [",
      format(fitted_noint$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_noint$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
  )
  ) |>
  add_ribbons(
    name = "No interaction CI",
    x = ~ts,
    ymin = fitted_noint$.lower_ci,
    ymax = fitted_noint$.upper_ci,
    line = list(color = "#fc8d62"),
    fillcolor = "#fc8d62",
    opacity = 0.05,
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_noint$month, "<br>",
      "95% CI [",
      format(fitted_noint$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_noint$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
      )
  ) |>
  add_trace(
    name = "All",
    x = ~ts,
    y = ~fitted_int$.fitted,
    line = list(color = "#8da0cb",
                dash = "dot"),
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_int$month, "<br>",
      format(fitted_int$.fitted, big.mark = " ", scientific = FALSE),
      " QPS", "<br>",
      "95% CI [",
      format(fitted_int$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_int$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
    )
  ) |>
  add_ribbons(
    name = "All CI",
    x = ~ts,
    ymin = fitted_int$.lower_ci,
    ymax = fitted_int$.upper_ci,
    line = list(color = "#8da0cb"),
    fillcolor = "#8da0cb",
    opacity = 0.05,
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_int$month, "<br>",
      "95% CI [",
      format(fitted_int$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_int$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
      )
  ) |>
  add_trace(
    name = "Historical",
    x = ~ts,
    y = ~qps,
    line = list(color = "#66c2a5",
                dash = "full"),
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_noint$month, "<br>",
      format(fitted_noint$qps, big.mark = " ", scientific = FALSE), " QPS"
    )
  ) |>
  layout(
    xaxis = list(
      title = "Date"
    ),
    yaxis = list(
      title = "QPS"
    )
  )

Figure 26: Domain count predictions based on model gamm_qps_2.

Traffic - GAMM 3

Code

gamm_qps_3 <- gamm(
  qps ~
    s(month, k = 12, bs = "cc") + 
    ti(time) +
    ti(inflation) +
    ti(time, inflation) +
    ukraine,
  random = list(year_f = ~ 1),
  correlation =
    corARMA(form = ~ time,
            p    = 4,
            q    = 0),
  data = df_crises,
  family = gaussian,
  method = "REML"
  )
saveRDS(gamm_qps_3, file = "gamm_qps_3.rds")

Code

gamm_qps_3 <- readRDS(file = "gamm_qps_3.rds")

Results

Code

summary(gamm_qps_3$gam)


Family: gaussian 
Link function: identity 

Formula:
qps ~ s(month, k = 12, bs = "cc") + ti(time) + ti(inflation) + 
    ti(time, inflation) + ukraine

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 17456.28     326.05  53.539   <2e-16 ***
ukraineWar     45.37     513.60   0.088     0.93    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                      edf Ref.df      F p-value    
s(month)           0.2188 10.000  0.024 0.33627    
ti(time)           1.0000  1.000 70.697 < 2e-16 ***
ti(inflation)      3.6928  3.693 15.567 < 2e-16 ***
ti(time,inflation) 1.0000  1.000  9.349 0.00392 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.927   
  Scale est. = 3.3981e+05  n = 49

Code

k.check(gamm_qps_3$gam)

                   k'       edf   k-index p-value
s(month)           10 0.2187762 1.0558366  0.5550
ti(time)            4 1.0000000 0.7800795  0.0325
ti(inflation)       4 3.6928450 0.8874125  0.1575
ti(time,inflation) 16 1.0000001 0.9837431  0.3750

Code

appraise(gamm_qps_3$gam,
         point_col   = "steelblue",
         line_col    = "black",
         point_alpha = 0.4) &
  theme_minimal()

Code

concurvity(gamm_qps_3$gam, full = FALSE) |>
  kable(
    format = "html",
    digits = 2)

	para	s(month)	ti(time)	ti(inflation)	ti(time,inflation)
para	1.00	0.00	0.00	0.00	0.98
s(month)	0.00	1.00	0.10	0.39	0.94
ti(time)	0.00	0.10	1.00	0.97	1.00
ti(inflation)	0.00	0.39	0.97	1.00	1.00
ti(time,inflation)	0.98	0.94	1.00	1.00	1.00

	para	s(month)	ti(time)	ti(inflation)	ti(time,inflation)
para	1.00	0.00	0.00	0.00	0.01
s(month)	0.00	1.00	0.04	0.10	0.07
ti(time)	0.00	0.07	1.00	0.69	0.90
ti(inflation)	0.00	0.15	0.23	1.00	0.14
ti(time,inflation)	0.98	0.69	0.97	0.97	1.00

	para	s(month)	ti(time)	ti(inflation)	ti(time,inflation)
para	1.00	0.00	0.00	0.00	0.14
s(month)	0.00	1.00	0.02	0.14	0.13
ti(time)	0.00	0.02	1.00	0.47	0.31
ti(inflation)	0.00	0.07	0.51	1.00	0.31
ti(time,inflation)	0.98	0.27	0.89	0.96	1.00

Code

acf(residuals(gamm_qps_3$lme, type = "normalized"),
    ylim = c(-1, 1),
    main = "")
pacf(residuals(gamm_qps_3$lme, type = "normalized"),
     ylim = c(-1, 1),
     main = "")

Smooths

Code

fig1 <- draw(
  gamm_qps_3$gam, residuals = TRUE, select = 1)
fig1 <- ggplotly(fig1)
fig2 <- draw(
  gamm_qps_3$gam, residuals = TRUE, select = 2)
fig2 <- ggplotly(fig2) |>
  layout(
    xaxis = list(tickvals = ~ ticks$time,
                 ticktext = ~ ticks$ticktext)
  )
fig3 <- draw(
  gamm_qps_3$gam, residuals = TRUE, select = 3)
fig3 <- ggplotly(fig3)

subplot(fig1, fig2, fig3,
        titleX = TRUE,
        titleY = TRUE,
        margin = 0.05) |>
  layout(title = "")

Figure 27: gamm_qps_3 partial effects.

Code

gam_smooth <-
  smooth_estimates(gamm_qps_3$gam) |>
  add_confint() |>
  filter(.smooth == "ti(time,inflation)")
gam_smooth <- get_ci_areas(gam_smooth, "ti", "time", "inflation")

Code

plot_ly(
  data = gam_smooth,
  type = "contour",
  x    = ~time,
  y    = ~inflation,
  z    = ~.estimate,
  colors   = "RdBu",
  reversescale = TRUE,
  contours = list(
    coloring = "heatmap",
    showlabels = TRUE
  ),
  lines     = list(color = "black"),
  hoverinfo = "text",
  text      = paste0(
    "<b>Inflation</b>: ",
    round(gam_smooth$inflation, 1),
    "<br>",
    "<b>Estimate</b>: ",
    round(gam_smooth$.estimate, 4),
    "<br>",
    "<b>95% CI</b> (",
    round(gam_smooth$.lower_ci, 4),
    ", ",
    round(gam_smooth$.upper_ci, 4),
    ")"
    )
  ) |>
  add_trace(
    name   = "Historical",
    x      = ~ df_crises$time,
    y      = ~ df_crises$inflation,
    type   = "scatter",
    mode   = "lines",
    line   = list(color = "black"),
    hoverinfo = "text",
    text = paste0(
      "<b>Month</b>: ",
      round(df_crises$month, 0),
      " (", df_crises$year, ")",
      "<br>",
      "<b>Inflation</b>: ",
      round(df_crises$inflation, 1)
      )
    ) |>
  add_trace(
    name   = "Positive CI",
    type   = "scatter",
    mode   = "markers",
    x      = ~ gam_smooth$var1_sig_pos,
    y      = ~ gam_smooth$var2_sig_pos,
    marker = list(color   = "black",
                  opacity = 0.1,
                  symbol  = "triangle-up-open"),
    showlegend = TRUE#,
    #visible = "legendonly"
    ) |>
  add_trace(
    name   = "Negative CI",
    type   = "scatter",
    mode   = "markers",
    x      = ~ gam_smooth$var1_sig_neg,
    y      = ~ gam_smooth$var2_sig_neg,
    marker = list(color   = "black",
                  opacity = 0.1,
                  symbol  = "triangle-down-open"),
    showlegend = TRUE#,
    #visible = "legendonly"
    ) |>
  layout(
    title = "Associations with QPS",
    xaxis = list(title    = "Time",
                 tickvals = ~ ticks$time,
                 ticktext = ~ ticks$ticktext),
    yaxis = list(title = "Inflation")
    ) |>
  colorbar(title = "Estimate",
           limits = c(-max(abs(gam_smooth$.estimate)),
                      max(abs(gam_smooth$.estimate))))

Figure 28: Association of QPS with inflation rate in interaction with the monthly trend.

Predictions

Code

fitted_noint <- fitted_values(
  gamm_qps_3,
  data = df_crises,
  terms = c(
    "(Intercept)",
    "ti(inflation)",
    "ti(time)",
    "s(month)",
    "ukraine"
    )
  )
fitted_int <- fitted_values(
  gamm_qps_3,
  data = df_crises,
  terms = c(
    "(Intercept)",
    "ti(inflation)",
    "ti(time)",
    "ti(time,inflation)",
    "s(month)",
    "ukraine"
    )
  )

plot_ly(
  name = "No interaction",
  data = fitted_noint,
  x = ~ts,
  y = ~.fitted,
  type = "scatter",
  mode = "lines",
  line = list(color = "#fc8d62",
              dash = "dot"),
  hovertemplate = paste0(
    fitted_noint$year, "-", fitted_noint$month, "<br>",
    format(fitted_noint$.fitted, big.mark = " ", scientific = FALSE),
    " QPS", "<br>",
    "95% CI [",
      format(fitted_noint$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_noint$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
  )
  ) |>
  add_ribbons(
    name = "No interaction CI",
    x = ~ts,
    ymin = fitted_noint$.lower_ci,
    ymax = fitted_noint$.upper_ci,
    line = list(color = "#fc8d62"),
    fillcolor = "#fc8d62",
    opacity = 0.05,
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_noint$month, "<br>",
      "95% CI [",
      format(fitted_noint$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_noint$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
      )
  ) |>
  add_trace(
    name = "All",
    x = ~ts,
    y = ~fitted_int$.fitted,
    line = list(color = "#8da0cb",
                dash = "dot"),
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_int$month, "<br>",
      format(fitted_int$.fitted, big.mark = " ", scientific = FALSE),
      " QPS", "<br>",
      "95% CI [",
      format(fitted_int$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_int$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
    )
  ) |>
  add_ribbons(
    name = "All CI",
    x = ~ts,
    ymin = fitted_int$.lower_ci,
    ymax = fitted_int$.upper_ci,
    line = list(color = "#8da0cb"),
    fillcolor = "#8da0cb",
    opacity = 0.05,
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_int$month, "<br>",
      "95% CI [",
      format(fitted_int$.lower_ci, big.mark = " ", scientific = FALSE),
      ", ",
      format(fitted_int$.upper_ci, big.mark = " ", scientific = FALSE),
      "]"
      )
  ) |>
  add_trace(
    name = "Historical",
    x = ~ts,
    y = ~qps,
    line = list(color = "#66c2a5",
                dash = "full"),
    hovertemplate = paste0(
      fitted_noint$year, "-", fitted_noint$month, "<br>",
      format(fitted_noint$qps, big.mark = " ", scientific = FALSE), " QPS"
    )
  ) |>
  layout(
    xaxis = list(
      title = "Date"
    ),
    yaxis = list(
      title = "QPS"
    )
  )

Figure 29: Domain count predictions based on model gamm_qps_3.

AIC & BIC

Code

AIC(gamm_qps_1$lme,
    gamm_qps_2$lme,
    gamm_qps_3$lme)

               df      AIC
gamm_qps_1$lme 12 732.2652
gamm_qps_2$lme 16 720.7267
gamm_qps_3$lme 16 722.3500

Code

BIC(gamm_qps_1$lme,
    gamm_qps_2$lme,
    gamm_qps_3$lme)

               df      BIC
gamm_qps_1$lme 12 753.6755
gamm_qps_2$lme 16 749.2737
gamm_qps_3$lme 16 750.8970

3.3.2 Discussion: Traffic and 2020s crises

Regarding the traffic in the .cz domain, the most interesting results were found in the model gamm_qps_2 (Section 3.3.1). Similarly to the results presented in previous sections, the model estimated positive associations for the interaction term between the monthly trend and new COVID-19 cases around the peaks of the COVID-19 transmission, starting as soon as there are some 100K new COVID-19 cases. Notably, this positive association appears at lower values of new cases and is far more widespread than the association estimated for domains and holders (which were observed only at or around the second-wave peaks). Furthermore, when the values of new COVID-19 cases fell down below some 100k in-between the waves, the association was estimated negative. Then, the hypothesis H3 (Increase in the spread of COVID-19 positively associates with QPS.) finds support when seen through the lens of the interaction. Taken together, when looking at the new cases’ fluctuations depending on the monthly time-flow, we can once again observe that traffic increases in given time periods when new COVID-19 cases simultaneously increase. In traffic’s case, these increases were observed for both waves of COVID-19, with a clear drop in and around the summer in-between, and also later after the second wave faded away. In sum, it seems that traffic increased noticeably during both COVID-19 waves when the transmission of COVID-19 increased too.

As for the inflation rate, the model gamm_qps_3 found associations supporting the hypothesis H6 (Increase of inflation associates with QPS.). When proposing the hypothesis in the introduction, I was unsure in which direction to envision such associations. Unfortunately, inspecting the results of the model has not shed more light on the processes which may lie behind them.

No support was found for the hypothesis H9 (The war in Ukraine positively associates with QPS).

Lastly, traffic does not seem to be associated with any seasonal pattern (although it slightly wiggles, similarly to the typical double-peaked shape). However, the monthly trend clearly shows a gradually increasing traffic volume.

4 General discussion

The goal of this report was to better understand how the .cz domain fared during the challenges that emerged during the first half of the 2020s. To do so, I investigated whether the spread of COVID-19, inflation rate, and the war in Ukraine associated with the count of second-level domains under .cz, the count of Czech holders of .cz domains, and traffic under .cz. I formulated three sets of hypotheses proposing that (a) higher transmission of COVID-19 would increase the domain and holder counts and QPS, (b) higher inflation rates would decrease domain and holder counts and associate with QPS, and that (c) the war in Ukraine would increase the count of domains and holders and also QPS. While the analysis found support for many of the hypotheses, the modelling results also estimated associations going contrary to the proposed directions. At first, this may sound like a paradox but upon a closer look the results provide a picture more complex than was originally envisioned.

One surprising finding is that the domain and holder counts behaved differently during the first and second COVID-19 wave. During the first wave, domain counts and in part also holder counts were unexpectedly estimated to decrease when the count of new COVID-19 cases in the Czech Republic increased. Here, the reason might be that individuals and companies might have been confused and ill-prepared to quickly adjust their activities and interests when facing the global pandemic for the first time. Perhaps, instead of rushing to register domains to realize interests in the online space, the shock and confusion initially lead to austerity, inactivity, or cautious hesitation, consequently decreasing the willingness to register domains (similarly to slowdowns in other industries, see Czech National Bank (2020)). However, during the second wave, both the domain and holder counts increased when the count of new COVID-19 cases increased too, supporting hypotheses H1 and H2. Here, we may speculate that individuals and companies might have been better prepared to face the challenges of another COVID-19 wave or were forced to move their interest and activities online out of necessity or both.

The results suggest a slightly different story for the traffic as QPS increased during both COVID-19 waves and the positive associations emerged at lower levels of COVID-19 transmission than for domain and holder counts. Furthermore, the association even became negative once the COVID-19 transmission decreased between the waves and afterwards. Together, these results support the hypothesis H3. The answer to why the results suggest increases in QPS even during the first COVID-19 wave quite possibly also dwells in the sudden confusion brought by the onset of the pandemic. However, unlike for the domain and holder counts, the uncertainty and social isolation prompted heightened demand for answers, entertainment or escapism in the online space which could have been realized instantly, resulting in increased traffic. Then, the first-wave difference from the domain and holder counts associations might be that the crisis projected into traffic way more flexibly and immediately, while registering domains and becoming a holder may require more deliberation and additional costs.

Furthermore, I found that high inflation rates associated negatively with domain counts in the period between March 2022 and June 2023, supporting hypothesis H4. When inflation rates surged to their highest values around 18% in July 2022, domain counts were estimated to drop by 22 thousand. Such results suggest that the domain count reflects the socio-economic situation in Czech Republic—and once the economic hardship set in sufficiently, the domain count seemed to follow these broader developments.

Inflation was found to associate with holder counts too (supporting hypothesis H5) but the size of the association was not as large as for the domain counts. Furthermore, inflation also associated with QPS; however, it is not clear what such associations might mean, if anything at all.

Lastly, the Russo-Ukrainian war did not seem to be associated with any changes in any of the modeled variables (hypotheses H7–9). One reason for such null results might be that the online dimension of the conflict mostly focuses on disseminating (dis-)information through already established domains (e.g., news websites) and social networking services suitable for reaching large audiences.

One limitation of this analysis is that the interaction terms for inflation and new COVID-19 cases with the monthly trend were not estimated within the same models but separately because high concurvity between the predictor variables would otherwise plague the models’ estimates. Also note that the associations described in this report are correlational, not causal. While it may seem intuitive to claim that COVID-19 and inflation influenced the count of domains and holders and the volume of QPS, one should not interpret the modeling results as such. Therefore, I usually describe the results as associations without claiming causality. If we were to properly test each predictors’ causal effect, we would need a parallel universe serving as an experimental control group where COVID-19, high inflation, and war in Ukraine did not happen. Unfortunately, we are aware only of the universe where all these crises happened.

To conclude, the analysis presents insights into how the COVID-19, inflation and war in Ukraine associated with trends in the .cz domain. It helps to evaluate and interpret the developments in the count of domains and holders and traffic under .cz by showing that all these variables were diversely related to the changes in the transmission of COVID-19 and the rise and fall of the inflation rate in the Czech Republic.

References

Andziński, Maciej, Jiří Helebrant, Ladislav Lhotka, and Maria Quiros Segovia. 2020. “COVID-19 in the .cz Domain.” https://stats.nic.cz/adam-reports/adam/covid19-en/.

Czech National Bank. 2020. Zpráva o Inflaci II/2020. Czech National Bank. https://www.cnb.cz/export/sites/cnb/cs/menova-politika/.galleries/zpravy_o_inflaci/2020/2020_II/download/ZOI_2020_II.pdf.

———. 2022. Monetary Policy Report - Summer 2022. Monetary Policy Report. Czech National Bank. https://www.cnb.cz/export/sites/cnb/en/monetary-policy/.galleries/monetary_policy_reports/2022/summer_2022/download/mpr_2022_summer.pdf.

Feldmann, Anja, Oliver Gasser, Franziska Lichtblau, Enric Pujol, Ingmar Poese, Christoph Dietzel, Daniel Wagner, et al. 2021. “A Year in Lockdown.” Communications of the ACM 64 (7): 101–8. https://doi.org/10.1145/3465212.

Komárek, Luboš, Petr Polák, Pavla Růžičková, Michaela Ryšavá, Alexis Derviz, Martin Kábrt, and Jan Hošek. 2024. Global Economic Outlook - January 2024. Global Economic Outlook. Czech National Bank. https://www.cnb.cz/export/sites/cnb/en/monetary-policy/.galleries/geo/geo_2024/gev_2024_01_en.pdf.

“Mgcv: Mixed GAM Computation Vehicle with Automatic Smoothness Estimation.” 2000. The R Foundation. https://doi.org/10.32614/cran.package.mgcv.

Pedersen, Eric J., David L. Miller, Gavin L. Simpson, and Noam Ross. 2019. “Hierarchical Generalized Additive Models in Ecology: An Introduction with Mgcv.” PeerJ 7 (May): e6876. https://doi.org/10.7717/peerj.6876.

Priyadarshini, Ishaani, Jyotir Moy Chatterjee, R. Sujatha, Nz Jhanjhi, Ali Karime, and Mehedi Masud. 2022. “Exploring Internet Meme Activity During COVID-19 Lockdown Using Artificial Intelligence Techniques.” Applied Artificial Intelligence 36 (1): 2014218. https://doi.org/10.1080/08839514.2021.2014218.

Quiros Segovia, Maria, Maciej Andziński, and Jiří Helebrant. 2022. “Ukraine War Effects on the .cz Domain.” https://stats.nic.cz/adam-reports/other/ukrainewar-en/.

Quiros Segovia, Maria, and Dan Řezníček. 2025. “Modeling the Seasonal Pattern of Domain Counts: Associations with Air Travel and Economic Sentiment.” https://stats.nic.cz/adam-reports/adam/model_cz_diffdomains/.

R Core Team. 2024. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.

Simpson, Gavin L. 2018a. “Modelling Palaeoecological Time Series Using Generalised Additive Models.” Frontiers in Ecology and Evolution 6 (October): 149. https://doi.org/10.3389/fevo.2018.00149.

———. 2018b. “Modelling Palaeoecological Time Series Using Generalised Additive Models.” Frontiers in Ecology and Evolution 6 (October): 149. https://doi.org/10.3389/fevo.2018.00149.

Wood, Simon N. 2017. Generalized Additive Models: An Introduction with R. Second edition. New York: Chapman; Hall/CRC.

Zanella, André Felipe, Stefania Rubrichi, Zbigniew Smoreda, and Marco Fiore. 2024. “Modeling and Understanding the Impact of COVID-19 Containment Policies on Mobile Service Consumption in French Cities.” EPJ Data Science 13 (1): 68. https://doi.org/10.1140/epjds/s13688-024-00507-9.

Footnotes

Before 2017, the .cz TLD was still observing an increasing trend in domain and holder counts.↩︎
One could argue that the Israeli conflicts with Hamas and Hezbollah could also be of focus. However, these conflicts have not created any immediate migration crises affecting the Czech Republic and are geographically distant. Thus, for simplicity, these conflicts were not modeled in this report.↩︎
The interaction term also estimated negative associations at the beginning and the end of the investigated period; however, it is dubious what these could mean.↩︎