Influences of seasonal and demographic factors on the COVID-19 pandemic dynamics

INTRODUCTION: The number of COVID-19 cases per capita (accumulated (CC) and daily (DCC)) are important characteristics of the pandemic dynamics indicating the effectiveness of quarantine, testing, and vaccination. They also indicate the appearance of new waves (e.g., caused by new coronavirus strains) and may be the result of various demographic and seasonal factors. OBJECTIVES: We investigate the influence of the volume and the density of population and the urbanisation level on the CC values accumulated in European countries and regions of Ukraine at the end of June 2021 and the impact of seasonal factors on the DCC values by comparing their dynamics in the spring and summer of 2020 and 2021 for northern and southern regions. METHODS: The influence of demographic factors on CC values was investigated with the use of linear regression. Since DCC values are very random and demonstrate some weekly period, we have used the 7-days smoothing proposed before. The second year of the pandemic allows us to compare its dynamics in the spring and the summer of 2020 with the same period in 2021 and investigate the influence of seasonal factors. We have chosen some northern countries and regions: Ukraine, EU, the UK, USA and some countries located in the tropical zone and Southern Hemisphere: India, Brazil, South Africa and Argentina. The dynamics in these regions was compared with the global one. RESULTS: The accumulated number of cases per capita CC does not depend on the demographic factors used for analysis, although it may differ by about 4 times for different regions of Ukraine and more than 9 times for different European countries. The number of COVID-19 per capita registered in Ukraine is comparable with the same characteristic in other European countries but much higher than in China, South Korea and Japan. Some seasonal similarities are visible for global dynamics, EU and South Africa. Before July 2020, the southern countries demonstrated exponential growth, but northern regions showed some stabilization trends. CONCLUSION: The CC values in Europe do not show any visible dependence on the volume of population, its density and the urbanization level. More or less similar seasonal behaviour of DCC values are visible for global dynamics in July and August. Unfortunately, we cannot conclude where the quarantine restrictions were the most effective since the dynamics of the pandemic are influenced by many other factors not considered in this study, in particular, the emergence of new strains and the large number of unreported cases.


Introduction
The accumulated number of COVID-19 cases per capita (CC) and the daily number of new cases per capita (DCC) may indicate the effectiveness of quarantine, testing, vaccination, and characterizes the virulence of coronavirus strains circulating in a particular region. The CC values increase monotonically over time, so it is important to fix the appropriate time and compare these values for different countries and regions. In this study we take the end of June 2021, when the CC growth rate in Ukraine and most European countries was small. The CC and DCC numbers are regularly reported by the World Health Organization (WHO) [1] and COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University (JHU) [2].
Relative final size of the first epidemic wave V  was approximated by following equations: with the use of results of SIR simulations for n=13 countries and regions and the results for n=11 (without mainland China and South Korea), respectively. In this paper we will study the influence of the volume of population N pop , its density, and the level of urbanization N ubr /N pop (N ubr is the number of people living in cities) on the accumulated number of laboratoryconfirmed cases per capita in European countries and regions of Ukraine.
Since we are already in the second year of the pandemic, it is reasonable to compare the DCC numbers for the same periods in 2020 and 2021 to find some seasonal dependence. In this paper the comparisons for Argentina, Brazil, India, South Africa, Ukraine, EU, the UK, USA, and the whole world will be presented and discussed.

Data and averaged CC and DCC values
We will use the data set regarding the numbers of laboratory-confirmed COVID-19 cases in the regions of Ukraine accumulated at the time June 27, 2021, and registered by national statistics, [20]. The corresponding CC numbers (per 100 persons of population) and demographic data sets for Ukrainian regions [21] are shown in Table 1. As the information from the regions of Ukraine fully or partially occupied by the Russian Federation is inaccurate, we excluded from consideration Donetsk and Luhansk regions, Crimea, and Sevastopol. CC values vary from 2.25 (Kirovohrad oblast) to 8.84 (Chernivtsi oblast). Rather high CC values were registered in Zhytomyr, Khmelnytskyi, Kyiv (oblast and city), and Sumy regions that are not in the west of Ukraine with traditionally close ties with the EU countries.
The CC figures (per 1,000,000 persons of population) accumulated at the time June 28, 2021, and registered by JHU, [2] are shown in Table 2, which also contains the demographic data sets for European countries taken from [22][23][24]. The highest CC levels were registered in Andorra -18%, Montenegro -16%, Czech Republic -15.5%, San Marino -15%, Slovenia -12.4%. The populations of these countries are not very high as well as populations of Iceland and Finland where the lowest CC values were registered (1.9% and 1.7%, respectively). More than a tenfold difference in values raises a reasonable question about its causes. They may be related to the population density or the level of urbanization. A possible relationship with these factors will be explored further.
The CC figures (per 1,000,000 persons of population) registered by JHU, [2] are shown in Table A1 (March 1 -August 31, 2020) and Table A2 (March 1 -August 31,  2021). We denote these values as V j corresponding time moments t j measured in days. It must be noted that JHU regularly updates its previous data. The data sets presented in Tables A1 and A2 correspond to the moment of time September 3, 2021 (in particular, some differences can be seen in Tables 2 and A2). Since the V j values are random and demonstrate some weekly periodicity, we need proper smoothing. Like the approach proposed in [18,25,26], we will use the averaged CC values (calculated with the use of the nearest 7-day figures): and averaged DCC values as follows:

Linear and nonlinear regressions, Fisher tests
The linear regression will be used to calculate the regression coefficient r and the coefficients a and b of corresponding straight lines, [27]: Influences of seasonal and demographic factors on the COVID-19 pandemic dynamics 5 СС a bx   (5) where x is the volume of population N pop , its density per square km , or the urbanization level N urb /N pop . We will also use the F-test for the null hypothesis that says that the proposed linear relationship (5) fits the data sets. The experimental values of the Fisher function can be calculated with the use of the formula: where m=2 is the number of parameters in the regression equation, [27]. The corresponding experimental values F are shown in Table 3. They must be compared with the critical values  Table 3 show that the critical values are much higher than the experimental F values. It means that the data sets presented in Tables 1 and 2 do not support the linear relationship (5). We will also check the non-linear dependences: instead of (5) as it was done in [19,31,32].

Results
The CC values (per 100 persons) versus the volume of population (red "triangles"), its density (blue "crosses") and the urbanization level (black "circles")are shown in Figs. 1 and 2. We have used datasets from Tables 1 and 2. The best fitting lines were calculated by the least squares method [27] and shown by corresponding colors. The linear regression was used to calculate the regression coefficient r and the coefficients a and b of corresponding straight lines (5), [27]. Table 3. Optimal values of parameters in eq. (5), correlation coefficients and the results of Fisher test applications.    [19] for the V  (see eq. (1)).
Corresponding values of r, a, b and the number n of regions or countries taken for calculations are shown in Table 3.  Table 1. Best fitting lines (5) correspond to values shown in Table 3.  Table 2. Best fitting lines (5) correspond to values shown in Table 3. Table 3 illustrate that CC values do not correlate with the density of population and the urbanization level both in the case of Ukrainian regions and European countries. In Fig. 1 we can see only a slight increase of CC values with the increase of the density of population in Ukrainian regions. Opposite trend is visible in Fig. 2  Where  and  are constant parameters, see ,e.g., [18]. Differentiation of (8) yields the exponential growth also for the DCC values:

Figs. 1, 2 and
Then, corresponding averaged values calculated with the use of (3) and (4) and taken in logarithmic scale must follow the straight lines.
The slopes of DCC lines are different for different countries. The highest ones can be seen for the USA and Ukraine (see Fig. 3) and for Argentina, Brazil, and South Africa (see Fig. 4). The number of cases in the whole world had increased not so intensively. The duplicated time of the accumulated number of cases was estimated to be 2.31 days for the USA and 3.65 days for the whole word [18]. For the northern region, the DCC values started to decrease in June-July 2021 respectively, indicating the end of the first pandemic waves. Unfortunately, releasing the quarantine limitations in the vacation season 2020 caused new pandemic waves in northern countries and regions (see "crosses" in Fig. 3.). Especially intensive growth of DCC values can be seen in the USA and is probably connected with mass protests and ignoring the social distance in this country.
After some stabilization in April-May 2020, the DCC numbers for the whole world started to increase almost exponentially (see corresponding black "crosses" in Fig. 3  or 4). Some seasonal similarities in the world dynamics are visible in July and June (black "crosses" and "triangles" in Fig. 3 or 4 follow almost parallel lines). In southern region in April-June 2020, the DCC numbers 7 continue to increase almost exponentially with the almost equal slopes, smaller than in March 2020 but much intensively than in the whole world. Probably these differences in the "northern" and "southern" dynamics relate to the cold or rainy seasons and neglecting the quarantine restrictions in the South. On the other hand, the pandemic dynamics in Brazil, Argentina and India significantly differ in 2020 and 2021 (compare "crosses" and "triangles" in Fig. 4). The maximum of DCC values in India probably is connected with a new "delta" strain of coronavirus emerged in this country.   Probably, a new strain is responsible for almost exponential increase of DCC values in May-July 2021 in the UK (see the magenta "triangles" in Fig. 3). Similar seasonal behavior of DCC values we can see in the case of EU and South Africa (corresponding yellow markers in Fig. 3 and magenta markers in Fig. 4 follow parallel lines in May-June 2020 and 2021).
We can see in Fig. 4 the results of national lockdowns established in Argentina and South Africa on March 19, 2020, and March 23, 2020, respectively [28,29]. The daily number of new cases stabilized and even decreased. Further pandemic dynamics in these countries showed smaller DCC values in comparison with Brazil, where no lockdowns and strict quarantines were used. Nevertheless, in July 2021 the DCC numbers in Argentina became higher than in Brazil and much higher than in South Africa (see Table A2). And still we cannot conclude that the restriction policy in South Africa was the most effective since these countries have different climate conditions and structures of population.

Discussion
The results of Fisher tests are shown in Table 3. Comparisons of the values in the last two columns show that the critical values are much higher than the experimental F values. It means that the data sets presented in Tables 1 and 2 do not support the linear relationship (5). Application of the non-linear dependence (7) showed that corresponding values 1 r  and 1 2 ( , ) C F k k F  . It means that hypothesis (7) was not also supported by the datasets presented in Tables 1 and 2. Very different CC values registered in the regions of Ukraine and European countries could be a result of different coronavirus strains, quarantine measures, testing, tracing and isolating patients. One more reason may be the large number of unregistered cases observed in many countries [33][34][35][36][37]. Estimates for Ukraine and Qatar made in [33,37] showed that the real number of cases is about 4-5 times higher than registered and reflected in the official statistics. Similar estimates can be made for the regions of Ukraine and other European countries.
If we apply the visibility coefficients (the ratios of real cases to the laboratory-confirmed ones) 10  =3.7 and 3 5.308   calculated for the Ukraine and Qatar (see [33,37]) and take accumulated numbers of laboratoryconfirmed cases, the CC values could be estimated as 20% in Ukraine and 42% in Qatar (as of the end of June 2021). Such a high percentage of people who catch the coronavirus infection can significantly affect the evaluation of the vaccination efficiency. Probably this is why we did not see the effect of vaccination on the pandemic dynamics in Qatar in the summer of 2021, [38]. Moreover, Israel has experienced a fairly strong wave of the COVID-19 pandemic in the summer and autumn of 2021 despite a rather high level of vaccination (the proportion of fully vaccinated people was approximately 60%), [39]. A recent statistical analysis [40] showed that vaccination does not decrease the DCC values. But the daily number of new deaths per capita (DDC) and deaths per case ratio (DDC/DCC) can be sufficiently reduced at high levels of vaccination.
The highest CC values registered in Europe are close to ones in other regions, e.g., Seychelles-15.8%, Bahrain-15.6%. The lowest CC values in Europe is much higher than in Vanuatu, Micronesia, Tanzania (around 0.001%), and China (0.006%). Very small CC values in the WHO Western Pacific region (e.g. Vietnam -0.017%; Laos -0.029%; South Korea, Cambodia -0.3%; and 0.63% for Japan) need special investigations, but let us express some hypotheses.
First, the COVID-19 pandemic probably started in this region in August 2019, [18]. It means that first cases were not identified and registered during at least 4 months. Probably these first cases were not very severe, and the symptoms were not so pronounced. Presumably mutations of the coronavirus made it more pathogenic and sick people became more noticeable in December 2019. But previous cases were not taken into account in the statistics. Recent DNA investigations of East Asia population reported the presence of coronavirus around 20,000 years ago [41]. Probably, the population of this region had a collective immunity to pathogens similar to Covid-19 before the pandemic.
Very small CC values in mainland China and Hong Kong can be explained by very high values of the test per case ratio. E.g., as of August 31, 2021 this value was 680.4 in Hong Kong and exceeded the critical value 520, which was estimated to be enough to control the COVID-19 pandemic completely (DCC and DDC figures tend to zero), [40].

Conclusions
The CC values in Europe do not show any visible dependence on the volume of population, its density and the urbanization level. More or less similar seasonal behavior of DCC values we can see only in the case of the EU, South Africa and the whole world. The highest CC values registered in Europe are close to ones in other regions, the lowest ones are much higher than in Vanuatu, Micronesia, Tanzania, and China. Very small CC values in the WHO Western Pacific region need special investigations. Unfortunately, we cannot conclude where the quarantine restrictions were the most effective since the dynamics of the pandemic are influenced by many other factors not considered in this study tests per capita and tests per case ratios, the emergence of new strains and