Covid-19 data visualization in R – Somalia example
In this post, we will understand how to use ggplot in R to visualize data. We will also use dplyr package in R to do data manipulation such as data filtering.
Download the latest global Covid-19 data from this link
The first step is to install necessary packages in R. The ones we need in our current exercise are tidyverse and lubridate. Tidyverse contains ggplot for dat visualization and dplyr for data manipulation. Lubridate is used to format date data
> library(tidyverse) > library(lubridate)
Now let us import dataset we downloaded into R
> WHO.COVID.19.global.data <- read.csv("C:/Users/HP/Downloads/WHO-COVID-19-global-data.csv")
Since this data is global and contains all countries, we need to filter the specific country we want to analyze. In our case, we select Somalia. To do this we use the dplyr function FILTER
> covid19 <- filter(WHO.COVID.19.global.data,Country == "Somalia")
To select the variables we are interested (Country, date, new_cases), we can the dplyr function SELECT
> covid19 <- select(covid19, ï..Date_reported, Country, New_cases)
Format date variable as date class using lubridate package
> covid19 <- covid19 %>% mutate(date(ymd(ï..Date_reported))) > covid19 <- covid19 %>% select(Country, New_cases,"date(ymd(ï..Date_reported))") > covid19 <- covid19 %>% rename(Date = 'date(ymd(ï..Date_reported))')
The data contains both 2020 and 2021. Suppose we want to visualize this year 2021. Then we use FILTER function of the dplyr to filter rows that contains only 2021 dates
> covid19 <- filter (covid19, Date > "2020-12-30")
Now we are done with data manipulation. Now we plot our ready data using ggplot
> ggplot(covid19, aes(x = Date, y = New_cases)) + geom_point(color = "red")+ + geom_smooth(se = FALSE)+ theme_bw() + scale_x_date(date_breaks = "1 month") + + ggtitle("Number of confirmed Covid19 cases in Somalia as of August 3 2021")+ + theme(axis.title = element_text(size = 16))
The resultant plot of the Somalia Covid19 new cases from jan-2021 to August-2021 is shown below
The plot shows that new-cases declined in July but it is increasing in August
Did you like this post ? Take Full online course on Statistical analysis using SPSS in Somali language
Leave a Reply