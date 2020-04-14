R and RStudio

This is a short introductory training session on the use of R in data science.

R is a statistical programming language that can be used for data manipulation, visualisation of data and statistical analysis. The R language consists of a set of tokens and keywords and a grammar that you can use to explore and understand data from many different sources.

We focus on a common task in data science: import a data set, manipulate its structure, and then visualise the data. We shall use R and RStudio to accomplish this task.

RStudio is an integrated development environment (IDE) that can be used to carry out data science tasks using R. It contains an editor for R scripts, a console to interact directly with the R interpreter, and a file manager similar to that available in your operating system.

This is an interactive training session, so you should try to follow along with the tutorial.

Before we start, you’ll need the following software installed on your system:

RStudio. Install the program with your distribution’s package manager.

The R packages readr, dplyr, and ggplot2. These packages are compiled and installed within RStudio. In the RStudio application, click on “Tools > Install Packages. In the Packages box type: “readr dplyr ggplot”, as shown in the screen shot below.