Introduction to R and R Studio for Data Science

Introduction to R and RStudio for Data Science

Last Updated on July 11, 2020

R and RStudio

This is a short introductory training session on the use of R in data science.

R is a statistical programming language that can be used for data manipulation, visualisation of data and statistical analysis. The R language consists of a set of tokens and keywords and a grammar that you can use to explore and understand data from many different sources.

We focus on a common task in data science: import a data set, manipulate its structure, and then visualise the data. We shall use R and RStudio to accomplish this task.

RStudio is an integrated development environment (IDE) that can be used to carry out data science tasks using R. It contains an editor for R scripts, a console to interact directly with the R interpreter, and a file manager similar to that available in your operating system.

This is an interactive training session, so you should try to follow along with the tutorial.

Before we start, you’ll need the following software installed on your system:

  • RStudio. Install the program with your distribution’s package manager.
  • The R packages readr, dplyr, and ggplot2. These packages are compiled and installed within RStudio. In the RStudio application, click on “Tools > Install Packages. In the Packages box type: “readr dplyr ggplot”, as shown in the screen shot below.

RStudio - Install Packages

Let’s start RStudio. From your menu, click RStudio, or launch the program from a terminal.  RStudio application should open.

An RStudio project

An RStudio project allows us to organise files and data in RStudio using a directory on the file system. It is best to create a project for each piece of data analysis that you carry out, so we shall create a project for this training session.

In the RStudio application, click on “File > New Project… > New Directory > New Project”. Type the project name “RIntro” into the directory name box and click on “Create Project”. This will start up a new RStudio session.

Download the training material:

Next we copy the given training materials into the project. Click on the “More” button in the “Files” tab and select “Show Folder in New Window”. This should open the file manager at the project directory. Copy the training materials into this folder.

The RStudio application should now look like the screen shot below.

RStudio - Data Science Tutorial
Click for full size image

From RStudio, open the notes by left clicking on “Notes.html” and then selecting “View in Web Browser”.

You are now ready to start the training session by going through Notes.html in RStudio.

Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Sorren69
Sorren69
3 years ago

I completed this tutorial yesterday. A very good short introduction to get a taste of R, how it can be used in data analysis, and how graphical output can be produced and refined from that analysis. It teaches, by example, just enough to get you started using R for yourself for straight-forward data analysis of clean CSV data.

Highly recommended if teaching by example works for you.