2 Getting started
This chapter gives you the minimum information you need to know in order to get started working through this book, and it will point you to other places in the book where you can find more information.
2.1 Installing R and RStudio
R is freely available from CRAN (the Comprehensive R Archive Network). How you install R will depend on your operating system. See the sections on installing R on a Mac (A.2), Windows (A.4), or Ubuntu (A.3) for details.
You should also install RStudio Desktop, an integrated development environment for R. While RStudio is not strictly necessary, you will find it enormously helpful for writing R code and understanding what is happening in your session. It contains tools for editing R code and documents, running code in an R console, viewing plots, keeping your code under version control, and inspecting your R session.
Once you have R and RStudio installed, there are some basic configuration options which are helpful to set (see section A.8).
2.2 Packages you will need
Most of this book uses the tidyverse of R packages. These packages are designed to work together using a concept called tidy data. For an explanation of this concept, see Hadley Wickham’s “The Tidy Tools Manifesto,” which is part of the tidyverse package.17 We can install most of the packages that we need by installing the tidyverse package. For now, you can run the following code in your R session. See the section on installing packages from CRAN for a fuller explanation (section A.6).
Most of the examples in this book are available in the historydata package.18 While this book is being written, you will need the development version of the package, which is available on GitHub. You can install it directly from GitHub using the devtools package by running the code below. See the section on installing packages from GitHub for a fuller explanation (section A.7).
You can test that you have these packages installed by loading them in your R session. Note that tidyverse prints a message about the other packages that it is loading, but this is not an error message. See the section on packages and libraries for more information about loading packages (section A.5).
library(historydata) library(tidyverse) #> ── Attaching packages ────────────────────────────────── tidyverse 1.2.1 ── #> ✔ ggplot2 2.2.1 ✔ purrr 0.2.4 #> ✔ tibble 1.4.2 ✔ dplyr 0.7.4 #> ✔ tidyr 0.7.2 ✔ stringr 1.2.0 #> ✔ readr 1.1.1 ✔ forcats 0.2.0 #> ── Conflicts ───────────────────────────────────── tidyverse_conflicts() ── #> ✖ dplyr::filter() masks stats::filter() #> ✖ dplyr::lag() masks stats::lag()
You will need other packages for different chapters in this book. You can install them from CRAN as you go. See the package guide for a description of other packages that may be helpful to you (appendix B).
2.3 Learning R
This book is not intended to teach you how to program in R. If you want a general purpose introduction to the language, I recommend the following sequence of books.
- Hadley Wickham and Garret Grolemund’s R for Data Science teaches some of the basics of R programming in the process of teaching data analysis. If you have already programmed before, this is the place to start.19
- Hadley Wickham’s Advanced R is the definitive guide to the language. Once you have a journeyman’s proficiency in R, use this book to gain a deeper knowledge.20
This book does offer an R primer (appendix C). This will explain the very basics of R in a condensed form. This should be enough to get you started, especially if you are already familiar with programming concepts from a different language.
Hadley Wickham, Tidyverse: Easily Install and Load the ’Tidyverse’, 2017, https://CRAN.R-project.org/package=tidyverse. For a more rigorous explanation, see Hadley Wickham, “Tidy Data,” The Journal of Statistical Software 59, no. 10 (2014), http://www.jstatsoft.org/v59/i10/.↩