ENVSOCTY 4GA3

Lab 1: Introduction and setting up R

Zehui Yin

Winter 2026

Zehui Yin
PhD Student in Geography at SEES

  • Email: yinz39@mcmaster.ca
    • Please use your McMaster email address.
    • Please put the course code ENVSOCTY 4GA3 in the subject line.
    • Please include your name and student number in the body of the email.
    • I will try to reply within 24 hours (please expect longer delays during weekends/holidays).
  • Personal Website: zehuiyin.github.io
  • Research Interests: Spatial Analysis, Econometrics, Machine Learning, Transportation Planning, Public Transit, and Micromobility

Agenda for today

  • Introduction to basic concepts for coding in R
  • Getting a flavour of R syntax and style
  • Setting up R and reproducible environment on your personal computer
    • Instructions for both Windows and Mac systems
    • A demo on a Windows machine

      If you are using Linux, I expect you already have the technical skills to install R on your own computer :)

Lab slides

  • You can access all the lab slides by scanning the QR code or by visiting the URL below directly.
  • Slides from previous years are also available. You can take a peek, though I’ll make some updates this year—likely not major ones.

R and RStudio

   

  • R is a free and open-source programming language for statistical computing and graphics.
  • RStudio is an integrated development environment (IDE) for coding in R.
    • An IDE is a set of tools that helps you code.
  • We use RStudio to write our R codes.

R packages

  • R packages are the fundamental units of reproducible R code.
  • They can include functions, data, or both, along with documentation.
  • Think of them as plug-ins that enhance the functionality of existing software.
  • For example, web browser extensions like ad blockers add additional features that the original browser doesn’t have.
  • In this course, we use additional R packages to work with geospatial data in R.

Reproducible environment

  • An environment is the system where a program is run, including hardware and software such as operating system dependencies, programming language, packages, their configuration, and versions.
  • Just as running 1000 meters affects individuals differently, running code on different computers or with different package versions can produce varied results.
  • A reproducible environment ensures that everyone gets the same result by keeping the environment consistent.
    • Ideally, we would run the same program on the same system with the same software versions for maximum reproducibility.
    • In practice, we typically focus on ensuring that key software—such as R and R package versions—is the same.

renv package

  • renv is an R package that helps create reproducible environments for R projects.
  • It records the R version and all R packages along with their versions in a lockfile.
    • A lockfile is a text file that stores all the environment information.

Code hosting and Github

  • Code hosting involves storing code online to facilitate sharing, management, and collaboration with others.
  • One of the most popular code hosting platforms is GitHub (owned by Microsoft).
  • Both the textbook and the companion R package used in this course are hosted on GitHub.
  • GitHub is like a cloud drive (similar to OneDrive or Dropbox) but specialized for storing code (more specifically plain text files), including R scripts.

R Markdown vs. R

  • R Markdown is a file format that combines R code, its results (after knitting), and accompanying text.
  • It uses the file extension .Rmd and essentially is a plain text file integrating markdown and R.
    • When you knit the file, the code is executed, the results are inserted, and the document is rendered into formats such as PDF, HTML, or other report file types.
  • You can start by creating an Rmd file on the lab computer.

R Markdown syntax

1---
title: "Untitled"
author: "Zehui Yin"
output: html_document
---

2# Heading level 1
## Heading level 2
text...text...text
**bold**   __bold__
*italic*   _italic_

3```{r}
print("Hello world!")
```
1
YAML header: stores settings or meta information
2
Markdown text: contains plain text in markdown format.
3
R code chunk: contains R code to be executed

Markdown syntax

R basics: arithmetic operations

You can start by trying out R on the lab computer. Later, we’ll set it up on your personal computer.

R can be used as a calculator, using intuitive symbols for these operations:

1 + 5
[1] 6
8 - 3
[1] 5
3 * 4
[1] 12
9 / 3
[1] 3

R basics: assigning values

One of the cornerstones of programming languages is assignment. You can assign a value/object to a name using <- (suggested R style) or = (“Python” style).

a <- 1
b <- 3
a + b
[1] 4
c = 7
d = 5
c * d
[1] 35

R basics: built-in functions

R comes with many built-in functions. The calling syntax is function(parameter1, parameter2, ...). Additionally, with extra R packages, there are even more functions you can use.

values <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
sum(values)
[1] 55
mean(values)
[1] 5.5
library(MASS)
# integrate the sin function from 0 to pi.
area(sin, 0, pi)
[1] 2

R basics: indexing

Indexing is the process of selecting specific values from an object based on their index location. Whenever you see [] or $ in R, some form of indexing is happening.

v <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
1 2 3 4 5 6 7 8 9 10
v[2]
[1] 2
v[2:4]
[1] 2 3 4
v[c(TRUE, T, T, T, T, TRUE,
    FALSE, F, FALSE, F)]
[1] 1 2 3 4 5 6

R basics: indexing

df <- data.frame(col1 = c(1, 2, 3),
                 col2 = c(4, 5, 6))
col1 col2
1 4
2 5
3 6
df[1, 2]
[1] 4
df[, "col2"]
[1] 4 5 6
df$col1
[1] 1 2 3

R basics: flow control

Flow control is an important component of any programming language. In R, the if-else statement and loops work as follows:

x <- 6
if (x > 5) { 
  print("Greater than 5") 
} else {
  print("Less or equal to 5")
}
[1] "Greater than 5"
for (i in 1:3) {
  print(i)
}
[1] 1
[1] 2
[1] 3

R basics: custom functions

To define your own function in R, you can use the following syntax. Note that the last line of code is automatically returned by R, though a “Python” style return statement is also valid in R.

add <- function(a, b) { 
  a + b
}

add(1, 4)
[1] 5
add <- function(a, b) { 
  return(a + b)
}

add(1, 4)
[1] 5

Download R version 4.4.2

We will be using a previous release of R for this course rather than the latest version. All course materials and code examples were tested exclusively on this specific R release.

mirror.csclub.uwaterloo.ca/CRAN

Getting previous releases of R

Windows systems: mirror.csclub.uwaterloo.ca/CRAN/bin/windows/base/old

Mac systems:

Download RStudio

posit.co/download/rstudio-desktop

Restoring the environment

Download the Applied-Spatial-Statistics zip file from Avenue (or directly from GitHub: github.com/paezha/Applied-Spatial-Statistics) and unzip it. You should then have a folder with the following structure:

Applied-Spatial-Statistics/
├── renv/
├── .gitignore
├── .Rprofile
├── Applied-Spatial-Statistics.Rproj
├── README.md
├── README.Rmd
└── renv.lock
  • Double-click the Applied-Spatial-Statistics.Rproj file to open the R project.

RTools 4.4 for Windows users

mirror.csclub.uwaterloo.ca/CRAN/bin/windows/Rtools

  • Ensure you download the RTools version that matches your installed R version.
  • RTools is a set of programs required on Windows to build R packages from source.
  • Note: If you are using Mac or Linux, RTools is not required.

Xcode and GNU Fortran for Mac users

mac.r-project.org/tools

  • In order to compile R for macOS, you will need both Xcode and GNU Fortran compiler.

Homebrew and GDAL for Mac users

Next, use macOS Terminal to install Homebrew, and subsequently, GDAL.

brew.sh

formulae.brew.sh/formula/gdal

Restoring the environment

  1. Navigate to the bottom right panel.
  2. Select the Packages tab, then click the Restore option.
  3. Click the Restore button in the pop-up panel (as shown in the right figure).

Install \(\LaTeX\)

\(\LaTeX\) is a high-quality typesetting system. While it may seem as a language, understanding it isn’t necessary for our purposes. We will use it to export Rmd files with results into PDF files.

If you already use \(\LaTeX\) and have it installed through MiKTeX or TeX Live, you can skip this step.

If you are unfamiliar with \(\LaTeX\) and don’t have it installed yet, simply run the following R code in the console to install it:

tinytex::install_tinytex()

Finish setup of the R environment

After completing all the steps above, you should now have a fully functional R environment.

You should keep the folder Applied-Spatial-Statistics on your computer.

  • You may rename it to anything you like, but make sure you know where it is located (its directory path).
  • Starting next week, you will be working inside this folder.
  • All lab activity code files (Rmd files) will be stored here.
    • This is mandatory because the renv package ties the environment (R packages) to this project folder.
      • Similar to Python virtual environments, it is not a system‑wide setup (for R packages).
    • If you remove the folder or try to run the code outside of it, you will need to restore the environment again.

References