Course structure

  • What will be covered before the mid-semester break:
    • Becoming familiar with R (Today)
    • Describing data
    • Statistical theory: Probability distributions
    • Statistical theory: Hypothesis tests
    • Analysis of discrete categories - Chi-Square tests
    • Research Report Instructions
  • What will be covered after the break:
    • Single-sample and two-sample tests
    • Comparing between multiple means
    • Multiple predictors: Balanced designs
    • Multiple predictors: Unbalanced designs
    • OLS regression with diagnostics
    • Bayesian vs Frequentist: What’s the difference?
    • Research Report Due

Course Evaluation

Familiarizing yourself with R

Open-source platform for data organization and analyses that is free and up-to-date. You will need access to R for the sake of this course. Although R should already be available on the lab PCs, I recommend you install it on a local/personal machine if you want to play around with the language in your own time.

You can find further instructions on how to install R and the accompanying RStudio by clicking on the link here.

Please make sure you have access to R and RStudio as these will be required to complete this course.

Some basic operations

3 + 3 # We can add
## [1] 6
5 - 3 # Or subtract
## [1] 2
5*3  # We can multiply
## [1] 15
6/2  # Or divide
## [1] 3
2^2  # Exponents
## [1] 4
1 + 2 * 4 # We can link operations in sequence
## [1] 9

The sequence of mathematical operations matter.

(1+2) * 4 # By placing round brackets, we tell R to focus on the enclosed operation first
## [1] 12

That looks about right! Practice with the arithmetic operators!

Assigning variables

We can assign values to variables. Values can include numbers or characters/words.

his.age <- 45  # We can store single values inside variables...
her.age <- 35
# Then operate on the new variables directly
his.age - her.age 
## [1] 10

Be careful when assigning variable names! R is case-sensitive, and there may be in-built functions/arguments within R that your code may conflict with. Some rules of thumb to avoid this is to use lower-case letters, include a ‘.’ between terms, and ensure the variable names are clear

Storing values inside a vector

We can store multiple values inside a vector (for now we will only deal with numeric values)

# The 'c' stands for 'concatenate'
their.ages <- c(22,35,42)  
# Let's look at the output...
their.ages                 
## [1] 22 35 42

We can operate on vectors which have similar length

their.ages2 <- c(27,31,44)  

# We can subtract from the vector defined earlier
their.ages2-their.ages                 
## [1]  5 -4  2

If vectors do not match in length, R will notify you about the error

their.ages3 <- c(15,16,25,17)  
their.ages4 <- c(22,21,30)

their.ages3 - their.ages4 # What happens when we subtract?
## Warning in their.ages3 - their.ages4: longer object length is not a multiple of
## shorter object length
## [1] -7 -5 -5 -5

We can extract particular values within vectors by indexing their position(s). Specifically, we can tell R where our value of interest is located, and R will extract out the necessary item.

Suppose we want to extract the FIRST value in the vector created earlier

their.ages[1]  # Select first value in series
## [1] 22

Similarly, we can extract the SECOND value by updating the index

their.ages[2]
## [1] 35

Or the THIRD value

their.ages[3]
## [1] 42

We can select multiple values by describing their index positions

# Suppose we want to extract the second and third values
their.ages[2:3] # or `their.ages[2,3]`
## [1] 35 42

We can assign ranges which R can select from…

# Suppose we want to know how many persons within a group are 18 years or above
their.ages2 >= 18
## [1] TRUE TRUE TRUE

After creating our new variable with the desired values (all persons over the age of 18), we can ask R to identify how many individuals are exactly 27 years old within our new variable [their.ages2]

# Is the exact number '27' within the specific age group
their.ages2 == 27
## [1]  TRUE FALSE FALSE
# We can ask for all numbers that are NOT equal to a given value
their.ages2 != 27
## [1] FALSE  TRUE  TRUE

LAB ACTIVITY 1

  1. Create a variable called ‘m.ages’ and store the ages of three of your male friends
  2. Create another variable called ‘f.ages’ and store the ages of three of your female friends
  3. Add both variables together, and store the results in a new variable called ‘total.ages’
  4. Are their any values less than 36 in ‘total.ages’? Describe your code
  5. Using a single relational operator, find out whether ‘m.ages’ is smaller/greater than ‘f.ages’.

Provide your responses on a text document with your student ID and submit it on dropbox.