Functions, continued

Let’s improve our proportion function

prop <- function(x, multiplier = 1) {
  n <- length(x)
  mean_val <- multiplier * sum(x) / n
  return(mean_val)
}

The multiplier = argument is a little silly. Let’s just give one option to make it a percentage.

TRUE/FALSE

TRUE and FALSE are special terms in R (no quotes)

  • They are also treated as 1 and 0 in many situations

Let’s say we want to replace the multiplier = argument with percentage =, where the user can set TRUE or FALSE.

prop(c(1, 0, 1, 0, 1))
[1] 0.6
prop(c(1, 0, 1, 0, 1), percentage = TRUE)
[1] 60

TRUE/FALSE lead naturally to if statements

if (3 > 8) {
  print("This is true")
} else {
  print("This is false")
}
[1] "This is false"
percentage <- TRUE
if (percentage) {
  print("print percentage")
} else {
  print("print proportion")
}
[1] "print percentage"

Incorporate it into the function

prop <- function(x, percentage = FALSE) {
  n <- length(x)
  mean_val <- sum(x) / n
  if (percentage) {
    mean_val <- mean_val * 100
  } else {
    # don't actually need this else statement!
    mean_val <- mean_val
  }
  return(mean_val)
}

Let’s try it out

prop(c(1, 0, 1, 0, 1))
[1] 0.6
prop(c(1, 0, 1, 0, 1), percentage = TRUE)
[1] 60
prop(c(1, 0, 1, 0, 1), percentage = FALSE)
[1] 0.6
nlsy_cc <- read_csv(here::here("data", "clean", "nlsy.rds"))
prop(nlsy_cc$glasses)
[1] 51.78423

Exercises

  1. Create a function that takes a vector of numbers and returns the standard deviation manually (like we did the mean). Use if statements to check if the vector has only one (or fewer) elements and return NA if so. (Hint: the length() function will be helpful!) You don’t need any extra arguments besides the vector of numbers.

\[ \text{sd}(x) = \sqrt{\frac{\sum_{i=1}^{n} (x_i - \bar{x})^2}{n-1} } \]

Exercises

  1. Modify your function to remove the NA values before calculating the standard deviation. (Hint: the na.omit() function will be helpful!) Add an argument na.rm = that defaults to TRUE (the opposite of the na.rm argument in the built-in R function sd()). If na.rm = FALSE, then the function should return NA if there are any NA values in the vector.

  2. What is the standard deviation of income in (all of) NLSY? Compare with the built-in R function sd().