Assignments 6 and 7

1. Create a new markdown file and a new chunk of R code. Copy and paste the following R code into the chunk, and run it.

# Takes a subsetted list of functions from Hadley Advanced R
# Assigns each randomly to a student in the class
# 19 February 2017
# NJG

# Ensure that the same random number sequence is used by everyone.
set.seed(100) 

# The Hadley R functions:
firstToLearn <- c("str", "?")

operators <- c("%in%", "match", "=", "<-", "<<-", "$", "[",
               "[[", "head", "tail", "subset", "with", "assign", "get")
comparisons <- c("all.equal", "identical", "!=", "==", ">", ">=", "<", "<=",  "is.na", "complete.cases",  "is.finite")

basicMath <- c("*", "+", "-", "/", "^", "%%", "%/%", "abs", "sign", "acos", "asin", "atan", "atan2", "sin", "cos", "tan", "ceiling", "floor", "round", "trunc", "signif", "exp", "log", "log10", "log2", "sqrt", "max", "min", "prod", "sum", "cummax", "cummin", "cumprod", "cumsum", "diff", "pmax", "pmin", "range", "mean", "median", "cor", "sd", "var", "rle")

logicalSets <- c("&", "|", "!", "xor", "all", "any", "intersect", "union", "setdiff", "setequal", "which")

vectorsMatrices <- c("c", "matrix", "length", "dim", "ncol", "nrow", "cbind", "rbind", "names", "colnames", "rownames", "t", "diag", "sweep", "as.matrix", "data.matrix")

makingVectors <- c("c", "rep", "rep_len", "seq", "seq_len", "seq_along", "rev", "sample", "choose", "factorial", "combn", "is.character", "is.numeric", "is.logical", "as.character", "as.numeric", "as.logical")

listsDataFrames <- c("list", "unlist",  "data.frame", "as.data.frame", "split", "expand.grid")

output <- c("print", "cat", "message", "warning", "dput", "format", "sink", "capture.output", "sprintf")

readingWritingData <- c("data", "count.fields", "read.csv", "write.csv", "read.delim", "write.delim", "read.fwf", "readLines", "writeLines", "readRDS", "saveRDS", "load", "save")

# Combine all of the function lists and randomize the order:
RFunctions <- c(firstToLearn, operators, comparisons, basicMath, logicalSets, vectorsMatrices, makingVectors, listsDataFrames, output, readingWritingData)

RFunctions <- sample(RFunctions)

# Create class list
classNames <- c("Alger", "Ashlock", "Burnham", "Clark", "Kazenal", "Keller", "Looi", "Makhukov", "Mickuki", "Nevins", "Southgate") 

# Assign functions
functionAssignments <- rep_len(classNames, length.out=length(RFunctions))

# Bind the two columns into a data frame
functionsFinal <- data.frame(functionAssignments,RFunctions)

2. Illustrate your knowledge of basic subsetting methods by creating and printing a data frame that shows only the 13 functions that you are responsible for.

melanie <- split(functionsFinal, functionsFinal$functionAssignments)
melanie$Kazenal

##     functionAssignments RFunctions
## 5               Kazenal       mean
## 16              Kazenal        dim
## 27              Kazenal      names
## 38              Kazenal  readLines
## 49              Kazenal     choose
## 60              Kazenal  identical
## 71              Kazenal   rownames
## 82              Kazenal       asin
## 93              Kazenal        xor
## 104             Kazenal        var
## 115             Kazenal       pmax
## 126             Kazenal       list
## 137             Kazenal     subset

3-6. Entries for the assigned functions

`mean`

Melanie R. Kazenel

The mean function calculates the arithmetic mean of a set of values. The function takes a numeric or logical vector (or a date, date-time, or time interval object) as input, and the output of the function is the arithmetic mean of the values in the object. The output is in the form of a vector with a length of one. In addition, the mean function can take a complex vector as its input if specific parameters are specified.

Under default settings, the mean function calculates the mean of the all of the observations in an object, and any NA values in the object are not removed prior to the calcuation of the mean. To remove a particular fraction of the observations from each end of the object before the mean is computed, the “trim” argument can be added. To remove NAs from the object before the mean is computed, the “na.rm” argument can be added.

### Example use of the `mean` function for a numeric vector

data <- c(4,8,10,25)

# Calculate the mean using default settings
mean(data)

## [1] 11.75

# Trim 25% of the observations from each end of the vector, and then calculate the mean of the remaining observations.
mean(data, trim = 0.25)

## [1] 9

### Example use of the `mean` function for a vector containing a "NA" value

data2 <- c(4,8,10,25,NA)

# Calculating the mean using default settings yields "NA"
mean(data2)

## [1] NA

# Adding "na.rm = TRUE" removes the "NA" and then calculates the mean of the remaining observations.
mean(data2, na.rm = TRUE)

## [1] 11.75

`dim`

Melanie R. Kazenel

The dim function can be used to obtain or specify the dimensions of an R object.

For use of dim to obtain the dimensions of an object, the function’s input is an R object of more than one dimension, such as a matrix, array, or data frame. The output is a set of numbers indicating the dimensions of the object. For instance, when the input is a 2-dimensional object such as a matrix or data frame, the first number in the output is the number of rows in the object, and the second number is the number of columns.

For use of dim to specify the dimensions of an object, the input can be an R object of one or more dimensions. The dim function can be used to assign a new set of dimensions to the object, so long as the new set of dimensions is compatible with the number of observations in the object. The output in this case is an object with the newly specified dimensions.

# Example use of the `dim` function to obtain the dimensions of a matrix
m <- matrix(data = 1:8, nrow = 4, ncol = 2) # creates a matrix
print(m)

##      [,1] [,2]
## [1,]    1    5
## [2,]    2    6
## [3,]    3    7
## [4,]    4    8

dim(m) # prints out the dimensions of the matrix in the form of "rows, columns"

## [1] 4 2

# Example use of the `dim` function to convert a vector into a matrix of specified dimensions
m2 <- c(1:12) # creates a vector
print(m2)

##  [1]  1  2  3  4  5  6  7  8  9 10 11 12

dim(m2) <- c(3,4) # converts the vector into a matrix with 3 rows and 4 columns
print(m2)

##      [,1] [,2] [,3] [,4]
## [1,]    1    4    7   10
## [2,]    2    5    8   11
## [3,]    3    6    9   12

`names`

Melanie R. Kazenel

The names function can be used to obtain the names associated with the elements of an object. It can also be used to assign names to the object. The function’s input is an R object, and the names associated with each element are the output.

# The following vector does not have names assigned to it
z <- c(1:4)
print(z)

## [1] 1 2 3 4

names(z)

## NULL

# Assign names to the vector using the "names" function
names(z) <- c("kale", "broccoli", "cabbage", "brussels sprouts") # assign names to the vector
print(z)

##             kale         broccoli          cabbage brussels sprouts 
##                1                2                3                4

names(z) # obtain the names associated with the vector

## [1] "kale"             "broccoli"         "cabbage"         
## [4] "brussels sprouts"

# Change the name assigned to a specific element within a vector by specifying the position of the element you wish to change the name of
z <- "names<-"(z, "[<-"(names(z), 2, "collards"))
print(z)

##             kale         collards          cabbage brussels sprouts 
##                1                2                3                4

names(z)

## [1] "kale"             "collards"         "cabbage"         
## [4] "brussels sprouts"

# Remove all names associated with the vector
names(z) <- NULL
print(z)

## [1] 1 2 3 4

`readLines`

Melanie R. Kazenel

The readLines function can be used to read text or data into R that is not formatted in a way conducive to being read in using a function such as read.csv or read.table. For instance, readLines can be used to read in unformatted text. The input for the function is a URL or file. The output is a vector in which each element corresponds to a line in the input file.

The “n=” argument can be used to specify the number of lines you want to be read in; the default value is -1 and means that all lines will be read in. The “ok” argument can be used to specify whether a warning message should come up if the end of the file is reached before the number of lines specified in the “n=” argument is reached; the default is TRUE. The “encoding” argument can be used to specify the type of encoding used in the document; the default is “unknown.” The “skipNul” argument can be used to specify whether nulls in the dataset should be skipped rather than read in; the default is FALSE. The “warn” argument can be added to specify whether warning messages should come up; the default is TRUE.

# Read all of the text from a webpage into R
z <- readLines("https://gotellilab.github.io/Bio381/CourseMaterials/CourseSyllabus.html")
summary(z)

##    Length     Class      Mode 
##       284 character character

# Read the first 5 lines from a webpage into R
z <- readLines("https://gotellilab.github.io/Bio381/CourseMaterials/CourseSyllabus.html", n = 5L)
summary(z)

##    Length     Class      Mode 
##         5 character character

print(z)

## [1] "<!DOCTYPE html>"                              
## [2] ""                                             
## [3] "<html xmlns=\"http://www.w3.org/1999/xhtml\">"
## [4] ""                                             
## [5] "<head>"

`choose`

Melanie R. Kazenel

The choose function can be used to calculate binomial coefficients. In other words, it can be used to calculate “n choose k” – the number of ways to to choose k elements from from a set of n elements, where n is a number and k is an integer. As arguments, the function takes a numeric vector for n and an integer vector for k. The function will round non-integer values of k to integers by default. When the vectors have a lenghth of one, the output is a single “n choose k” value. When the vectors contain multiple elements, the function will calculate “n choose k” for the pairs of elements in corresponding positions in the vectors.

# Calculate the number of ways to choose 2 elements from a set of 8 elements
choose(n=8,k=2)

## [1] 28

# Calculate the number of ways to choose 3 elements from a set of 4 elements
choose(n=4,k=3)

## [1] 4

# If vectors containing equal numbers of elements are used as input for n and k, the function will calculate "n choose k" for pairwise combinations of elements in corresponding positions in the vectors.
choose(n=c(8,4),k=c(2,3))

## [1] 28  4

# In the example below, the vectors for n and k contain unequal numbers of elements. The value of k is used to calculate "n choose k" for both values in n.
choose(n=c(8,4),k=2)

## [1] 28  6

`identical`

Melanie R. Kazenel

The identical function can be used to test whether two objects are exactly equal to one another. The function takes two R objects of any type as input. The output is TRUE if the objects are equal to one another, and FALSE otherwise. The “num.eq” argument can be added to specify whether double and complex non-NA numbers should be compared using “==” (equal), which is the default (num.eq = TRUE), or bitwise (num.eq = FALSE). The “single.NA” argument can be used to specify whether to differentiate different types of NAs and NaNs; the default is TRUE, signifying that they should not be differentiated.

# Check whether two vectors are equal to one another
z <- c(2,3,5,7,"ten", "seven")
x <- c(2,3,5,7,"ten", "seven")
y <- c(2,3,5,7,"ten", "eight")
identical(z,x) # z and x are equal to one another

## [1] TRUE

identical(z,y) # z and y are not equal to one another

## [1] FALSE

`rownames`

Melanie R. Kazenel

The rownames function can be used to specify or obtain the names associated with rows in a matrix or matrix-like object. The function’s input is a matrix-like R object, and the names associated with each row are the output. The arguments “do.NULL” and “prefix” can be added; see the example below for an explanation of how to use these arguments.

# The following matrix does not have row names assigned to it
z <- matrix(data = c(1:8), nrow = 4, ncol = 2)
print(z)

##      [,1] [,2]
## [1,]    1    5
## [2,]    2    6
## [3,]    3    7
## [4,]    4    8

rownames(z) # under default settings, the output of 'rownames' is NULL when no row names have been assigned

## NULL

rownames(z, do.NULL = FALSE) # when do.NULL = FALSE, names are assigned to each row using the prefix "row" plus a number

## [1] "row1" "row2" "row3" "row4"

rownames(z, do.NULL = FALSE, prefix = "specialname") # the prefix argument can be added to specify that a name other than "row" should be added before each number when do.NULL = FALSE

## [1] "specialname1" "specialname2" "specialname3" "specialname4"

# Assign row names to the matrix 
rownames(z) <- c("kale", "broccoli", "cabbage", "brussels sprouts") # assign row names to the matrix
print(z)

##                  [,1] [,2]
## kale                1    5
## broccoli            2    6
## cabbage             3    7
## brussels sprouts    4    8

rownames(z) # obtain the row names associated with the matrix

## [1] "kale"             "broccoli"         "cabbage"         
## [4] "brussels sprouts"

`asin`

Melanie R. Kazenel

The asin function computes the arc-sine of a numeric or complex vector, which is the input. The input should be in radians, not degrees. The output is the arc-sine value of each number in the vector.

# Compute the arc-sine of a single number
asin(1)

## [1] 1.570796

# Compute the arc-sine of each number in a vector
z <- c(0.5,0.75,1) # create a vector
asin(z) # compute the arcsine of each element

## [1] 0.5235988 0.8480621 1.5707963