Loading data

# install.packages("readr")
library(readr)
library(mosaic)
collegedata <- read_csv("~/collegedata.csv")

Structure and names

str(collegedata)

There are 10 variables in the data, and 81 observations. The variables are:

Other than tuitionAndFees the variable types make sense to me. All of these variable names are pretty self-explanatory, although I might want to rename the first column or remove it altogether.

Variable analysis

favstats(~rank, data=collegedata)

Minimum rank is 1, max is 75, no missing data. This all makes sense.

favstats(~score, data=collegedata)

Minimum is 49 (need to look in to what this means) maximum is 100 (makes sense!). No missing data.

tally(~name, data=collegedata)

Tried this and it is just a bunch of 1s and school names.

Etcetera

Most pressing data cleaning issues