# install.packages("readr")
library(readr)
library(mosaic)
collegedata <- read_csv("~/collegedata.csv")
str(collegedata)
There are 10 variables in the data, and 81 observations. The variables are:
rank
contains the rank of schools (integer)score
contains the score of each school (integer)name
is a chr
string of each school’s name.location
is a chr
string of the city and state of each schooltuitionAndFees
is a chr
, including dollar signs and commas. More useful would be as a numeric variable with dollar amountstotalEnrollment
is an int
fall2013AcceptanceRate
is a numeric variableaverageFreshmanRetentionRate
is a numeric variablesixYearGraduation rate
is a numeric variableOther than tuitionAndFees
the variable types make sense to me. All of these variable names are pretty self-explanatory, although I might want to rename the first column or remove it altogether.
favstats(~rank, data=collegedata)
Minimum rank is 1, max is 75, no missing data. This all makes sense.
favstats(~score, data=collegedata)
Minimum is 49 (need to look in to what this means) maximum is 100 (makes sense!). No missing data.
tally(~name, data=collegedata)
Tried this and it is just a bunch of 1s and school names.
Etcetera
tuitionAndFees
.location
so we can use state as a variable.