Data Munging of the Titanic

— title: “Data Munging of the Titanic” output: html_document: theme: cerulean — _________________________________________________________________________________________________ ### Assign the dataset “`{r echo=TRUE, message=FALSE, warning=FALSE} train <- read.csv(“../input/train.csv”, stringsAsFactors=FALSE) # Check for the NA data library(Amelia) missmap(train, col=c(“yellow”, “blue”), legend = FALSE, main = “The Train Data”) “` ### Pclass and Sex, the Most Important Factors “`{r echo=TRUE, message=FALSE, warning=FALSE} total <- train total$Pclass <- factor(total$Pclass) levels(total$Pclass) <- c(“FirstClass”, “SecondClass”, “ThirdClass”) total$Survived <- factor(total$Survived) library(ggplot2) ggplot(total, aes(Pclass)) + geom_bar(aes(fill = Survived)) + facet_grid(~Sex) + ggtitle(“Pclass and Sex as the Survival Factors”) “` ### Age: the Different Fates of Juniors and Seniors “`{r echo=TRUE, message=FALSE, warning=FALSE} ggplot(na.omit(total), aes(Age)) + geom_bar(aes(fill = Survived), binwidth = 2) + facet_wrap(~Sex+Pclass, nrow = 2, scales = “free_y”) + ggtitle(“Age as the Survival Factor”) # Add the variable of age…

Link to Full Article: Data Munging of the Titanic

Pin It on Pinterest

Share This

Join Our Newsletter

Sign up to our mailing list to receive the latest news and updates about and the Informed.AI Network of AI related websites which includes Events.AI, Neurons.AI, Awards.AI, and Vocation.AI

You have Successfully Subscribed!