Cannot reproduce tutorial results

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Cannot reproduce tutorial results

R help mailing list-2
I have come back to trying to learn R after a long time away, and have begun with the YouTube tutorial videos by David Langer, as seen here Introduction to Data Science with R - Data Analysis Part 1. I am using R Studio with R version 3.4.4 (2018-03-15) -- "Someone to Lean On"
Around 1:16:00 in the video Langer creates a new variable using an else if loop:
extractTitle <- function(name) {     name <- as.character(name)     if(length(grep("Miss.", name)) > 0) {       return("Miss.")     } else if(length(grep("Master.", name)) > 0) {       return("Master.")     } else if(length(grep("Mrs.", name)) > 0) {       return("Mrs.")     } else if(length(grep("Mr.", name)) > 0) {       return("Mr.")     } else { return("Other.") }}
titles <- NULLfor(i in 1:nrow(data.combined)) {     titles <- c(title, extractTitle(data.combined[i, "name"])) }
data.combined$title <- as.factor(titles)
There are two problems I see in my attempt to replicate this. First, the data.combined set contains 1309 names, but when I try to create the variable "titles" using this code, it creates a list of 2. When I use View(titles), what comes up is the first item on the list is 
function (main = NULL, sub = NULL, xlab = NULL, ylab = NULL, line = NA, outer = FALSE, ...)
and the second item in the list is just the title "Master."
The second problem is that after I enter data.combined$title <- as.factor(titles) I get the error message
Error in sort.list(y) : 'x' must be atomic for 'sort.list'Have you called 'sort' on a list?

If I try changing the as.factor to as.vector, I get
Error in `$<-.data.frame`(`*tmp*`, title, value = list(function (main = NULL, : replacement has 2 rows, data has 1309

I have checked and rechecked my code, and it is identical to Langer's. What is wrong here?

|
|
|
|  |  |

 |

 |
|
|  |
Introduction to Data Science with R - Data Analysis Part 1

Part 1 in a in-depth hands-on tutorial introducing the viewer to Data Science with R programming. The video prov...
 |

 |

 |




        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Cannot reproduce tutorial results

Bert Gunter-2
This is a plain text list and your html post below is pretty mangled and
difficult to read. If you re-post in plain text, you are more likely to get
a response.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Feb 22, 2019 at 1:11 PM Jason Hernandez via R-help <
[hidden email]> wrote:

> I have come back to trying to learn R after a long time away, and have
> begun with the YouTube tutorial videos by David Langer, as seen
> here Introduction to Data Science with R - Data Analysis Part 1. I am using
> R Studio with R version 3.4.4 (2018-03-15) -- "Someone to Lean On"
> Around 1:16:00 in the video Langer creates a new variable using an else if
> loop:
> extractTitle <- function(name) {     name <- as.character(name)
>  if(length(grep("Miss.", name)) > 0) {       return("Miss.")     } else
> if(length(grep("Master.", name)) > 0) {       return("Master.")     } else
> if(length(grep("Mrs.", name)) > 0) {       return("Mrs.")     } else
> if(length(grep("Mr.", name)) > 0) {       return("Mr.")     } else {
> return("Other.") }}
> titles <- NULLfor(i in 1:nrow(data.combined)) {     titles <- c(title,
> extractTitle(data.combined[i, "name"])) }
> data.combined$title <- as.factor(titles)
> There are two problems I see in my attempt to replicate this. First, the
> data.combined set contains 1309 names, but when I try to create the
> variable "titles" using this code, it creates a list of 2. When I use
> View(titles), what comes up is the first item on the list is
> function (main = NULL, sub = NULL, xlab = NULL, ylab = NULL, line = NA,
> outer = FALSE, ...)
> and the second item in the list is just the title "Master."
> The second problem is that after I enter data.combined$title <-
> as.factor(titles) I get the error message
> Error in sort.list(y) : 'x' must be atomic for 'sort.list'Have you called
> 'sort' on a list?
>
> If I try changing the as.factor to as.vector, I get
> Error in `$<-.data.frame`(`*tmp*`, title, value = list(function (main =
> NULL, : replacement has 2 rows, data has 1309
>
> I have checked and rechecked my code, and it is identical to Langer's.
> What is wrong here?
>
> |
> |
> |
> |  |  |
>
>  |
>
>  |
> |
> |  |
> Introduction to Data Science with R - Data Analysis Part 1
>
> Part 1 in a in-depth hands-on tutorial introducing the viewer to Data
> Science with R programming. The video prov...
>  |
>
>  |
>
>  |
>
>
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Cannot reproduce tutorial results

PIKAL Petr
In reply to this post by R help mailing list-2
Hi.

Without going too deep in your messy code:
"title" is function to make titles in plots.

therefore

titles <- c(title, extractTitle(data.combined[i, "name"]))

put in your titles object in the first place this "title" function.

And I believe, that instead of multiple if/else there is better vectorised option using gsub for extracting titles from something like TTT. Name Name. Just find a dot and get rid of evering what is right from the dot.

Cheers
Petr

> -----Original Message-----
> From: R-help <[hidden email]> On Behalf Of Jason Hernandez
> via R-help
> Sent: Friday, February 22, 2019 10:10 PM
> To: [hidden email]
> Subject: [R] Cannot reproduce tutorial results
>
> I have come back to trying to learn R after a long time away, and have begun
> with the YouTube tutorial videos by David Langer, as seen here Introduction to
> Data Science with R - Data Analysis Part 1. I am using R Studio with R version
> 3.4.4 (2018-03-15) -- "Someone to Lean On"
> Around 1:16:00 in the video Langer creates a new variable using an else if loop:
> extractTitle <- function(name) {     name <-
> as.character(name)     if(length(grep("Miss.", name)) > 0)
> {       return("Miss.")     } else if(length(grep("Master.", name)) > 0)
> {       return("Master.")     } else if(length(grep("Mrs.", name)) > 0)
> {       return("Mrs.")     } else if(length(grep("Mr.", name)) > 0)
> {       return("Mr.")     } else { return("Other.") }} titles <- NULLfor(i in
> 1:nrow(data.combined)) {     titles <- c(title, extractTitle(data.combined[i,
> "name"])) } data.combined$title <- as.factor(titles) There are two problems I
> see in my attempt to replicate this. First, the data.combined set contains 1309
> names, but when I try to create the variable "titles" using this code, it creates a
> list of 2. When I use View(titles), what comes up is the first item on the list is
> function (main = NULL, sub = NULL, xlab = NULL, ylab = NULL, line = NA, outer =
> FALSE, ...) and the second item in the list is just the title "Master."
> The second problem is that after I enter data.combined$title <- as.factor(titles)
> I get the error message Error in sort.list(y) : 'x' must be atomic for
> 'sort.list'Have you called 'sort' on a list?
>
> If I try changing the as.factor to as.vector, I get Error in `$<-
> .data.frame`(`*tmp*`, title, value = list(function (main = NULL, : replacement
> has 2 rows, data has 1309
>
> I have checked and rechecked my code, and it is identical to Langer's. What is
> wrong here?
>
> |
> |
> |
> |  |  |
>
>  |
>
>  |
> |
> |  |
> Introduction to Data Science with R - Data Analysis Part 1
>
> Part 1 in a in-depth hands-on tutorial introducing the viewer to Data Science
> with R programming. The video prov...
>  |
>
>  |
>
>  |
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních partnerů PRECHEZA a.s. jsou zveřejněny na: https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about processing and protection of business partner’s personal data are available on website: https://www.precheza.cz/en/personal-data-protection-principles/
Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: https://www.precheza.cz/01-dovetek/ | This email and any documents attached to it may be confidential and are subject to the legally binding disclaimer: https://www.precheza.cz/en/01-disclaimer/

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.