Hi Phillip,

Jim and David and Petr all wrote you good code, but you have major

problems in data formatting. Your data uses spaces both as a column

separator and also to denote "blank fields". Because of problems with

your input data structure, it's doubtful whether the good code you've

received will result in the correct baseball answer.

The Arizona Diamondbacks data you posted shows runner positions for

about seven outs of a game (about 1-and-1/6 inning)--I say "about"

because there may be subsequent rows with the same number of outs

listed in row 14. However, rows 10/11 have two blank spaces between

the number-of-outs and a runner_ID (suggesting one "blank field" to

the left of the first runner_ID), while row 12 has three blank spaces

between the number-of-outs and the first runner_ID (suggesting two

"blank fields" to the left of the first runner_ID).

Since bases are loaded in row 9 and no outs are recorded between rows

9 and 10, the game situation suggests that two runners score between

rows 9 and 10 (polla001 and perad001), with the remaining baserunners

ending up on second and third base, not first and second base (best

guess: batter lambj001 hits a double, winds up on second base, and

gets two RBIs). Similarly between rows 11 and 12, goldp001 is removed

as a baserunner and an out is recorded, however no new baserunners

appear. This game situation suggests both runners advancing (e.g. by a

sacrifice fly) with goldp001 scoring and the remaining baserunner

(lambj001) ending up on third base, not second base or first base.

Now if you run the code posted earlier using read.table(), in all

cases you will find blank fields removed between the "outs" column and

the first baserunner listed, so every row of your data with

runners-on-base will have a runner on first-base. Intuitively, you

know this must be wrong (think doubles and triples). The mechanics of

read.table() are such that the field separator character ("sep"

parameter) defaults to 'white space', that is to say, "ONE OR MORE

spaces, tabs, newlines or carriage returns" (capitalization mine). So

multiple white space characters in your file are read as a single

"field separator" separating two adjacent columns.

What you really need to do is export your data in a format that R can

easily understand. There's a possibility that posting your code in

HTML to the R-Help mailing list may have corrupted your data (e.g.

removing tabs and inserting spaces instead), but no matter. You need

to set up a workflow so this **cannot** happen, i.e. start exporting

from a spreadsheet program in ".csv" format and start importing into R

using R's read.csv() function instead. Colleagues have recommended the

book "Beyond Spreadsheets with R" by Dr. Jonathan Carroll to me as a

good introductory text for tackling these issues.

Finally (if you're read this far), the truth is if you work at it a

little bit, you can get the data you posted into R into a reasonable

format using lists (although starting from a ".csv" file may be

conceptually easier for you). Lists are very useful when you have

multiple vectors of different lengths. See the code below (note--I

dropped your first "Row#" column):

> zz <- textConnection("ari18.test3.raw", "w")

> writeLines(con=zz, c("0

+ 1

+ 1

+ 1 arenn001

+ 2 arenn001

+ 0

+ 0 perad001

+ 0 polla001 perad001

+ 0 goldp001 polla001 perad001

+ 0 lambj001 goldp001

+ 1 lambj001 goldp001

+ 2 lambj001

+ 0

+ 1 "))

> close(zz)

> ari18.test3.raw

[1] "0 " "1 "

[3] "1 " "1 arenn001 "

[5] "2 arenn001 " "0 "

[7] "0 perad001 " "0 polla001 perad001 "

[9] "0 goldp001 polla001 perad001 " "0 lambj001 goldp001 "

[11] "1 lambj001 goldp001 " "2 lambj001 "

[13] "0 " "1 "

> aa <- strsplit(trimws(ari18.test3.raw), split=" ")

> bb <- t(sapply(aa, FUN=function(x) {c(x, rep(NA, length.out=4-length(x)))} ))

> cc <- t(apply(bb[,-1], 1, FUN=function(x) {ifelse(test=nchar(x), yes=1, no=0)} ))

> bb

[,1] [,2] [,3] [,4]

[1,] "0" NA NA NA

[2,] "1" NA NA NA

[3,] "1" NA NA NA

[4,] "1" "arenn001" NA NA

[5,] "2" "arenn001" NA NA

[6,] "0" NA NA NA

[7,] "0" "perad001" NA NA

[8,] "0" "polla001" "perad001" NA

[9,] "0" "goldp001" "polla001" "perad001"

[10,] "0" "" "lambj001" "goldp001"

[11,] "1" "" "lambj001" "goldp001"

[12,] "2" "" "" "lambj001"

[13,] "0" NA NA NA

[14,] "1" NA NA NA

> cc

[,1] [,2] [,3]

[1,] NA NA NA

[2,] NA NA NA

[3,] NA NA NA

[4,] 1 NA NA

[5,] 1 NA NA

[6,] NA NA NA

[7,] 1 NA NA

[8,] 1 1 NA

[9,] 1 1 1

[10,] 0 1 1

[11,] 0 1 1

[12,] 0 0 1

[13,] NA NA NA

[14,] NA NA NA

>

HTH, Bill.

W. Michels, Ph.D.

On Wed, Oct 23, 2019 at 12:40 AM PIKAL Petr <

[hidden email]> wrote:

>

> Hi

>

> ***do not think in if or if loops in R***.

>

> to elaborate Jim's solution further

>

> With simple function based on logical expression

> fff <- function(x) (x!="")+0

>

> you could use apply

>

> t(apply(phdf[,3:5], 1, fff))

>

> and add results to your data frame columns

> phdf[, 6:8] <- t(apply(phdf[,3:5], 1, fff))

>

> Regarding some tutorial

>

> Basic stuff is in R-intro, there is excellent documentation to each function.

>

> And as R users pool is huge, you could simply ask Google

> e.g.

> r change values based on condition

>

> Cheers

> Petr

>

> > -----Original Message-----

> > From: R-help <

[hidden email]> On Behalf Of Jim Lemon

> > Sent: Wednesday, October 23, 2019 12:26 AM

> > To: Phillip Heinrich <

[hidden email]>

> > Cc: r-help <

[hidden email]>

> > Subject: Re: [R] If Loop I Think

> >

> > Hi Philip,

> > Try this:

> >

> > phdf<-read.table(

> > text="Row Outs RunnerFirst RunnerSecond RunnerThird R1 R2 R3

> > 1 0

> > 2 1

> > 3 1

> > 4 1 arenn001

> > 5 2 arenn001

> > 6 0

> > 7 0 perad001

> > 8 0 polla001 perad001

> > 9 0 goldp001 polla001 perad001

> > 10 0 lambj001 goldp001

> > 11 1 lambj001 goldp001

> > 12 2 lambj001

> > 13 0

> > 14 1 ",

> > header=TRUE,stringsAsFactors=FALSE,fill=TRUE)

> > phdf$R1<-ifelse(nchar(phdf$RunnerFirst) > 0,1,0)

> > phdf$R2<-ifelse(nchar(phdf$RunnerSecond) > 0,1,0)

> > phdf$R3<-ifelse(nchar(phdf$RunnerThird) > 0,1,0)

> >

> > Jim

> >

> > On Wed, Oct 23, 2019 at 7:54 AM Phillip Heinrich <

[hidden email]>

> > wrote:

> > >

> > > Row Outs RunnerFirst RunnerSecond RunnerThird R1 R2 R3

> > > 1 0

> > > 2 1

> > > 3 1

> > > 4 1 arenn001

> > > 5 2 arenn001

> > > 6 0

> > > 7 0 perad001

> > > 8 0 polla001 perad001

> > > 9 0 goldp001 polla001 perad001

> > > 10 0 lambj001 goldp001

> > > 11 1 lambj001 goldp001

> > > 12 2 lambj001

> > > 13 0

> > > 14 1

> > >

> > >

> > >

> > > With the above data, Arizona Diamondbacks baseball, I’m trying to put

> > zeros into the R1 column is the RunnerFirst column is blank and a one if the

> > column has a coded entry such as rows 4,5,7,8,& 9. Similarly I want zeros in

> > R2 and R3 if RunnerSecond and RunnerThird respectively are blank and ones

> > if there is an entry.

> > >

> > > I’ve tried everything I know how to do such as “If Loops”, “If-Then loops”,

> > “apply”, “sapply”, etc. I wrote function below and it ran without errors but I

> > have no idea what to do with it to accomplish my goal:

> > >

> > > R1 <- function(x) {

> > > if (ari18.test3$RunnerFirst == " "){

> > > ari18.test3$R1 <- 0

> > > return(R1)

> > > }else{

> > > R1 <- ari18.test3$R1 <- 1

> > > return(R1)

> > > }

> > > }

> > >

> > > The name of the data frame is ari18.test3

> > >

> > > On a more philosophical note, data handling in R seems to be made up of

> > thousands of details with no over-riding principles. I’ve read two books on R

> > and a number of tutorial and watched several videos but I don’t seem to be

> > making any progress. Can anyone suggest videos, or tutorials, or books that

> > might help? Database stuff has never been my strong point but I’m

> > determined to learn.

> > >

> > > Thanks,

> > > Philip Heinrich

> > > [[alternative HTML version deleted]]

> > >

> > > ______________________________________________

> > >

[hidden email] mailing list -- To UNSUBSCRIBE and more, see

> > >

https://stat.ethz.ch/mailman/listinfo/r-help> > > PLEASE do read the posting guide

> > >

http://www.R-project.org/posting-guide.html> > > and provide commented, minimal, self-contained, reproducible code.

> >

> > ______________________________________________

> >

[hidden email] mailing list -- To UNSUBSCRIBE and more, see

> >

https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide

http://www.R-project.org/posting-> > guide.html

> > and provide commented, minimal, self-contained, reproducible code.

> ______________________________________________

>

[hidden email] mailing list -- To UNSUBSCRIBE and more, see

>

https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.

______________________________________________

[hidden email] mailing list -- To UNSUBSCRIBE and more, see

https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide

http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.