Help understanding loop behaviour

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Help understanding loop behaviour

R help mailing list-2
I am trying to understand how loops in operate. I have a simple dataframe xx which is as follows

COMPANY_NUMBER   NUMBER_OF_YEARS
 
#0070837                             3
#0070837                             3
#0070837                             3
1000403                               4
1000403                               4
1000403                               4
1000403                               4
10029943                             3
10029943                             3
10029943                             3
10037980                             4
10037980                             4
10037980                             4
10037980                             4
10057418                             3
10057418                             3

10057418                             3
1009550                               4
1009550                               4
1009550                               4
1009550                               4
The code I have written is

while (i <= nrow(xx1) )

{

for (j in 1:xx1$NUMBER_OF_YEARS[i])
{
xx1$I[i] <- i
xx1$J[j] <- j
xx1$NUMBER_OF_YEARS_j[j] <- xx1$NUMBER_OF_YEARS[j]
}
i=i + (xx1$NUMBER_OF_YEARS[i] )
}
After running the code I want my dataframe to look like

|COMPANY_NUMBER |NUMBER_OF_YEARS| | I| |J|

|#0070837 |3| |1| |1|
|#0070837 |3| |1| |2|
|#0070837 |3| |3| |3|
|1000403 |4| |1| |1|
|1000403 |4| |1| |2|
|1000403 |4| |1| |3|
|1000403 |4| |4| |4|
|10029943 |3| |1| |1|
|10029943 |3| |1| |2|
|10029943 |3| |3| |3|
|10037980 |4| |1| |1|
|10037980 |4| |1| |2|
|10037980 |4| |1| |3|
|10037980 |4| |4| |4|
|10057418 |3| |1| |1|
|10057418 |3| |1| |1|
|10057418 |3| |1| |1|
|1009550 |4| |1| |1|
|1009550 |4| |1| |2|
|1009550 |4| |1| |3|
|1009550 |4| |4| |4|


I get the correct value of I but in the wrong row but the vaule of J is correct in the first iteration and then it goes to 1

Any help will be greatly appreciated
        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Help understanding loop behaviour

PIKAL Petr
Hi

Your code is hardly readable as you used HTML formating (not recommended) so
I used another (split) approach.

Third column seems to be simple

#make list
lll <- split(as.factor(COMPANY_NUMBER), COMPANY_NUMBER)

#calculate sequences
as.numeric(unlist(lapply(lll, function(x) 1:length(x))))
should give you third column

The second column seems to be calculated this way.
lapply(lll, function(x) c(rep(1, length(x)-1), max(length(x))))

I believe others could come with simpler solutions.

BTW why result for
10057418
Should be different?

Cheers
Petr

> -----Original Message-----
> From: R-help <[hidden email]> On Behalf Of e-mail
> ma015k3113 via R-help
> Sent: Thursday, April 29, 2021 5:41 PM
> To: [hidden email]
> Subject: [R] Help understanding loop behaviour
>
> I am trying to understand how loops in operate. I have a simple dataframe
xx

> which is as follows
>
> COMPANY_NUMBER   NUMBER_OF_YEARS
>
> #0070837                             3
> #0070837                             3
> #0070837                             3
> 1000403                               4
> 1000403                               4
> 1000403                               4
> 1000403                               4
> 10029943                             3
> 10029943                             3
> 10029943                             3
> 10037980                             4
> 10037980                             4
> 10037980                             4
> 10037980                             4
> 10057418                             3
> 10057418                             3
>
> 10057418                             3
> 1009550                               4
> 1009550                               4
> 1009550                               4
> 1009550                               4
> The code I have written is
>
> while (i <= nrow(xx1) )
>
> {
>
> for (j in 1:xx1$NUMBER_OF_YEARS[i])
> {
> xx1$I[i] <- i
> xx1$J[j] <- j
> xx1$NUMBER_OF_YEARS_j[j] <- xx1$NUMBER_OF_YEARS[j] } i=i +
> (xx1$NUMBER_OF_YEARS[i] ) } After running the code I want my dataframe
> to look like
>
> |COMPANY_NUMBER |NUMBER_OF_YEARS| | I| |J|
>
> |#0070837 |3| |1| |1|
> |#0070837 |3| |1| |2|
> |#0070837 |3| |3| |3|
> |1000403 |4| |1| |1|
> |1000403 |4| |1| |2|
> |1000403 |4| |1| |3|
> |1000403 |4| |4| |4|
> |10029943 |3| |1| |1|
> |10029943 |3| |1| |2|
> |10029943 |3| |3| |3|
> |10037980 |4| |1| |1|
> |10037980 |4| |1| |2|
> |10037980 |4| |1| |3|
> |10037980 |4| |4| |4|
> |10057418 |3| |1| |1|
> |10057418 |3| |1| |1|
> |10057418 |3| |1| |1|
> |1009550 |4| |1| |1|
> |1009550 |4| |1| |2|
> |1009550 |4| |1| |3|
> |1009550 |4| |4| |4|
>
>
> I get the correct value of I but in the wrong row but the vaule of J is
correct in

> the first iteration and then it goes to 1
>
> Any help will be greatly appreciated
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Help understanding loop behaviour

Jim Lemon-4
In reply to this post by R help mailing list-2
Hi email,
If you want what you described, try this:

xx<-read.table(text="COMPANY_NUMBER NUMBER_OF_YEARS
0070837  3
0070837  3
0070837  3
1000403  4
1000403  4
1000403  4
1000403  4
10029943  3
10029943  3
10029943  3
10037980  4
10037980  4
10037980  4
10037980  4
10057418  3
10057418  3
10057418  3
1009550  4
1009550  4
1009550  4
1009550  4",
header=TRUE,stringsAsFactors=FALSE)
xx$I<-NA
xx$J<-NA
row_count<-1
for(row in 1:nrow(xx)) {
 if(row == nrow(xx) || xx$COMPANY_NUMBER[row]==xx$COMPANY_NUMBER[row+1]) {
  xx$I[row]<-1
  xx$J[row]<-row_count
  row_count<-row_count+1
 } else {
  xx$I[row]<-xx$J[row]<-xx$NUMBER_OF_YEARS[row]
  row_count<-1
 }
}
xx

Like Petr, I am assuming that you want company 10057418 treated the
same as the others. If not, let us know why. I am also adssuming that
the first three rows should _not_ have a "#" at the beginning, which
means that they will be discarded.

Jim

On Fri, Apr 30, 2021 at 1:41 AM e-mail ma015k3113 via R-help
<[hidden email]> wrote:

>
> I am trying to understand how loops in operate. I have a simple dataframe xx which is as follows
>
> COMPANY_NUMBER   NUMBER_OF_YEARS
>
> #0070837                             3
> #0070837                             3
> #0070837                             3
> 1000403                               4
> 1000403                               4
> 1000403                               4
> 1000403                               4
> 10029943                             3
> 10029943                             3
> 10029943                             3
> 10037980                             4
> 10037980                             4
> 10037980                             4
> 10037980                             4
> 10057418                             3
> 10057418                             3
>
> 10057418                             3
> 1009550                               4
> 1009550                               4
> 1009550                               4
> 1009550                               4
> The code I have written is
>
> while (i <= nrow(xx1) )
>
> {
>
> for (j in 1:xx1$NUMBER_OF_YEARS[i])
> {
> xx1$I[i] <- i
> xx1$J[j] <- j
> xx1$NUMBER_OF_YEARS_j[j] <- xx1$NUMBER_OF_YEARS[j]
> }
> i=i + (xx1$NUMBER_OF_YEARS[i] )
> }
> After running the code I want my dataframe to look like
>
> |COMPANY_NUMBER |NUMBER_OF_YEARS| | I| |J|
>
> |#0070837 |3| |1| |1|
> |#0070837 |3| |1| |2|
> |#0070837 |3| |3| |3|
> |1000403 |4| |1| |1|
> |1000403 |4| |1| |2|
> |1000403 |4| |1| |3|
> |1000403 |4| |4| |4|
> |10029943 |3| |1| |1|
> |10029943 |3| |1| |2|
> |10029943 |3| |3| |3|
> |10037980 |4| |1| |1|
> |10037980 |4| |1| |2|
> |10037980 |4| |1| |3|
> |10037980 |4| |4| |4|
> |10057418 |3| |1| |1|
> |10057418 |3| |1| |1|
> |10057418 |3| |1| |1|
> |1009550 |4| |1| |1|
> |1009550 |4| |1| |2|
> |1009550 |4| |1| |3|
> |1009550 |4| |4| |4|
>
>
> I get the correct value of I but in the wrong row but the vaule of J is correct in the first iteration and then it goes to 1
>
> Any help will be greatly appreciated
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Help understanding loop behaviour

PIKAL Petr
Hallo,

Sorry, my suggestion did not worked in your case correctly as split used
natural factor ordering.

So using Jim's data, this results in desired output.

#prepare factor in original ordering
ff <- factor(xx[,1], levels=unique(xx[,1]))
lll <- split(xx$COMPANY_NUMBER, ff)
xx$I <- unlist(lapply(lll, function(x) c(rep(1, length(x)-1),
max(length(x)))),use.names=FALSE)
xx$J <- unlist(lapply(lll, function(x) 1:length(x)), use.names=FALSE)
> xx
   COMPANY_NUMBER NUMBER_OF_YEARS I J
1           70837               3 1 1
2           70837               3 1 2
3           70837               3 3 3
4         1000403               4 1 1
5         1000403               4 1 2
6         1000403               4 1 3
7         1000403               4 4 4
8        10029943               3 1 1
9        10029943               3 1 2
10       10029943               3 3 3
11       10037980               4 1 1
12       10037980               4 1 2
13       10037980               4 1 3
14       10037980               4 4 4
15       10057418               3 1 1
16       10057418               3 1 2
17       10057418               3 3 3
18        1009550               4 1 1
19        1009550               4 1 2
20        1009550               4 1 3
21        1009550               4 4 4

Cheers.
Petr

> -----Original Message-----
> From: R-help <[hidden email]> On Behalf Of Jim Lemon
> Sent: Friday, April 30, 2021 11:45 AM
> To: e-mail ma015k3113 <[hidden email]>; r-help mailing list
> <[hidden email]>
> Subject: Re: [R] Help understanding loop behaviour
>
> Hi email,
> If you want what you described, try this:
>
> xx<-read.table(text="COMPANY_NUMBER NUMBER_OF_YEARS
> 0070837  3
> 0070837  3
> 0070837  3
> 1000403  4
> 1000403  4
> 1000403  4
> 1000403  4
> 10029943  3
> 10029943  3
> 10029943  3
> 10037980  4
> 10037980  4
> 10037980  4
> 10037980  4
> 10057418  3
> 10057418  3
> 10057418  3
> 1009550  4
> 1009550  4
> 1009550  4
> 1009550  4",
> header=TRUE,stringsAsFactors=FALSE)
> xx$I<-NA
> xx$J<-NA
> row_count<-1
> for(row in 1:nrow(xx)) {
>  if(row == nrow(xx) ||
> xx$COMPANY_NUMBER[row]==xx$COMPANY_NUMBER[row+1]) {
>   xx$I[row]<-1
>   xx$J[row]<-row_count
>   row_count<-row_count+1
>  } else {
>   xx$I[row]<-xx$J[row]<-xx$NUMBER_OF_YEARS[row]
>   row_count<-1
>  }
> }
> xx
>
> Like Petr, I am assuming that you want company 10057418 treated the same
> as the others. If not, let us know why. I am also adssuming that the first
three
> rows should _not_ have a "#" at the beginning, which means that they will
be

> discarded.
>
> Jim
>
> On Fri, Apr 30, 2021 at 1:41 AM e-mail ma015k3113 via R-help <r-help@r-
> project.org> wrote:
> >
> > I am trying to understand how loops in operate. I have a simple
> > dataframe xx which is as follows
> >
> > COMPANY_NUMBER   NUMBER_OF_YEARS
> >
> > #0070837                             3
> > #0070837                             3
> > #0070837                             3
> > 1000403                               4
> > 1000403                               4
> > 1000403                               4
> > 1000403                               4
> > 10029943                             3
> > 10029943                             3
> > 10029943                             3
> > 10037980                             4
> > 10037980                             4
> > 10037980                             4
> > 10037980                             4
> > 10057418                             3
> > 10057418                             3
> >
> > 10057418                             3
> > 1009550                               4
> > 1009550                               4
> > 1009550                               4
> > 1009550                               4
> > The code I have written is
> >
> > while (i <= nrow(xx1) )
> >
> > {
> >
> > for (j in 1:xx1$NUMBER_OF_YEARS[i])
> > {
> > xx1$I[i] <- i
> > xx1$J[j] <- j
> > xx1$NUMBER_OF_YEARS_j[j] <- xx1$NUMBER_OF_YEARS[j] } i=i +
> > (xx1$NUMBER_OF_YEARS[i] ) } After running the code I want my
> dataframe
> > to look like
> >
> > |COMPANY_NUMBER |NUMBER_OF_YEARS| | I| |J|
> >
> > |#0070837 |3| |1| |1|
> > |#0070837 |3| |1| |2|
> > |#0070837 |3| |3| |3|
> > |1000403 |4| |1| |1|
> > |1000403 |4| |1| |2|
> > |1000403 |4| |1| |3|
> > |1000403 |4| |4| |4|
> > |10029943 |3| |1| |1|
> > |10029943 |3| |1| |2|
> > |10029943 |3| |3| |3|
> > |10037980 |4| |1| |1|
> > |10037980 |4| |1| |2|
> > |10037980 |4| |1| |3|
> > |10037980 |4| |4| |4|
> > |10057418 |3| |1| |1|
> > |10057418 |3| |1| |1|
> > |10057418 |3| |1| |1|
> > |1009550 |4| |1| |1|
> > |1009550 |4| |1| |2|
> > |1009550 |4| |1| |3|
> > |1009550 |4| |4| |4|
> >
> >
> > I get the correct value of I but in the wrong row but the vaule of J
> > is correct in the first iteration and then it goes to 1
> >
> > Any help will be greatly appreciated
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Help understanding loop behaviour

Rui Barradas
Hello,

For column J, ave/seq_along seems to be the simplest. For column I, ave
is also a good option, it avoids split/lapply.


xx$I <- ave(xx$NUMBER_OF_YEARS, xx$COMPANY_NUMBER, FUN = function(x){
   c(rep(1, length(x) - 1), max(length(x)))
})

xx$J <- ave(xx$NUMBER_OF_YEARS, xx$COMPANY_NUMBER, FUN = seq_along)


Hope this helps,

Às 11:49 de 30/04/21, PIKAL Petr escreveu:

> Hallo,
>
> Sorry, my suggestion did not worked in your case correctly as split used
> natural factor ordering.
>
> So using Jim's data, this results in desired output.
>
> #prepare factor in original ordering
> ff <- factor(xx[,1], levels=unique(xx[,1]))
> lll <- split(xx$COMPANY_NUMBER, ff)
> xx$I <- unlist(lapply(lll, function(x) c(rep(1, length(x)-1),
> max(length(x)))),use.names=FALSE)
> xx$J <- unlist(lapply(lll, function(x) 1:length(x)), use.names=FALSE)
>> xx
>     COMPANY_NUMBER NUMBER_OF_YEARS I J
> 1           70837               3 1 1
> 2           70837               3 1 2
> 3           70837               3 3 3
> 4         1000403               4 1 1
> 5         1000403               4 1 2
> 6         1000403               4 1 3
> 7         1000403               4 4 4
> 8        10029943               3 1 1
> 9        10029943               3 1 2
> 10       10029943               3 3 3
> 11       10037980               4 1 1
> 12       10037980               4 1 2
> 13       10037980               4 1 3
> 14       10037980               4 4 4
> 15       10057418               3 1 1
> 16       10057418               3 1 2
> 17       10057418               3 3 3
> 18        1009550               4 1 1
> 19        1009550               4 1 2
> 20        1009550               4 1 3
> 21        1009550               4 4 4
>
> Cheers.
> Petr
>
>> -----Original Message-----
>> From: R-help <[hidden email]> On Behalf Of Jim Lemon
>> Sent: Friday, April 30, 2021 11:45 AM
>> To: e-mail ma015k3113 <[hidden email]>; r-help mailing list
>> <[hidden email]>
>> Subject: Re: [R] Help understanding loop behaviour
>>
>> Hi email,
>> If you want what you described, try this:
>>
>> xx<-read.table(text="COMPANY_NUMBER NUMBER_OF_YEARS
>> 0070837  3
>> 0070837  3
>> 0070837  3
>> 1000403  4
>> 1000403  4
>> 1000403  4
>> 1000403  4
>> 10029943  3
>> 10029943  3
>> 10029943  3
>> 10037980  4
>> 10037980  4
>> 10037980  4
>> 10037980  4
>> 10057418  3
>> 10057418  3
>> 10057418  3
>> 1009550  4
>> 1009550  4
>> 1009550  4
>> 1009550  4",
>> header=TRUE,stringsAsFactors=FALSE)
>> xx$I<-NA
>> xx$J<-NA
>> row_count<-1
>> for(row in 1:nrow(xx)) {
>>   if(row == nrow(xx) ||
>> xx$COMPANY_NUMBER[row]==xx$COMPANY_NUMBER[row+1]) {
>>    xx$I[row]<-1
>>    xx$J[row]<-row_count
>>    row_count<-row_count+1
>>   } else {
>>    xx$I[row]<-xx$J[row]<-xx$NUMBER_OF_YEARS[row]
>>    row_count<-1
>>   }
>> }
>> xx
>>
>> Like Petr, I am assuming that you want company 10057418 treated the same
>> as the others. If not, let us know why. I am also adssuming that the first
> three
>> rows should _not_ have a "#" at the beginning, which means that they will
> be
>> discarded.
>>
>> Jim
>>
>> On Fri, Apr 30, 2021 at 1:41 AM e-mail ma015k3113 via R-help <r-help@r-
>> project.org> wrote:
>>>
>>> I am trying to understand how loops in operate. I have a simple
>>> dataframe xx which is as follows
>>>
>>> COMPANY_NUMBER   NUMBER_OF_YEARS
>>>
>>> #0070837                             3
>>> #0070837                             3
>>> #0070837                             3
>>> 1000403                               4
>>> 1000403                               4
>>> 1000403                               4
>>> 1000403                               4
>>> 10029943                             3
>>> 10029943                             3
>>> 10029943                             3
>>> 10037980                             4
>>> 10037980                             4
>>> 10037980                             4
>>> 10037980                             4
>>> 10057418                             3
>>> 10057418                             3
>>>
>>> 10057418                             3
>>> 1009550                               4
>>> 1009550                               4
>>> 1009550                               4
>>> 1009550                               4
>>> The code I have written is
>>>
>>> while (i <= nrow(xx1) )
>>>
>>> {
>>>
>>> for (j in 1:xx1$NUMBER_OF_YEARS[i])
>>> {
>>> xx1$I[i] <- i
>>> xx1$J[j] <- j
>>> xx1$NUMBER_OF_YEARS_j[j] <- xx1$NUMBER_OF_YEARS[j] } i=i +
>>> (xx1$NUMBER_OF_YEARS[i] ) } After running the code I want my
>> dataframe
>>> to look like
>>>
>>> |COMPANY_NUMBER |NUMBER_OF_YEARS| | I| |J|
>>>
>>> |#0070837 |3| |1| |1|
>>> |#0070837 |3| |1| |2|
>>> |#0070837 |3| |3| |3|
>>> |1000403 |4| |1| |1|
>>> |1000403 |4| |1| |2|
>>> |1000403 |4| |1| |3|
>>> |1000403 |4| |4| |4|
>>> |10029943 |3| |1| |1|
>>> |10029943 |3| |1| |2|
>>> |10029943 |3| |3| |3|
>>> |10037980 |4| |1| |1|
>>> |10037980 |4| |1| |2|
>>> |10037980 |4| |1| |3|
>>> |10037980 |4| |4| |4|
>>> |10057418 |3| |1| |1|
>>> |10057418 |3| |1| |1|
>>> |10057418 |3| |1| |1|
>>> |1009550 |4| |1| |1|
>>> |1009550 |4| |1| |2|
>>> |1009550 |4| |1| |3|
>>> |1009550 |4| |4| |4|
>>>
>>>
>>> I get the correct value of I but in the wrong row but the vaule of J
>>> is correct in the first iteration and then it goes to 1
>>>
>>> Any help will be greatly appreciated
>>>          [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Help understanding loop behaviour

Bert Gunter-2
There is something wrong here I believe -- see inline below:

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Apr 30, 2021 at 10:37 AM Rui Barradas <[hidden email]> wrote:

> Hello,
>
> For column J, ave/seq_along seems to be the simplest. For column I, ave
> is also a good option, it avoids split/lapply.
>
>
> xx$I <- ave(xx$NUMBER_OF_YEARS, xx$COMPANY_NUMBER, FUN = function(x){
>    c(rep(1, length(x) - 1), max(length(x)))  ### ???
> })
>
> **********
length() returns a single integer, so max(length(x)) makes no sense
************************************


> xx$J <- ave(xx$NUMBER_OF_YEARS, xx$COMPANY_NUMBER, FUN = seq_along)
>
>
> Hope this helps,
>
> Às 11:49 de 30/04/21, PIKAL Petr escreveu:
> > Hallo,
> >
> > Sorry, my suggestion did not worked in your case correctly as split used
> > natural factor ordering.
> >
> > So using Jim's data, this results in desired output.
> >
> > #prepare factor in original ordering
> > ff <- factor(xx[,1], levels=unique(xx[,1]))
> > lll <- split(xx$COMPANY_NUMBER, ff)
> > xx$I <- unlist(lapply(lll, function(x) c(rep(1, length(x)-1),
> > max(length(x)))),use.names=FALSE)
> > xx$J <- unlist(lapply(lll, function(x) 1:length(x)), use.names=FALSE)
> >> xx
> >     COMPANY_NUMBER NUMBER_OF_YEARS I J
> > 1           70837               3 1 1
> > 2           70837               3 1 2
> > 3           70837               3 3 3
> > 4         1000403               4 1 1
> > 5         1000403               4 1 2
> > 6         1000403               4 1 3
> > 7         1000403               4 4 4
> > 8        10029943               3 1 1
> > 9        10029943               3 1 2
> > 10       10029943               3 3 3
> > 11       10037980               4 1 1
> > 12       10037980               4 1 2
> > 13       10037980               4 1 3
> > 14       10037980               4 4 4
> > 15       10057418               3 1 1
> > 16       10057418               3 1 2
> > 17       10057418               3 3 3
> > 18        1009550               4 1 1
> > 19        1009550               4 1 2
> > 20        1009550               4 1 3
> > 21        1009550               4 4 4
> >
> > Cheers.
> > Petr
> >
> >> -----Original Message-----
> >> From: R-help <[hidden email]> On Behalf Of Jim Lemon
> >> Sent: Friday, April 30, 2021 11:45 AM
> >> To: e-mail ma015k3113 <[hidden email]>; r-help mailing
> list
> >> <[hidden email]>
> >> Subject: Re: [R] Help understanding loop behaviour
> >>
> >> Hi email,
> >> If you want what you described, try this:
> >>
> >> xx<-read.table(text="COMPANY_NUMBER NUMBER_OF_YEARS
> >> 0070837  3
> >> 0070837  3
> >> 0070837  3
> >> 1000403  4
> >> 1000403  4
> >> 1000403  4
> >> 1000403  4
> >> 10029943  3
> >> 10029943  3
> >> 10029943  3
> >> 10037980  4
> >> 10037980  4
> >> 10037980  4
> >> 10037980  4
> >> 10057418  3
> >> 10057418  3
> >> 10057418  3
> >> 1009550  4
> >> 1009550  4
> >> 1009550  4
> >> 1009550  4",
> >> header=TRUE,stringsAsFactors=FALSE)
> >> xx$I<-NA
> >> xx$J<-NA
> >> row_count<-1
> >> for(row in 1:nrow(xx)) {
> >>   if(row == nrow(xx) ||
> >> xx$COMPANY_NUMBER[row]==xx$COMPANY_NUMBER[row+1]) {
> >>    xx$I[row]<-1
> >>    xx$J[row]<-row_count
> >>    row_count<-row_count+1
> >>   } else {
> >>    xx$I[row]<-xx$J[row]<-xx$NUMBER_OF_YEARS[row]
> >>    row_count<-1
> >>   }
> >> }
> >> xx
> >>
> >> Like Petr, I am assuming that you want company 10057418 treated the same
> >> as the others. If not, let us know why. I am also adssuming that the
> first
> > three
> >> rows should _not_ have a "#" at the beginning, which means that they
> will
> > be
> >> discarded.
> >>
> >> Jim
> >>
> >> On Fri, Apr 30, 2021 at 1:41 AM e-mail ma015k3113 via R-help <r-help@r-
> >> project.org> wrote:
> >>>
> >>> I am trying to understand how loops in operate. I have a simple
> >>> dataframe xx which is as follows
> >>>
> >>> COMPANY_NUMBER   NUMBER_OF_YEARS
> >>>
> >>> #0070837                             3
> >>> #0070837                             3
> >>> #0070837                             3
> >>> 1000403                               4
> >>> 1000403                               4
> >>> 1000403                               4
> >>> 1000403                               4
> >>> 10029943                             3
> >>> 10029943                             3
> >>> 10029943                             3
> >>> 10037980                             4
> >>> 10037980                             4
> >>> 10037980                             4
> >>> 10037980                             4
> >>> 10057418                             3
> >>> 10057418                             3
> >>>
> >>> 10057418                             3
> >>> 1009550                               4
> >>> 1009550                               4
> >>> 1009550                               4
> >>> 1009550                               4
> >>> The code I have written is
> >>>
> >>> while (i <= nrow(xx1) )
> >>>
> >>> {
> >>>
> >>> for (j in 1:xx1$NUMBER_OF_YEARS[i])
> >>> {
> >>> xx1$I[i] <- i
> >>> xx1$J[j] <- j
> >>> xx1$NUMBER_OF_YEARS_j[j] <- xx1$NUMBER_OF_YEARS[j] } i=i +
> >>> (xx1$NUMBER_OF_YEARS[i] ) } After running the code I want my
> >> dataframe
> >>> to look like
> >>>
> >>> |COMPANY_NUMBER |NUMBER_OF_YEARS| | I| |J|
> >>>
> >>> |#0070837 |3| |1| |1|
> >>> |#0070837 |3| |1| |2|
> >>> |#0070837 |3| |3| |3|
> >>> |1000403 |4| |1| |1|
> >>> |1000403 |4| |1| |2|
> >>> |1000403 |4| |1| |3|
> >>> |1000403 |4| |4| |4|
> >>> |10029943 |3| |1| |1|
> >>> |10029943 |3| |1| |2|
> >>> |10029943 |3| |3| |3|
> >>> |10037980 |4| |1| |1|
> >>> |10037980 |4| |1| |2|
> >>> |10037980 |4| |1| |3|
> >>> |10037980 |4| |4| |4|
> >>> |10057418 |3| |1| |1|
> >>> |10057418 |3| |1| |1|
> >>> |10057418 |3| |1| |1|
> >>> |1009550 |4| |1| |1|
> >>> |1009550 |4| |1| |2|
> >>> |1009550 |4| |1| |3|
> >>> |1009550 |4| |4| |4|
> >>>
> >>>
> >>> I get the correct value of I but in the wrong row but the vaule of J
> >>> is correct in the first iteration and then it goes to 1
> >>>
> >>> Any help will be greatly appreciated
> >>>          [[alternative HTML version deleted]]
> >>>
> >>> ______________________________________________
> >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>
> >> ______________________________________________
> >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/posting-
> >> guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >> ______________________________________________
> >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Help understanding loop behaviour

Rui Barradas
Hello,

Right, thanks. I should be


xx$I <- ave(xx$NUMBER_OF_YEARS, xx$COMPANY_NUMBER, FUN = function(x){
         c(rep(1, length(x) - 1), length(x))  ### ???
     })


Hope this helps,

Rui Barradas

Às 19:46 de 30/04/21, Bert Gunter escreveu:

> There is something wrong here I believe -- see inline below:
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Fri, Apr 30, 2021 at 10:37 AM Rui Barradas <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     Hello,
>
>     For column J, ave/seq_along seems to be the simplest. For column I, ave
>     is also a good option, it avoids split/lapply.
>
>
>     xx$I <- ave(xx$NUMBER_OF_YEARS, xx$COMPANY_NUMBER, FUN = function(x){
>         c(rep(1, length(x) - 1), max(length(x)))  ### ???
>     })
>
> **********
> length() returns a single integer, so max(length(x)) makes no sense
> ************************************
>
>     xx$J <- ave(xx$NUMBER_OF_YEARS, xx$COMPANY_NUMBER, FUN = seq_along)
>
>
>     Hope this helps,
>
>     Às 11:49 de 30/04/21, PIKAL Petr escreveu:
>      > Hallo,
>      >
>      > Sorry, my suggestion did not worked in your case correctly as
>     split used
>      > natural factor ordering.
>      >
>      > So using Jim's data, this results in desired output.
>      >
>      > #prepare factor in original ordering
>      > ff <- factor(xx[,1], levels=unique(xx[,1]))
>      > lll <- split(xx$COMPANY_NUMBER, ff)
>      > xx$I <- unlist(lapply(lll, function(x) c(rep(1, length(x)-1),
>      > max(length(x)))),use.names=FALSE)
>      > xx$J <- unlist(lapply(lll, function(x) 1:length(x)), use.names=FALSE)
>      >> xx
>      >     COMPANY_NUMBER NUMBER_OF_YEARS I J
>      > 1           70837               3 1 1
>      > 2           70837               3 1 2
>      > 3           70837               3 3 3
>      > 4         1000403               4 1 1
>      > 5         1000403               4 1 2
>      > 6         1000403               4 1 3
>      > 7         1000403               4 4 4
>      > 8        10029943               3 1 1
>      > 9        10029943               3 1 2
>      > 10       10029943               3 3 3
>      > 11       10037980               4 1 1
>      > 12       10037980               4 1 2
>      > 13       10037980               4 1 3
>      > 14       10037980               4 4 4
>      > 15       10057418               3 1 1
>      > 16       10057418               3 1 2
>      > 17       10057418               3 3 3
>      > 18        1009550               4 1 1
>      > 19        1009550               4 1 2
>      > 20        1009550               4 1 3
>      > 21        1009550               4 4 4
>      >
>      > Cheers.
>      > Petr
>      >
>      >> -----Original Message-----
>      >> From: R-help <[hidden email]
>     <mailto:[hidden email]>> On Behalf Of Jim Lemon
>      >> Sent: Friday, April 30, 2021 11:45 AM
>      >> To: e-mail ma015k3113 <[hidden email]
>     <mailto:[hidden email]>>; r-help mailing list
>      >> <[hidden email] <mailto:[hidden email]>>
>      >> Subject: Re: [R] Help understanding loop behaviour
>      >>
>      >> Hi email,
>      >> If you want what you described, try this:
>      >>
>      >> xx<-read.table(text="COMPANY_NUMBER NUMBER_OF_YEARS
>      >> 0070837  3
>      >> 0070837  3
>      >> 0070837  3
>      >> 1000403  4
>      >> 1000403  4
>      >> 1000403  4
>      >> 1000403  4
>      >> 10029943  3
>      >> 10029943  3
>      >> 10029943  3
>      >> 10037980  4
>      >> 10037980  4
>      >> 10037980  4
>      >> 10037980  4
>      >> 10057418  3
>      >> 10057418  3
>      >> 10057418  3
>      >> 1009550  4
>      >> 1009550  4
>      >> 1009550  4
>      >> 1009550  4",
>      >> header=TRUE,stringsAsFactors=FALSE)
>      >> xx$I<-NA
>      >> xx$J<-NA
>      >> row_count<-1
>      >> for(row in 1:nrow(xx)) {
>      >>   if(row == nrow(xx) ||
>      >> xx$COMPANY_NUMBER[row]==xx$COMPANY_NUMBER[row+1]) {
>      >>    xx$I[row]<-1
>      >>    xx$J[row]<-row_count
>      >>    row_count<-row_count+1
>      >>   } else {
>      >>    xx$I[row]<-xx$J[row]<-xx$NUMBER_OF_YEARS[row]
>      >>    row_count<-1
>      >>   }
>      >> }
>      >> xx
>      >>
>      >> Like Petr, I am assuming that you want company 10057418 treated
>     the same
>      >> as the others. If not, let us know why. I am also adssuming that
>     the first
>      > three
>      >> rows should _not_ have a "#" at the beginning, which means that
>     they will
>      > be
>      >> discarded.
>      >>
>      >> Jim
>      >>
>      >> On Fri, Apr 30, 2021 at 1:41 AM e-mail ma015k3113 via R-help
>     <r-help@r-
>      >> project.org <http://project.org>> wrote:
>      >>>
>      >>> I am trying to understand how loops in operate. I have a simple
>      >>> dataframe xx which is as follows
>      >>>
>      >>> COMPANY_NUMBER   NUMBER_OF_YEARS
>      >>>
>      >>> #0070837                             3
>      >>> #0070837                             3
>      >>> #0070837                             3
>      >>> 1000403                               4
>      >>> 1000403                               4
>      >>> 1000403                               4
>      >>> 1000403                               4
>      >>> 10029943                             3
>      >>> 10029943                             3
>      >>> 10029943                             3
>      >>> 10037980                             4
>      >>> 10037980                             4
>      >>> 10037980                             4
>      >>> 10037980                             4
>      >>> 10057418                             3
>      >>> 10057418                             3
>      >>>
>      >>> 10057418                             3
>      >>> 1009550                               4
>      >>> 1009550                               4
>      >>> 1009550                               4
>      >>> 1009550                               4
>      >>> The code I have written is
>      >>>
>      >>> while (i <= nrow(xx1) )
>      >>>
>      >>> {
>      >>>
>      >>> for (j in 1:xx1$NUMBER_OF_YEARS[i])
>      >>> {
>      >>> xx1$I[i] <- i
>      >>> xx1$J[j] <- j
>      >>> xx1$NUMBER_OF_YEARS_j[j] <- xx1$NUMBER_OF_YEARS[j] } i=i +
>      >>> (xx1$NUMBER_OF_YEARS[i] ) } After running the code I want my
>      >> dataframe
>      >>> to look like
>      >>>
>      >>> |COMPANY_NUMBER |NUMBER_OF_YEARS| | I| |J|
>      >>>
>      >>> |#0070837 |3| |1| |1|
>      >>> |#0070837 |3| |1| |2|
>      >>> |#0070837 |3| |3| |3|
>      >>> |1000403 |4| |1| |1|
>      >>> |1000403 |4| |1| |2|
>      >>> |1000403 |4| |1| |3|
>      >>> |1000403 |4| |4| |4|
>      >>> |10029943 |3| |1| |1|
>      >>> |10029943 |3| |1| |2|
>      >>> |10029943 |3| |3| |3|
>      >>> |10037980 |4| |1| |1|
>      >>> |10037980 |4| |1| |2|
>      >>> |10037980 |4| |1| |3|
>      >>> |10037980 |4| |4| |4|
>      >>> |10057418 |3| |1| |1|
>      >>> |10057418 |3| |1| |1|
>      >>> |10057418 |3| |1| |1|
>      >>> |1009550 |4| |1| |1|
>      >>> |1009550 |4| |1| |2|
>      >>> |1009550 |4| |1| |3|
>      >>> |1009550 |4| |4| |4|
>      >>>
>      >>>
>      >>> I get the correct value of I but in the wrong row but the vaule
>     of J
>      >>> is correct in the first iteration and then it goes to 1
>      >>>
>      >>> Any help will be greatly appreciated
>      >>>          [[alternative HTML version deleted]]
>      >>>
>      >>> ______________________________________________
>      >>> [hidden email] <mailto:[hidden email]> mailing list
>     -- To UNSUBSCRIBE and more, see
>      >>> https://stat.ethz.ch/mailman/listinfo/r-help
>     <https://stat.ethz.ch/mailman/listinfo/r-help>
>      >>> PLEASE do read the posting guide
>      >>> http://www.R-project.org/posting-guide.html
>     <http://www.R-project.org/posting-guide.html>
>      >>> and provide commented, minimal, self-contained, reproducible code.
>      >>
>      >> ______________________________________________
>      >> [hidden email] <mailto:[hidden email]> mailing list
>     -- To UNSUBSCRIBE and more, see
>      >> https://stat.ethz.ch/mailman/listinfo/r-help
>     <https://stat.ethz.ch/mailman/listinfo/r-help>
>      >> PLEASE do read the posting guide
>     http://www.R-project.org/posting- <http://www.R-project.org/posting->
>      >> guide.html
>      >> and provide commented, minimal, self-contained, reproducible code.
>      >>
>      >> ______________________________________________
>      >> [hidden email] <mailto:[hidden email]> mailing list
>     -- To UNSUBSCRIBE and more, see
>      >> https://stat.ethz.ch/mailman/listinfo/r-help
>     <https://stat.ethz.ch/mailman/listinfo/r-help>
>      >> PLEASE do read the posting guide
>     http://www.R-project.org/posting-guide.html
>     <http://www.R-project.org/posting-guide.html>
>      >> and provide commented, minimal, self-contained, reproducible code.
>
>     ______________________________________________
>     [hidden email] <mailto:[hidden email]> mailing list --
>     To UNSUBSCRIBE and more, see
>     https://stat.ethz.ch/mailman/listinfo/r-help
>     <https://stat.ethz.ch/mailman/listinfo/r-help>
>     PLEASE do read the posting guide
>     http://www.R-project.org/posting-guide.html
>     <http://www.R-project.org/posting-guide.html>
>     and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.