Creating data using multiple for loops

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Creating data using multiple for loops

Gregory Gilbert
I would like to create pseudo identification numbers in the format of last
four of a social security number (0000 to 9999), month of birth (01 to 12),
and day of birth (01-28). The IDs can be character.

I have gotten this far:

for (ssn in 0:9){
     for (month in 1:3){
          for (day in 1:5){
                      }
                      id <-paste(ssn, month, day, sep="")
            }
}

limiting each value above for demonstration purposes. I cannot figure out
how to store the created IDs. I know I have to create a container, but I
don't know, among other things, how to index the container.  Any help is
appreciated. TIA

-Greg

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Creating data using multiple for loops

Bert Gunter-2
id <- do.call(paste0,expand.grid(0:9, 1:3, 1:5))

Comment: If you use R much, you'll do much better using R language
constructs than trying to apply those from other languages (Java perhaps?).
I realize this can be difficult, especially if you are experienced in the
another language (or languages), but it's worth the effort.


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Aug 18, 2019 at 11:58 AM <[hidden email]> wrote:

> I would like to create pseudo identification numbers in the format of last
> four of a social security number (0000 to 9999), month of birth (01 to 12),
> and day of birth (01-28). The IDs can be character.
>
> I have gotten this far:
>
> for (ssn in 0:9){
>      for (month in 1:3){
>           for (day in 1:5){
>                       }
>                       id <-paste(ssn, month, day, sep="")
>             }
> }
>
> limiting each value above for demonstration purposes. I cannot figure out
> how to store the created IDs. I know I have to create a container, but I
> don't know, among other things, how to index the container.  Any help is
> appreciated. TIA
>
> -Greg
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Creating data using multiple for loops

Jim Lemon-4
In reply to this post by Gregory Gilbert
Hi Greg,
One problem is that you have misplaced the closing brace in the third
loop. It should follow the assignment statement. Because you used
loops rather than Bert's suggestion, perhaps you are trying to order
the values assigned. In your example, the ordering will be ssn, then
month of birth, then day of birth. Occasionally people resort to an
explicit calculation for the index:

id<-vector("character",10*3*5)
for (ssn in 0:9){
     for (month in 1:3){
          for (day in 1:5){
                      id[day+(month-1)*5+ssn*15] <-paste0(ssn, month, day)
          }
     }
}

This would order the values in the opposite precedence. Also, you may
not want to create well over 3 million values as in your initial
specification, in which case a different strategy using "sample" would
be appropriate.

Jim

On Mon, Aug 19, 2019 at 4:58 AM <[hidden email]> wrote:

>
> I would like to create pseudo identification numbers in the format of last
> four of a social security number (0000 to 9999), month of birth (01 to 12),
> and day of birth (01-28). The IDs can be character.
>
> I have gotten this far:
>
> for (ssn in 0:9){
>      for (month in 1:3){
>           for (day in 1:5){
>                       }
>                       id <-paste(ssn, month, day, sep="")
>             }
> }
>
> limiting each value above for demonstration purposes. I cannot figure out
> how to store the created IDs. I know I have to create a container, but I
> don't know, among other things, how to index the container.  Any help is
> appreciated. TIA
>
> -Greg
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Creating data using multiple for loops

R help mailing list-2
In reply to this post by Bert Gunter-2
do.call(paste0,expand.grid(0:1000, 1:12, 1:30)) takes care of storing all
the values, but note that paste() doesn't put leading zeroes in front of
small numbers so this maps lots of  ssn/month/day combos to the the same
id.  sprintf() can take care of that:
id <- with(expand.grid(ssn=0:1000, month=1:12, day=1:30),
sprintf("%04d%02d%02d", ssn, month, day))

You probably should define a function to map vectors of ssn, month,  and
day to a vector of ids (it can also check for inappropriate inputs), check
that it works, and use it instead of repeating the sprintf() or paste0()
code.

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Sun, Aug 18, 2019 at 12:18 PM Bert Gunter <[hidden email]> wrote:

> id <- do.call(paste0,expand.grid(0:9, 1:3, 1:5))
>
> Comment: If you use R much, you'll do much better using R language
> constructs than trying to apply those from other languages (Java perhaps?).
> I realize this can be difficult, especially if you are experienced in the
> another language (or languages), but it's worth the effort.
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Sun, Aug 18, 2019 at 11:58 AM <[hidden email]> wrote:
>
> > I would like to create pseudo identification numbers in the format of
> last
> > four of a social security number (0000 to 9999), month of birth (01 to
> 12),
> > and day of birth (01-28). The IDs can be character.
> >
> > I have gotten this far:
> >
> > for (ssn in 0:9){
> >      for (month in 1:3){
> >           for (day in 1:5){
> >                       }
> >                       id <-paste(ssn, month, day, sep="")
> >             }
> > }
> >
> > limiting each value above for demonstration purposes. I cannot figure out
> > how to store the created IDs. I know I have to create a container, but I
> > don't know, among other things, how to index the container.  Any help is
> > appreciated. TIA
> >
> > -Greg
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Creating data using multiple for loops

Jim Lemon-4
In reply to this post by Jim Lemon-4
Hi Greg,
I replied because I thought the name of the "expand.grid" function can
be puzzling. While "expand.grid" is a very elegant and useful
function, it is much easier to see what is happening with explicit
loops rather than loops buried deep inside "expand.grid". Also note
Bill's comment about producing repeats by converting numeric values to
character without the leading zeros. You can also use "formatC" to
deal with that problem.

Jim

On Tue, Aug 20, 2019 at 12:05 AM <[hidden email]> wrote:
>
> Jim,
>
> Thank you very much for your help. I have "unpacked" the code and have a rudimentary understanding of what you did. Thanks again. However, I have no idea to what Bert is referring. Could you help me understand his suggestion? Thanks.
>
> -Greg

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Creating data using multiple for loops

Bert Gunter-2
From section 9.2.2 (on looping) in "An Introduction to R":

"*Warning*: for() loops are used in R code much less often than in compiled
languages. Code that takes a ‘whole object’ view is likely to be both
clearer and faster in R."

Web searching on "for loops in R" and similar will give you further
comments and perspectives.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Aug 19, 2019 at 2:12 PM Jim Lemon <[hidden email]> wrote:

> Hi Greg,
> I replied because I thought the name of the "expand.grid" function can
> be puzzling. While "expand.grid" is a very elegant and useful
> function, it is much easier to see what is happening with explicit
> loops rather than loops buried deep inside "expand.grid". Also note
> Bill's comment about producing repeats by converting numeric values to
> character without the leading zeros. You can also use "formatC" to
> deal with that problem.
>
> Jim
>
> On Tue, Aug 20, 2019 at 12:05 AM <[hidden email]> wrote:
> >
> > Jim,
> >
> > Thank you very much for your help. I have "unpacked" the code and have a
> rudimentary understanding of what you did. Thanks again. However, I have no
> idea to what Bert is referring. Could you help me understand his
> suggestion? Thanks.
> >
> > -Greg
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Creating data using multiple for loops

Jeff Newmiller
In reply to this post by Jim Lemon-4
Perhaps different people find different concepts the most challenging, but I find looking at the output of expand.grid quite informative... do try it out.

The do.call function seems to be the more obscure function here, but Bert's code

id <- do.call( paste0, expand.grid(0:9,1:3,1:5) )

is equivalent to

all_comb <- expand.grid( 0:9, 1:3, 1:5 )
all_comb # look at it for learning, remove once you understand
paste0( all_comb[[1]], all_comb[[2]], all_comb[[3]] )

because all_comb is a data frame, which is a list of column vectors all the same length. The do.call function expects the first argument to be a function symbol, while the second argument to do.call should be a single object that is a list of arguments you want that function to be given as separate arguments. The paste0 function puts the three vectors together into one character vector, element by element.

Read the help pages for each function:
?expand.grid
?paste0
?do.call

On the other hand, nested for loops seem to become spaghetti quickly in my mind... essentially just write-only code because I never want to look at it again.

On August 19, 2019 2:09:59 PM PDT, Jim Lemon <[hidden email]> wrote:

>Hi Greg,
>I replied because I thought the name of the "expand.grid" function can
>be puzzling. While "expand.grid" is a very elegant and useful
>function, it is much easier to see what is happening with explicit
>loops rather than loops buried deep inside "expand.grid". Also note
>Bill's comment about producing repeats by converting numeric values to
>character without the leading zeros. You can also use "formatC" to
>deal with that problem.
>
>Jim
>
>On Tue, Aug 20, 2019 at 12:05 AM <[hidden email]> wrote:
>>
>> Jim,
>>
>> Thank you very much for your help. I have "unpacked" the code and
>have a rudimentary understanding of what you did. Thanks again.
>However, I have no idea to what Bert is referring. Could you help me
>understand his suggestion? Thanks.
>>
>> -Greg
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.