How to pass a character string with a hyphen

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

How to pass a character string with a hyphen

reichmaj
R-Help

 

How does one pass a character string containing a hyphen? I have a function
that accesses an api if I hard code the object, for example

 

key_key <- "xxxx-yyyy"

 

it works but when I pass the key  code to the function (say something like
key_code <- code_input)  it returns only xxxx. So R is seeing a string with
a negative operator I'm assuming

 

Jeff


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: How to pass a character string with a hyphen

Boris Steipe
tmp <- function(s) {
  return(str(s))
}
key <- "xxxx-yyyy"
tmp(key)

#  chr "xxxx-yyyy"

... works for me.

Reprex?

Cheers,
Boris

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

counting duplicate items that occur in multiple groups

Al Koholic
Hi everyone.  I have a dataframe that is a collection of Vendor IDs  
plus a bank account number for each vendor. I'm trying to find a way  
to count the number of duplicate bank accounts that occur in more than  
one unique Vendor_ID, and then assign the count value for each row in  
the dataframe in a new variable.

I can do a count of bank accounts that occur within the same vendor  
using dplyr and group_by and count, but I can't figure out a way to  
count duplicates among multiple Vendor_IDs.


Dataframe example code:


#Create a sample data frame:

set.seed(1)

Data <- data.frame(Vendor_ID = sample(1:10000), Bank_Account_ID =  
sample(1:10000))




Thanks in advance for any help.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: How to pass a character string with a hyphen

reichmaj
In reply to this post by Boris Steipe
Boris

Yes in that case it works as you have assigned a character string to key. My
problem is I'm taking a string input from a shiny app and when I run the
function it only sees xxxx.  How to assign xxxx-yyyy to a character string.

Jeff
-----Original Message-----
From: Boris Steipe <[hidden email]>
Sent: Tuesday, November 17, 2020 3:00 PM
To: [hidden email]
Cc: [hidden email]
Subject: Re: [R] How to pass a character string with a hyphen

tmp <- function(s) {
  return(str(s))
}
key <- "xxxx-yyyy"
tmp(key)

#  chr "xxxx-yyyy"

.. works for me.

Reprex?

Cheers,
Boris

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: How to pass a character string with a hyphen

R help mailing list-2
In reply to this post by reichmaj
Hi,

You might want to look at ?shQuote, which wraps text in single quotes, if the source text does not include them, or double quotes otherwise, as might be used in a shell setting, where you are passing arguments that may have spaces or other characters that may be evaluated.

My guess is that the API that you are passing the character vector to may be parsing/evaluating the '-' and only seeing the first part of the passed value.

So, for example:

> shQuote("xxxx-yyyy")
[1] "'xxxx-yyyy'"


See if that works.

Regards,

Marc Schwartz


> On Nov 17, 2020, at 3:43 PM, Jeff Reichman <[hidden email]> wrote:
>
> R-Help
>
> How does one pass a character string containing a hyphen? I have a function
> that accesses an api if I hard code the object, for example
>
> key_key <- "xxxx-yyyy"
>
> it works but when I pass the key  code to the function (say something like
> key_code <- code_input)  it returns only xxxx. So R is seeing a string with
> a negative operator I'm assuming
>
> Jeff
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: How to pass a character string with a hyphen

Robert Knight
In reply to this post by reichmaj
Strip the left characters and strip the right characters into their own
variables using one of the methods that can do that.  Then pass it using
something like paste(left, "-", right).

On Tue, Nov 17, 2020, 2:43 PM Jeff Reichman <[hidden email]> wrote:

> R-Help
>
>
>
> How does one pass a character string containing a hyphen? I have a function
> that accesses an api if I hard code the object, for example
>
>
>
> key_key <- "xxxx-yyyy"
>
>
>
> it works but when I pass the key  code to the function (say something like
> key_code <- code_input)  it returns only xxxx. So R is seeing a string with
> a negative operator I'm assuming
>
>
>
> Jeff
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: counting duplicate items that occur in multiple groups

Bert Gunter-2
In reply to this post by Al Koholic
Inline.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Nov 17, 2020 at 1:20 PM Tom Woolman <[hidden email]>
wrote:

> Hi everyone.  I have a dataframe that is a collection of Vendor IDs
> plus a bank account number for each vendor.


I interpret this as: "all vendors are unique and each vendor has a single
bank account." Is that correct?


> I'm trying to find a way
> to count the number of duplicate bank accounts that occur in more than
> one unique Vendor_ID,


The following makes no sense to me, as each row is a unique vendor and has
only one bank account.

> and then assign the count value for each row in
> the dataframe in a new variable.
>
> I can do a count of bank accounts that occur within the same vendor
>
using dplyr and group_by and count, but I can't figure out a way to
> count duplicates among multiple Vendor_IDs.
>
I interpret this to mean that you want to count vendor ID's by account .
With only one account per vendor
this is trivial; e.g.

set.seed(22)
d1 <- data.frame(id = sample(1:30),
      account = sample(1:20,30, replace = TRUE))

table(d1$account)

## gives
 1  2  3  6  7  8  9 10 11 13 15 16 17 18 19 20
 3  1  2  1  1  1  1  1  4  3  1  2  1  3  2  3

Note that AFAICS your example is useless, as it gives the same number of
different account numbers as ID's, so no duplication can occur.

As my interpretations are likely incorrect and this is not what you mean
nor want, either clarify your meaning and provide a useful **minimal**
example; or wait for a reply from someone with a better understanding than
I.

Cheers,
Bert





>
> Dataframe example code:
>
>
> #Create a sample data frame:
>
> set.seed(1)
>
> Data <- data.frame(Vendor_ID = sample(1:10000), Bank_Account_ID =
> sample(1:10000))
>
>
>
>
> Thanks in advance for any help.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: counting duplicate items that occur in multiple groups

Bill Dunlap-2
In reply to this post by Al Koholic
What should the result be for
  Data1 <- data.frame(Vendor=c("V1","V2","V3","V4"),
Account=c("A1","A2","A2","A2"))
?

Must each vendor have only one account?  If not, what should the result be
for
   Data2 <- data.frame(Vendor=c("V1","V2","V3","V1","V4","V2"),
Account=c("A1","A2","A2","A2","A3","A4"))
?

-Bill

On Tue, Nov 17, 2020 at 1:20 PM Tom Woolman <[hidden email]>
wrote:

> Hi everyone.  I have a dataframe that is a collection of Vendor IDs
> plus a bank account number for each vendor. I'm trying to find a way
> to count the number of duplicate bank accounts that occur in more than
> one unique Vendor_ID, and then assign the count value for each row in
> the dataframe in a new variable.
>
> I can do a count of bank accounts that occur within the same vendor
> using dplyr and group_by and count, but I can't figure out a way to
> count duplicates among multiple Vendor_IDs.
>
>
> Dataframe example code:
>
>
> #Create a sample data frame:
>
> set.seed(1)
>
> Data <- data.frame(Vendor_ID = sample(1:10000), Bank_Account_ID =
> sample(1:10000))
>
>
>
>
> Thanks in advance for any help.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: counting duplicate items that occur in multiple groups

Al Koholic
Hi Bill. Sorry to be so obtuse with the example data, I was trying  
(too hard) not to share any actual values so I just created randomized  
values for my example; of course I should have specified that the  
random values would not provide the expected problem pattern. I should  
have just used simple dummy codes as Bill Dunlap did.

So per Bill's example data for Data1, the expected (hoped for) output  
should be:

  Vendor Account Num_Vendors_Sharing_Bank_Acct
1     V1      A1      0
2     V2      A2      3
3     V3      A2      3
4     V4      A2      3


Where the new calculated variable is Num_Vendors_Sharing_Bank_Acct.  
The value is 3 for V2, V3 and V4 because they each share bank account  
A2.


Likewise, in the Data2 frame, the same logic applies:

  Vendor Account Num_Vendors_Sharing_Bank_Acct
1     V1      A1     0
2     V2      A2     3
3     V3      A2     3
4     V1      A2     3
5     V4      A3     0
6     V2      A4     0






Thanks!


Quoting Bill Dunlap <[hidden email]>:

> What should the result be for
>   Data1 <- data.frame(Vendor=c("V1","V2","V3","V4"),
> Account=c("A1","A2","A2","A2"))
> ?
>
> Must each vendor have only one account?  If not, what should the result be
> for
>    Data2 <- data.frame(Vendor=c("V1","V2","V3","V1","V4","V2"),
> Account=c("A1","A2","A2","A2","A3","A4"))
> ?
>
> -Bill
>
> On Tue, Nov 17, 2020 at 1:20 PM Tom Woolman <[hidden email]>
> wrote:
>
>> Hi everyone.  I have a dataframe that is a collection of Vendor IDs
>> plus a bank account number for each vendor. I'm trying to find a way
>> to count the number of duplicate bank accounts that occur in more than
>> one unique Vendor_ID, and then assign the count value for each row in
>> the dataframe in a new variable.
>>
>> I can do a count of bank accounts that occur within the same vendor
>> using dplyr and group_by and count, but I can't figure out a way to
>> count duplicates among multiple Vendor_IDs.
>>
>>
>> Dataframe example code:
>>
>>
>> #Create a sample data frame:
>>
>> set.seed(1)
>>
>> Data <- data.frame(Vendor_ID = sample(1:10000), Bank_Account_ID =
>> sample(1:10000))
>>
>>
>>
>>
>> Thanks in advance for any help.
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: counting duplicate items that occur in multiple groups

Bert Gunter-2
Why 0's in the data frame? Shouldn't that be 1 (vendor with that account)?

Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Nov 17, 2020 at 3:29 PM Tom Woolman <[hidden email]>
wrote:

> Hi Bill. Sorry to be so obtuse with the example data, I was trying
> (too hard) not to share any actual values so I just created randomized
> values for my example; of course I should have specified that the
> random values would not provide the expected problem pattern. I should
> have just used simple dummy codes as Bill Dunlap did.
>
> So per Bill's example data for Data1, the expected (hoped for) output
> should be:
>
>   Vendor Account Num_Vendors_Sharing_Bank_Acct
> 1     V1      A1      0
> 2     V2      A2      3
> 3     V3      A2      3
> 4     V4      A2      3
>
>
> Where the new calculated variable is Num_Vendors_Sharing_Bank_Acct.
> The value is 3 for V2, V3 and V4 because they each share bank account
> A2.
>
>
> Likewise, in the Data2 frame, the same logic applies:
>
>   Vendor Account Num_Vendors_Sharing_Bank_Acct
> 1     V1      A1     0
> 2     V2      A2     3
> 3     V3      A2     3
> 4     V1      A2     3
> 5     V4      A3     0
> 6     V2      A4     0
>
>
>
>
>
>
> Thanks!
>
>
> Quoting Bill Dunlap <[hidden email]>:
>
> > What should the result be for
> >   Data1 <- data.frame(Vendor=c("V1","V2","V3","V4"),
> > Account=c("A1","A2","A2","A2"))
> > ?
> >
> > Must each vendor have only one account?  If not, what should the result
> be
> > for
> >    Data2 <- data.frame(Vendor=c("V1","V2","V3","V1","V4","V2"),
> > Account=c("A1","A2","A2","A2","A3","A4"))
> > ?
> >
> > -Bill
> >
> > On Tue, Nov 17, 2020 at 1:20 PM Tom Woolman <[hidden email]>
> > wrote:
> >
> >> Hi everyone.  I have a dataframe that is a collection of Vendor IDs
> >> plus a bank account number for each vendor. I'm trying to find a way
> >> to count the number of duplicate bank accounts that occur in more than
> >> one unique Vendor_ID, and then assign the count value for each row in
> >> the dataframe in a new variable.
> >>
> >> I can do a count of bank accounts that occur within the same vendor
> >> using dplyr and group_by and count, but I can't figure out a way to
> >> count duplicates among multiple Vendor_IDs.
> >>
> >>
> >> Dataframe example code:
> >>
> >>
> >> #Create a sample data frame:
> >>
> >> set.seed(1)
> >>
> >> Data <- data.frame(Vendor_ID = sample(1:10000), Bank_Account_ID =
> >> sample(1:10000))
> >>
> >>
> >>
> >>
> >> Thanks in advance for any help.
> >>
> >> ______________________________________________
> >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: counting duplicate items that occur in multiple groups

Al Koholic
Yes, good catch. Thanks


Quoting Bert Gunter <[hidden email]>:

> Why 0's in the data frame? Shouldn't that be 1 (vendor with that account)?
>
> Bert
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Tue, Nov 17, 2020 at 3:29 PM Tom Woolman <[hidden email]>
> wrote:
>
>> Hi Bill. Sorry to be so obtuse with the example data, I was trying
>> (too hard) not to share any actual values so I just created randomized
>> values for my example; of course I should have specified that the
>> random values would not provide the expected problem pattern. I should
>> have just used simple dummy codes as Bill Dunlap did.
>>
>> So per Bill's example data for Data1, the expected (hoped for) output
>> should be:
>>
>>   Vendor Account Num_Vendors_Sharing_Bank_Acct
>> 1     V1      A1      0
>> 2     V2      A2      3
>> 3     V3      A2      3
>> 4     V4      A2      3
>>
>>
>> Where the new calculated variable is Num_Vendors_Sharing_Bank_Acct.
>> The value is 3 for V2, V3 and V4 because they each share bank account
>> A2.
>>
>>
>> Likewise, in the Data2 frame, the same logic applies:
>>
>>   Vendor Account Num_Vendors_Sharing_Bank_Acct
>> 1     V1      A1     0
>> 2     V2      A2     3
>> 3     V3      A2     3
>> 4     V1      A2     3
>> 5     V4      A3     0
>> 6     V2      A4     0
>>
>>
>>
>>
>>
>>
>> Thanks!
>>
>>
>> Quoting Bill Dunlap <[hidden email]>:
>>
>> > What should the result be for
>> >   Data1 <- data.frame(Vendor=c("V1","V2","V3","V4"),
>> > Account=c("A1","A2","A2","A2"))
>> > ?
>> >
>> > Must each vendor have only one account?  If not, what should the result
>> be
>> > for
>> >    Data2 <- data.frame(Vendor=c("V1","V2","V3","V1","V4","V2"),
>> > Account=c("A1","A2","A2","A2","A3","A4"))
>> > ?
>> >
>> > -Bill
>> >
>> > On Tue, Nov 17, 2020 at 1:20 PM Tom Woolman <[hidden email]>
>> > wrote:
>> >
>> >> Hi everyone.  I have a dataframe that is a collection of Vendor IDs
>> >> plus a bank account number for each vendor. I'm trying to find a way
>> >> to count the number of duplicate bank accounts that occur in more than
>> >> one unique Vendor_ID, and then assign the count value for each row in
>> >> the dataframe in a new variable.
>> >>
>> >> I can do a count of bank accounts that occur within the same vendor
>> >> using dplyr and group_by and count, but I can't figure out a way to
>> >> count duplicates among multiple Vendor_IDs.
>> >>
>> >>
>> >> Dataframe example code:
>> >>
>> >>
>> >> #Create a sample data frame:
>> >>
>> >> set.seed(1)
>> >>
>> >> Data <- data.frame(Vendor_ID = sample(1:10000), Bank_Account_ID =
>> >> sample(1:10000))
>> >>
>> >>
>> >>
>> >>
>> >> Thanks in advance for any help.
>> >>
>> >> ______________________________________________
>> >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide
>> >> http://www.R-project.org/posting-guide.html
>> >> and provide commented, minimal, self-contained, reproducible code.
>> >>
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: counting duplicate items that occur in multiple groups

R help mailing list-2
In reply to this post by Al Koholic
Many problems can often be solved with some thought by using the right tools, such as the ones from the tidyverse.

Without giving a specific answer, you might want to think about using the group_by() functionality in a pipeline that would lump together all rows matching say having the same value in several columns. Then in something like a mutate() or summarize() you can use special functions like n() that return how many rows exist within each grouping. There are many more such verbs and features that let you build up something, often by removing the grouping along the way and perhaps adding some other form of grouping including the new rowwise() that then lets you do things across columns on a row at a time and so on.

I think the point is to think of steps that lead to a result that can be used in the next step and so on.

And, for some problems, you can  think outside the pipelines and create multiple intermediate data.frames with parts of what you will need and then combine them with joins or whatever it takes to efficiently get a result, or by brute force. Sometimes (as when making graphs) you might want to convert data between forms that are often called long versus wide.

Yes, plenty can be done in base R or using other packages. But a good set of tools might be part of what you need to investigate.

Of course, others can chime in suggesting that there are negatives to dplyr and other aspects of the tidyverse and they would be right too.


-----Original Message-----
From: R-help <[hidden email]> On Behalf Of Tom Woolman
Sent: Tuesday, November 17, 2020 6:30 PM
To: Bill Dunlap <[hidden email]>
Cc: [hidden email]
Subject: Re: [R] counting duplicate items that occur in multiple groups

Hi Bill. Sorry to be so obtuse with the example data, I was trying (too hard) not to share any actual values so I just created randomized values for my example; of course I should have specified that the random values would not provide the expected problem pattern. I should have just used simple dummy codes as Bill Dunlap did.

So per Bill's example data for Data1, the expected (hoped for) output should be:

  Vendor Account Num_Vendors_Sharing_Bank_Acct
1     V1      A1      0
2     V2      A2      3
3     V3      A2      3
4     V4      A2      3


Where the new calculated variable is Num_Vendors_Sharing_Bank_Acct.  
The value is 3 for V2, V3 and V4 because they each share bank account A2.


Likewise, in the Data2 frame, the same logic applies:

  Vendor Account Num_Vendors_Sharing_Bank_Acct
1     V1      A1     0
2     V2      A2     3
3     V3      A2     3
4     V1      A2     3
5     V4      A3     0
6     V2      A4     0






Thanks!


Quoting Bill Dunlap <[hidden email]>:

> What should the result be for
>   Data1 <- data.frame(Vendor=c("V1","V2","V3","V4"),
> Account=c("A1","A2","A2","A2"))
> ?
>
> Must each vendor have only one account?  If not, what should the
> result be for
>    Data2 <- data.frame(Vendor=c("V1","V2","V3","V1","V4","V2"),
> Account=c("A1","A2","A2","A2","A3","A4"))
> ?
>
> -Bill
>
> On Tue, Nov 17, 2020 at 1:20 PM Tom Woolman <[hidden email]>
> wrote:
>
>> Hi everyone.  I have a dataframe that is a collection of Vendor IDs
>> plus a bank account number for each vendor. I'm trying to find a way
>> to count the number of duplicate bank accounts that occur in more
>> than one unique Vendor_ID, and then assign the count value for each
>> row in the dataframe in a new variable.
>>
>> I can do a count of bank accounts that occur within the same vendor
>> using dplyr and group_by and count, but I can't figure out a way to
>> count duplicates among multiple Vendor_IDs.
>>
>>
>> Dataframe example code:
>>
>>
>> #Create a sample data frame:
>>
>> set.seed(1)
>>
>> Data <- data.frame(Vendor_ID = sample(1:10000), Bank_Account_ID =
>> sample(1:10000))
>>
>>
>>
>>
>> Thanks in advance for any help.
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: counting duplicate items that occur in multiple groups

Bert Gunter-2
In reply to this post by Al Koholic
z <- with(Data2, tapply(Vendor,Account, I))
n <- vapply(z,length,1)
data.frame (Vendor = unlist(z),
   Account = rep(names(z),n),
   NumVen = rep(n,n)
)

## which gives:

   Vendor Account NumVen
A1      V1      A1      1
A21     V2      A2      3
A22     V3      A2      3
A23     V1      A2      3
A3      V4      A3      1
A4      V2      A4      1

Of course this also works for Data1

Bill may be able to come up with a slicker version, however.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Nov 17, 2020 at 3:34 PM Tom Woolman <[hidden email]>
wrote:

> Yes, good catch. Thanks
>
>
> Quoting Bert Gunter <[hidden email]>:
>
> > Why 0's in the data frame? Shouldn't that be 1 (vendor with that
> account)?
> >
> > Bert
> > Bert Gunter
> >
> > "The trouble with having an open mind is that people keep coming along
> and
> > sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> >
> > On Tue, Nov 17, 2020 at 3:29 PM Tom Woolman <[hidden email]>
> > wrote:
> >
> >> Hi Bill. Sorry to be so obtuse with the example data, I was trying
> >> (too hard) not to share any actual values so I just created randomized
> >> values for my example; of course I should have specified that the
> >> random values would not provide the expected problem pattern. I should
> >> have just used simple dummy codes as Bill Dunlap did.
> >>
> >> So per Bill's example data for Data1, the expected (hoped for) output
> >> should be:
> >>
> >>   Vendor Account Num_Vendors_Sharing_Bank_Acct
> >> 1     V1      A1      0
> >> 2     V2      A2      3
> >> 3     V3      A2      3
> >> 4     V4      A2      3
> >>
> >>
> >> Where the new calculated variable is Num_Vendors_Sharing_Bank_Acct.
> >> The value is 3 for V2, V3 and V4 because they each share bank account
> >> A2.
> >>
> >>
> >> Likewise, in the Data2 frame, the same logic applies:
> >>
> >>   Vendor Account Num_Vendors_Sharing_Bank_Acct
> >> 1     V1      A1     0
> >> 2     V2      A2     3
> >> 3     V3      A2     3
> >> 4     V1      A2     3
> >> 5     V4      A3     0
> >> 6     V2      A4     0
> >>
> >>
> >>
> >>
> >>
> >>
> >> Thanks!
> >>
> >>
> >> Quoting Bill Dunlap <[hidden email]>:
> >>
> >> > What should the result be for
> >> >   Data1 <- data.frame(Vendor=c("V1","V2","V3","V4"),
> >> > Account=c("A1","A2","A2","A2"))
> >> > ?
> >> >
> >> > Must each vendor have only one account?  If not, what should the
> result
> >> be
> >> > for
> >> >    Data2 <- data.frame(Vendor=c("V1","V2","V3","V1","V4","V2"),
> >> > Account=c("A1","A2","A2","A2","A3","A4"))
> >> > ?
> >> >
> >> > -Bill
> >> >
> >> > On Tue, Nov 17, 2020 at 1:20 PM Tom Woolman <[hidden email]
> >
> >> > wrote:
> >> >
> >> >> Hi everyone.  I have a dataframe that is a collection of Vendor IDs
> >> >> plus a bank account number for each vendor. I'm trying to find a way
> >> >> to count the number of duplicate bank accounts that occur in more
> than
> >> >> one unique Vendor_ID, and then assign the count value for each row in
> >> >> the dataframe in a new variable.
> >> >>
> >> >> I can do a count of bank accounts that occur within the same vendor
> >> >> using dplyr and group_by and count, but I can't figure out a way to
> >> >> count duplicates among multiple Vendor_IDs.
> >> >>
> >> >>
> >> >> Dataframe example code:
> >> >>
> >> >>
> >> >> #Create a sample data frame:
> >> >>
> >> >> set.seed(1)
> >> >>
> >> >> Data <- data.frame(Vendor_ID = sample(1:10000), Bank_Account_ID =
> >> >> sample(1:10000))
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> Thanks in advance for any help.
> >> >>
> >> >> ______________________________________________
> >> >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> >> PLEASE do read the posting guide
> >> >> http://www.R-project.org/posting-guide.html
> >> >> and provide commented, minimal, self-contained, reproducible code.
> >> >>
> >>
> >> ______________________________________________
> >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
>
>
>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: How to pass a character string with a hyphen

reichmaj
In reply to this post by Robert Knight
Robert

 

I ended up using the paste command, its not pretty but it works thank you

 

Jeff

 

From: Robert Knight <[hidden email]>
Sent: Tuesday, November 17, 2020 3:56 PM
To: [hidden email]
Cc: [hidden email]
Subject: Re: [R] How to pass a character string with a hyphen

 

Strip the left characters and strip the right characters into their own variables using one of the methods that can do that.  Then pass it using something like paste(left, "-", right).

 

On Tue, Nov 17, 2020, 2:43 PM Jeff Reichman <[hidden email] <mailto:[hidden email]> > wrote:

R-Help



How does one pass a character string containing a hyphen? I have a function
that accesses an api if I hard code the object, for example



key_key <- "xxxx-yyyy"



it works but when I pass the key  code to the function (say something like
key_code <- code_input)  it returns only xxxx. So R is seeing a string with
a negative operator I'm assuming



Jeff


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] <mailto:[hidden email]>  mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: counting duplicate items that occur in multiple groups

Deepayan Sarkar
In reply to this post by Bert Gunter-2
On Wed, Nov 18, 2020 at 5:40 AM Bert Gunter <[hidden email]> wrote:

>
> z <- with(Data2, tapply(Vendor,Account, I))
> n <- vapply(z,length,1)
> data.frame (Vendor = unlist(z),
>    Account = rep(names(z),n),
>    NumVen = rep(n,n)
> )
>
> ## which gives:
>
>    Vendor Account NumVen
> A1      V1      A1      1
> A21     V2      A2      3
> A22     V3      A2      3
> A23     V1      A2      3
> A3      V4      A3      1
> A4      V2      A4      1
>
> Of course this also works for Data1
>
> Bill may be able to come up with a slicker version, however.

Perhaps

transform(Data2, nshare = as.vector(table(Account)[Account]))

(or dplyr::mutate() instead of transform(), if you prefer.)

-Deepayan

>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Tue, Nov 17, 2020 at 3:34 PM Tom Woolman <[hidden email]>
> wrote:
>
> > Yes, good catch. Thanks
> >
> >
> > Quoting Bert Gunter <[hidden email]>:
> >
> > > Why 0's in the data frame? Shouldn't that be 1 (vendor with that
> > account)?
> > >
> > > Bert
> > > Bert Gunter
> > >
> > > "The trouble with having an open mind is that people keep coming along
> > and
> > > sticking things into it."
> > > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> > >
> > >
> > > On Tue, Nov 17, 2020 at 3:29 PM Tom Woolman <[hidden email]>
> > > wrote:
> > >
> > >> Hi Bill. Sorry to be so obtuse with the example data, I was trying
> > >> (too hard) not to share any actual values so I just created randomized
> > >> values for my example; of course I should have specified that the
> > >> random values would not provide the expected problem pattern. I should
> > >> have just used simple dummy codes as Bill Dunlap did.
> > >>
> > >> So per Bill's example data for Data1, the expected (hoped for) output
> > >> should be:
> > >>
> > >>   Vendor Account Num_Vendors_Sharing_Bank_Acct
> > >> 1     V1      A1      0
> > >> 2     V2      A2      3
> > >> 3     V3      A2      3
> > >> 4     V4      A2      3
> > >>
> > >>
> > >> Where the new calculated variable is Num_Vendors_Sharing_Bank_Acct.
> > >> The value is 3 for V2, V3 and V4 because they each share bank account
> > >> A2.
> > >>
> > >>
> > >> Likewise, in the Data2 frame, the same logic applies:
> > >>
> > >>   Vendor Account Num_Vendors_Sharing_Bank_Acct
> > >> 1     V1      A1     0
> > >> 2     V2      A2     3
> > >> 3     V3      A2     3
> > >> 4     V1      A2     3
> > >> 5     V4      A3     0
> > >> 6     V2      A4     0
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> Thanks!
> > >>
> > >>
> > >> Quoting Bill Dunlap <[hidden email]>:
> > >>
> > >> > What should the result be for
> > >> >   Data1 <- data.frame(Vendor=c("V1","V2","V3","V4"),
> > >> > Account=c("A1","A2","A2","A2"))
> > >> > ?
> > >> >
> > >> > Must each vendor have only one account?  If not, what should the
> > result
> > >> be
> > >> > for
> > >> >    Data2 <- data.frame(Vendor=c("V1","V2","V3","V1","V4","V2"),
> > >> > Account=c("A1","A2","A2","A2","A3","A4"))
> > >> > ?
> > >> >
> > >> > -Bill
> > >> >
> > >> > On Tue, Nov 17, 2020 at 1:20 PM Tom Woolman <[hidden email]
> > >
> > >> > wrote:
> > >> >
> > >> >> Hi everyone.  I have a dataframe that is a collection of Vendor IDs
> > >> >> plus a bank account number for each vendor. I'm trying to find a way
> > >> >> to count the number of duplicate bank accounts that occur in more
> > than
> > >> >> one unique Vendor_ID, and then assign the count value for each row in
> > >> >> the dataframe in a new variable.
> > >> >>
> > >> >> I can do a count of bank accounts that occur within the same vendor
> > >> >> using dplyr and group_by and count, but I can't figure out a way to
> > >> >> count duplicates among multiple Vendor_IDs.
> > >> >>
> > >> >>
> > >> >> Dataframe example code:
> > >> >>
> > >> >>
> > >> >> #Create a sample data frame:
> > >> >>
> > >> >> set.seed(1)
> > >> >>
> > >> >> Data <- data.frame(Vendor_ID = sample(1:10000), Bank_Account_ID =
> > >> >> sample(1:10000))
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> Thanks in advance for any help.
> > >> >>
> > >> >> ______________________________________________
> > >> >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > >> >> https://stat.ethz.ch/mailman/listinfo/r-help
> > >> >> PLEASE do read the posting guide
> > >> >> http://www.R-project.org/posting-guide.html
> > >> >> and provide commented, minimal, self-contained, reproducible code.
> > >> >>
> > >>
> > >> ______________________________________________
> > >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > >> https://stat.ethz.ch/mailman/listinfo/r-help
> > >> PLEASE do read the posting guide
> > >> http://www.R-project.org/posting-guide.html
> > >> and provide commented, minimal, self-contained, reproducible code.
> > >>
> >
> >
> >
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: counting duplicate items that occur in multiple groups

Jim Lemon-4
In reply to this post by Al Koholic
Oops, I sent this to Tom earlier today and forgot to copy to the list:

VendorID=rep(paste0("V",1:10),each=5)
AcctID=paste0("A",sample(1:5,50,TRUE))
Data<-data.frame(VendorID,AcctID)
table(Data)
# get multiple vendors for each account
dupAcctID<-colSums(table(Data)>0)
Data$dupAcct<-NA
# fill in the new column
for(i in 1:length(dupAcctID))
 Data$dupAcct[Data$AcctID == names(dupAcctID[i])]<-dupAcctID[i]

Jim

On Wed, Nov 18, 2020 at 8:20 AM Tom Woolman <[hidden email]>
wrote:

> Hi everyone.  I have a dataframe that is a collection of Vendor IDs
> plus a bank account number for each vendor. I'm trying to find a way
> to count the number of duplicate bank accounts that occur in more than
> one unique Vendor_ID, and then assign the count value for each row in
> the dataframe in a new variable.
>
> I can do a count of bank accounts that occur within the same vendor
> using dplyr and group_by and count, but I can't figure out a way to
> count duplicates among multiple Vendor_IDs.
>
>
> Dataframe example code:
>
>
> #Create a sample data frame:
>
> set.seed(1)
>
> Data <- data.frame(Vendor_ID = sample(1:10000), Bank_Account_ID =
> sample(1:10000))
>
>
>
>
> Thanks in advance for any help.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: counting duplicate items that occur in multiple groups

Al Koholic
Thanks, everyone!



Quoting Jim Lemon <[hidden email]>:

> Oops, I sent this to Tom earlier today and forgot to copy to the list:
>
> VendorID=rep(paste0("V",1:10),each=5)
> AcctID=paste0("A",sample(1:5,50,TRUE))
> Data<-data.frame(VendorID,AcctID)
> table(Data)
> # get multiple vendors for each account
> dupAcctID<-colSums(table(Data)>0)
> Data$dupAcct<-NA
> # fill in the new column
> for(i in 1:length(dupAcctID))
>  Data$dupAcct[Data$AcctID == names(dupAcctID[i])]<-dupAcctID[i]
>
> Jim
>
> On Wed, Nov 18, 2020 at 8:20 AM Tom Woolman <[hidden email]>
> wrote:
>
>> Hi everyone.  I have a dataframe that is a collection of Vendor IDs
>> plus a bank account number for each vendor. I'm trying to find a way
>> to count the number of duplicate bank accounts that occur in more than
>> one unique Vendor_ID, and then assign the count value for each row in
>> the dataframe in a new variable.
>>
>> I can do a count of bank accounts that occur within the same vendor
>> using dplyr and group_by and count, but I can't figure out a way to
>> count duplicates among multiple Vendor_IDs.
>>
>>
>> Dataframe example code:
>>
>>
>> #Create a sample data frame:
>>
>> set.seed(1)
>>
>> Data <- data.frame(Vendor_ID = sample(1:10000), Bank_Account_ID =
>> sample(1:10000))
>>
>>
>>
>>
>> Thanks in advance for any help.
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.