Quantcast

count occurrence and distance of characters in string

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

count occurrence and distance of characters in string

Immanuel-2
Hello all,

I want to know how often one character occurs in a given string
and the distance from between every two occurences. (distance = other
characters between them).

thanks

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: count occurrence and distance of characters in string

Nordlund, Dan (DSHS/RDA)
> -----Original Message-----
> From: [hidden email] [mailto:r-help-bounces@r-
> project.org] On Behalf Of Immanuel
> Sent: Thursday, November 04, 2010 3:42 PM
> To: [hidden email]
> Subject: [R] count occurrence and distance of characters in string
>
> Hello all,
>
> I want to know how often one character occurs in a given string
> and the distance from between every two occurences. (distance = other
> characters between them).
>
> thanks
>

Without a reproducible example, I can only guess.  But this should get you started.

s <- 'abcdeabcxdeabcdeaxabcdeabcdeabcdxeabc'
chr.pos <- which(unlist(strsplit(s,NULL)) == 'x')
chr.count <- length(chr.pos)
chr.dist <- diff(chr.pos)-1
chr.pos
chr.count
chr.dist

Hope this is helpful,

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA 98504-5204


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: count occurrence and distance of characters in string

cberry
In reply to this post by Immanuel-2
On Thu, 4 Nov 2010, Immanuel wrote:

> Hello all,
>
> I want to know how often one character occurs in a given string
> and the distance from between every two occurences. (distance = other
> characters between them).

You should provide "commented, minimal, self-contained, reproducible code"
as asked.

And especially for a question like this one with many simple answers that
RespondeRs will shower you with if only you give them a starting point.

Use tapply, strsplit, seq, nchar, unlist, diff, "-", and table for one
way.

Chuck

>
> thanks
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Charles C. Berry                            Dept of Family/Preventive Medicine
[hidden email]    UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: count occurrence and distance of characters in string

Immanuel-2
Hey,

thanks for the answer, actually I already typed an example
but deleted it since I thought it's superfluous.
regards

---------
string <- "kjokllokkoadddo"

# f1(string, "o") should return that "o" was found 4 times
# f2(string, "o") should return that the distances between the "o"'s
found is 3 , 2, 4
---------


On 11/05/2010 12:28 AM, Charles C. Berry wrote:

> On Thu, 4 Nov 2010, Immanuel wrote:
>
>> Hello all,
>>
>> I want to know how often one character occurs in a given string
>> and the distance from between every two occurences. (distance = other
>> characters between them).
>
> You should provide "commented, minimal, self-contained, reproducible
> code" as asked.
>
> And especially for a question like this one with many simple answers
> that RespondeRs will shower you with if only you give them a starting
> point.
>
> Use tapply, strsplit, seq, nchar, unlist, diff, "-", and table for one
> way.
>
> Chuck
>
>>
>> thanks
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> Charles C. Berry                            Dept of Family/Preventive
> Medicine
> [hidden email]                UC San Diego
> http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego
> 92093-0901
>
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: count occurrence and distance of characters in string

cberry
On Fri, 5 Nov 2010, Immanuel wrote:

> Hey,
>
> thanks for the answer, actually I already typed an example
> but deleted it since I thought it's superfluous.
> regards
>
> ---------
> string <- "kjokllokkoadddo"
>
> # f1(string, "o") should return that "o" was found 4 times
> # f2(string, "o") should return that the distances between the "o"'s
> found is 3 , 2, 4
> ---------

In that case, I'd use split:

> res <- split(seq(nchar(string)),unlist(strsplit(string,'')))
> length(res[['o']])
[1] 4
> ## or
> sapply(res,length)
a d j k l o
1 3 1 4 2 4
> diff(res[['o']])-1
[1] 3 2 4
> # or
> sapply(sapply(res,diff),"-",1)
$a
numeric(0)

$d
[1] 0 0

$j
numeric(0)

$k
[1] 2 3 0

$l
[1] 0

$o
[1] 3 2 4

>
Chuck


>
>
> On 11/05/2010 12:28 AM, Charles C. Berry wrote:
>> On Thu, 4 Nov 2010, Immanuel wrote:
>>
>>> Hello all,
>>>
>>> I want to know how often one character occurs in a given string
>>> and the distance from between every two occurences. (distance = other
>>> characters between them).
>>
>> You should provide "commented, minimal, self-contained, reproducible
>> code" as asked.
>>
>> And especially for a question like this one with many simple answers
>> that RespondeRs will shower you with if only you give them a starting
>> point.
>>
>> Use tapply, strsplit, seq, nchar, unlist, diff, "-", and table for one
>> way.
>>
>> Chuck
>>
>>>
>>> thanks
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> Charles C. Berry                            Dept of Family/Preventive
>> Medicine
>> [hidden email]                UC San Diego
>> http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego
>> 92093-0901
>>
>>
>>
>
>

Charles C. Berry                            Dept of Family/Preventive Medicine
[hidden email]    UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: count occurrence and distance of characters in string

David Winsemius

On Nov 4, 2010, at 8:06 PM, Charles C. Berry wrote:

> On Fri, 5 Nov 2010, Immanuel wrote:
>
>> Hey,
>>
>> thanks for the answer, actually I already typed an example
>> but deleted it since I thought it's superfluous.
>> regards
>>
>> ---------
>> string <- "kjokllokkoadddo"
>>
>> # f1(string, "o") should return that "o" was found 4 times

Other ways:

sum(unlist(strsplit(string, "")) == "o")
[1] 4

>> # f2(string, "o") should return that the distances between the "o"'s
>> found is 3 , 2, 4
>> ---------

 > diff(grep("o", strsplit(string, "")[[1]]) ) -1
[1] 3 2 4


>
> In that case, I'd use split:
>
>> res <- split(seq(nchar(string)),unlist(strsplit(string,'')))
>> length(res[['o']])
> [1] 4
>> ## or sapply(res,length)
> a d j k l o
> 1 3 1 4 2 4
>> diff(res[['o']])-1
> [1] 3 2 4
>> # or
>> sapply(sapply(res,diff),"-",1)
> $a
> numeric(0)
>
> $d
> [1] 0 0
>
> $j
> numeric(0)
>
> $k
> [1] 2 3 0
>
> $l
> [1] 0
>
> $o
> [1] 3 2 4
>
>>
> Chuck
>
>
>>
>>
>> On 11/05/2010 12:28 AM, Charles C. Berry wrote:
>>> On Thu, 4 Nov 2010, Immanuel wrote:
>>>
>>>> Hello all,
>>>>
>>>> I want to know how often one character occurs in a given string
>>>> and the distance from between every two occurences. (distance =  
>>>> other
>>>> characters between them).
>>>
>>> You should provide "commented, minimal, self-contained, reproducible
>>> code" as asked.
>>>
>>> And especially for a question like this one with many simple answers
>>> that RespondeRs will shower you with if only you give them a  
>>> starting
>>> point.
>>>
>>> Use tapply, strsplit, seq, nchar, unlist, diff, "-", and table for  
>>> one
>>> way.
>>>
>>> Chuck
>>>
>>>>
>>>> thanks
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>> Charles C. Berry                            Dept of Family/
>>> Preventive
>>> Medicine
>>> [hidden email]                UC San Diego
>>> http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego
>>> 92093-0901
>>>
>>>
>>>
>>
>>
>
> Charles C. Berry                            Dept of Family/
> Preventive Medicine
> [hidden email]    UC San Diego
> http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego  
> 92093-0901
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: count occurrence and distance of characters in string

William Dunlap
In reply to this post by Immanuel-2
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of Immanuel
> Sent: Thursday, November 04, 2010 4:54 PM
> To: Charles C. Berry
> Cc: [hidden email]
> Subject: Re: [R] count occurrence and distance of characters in string
>
> Hey,
>
> thanks for the answer, actually I already typed an example
> but deleted it since I thought it's superfluous.
> regards
>
> ---------
> string <- "kjokllokkoadddo"
>
> # f1(string, "o") should return that "o" was found 4 times
> # f2(string, "o") should return that the distances between the "o"'s
> found is 3 , 2, 4
> ---------

Try gregexpr():
  > string <- "kjokllokkoadddo"
  > gregexpr("o", string)
  [[1]]
  [1]  3  7 10 15
  attr(,"match.length")
  [1] 1 1 1 1

  > gregexpr("o", c("kjokllokkoadddo", "ooofoo", "abcde"))
  [[1]]
  [1]  3  7 10 15
  attr(,"match.length")
  [1] 1 1 1 1

  [[2]]
  [1] 1 2 3 5 6
  attr(,"match.length")
  [1] 1 1 1 1 1

  [[3]]
  [1] -1
  attr(,"match.length")
  [1] -1

Postprocess its output with length and diff to get what
your want.  E.g.,

  > g <- gregexpr("o", c("kjokllokkoadddo", "ooofoo", "abcde"))
  > sapply(g,length)
  [1] 4 5 1
  > lapply(g,function(x)diff(x)-1)
  [[1]]
  [1] 3 2 4

  [[2]]
  [1] 0 0 1 0

  [[3]]
  numeric(0)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

>
>
> On 11/05/2010 12:28 AM, Charles C. Berry wrote:
> > On Thu, 4 Nov 2010, Immanuel wrote:
> >
> >> Hello all,
> >>
> >> I want to know how often one character occurs in a given string
> >> and the distance from between every two occurences.
> (distance = other
> >> characters between them).
> >
> > You should provide "commented, minimal, self-contained, reproducible
> > code" as asked.
> >
> > And especially for a question like this one with many simple answers
> > that RespondeRs will shower you with if only you give them
> a starting
> > point.
> >
> > Use tapply, strsplit, seq, nchar, unlist, diff, "-", and
> table for one
> > way.
> >
> > Chuck
> >
> >>
> >> thanks
> >>
> >> ______________________________________________
> >> [hidden email] mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> > Charles C. Berry                            Dept of
> Family/Preventive
> > Medicine
> > [hidden email]                UC San Diego
> > http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego
> > 92093-0901
> >
> >
> >
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Loading...