vectorised recovery of strsplit value ??

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

vectorised recovery of strsplit value ??

maddox
Dear Guru's

My first steps with R have ground to a halt! I have a vector of sample identifiers

> sampleIDs
 [1] "D1_1"   "D1_2"   "D1_3"   "D1_4"   "D1_5"   "D1_6"   "D1_7"   "D1_8"  
 [9] "D1_9"   "D1_10"  "D1_11"  "D1_12"  "F1_13"  "F1_14"  "F1_15"  "F1_16"
[17] "F1_17"  "F1_18"  "F1_19"  "F1_20"  "F1_21"  "F1_22"  "F1_23"  "F1_24"
[25] "DDC_25" "DDC_26" "DDC_27" "DDC_28" "DDC_29" "DDC_30" "DDC_31" "DDC_32"
[33] "DDC_33" "DDC_34" "DDC_35" "DDC_36"

from which I've split the prefix identifier using strsplit

> splitIDs <- strsplit( as.character(sampleIDs), "_")
> splitIDs
[[1]]
[1] "D1" "1"

[[2]]
[1] "D1" "2"

[[3]]
[1] "D1" "3"

[[4]]
[1] "D1" "4"  etc

I am now struggling to work with the prefix identifiers (D1, F1, DDC) because the only way I have figured out to access them is with splitIDs[[i]][1] i.e. it seems like I have to use a loop to get the identifiers into a factor and counted.

Is there a vectorised solution someone can suggest?
Or an alternative strategy .. these are early days using R for me!
Thanks


regards

M


Reply | Threaded
Open this post in threaded view
|

Re: vectorised recovery of strsplit value ?? Resolved

maddox
Reply | Threaded
Open this post in threaded view
|

Re: vectorised recovery of strsplit value ??

plangfelder
In reply to this post by maddox
There are several ways to get a matrix, for example

mat = as.matrix(as.data.frame(splitIDs))

or

mat = sapply(splitIDs, I)

True experts may suggests even more ways.

Peter

On Wed, Dec 22, 2010 at 1:02 PM, maddox <[hidden email]> wrote:

>
> Dear Guru's
>
> My first steps with R have ground to a halt! I have a vector of sample
> identifiers
>
>> sampleIDs
>  [1] "D1_1"   "D1_2"   "D1_3"   "D1_4"   "D1_5"   "D1_6"   "D1_7"   "D1_8"
>  [9] "D1_9"   "D1_10"  "D1_11"  "D1_12"  "F1_13"  "F1_14"  "F1_15"  "F1_16"
> [17] "F1_17"  "F1_18"  "F1_19"  "F1_20"  "F1_21"  "F1_22"  "F1_23"  "F1_24"
> [25] "DDC_25" "DDC_26" "DDC_27" "DDC_28" "DDC_29" "DDC_30" "DDC_31" "DDC_32"
> [33] "DDC_33" "DDC_34" "DDC_35" "DDC_36"
>
> from which I've split the prefix identifier using strsplit
>
>> splitIDs <- strsplit( as.character(sampleIDs), "_")
>> splitIDs
> [[1]]
> [1] "D1" "1"
>
> [[2]]
> [1] "D1" "2"
>
> [[3]]
> [1] "D1" "3"
>
> [[4]]
> [1] "D1" "4"  etc
>
> I am now struggling to work with the prefix identifiers (D1, F1, DDC)
> because the only way I have figured out to access them is with
> splitIDs[[i]][1] i.e. it seems like I have to use a loop to get the
> identifiers into a factor and counted.
>
> Is there a vectorised solution someone can suggest?
> Or an alternative strategy .. these are early days using R for me!
> Thanks
>
>
> regards
>
> M
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/vectorised-recovery-of-strsplit-value-tp3161254p3161254.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: vectorised recovery of strsplit value ??

Jorge I Velez
In reply to this post by maddox
Try

sapply(strsplit(sampleIDs, "_"), "[", 1)

HTH,
Jorge


On Wed, Dec 22, 2010 at 4:02 PM, maddox <> wrote:

>
> Dear Guru's
>
> My first steps with R have ground to a halt! I have a vector of sample
> identifiers
>
> > sampleIDs
>  [1] "D1_1"   "D1_2"   "D1_3"   "D1_4"   "D1_5"   "D1_6"   "D1_7"   "D1_8"
>  [9] "D1_9"   "D1_10"  "D1_11"  "D1_12"  "F1_13"  "F1_14"  "F1_15"  "F1_16"
> [17] "F1_17"  "F1_18"  "F1_19"  "F1_20"  "F1_21"  "F1_22"  "F1_23"  "F1_24"
> [25] "DDC_25" "DDC_26" "DDC_27" "DDC_28" "DDC_29" "DDC_30" "DDC_31"
> "DDC_32"
> [33] "DDC_33" "DDC_34" "DDC_35" "DDC_36"
>
> from which I've split the prefix identifier using strsplit
>
> > splitIDs <- strsplit( as.character(sampleIDs), "_")
> > splitIDs
> [[1]]
> [1] "D1" "1"
>
> [[2]]
> [1] "D1" "2"
>
> [[3]]
> [1] "D1" "3"
>
> [[4]]
> [1] "D1" "4"  etc
>
> I am now struggling to work with the prefix identifiers (D1, F1, DDC)
> because the only way I have figured out to access them is with
> splitIDs[[i]][1] i.e. it seems like I have to use a loop to get the
> identifiers into a factor and counted.
>
> Is there a vectorised solution someone can suggest?
> Or an alternative strategy .. these are early days using R for me!
> Thanks
>
>
> regards
>
> M
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/vectorised-recovery-of-strsplit-value-tp3161254p3161254.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: vectorised recovery of strsplit value ??

maddox
Thanks Jorge, for your reply. In the end I changed my approach and used a sub() strategy I found on this forum to recover the prefixes as below.

IDs.prefix <- sub("([^*])(_.*)", "\\1" , sampleIDs )
IDs.split <- cbind(sampleIDs , IDs.prefix)

Regards

M