extracting submatrix from a bigger one

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

extracting submatrix from a bigger one

matteo-2
Hi guys,
I'm a newby, so sorry for the easy question.

I have a matrix (459x28) in which a large number of observations are
repeated (same placed sampled in different times).
One of the columns is refers to the ID of the place of sampling.
What I would like is to extract subset matrix for every point of sampling.

I can do it manually, e.g. x1<-data.frame(dataset[dataset$ID=="x1",])
but is it possible to write a script and let do it to R?
So i got n submatrix of the n ID found in the original columns.

Cheers

Matteo

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: extracting submatrix from a bigger one

Rui Barradas
Hello,

Try the following.


result <- lapply(unique(dataset$ID), function(uid) dataset[dataset$ID ==
uid, ])
names(result) <- unique(dataset$ID)


Hope this helps,

Rui Barradas

Em 24-06-2013 15:36, matteo escreveu:

> Hi guys,
> I'm a newby, so sorry for the easy question.
>
> I have a matrix (459x28) in which a large number of observations are
> repeated (same placed sampled in different times).
> One of the columns is refers to the ID of the place of sampling.
> What I would like is to extract subset matrix for every point of sampling.
>
> I can do it manually, e.g. x1<-data.frame(dataset[dataset$ID=="x1",])
> but is it possible to write a script and let do it to R?
> So i got n submatrix of the n ID found in the original columns.
>
> Cheers
>
> Matteo
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: extracting submatrix from a bigger one

matteo-2
Hi,

> result <- lapply(unique(dataset$ID), function(uid) dataset[dataset$ID
> == uid, ])
Ok, I have the element result as a list

> names(result) <- unique(dataset$ID)
Nothing happens. I don't have any submatrix...

Matteo

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: extracting submatrix from a bigger one

Bert Gunter
In reply to this post by Rui Barradas
First of all, is your data structure a matrix or a data frame? They
are different!

Assuming the latter, a shorter version of Rui's answer that avoids
unique() and automatically takes care of names is:

result <- by(dataset, dataset$ID,I)

See ?by, ?tapply, and ?split

-- Bert

On Mon, Jun 24, 2013 at 9:24 AM, Rui Barradas <[hidden email]> wrote:

> Hello,
>
> Try the following.
>
>
> result <- lapply(unique(dataset$ID), function(uid) dataset[dataset$ID ==
> uid, ])
> names(result) <- unique(dataset$ID)
>
>
> Hope this helps,
>
> Rui Barradas
>
> Em 24-06-2013 15:36, matteo escreveu:
>>
>> Hi guys,
>> I'm a newby, so sorry for the easy question.
>>
>> I have a matrix (459x28) in which a large number of observations are
>> repeated (same placed sampled in different times).
>> One of the columns is refers to the ID of the place of sampling.
>> What I would like is to extract subset matrix for every point of sampling.
>>
>> I can do it manually, e.g. x1<-data.frame(dataset[dataset$ID=="x1",])
>> but is it possible to write a script and let do it to R?
>> So i got n submatrix of the n ID found in the original columns.
>>
>> Cheers
>>
>> Matteo
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: extracting submatrix from a bigger one

Rui Barradas
In reply to this post by matteo-2
Hello,

You don't have a sub-data.frame, what you have is a list, with each
element of that list a df. Try to see, for instance, result[[1]]. This
should be a data.frame corresponding to the first ID.

Rui Barradas

Em 24-06-2013 18:03, matteo escreveu:

> Hi,
>
>> result <- lapply(unique(dataset$ID), function(uid) dataset[dataset$ID
>> == uid, ])
> Ok, I have the element result as a list
>
>> names(result) <- unique(dataset$ID)
> Nothing happens. I don't have any submatrix...
>
> Matteo
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: extracting submatrix from a bigger one

David Winsemius
In reply to this post by matteo-2

On Jun 24, 2013, at 10:03 AM, matteo wrote:

> Hi,
>
>> result <- lapply(unique(dataset$ID), function(uid) dataset[dataset$ID == uid, ])
> Ok, I have the element result as a list
>
>> names(result) <- unique(dataset$ID)
> Nothing happens. I don't have any submatrix...

>
> Matteo
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: extracting submatrix from a bigger one

David Winsemius
In reply to this post by matteo-2
Sorry for the blank message. The default behavior of the Mac Mail.app spell checker has me confused.

On Jun 24, 2013, at 10:03 AM, matteo wrote:

> Hi,
>
>> result <- lapply(unique(dataset$ID), function(uid) dataset[dataset$ID == uid, ])
> Ok, I have the element result as a list
>
>> names(result) <- unique(dataset$ID)
> Nothing happens. I don't have any submatrix...

What were you expecting to happen? You just assigned names to the result. You should be able to execute:

names(result)

Or:

str(result)

>
--

David Winsemius
Alameda, CA, USA

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: extracting submatrix from a bigger one

Rui Barradas
In reply to this post by Bert Gunter
Hello,

I had forgotten the much simpler solutions. The following should do it.

split(dataset, dataset$ID)


Rui Barradas

Em 24-06-2013 18:13, Bert Gunter escreveu:

> First of all, is your data structure a matrix or a data frame? They
> are different!
>
> Assuming the latter, a shorter version of Rui's answer that avoids
> unique() and automatically takes care of names is:
>
> result <- by(dataset, dataset$ID,I)
>
> See ?by, ?tapply, and ?split
>
> -- Bert
>
> On Mon, Jun 24, 2013 at 9:24 AM, Rui Barradas <[hidden email]> wrote:
>> Hello,
>>
>> Try the following.
>>
>>
>> result <- lapply(unique(dataset$ID), function(uid) dataset[dataset$ID ==
>> uid, ])
>> names(result) <- unique(dataset$ID)
>>
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>> Em 24-06-2013 15:36, matteo escreveu:
>>>
>>> Hi guys,
>>> I'm a newby, so sorry for the easy question.
>>>
>>> I have a matrix (459x28) in which a large number of observations are
>>> repeated (same placed sampled in different times).
>>> One of the columns is refers to the ID of the place of sampling.
>>> What I would like is to extract subset matrix for every point of sampling.
>>>
>>> I can do it manually, e.g. x1<-data.frame(dataset[dataset$ID=="x1",])
>>> but is it possible to write a script and let do it to R?
>>> So i got n submatrix of the n ID found in the original columns.
>>>
>>> Cheers
>>>
>>> Matteo
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: extracting submatrix from a bigger one

Bert Gunter
Oh yes, that's even better!

-- Bert

On Mon, Jun 24, 2013 at 10:33 AM, Rui Barradas <[hidden email]> wrote:

> Hello,
>
> I had forgotten the much simpler solutions. The following should do it.
>
> split(dataset, dataset$ID)
>
>
> Rui Barradas
>
> Em 24-06-2013 18:13, Bert Gunter escreveu:
>>
>> First of all, is your data structure a matrix or a data frame? They
>> are different!
>>
>> Assuming the latter, a shorter version of Rui's answer that avoids
>> unique() and automatically takes care of names is:
>>
>> result <- by(dataset, dataset$ID,I)
>>
>> See ?by, ?tapply, and ?split
>>
>> -- Bert
>>
>> On Mon, Jun 24, 2013 at 9:24 AM, Rui Barradas <[hidden email]>
>> wrote:
>>>
>>> Hello,
>>>
>>> Try the following.
>>>
>>>
>>> result <- lapply(unique(dataset$ID), function(uid) dataset[dataset$ID ==
>>> uid, ])
>>> names(result) <- unique(dataset$ID)
>>>
>>>
>>> Hope this helps,
>>>
>>> Rui Barradas
>>>
>>> Em 24-06-2013 15:36, matteo escreveu:
>>>>
>>>>
>>>> Hi guys,
>>>> I'm a newby, so sorry for the easy question.
>>>>
>>>> I have a matrix (459x28) in which a large number of observations are
>>>> repeated (same placed sampled in different times).
>>>> One of the columns is refers to the ID of the place of sampling.
>>>> What I would like is to extract subset matrix for every point of
>>>> sampling.
>>>>
>>>> I can do it manually, e.g. x1<-data.frame(dataset[dataset$ID=="x1",])
>>>> but is it possible to write a script and let do it to R?
>>>> So i got n submatrix of the n ID found in the original columns.
>>>>
>>>> Cheers
>>>>
>>>> Matteo
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>>
>



--

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: extracting submatrix from a bigger one

matteo-2
First of all, thanks for all the replies!!
What you have written helps, but is not entirely the answer to my problem.

What I'd have is the creation of new data.frames each of one named with
the ID of the original dataframe and with all the columns.

For example, in the original dataframe one column (ID) has 5 different
elements:

ID    value1    value2
x1        10            12
x1        12            22
x1        11            9
x2        15            10
x3        11            11
x3        13            8

I need a command ables to split the dataframe in other smallest and
separated dataframes, so that they look like

x1 is
ID    value1    value2
x1        10            12
x1        12            22
x1        11            9

x2 is
ID    value1    value2
x2        15            10

and x3 is
ID    value1    value2
x1        10            12
x3        11            11
x3        13            8


Sorry if I'm not able to explain it better and as I said I'm very new to
R.....

Thanks

Matteo

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: extracting submatrix from a bigger one

Bert Gunter
Inline below...

On Mon, Jun 24, 2013 at 11:31 AM, matteo <[hidden email]> wrote:
> First of all, thanks for all the replies!!
> What you have written helps, but is not entirely the answer to my problem.
>
> What I'd have is the creation of new data.frames each of one named with the
> ID of the original dataframe and with all the columns.

No you don't! You want what you were provided, a list of data frames.
Anything you want to do can be done, probably more conveniently, with
that.

Read "An Introduction to R" and learn to work with lists. Being a
newbie is no excuse for not making an effort to learn.

-- Bert



>
> For example, in the original dataframe one column (ID) has 5 different
> elements:
>
> ID    value1    value2
> x1        10            12
> x1        12            22
> x1        11            9
> x2        15            10
> x3        11            11
> x3        13            8
>
> I need a command ables to split the dataframe in other smallest and
> separated dataframes, so that they look like
>
> x1 is
> ID    value1    value2
> x1        10            12
> x1        12            22
> x1        11            9
>
> x2 is
> ID    value1    value2
> x2        15            10
>
> and x3 is
> ID    value1    value2
> x1        10            12
> x3        11            11
> x3        13            8
>
>
> Sorry if I'm not able to explain it better and as I said I'm very new to
> R.....
>
> Thanks
>
> Matteo



--

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: extracting submatrix from a bigger one

William Dunlap
In reply to this post by matteo-2
> First of all, thanks for all the replies!!
> What you have written helps, but is not entirely the answer to my problem.
>
> What I'd have is the creation of new data.frames each of one named with
> the ID of the original dataframe and with all the columns.

What was suggested gave you a list of data.frames, each named with the ID .
You can use the syntax list$name or list[["name"]] to refer to a data.frame.
  R> splitData <- split(allData, allData$ID)
  R> splitData$x1
    ID value1 value2
  1 x1     10     12
  2 x1     12     22
  3 x1     11      9
  R> splitData$x2
    ID value1 value2
  4 x2     15     10

You seem to want a function that creates a bunch of data.frames in the
current environment instead of one that creates them in a list created to
hold them.  This is not necessary and actually gets in the way most of the
time.

If you want to refer to 'x1' instead of 'splitData$x1' you can use 'with', as in
  R> with(splitData, mean(x1$value2) - mean(x2$value2))
  [1] 4.333333
instead of the slightly wordier
  R> mean(splitData$x1$value2) - mean(splitData$x2$value2)
  [1] 4.333333

If you want to process each sub-data.frame (these are data.frame, not matrices)
you can use lapply() or sapply() or vapply() on the list
  R> dm <- sapply(splitData, function(x)mean(x$value2) - mean(x$value1))
  R> dm
         x1        x2        x3
   3.333333 -5.000000 -2.500000
  R> dm["x2"]
  x2
  -5

If you put all those names into the current environment you stand the chance
of clobbering some other dataset whose name matched one of the entries in
allData$ID.  Also you would have to use some rather ugly code involving get()
and assign() to manipulate the objects.  Learn to love lists.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]] On Behalf
> Of matteo
> Sent: Monday, June 24, 2013 11:32 AM
> To: Bert Gunter
> Cc: [hidden email]
> Subject: Re: [R] extracting submatrix from a bigger one
>
> First of all, thanks for all the replies!!
> What you have written helps, but is not entirely the answer to my problem.
>
> What I'd have is the creation of new data.frames each of one named with
> the ID of the original dataframe and with all the columns.
>
> For example, in the original dataframe one column (ID) has 5 different
> elements:
>
> ID    value1    value2
> x1        10            12
> x1        12            22
> x1        11            9
> x2        15            10
> x3        11            11
> x3        13            8
>
> I need a command ables to split the dataframe in other smallest and
> separated dataframes, so that they look like
>
> x1 is
> ID    value1    value2
> x1        10            12
> x1        12            22
> x1        11            9
>
> x2 is
> ID    value1    value2
> x2        15            10
>
> and x3 is
> ID    value1    value2
> x1        10            12
> x3        11            11
> x3        13            8
>
>
> Sorry if I'm not able to explain it better and as I said I'm very new to
> R.....
>
> Thanks
>
> Matteo
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.