Quantcast

concatenating range of columns in dataframe

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

concatenating range of columns in dataframe

Evan Cooch
Suppose I have the following data frame (call it df):

Trt   y1  y2  y3  y4
A1A   1    0    0    1
A1B  1    1    0    0
A1 C   0   1    0   1
A1D   1    1    1   1

What I want to do is concatenate columns y1  -> y4 into a contiguous
string (which I'll call df$conc), so that the final df looks like

Trt      Conc
A1A   1001
A1B   1100
A1C  0101
A1D   1111


Now, if my initial dataframe was simply

  1   0  0  1
  1   1  0  0
   0  1  0  1
   1  1  1  1

then apply(df,1,paste,collapse="") does the trick, more or less.

But once I have a Trt column, this approach yields

A1A1001
A1B1100
A1C0101
A1D1111

I need to maintain the space between Trt, and the other columns. So, I'm
trying to concatenate a subset of columns in the data frame, but I don't
want to have to do something like create a cahracter vector of the
column names to do it (e.g., c("y1","y2","y3","y4"). Doing a few by hand
that way is easy, but not if you  have dozens to hundreds of columns to
work with.

  Ideally, I'd like to be able to say

"concatenate df[,2:4], get rid of the spaces, pipe the concatenated
columns to a new named column, and drop the original columns from the
final df.

Heuristically,

df$conc <- concatenate df[,2:4] # making a new, 5th column in df
df[,2:4] <- NULL   # to drop original columns 2 -> 4

Suggestions/pointers to the obvious appreciated.

Thanks in advance!

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: concatenating range of columns in dataframe

Jim Lemon-4
Hi Evan,
How about this:

df2<-data.frame(Trt=df[,1],Conc=apply(df[,2:5],1,paste,sep="",collapse=""))

Jim

On Fri, Mar 10, 2017 at 3:16 PM, Evan Cooch <[hidden email]> wrote:

> Suppose I have the following data frame (call it df):
>
> Trt   y1  y2  y3  y4
> A1A   1    0    0    1
> A1B  1    1    0    0
> A1 C   0   1    0   1
> A1D   1    1    1   1
>
> What I want to do is concatenate columns y1  -> y4 into a contiguous string
> (which I'll call df$conc), so that the final df looks like
>
> Trt      Conc
> A1A   1001
> A1B   1100
> A1C  0101
> A1D   1111
>
>
> Now, if my initial dataframe was simply
>
>  1   0  0  1
>  1   1  0  0
>   0  1  0  1
>   1  1  1  1
>
> then apply(df,1,paste,collapse="") does the trick, more or less.
>
> But once I have a Trt column, this approach yields
>
> A1A1001
> A1B1100
> A1C0101
> A1D1111
>
> I need to maintain the space between Trt, and the other columns. So, I'm
> trying to concatenate a subset of columns in the data frame, but I don't
> want to have to do something like create a cahracter vector of the column
> names to do it (e.g., c("y1","y2","y3","y4"). Doing a few by hand that way
> is easy, but not if you  have dozens to hundreds of columns to work with.
>
>  Ideally, I'd like to be able to say
>
> "concatenate df[,2:4], get rid of the spaces, pipe the concatenated columns
> to a new named column, and drop the original columns from the final df.
>
> Heuristically,
>
> df$conc <- concatenate df[,2:4] # making a new, 5th column in df
> df[,2:4] <- NULL   # to drop original columns 2 -> 4
>
> Suggestions/pointers to the obvious appreciated.
>
> Thanks in advance!
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: concatenating range of columns in dataframe

Bert Gunter-2
In reply to this post by Evan Cooch
I think you need to spend some time with an R tutorial or two,
especially with regard to indexing.

Unless I have misunderstood (apologies if I have),

df$Conc <- apply(df[,-1],1,paste,collapse="")

does it.

-- Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Mar 9, 2017 at 8:16 PM, Evan Cooch <[hidden email]> wrote:

> Suppose I have the following data frame (call it df):
>
> Trt   y1  y2  y3  y4
> A1A   1    0    0    1
> A1B  1    1    0    0
> A1 C   0   1    0   1
> A1D   1    1    1   1
>
> What I want to do is concatenate columns y1  -> y4 into a contiguous string
> (which I'll call df$conc), so that the final df looks like
>
> Trt      Conc
> A1A   1001
> A1B   1100
> A1C  0101
> A1D   1111
>
>
> Now, if my initial dataframe was simply
>
>  1   0  0  1
>  1   1  0  0
>   0  1  0  1
>   1  1  1  1
>
> then apply(df,1,paste,collapse="") does the trick, more or less.
>
> But once I have a Trt column, this approach yields
>
> A1A1001
> A1B1100
> A1C0101
> A1D1111
>
> I need to maintain the space between Trt, and the other columns. So, I'm
> trying to concatenate a subset of columns in the data frame, but I don't
> want to have to do something like create a cahracter vector of the column
> names to do it (e.g., c("y1","y2","y3","y4"). Doing a few by hand that way
> is easy, but not if you  have dozens to hundreds of columns to work with.
>
>  Ideally, I'd like to be able to say
>
> "concatenate df[,2:4], get rid of the spaces, pipe the concatenated columns
> to a new named column, and drop the original columns from the final df.
>
> Heuristically,
>
> df$conc <- concatenate df[,2:4] # making a new, 5th column in df
> df[,2:4] <- NULL   # to drop original columns 2 -> 4
>
> Suggestions/pointers to the obvious appreciated.
>
> Thanks in advance!
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: concatenating range of columns in dataframe

Ulrik Stervbo-2
In reply to this post by Jim Lemon-4
Hi Evan,

the unite function of the tidyr package achieves the same as Jim suggested,
but in perhaps a slightly more readable manner.

Ulrik

On Fri, 10 Mar 2017 at 07:50 Jim Lemon <[hidden email]> wrote:

> Hi Evan,
> How about this:
>
> df2<-data.frame(Trt=df[,1],Conc=apply(df[,2:5],1,paste,sep="",collapse=""))
>
> Jim
>
> On Fri, Mar 10, 2017 at 3:16 PM, Evan Cooch <[hidden email]> wrote:
> > Suppose I have the following data frame (call it df):
> >
> > Trt   y1  y2  y3  y4
> > A1A   1    0    0    1
> > A1B  1    1    0    0
> > A1 C   0   1    0   1
> > A1D   1    1    1   1
> >
> > What I want to do is concatenate columns y1  -> y4 into a contiguous
> string
> > (which I'll call df$conc), so that the final df looks like
> >
> > Trt      Conc
> > A1A   1001
> > A1B   1100
> > A1C  0101
> > A1D   1111
> >
> >
> > Now, if my initial dataframe was simply
> >
> >  1   0  0  1
> >  1   1  0  0
> >   0  1  0  1
> >   1  1  1  1
> >
> > then apply(df,1,paste,collapse="") does the trick, more or less.
> >
> > But once I have a Trt column, this approach yields
> >
> > A1A1001
> > A1B1100
> > A1C0101
> > A1D1111
> >
> > I need to maintain the space between Trt, and the other columns. So, I'm
> > trying to concatenate a subset of columns in the data frame, but I don't
> > want to have to do something like create a cahracter vector of the column
> > names to do it (e.g., c("y1","y2","y3","y4"). Doing a few by hand that
> way
> > is easy, but not if you  have dozens to hundreds of columns to work with.
> >
> >  Ideally, I'd like to be able to say
> >
> > "concatenate df[,2:4], get rid of the spaces, pipe the concatenated
> columns
> > to a new named column, and drop the original columns from the final df.
> >
> > Heuristically,
> >
> > df$conc <- concatenate df[,2:4] # making a new, 5th column in df
> > df[,2:4] <- NULL   # to drop original columns 2 -> 4
> >
> > Suggestions/pointers to the obvious appreciated.
> >
> > Thanks in advance!
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: concatenating range of columns in dataframe

Evan Cooch
In reply to this post by Jim Lemon-4


On 3/10/2017 1:48 AM, Jim Lemon wrote:
> Hi Evan,
> How about this:
>
> df2<-data.frame(Trt=df[,1],Conc=apply(df[,2:5],1,paste,sep="",collapse=""))
>
> Jim
>
>


Thanks!

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: concatenating range of columns in dataframe

Evan Cooch
In reply to this post by Bert Gunter-2


On 3/10/2017 2:23 AM, Bert Gunter wrote:

> I think you need to spend some time with an R tutorial or two,
> especially with regard to indexing.
>
> Unless I have misunderstood (apologies if I have),
>
> df$Conc <- apply(df[,-1],1,paste,collapse="")
>
> does it.
>
> -- Bert
>
>
\

Thanks -- sage advice.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: concatenating range of columns in dataframe

Evan Cooch
In reply to this post by Ulrik Stervbo-2


On 3/10/2017 2:24 AM, Ulrik Stervbo wrote:
> Hi Evan,
>
> the unite function of the tidyr package achieves the same as Jim
> suggested, but in perhaps a slightly more readable manner.
>
> Ulrik
>

I use perl for scripting, so readability isn't a big factor. ;-)

Thanks!

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Loading...