Separating columns, and sorting by rows

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Separating columns, and sorting by rows

RagingJim
Dear anyone who knows more about R than me (so everyone). I have been bashing my head on the keyboard all day trying to do something with my table.

I have some data, like so:
     yyyy-mm     Rainfall(mm)
1   1977-02        17.4
2   1977-03        34.0
3   1977-04        26.2
4   1977-05        42.6
5   1977-06        58.6
6   1977-07        23.2
7   1977-08        26.8
8   1977-09        48.4
9   1977-10        47.0
10  1977-11        37.2
11  1977-12        15.0
12  1978-01         2.6
13  1978-02         6.8
14  1978-03         9.0
15  1978-04        46.6

The data continues on for x number of hundreds of data points. Simply put, I need to make that data.frame into this data.frame/table/matrix/grid/... you get the picture.

         Jan  Feb  Mar ... etc
Year   Rain Rain Rain
Year   Rain Rain Rain
Year   Rain Rain Rain
Year   etc...
Year
Year

How on earth do I do it? I have made little to no progress on it all day. Also, just like all the other problems, the year and month will change every time depending upon which csv file or sql query I load into the program. If anyone has any pointers, that would be awesome.

Cheers lads.
Reply | Threaded
Open this post in threaded view
|

Re: Separating columns, and sorting by rows

milton ruser
Hi Raging Jim

may be this is a starting point.

myDF<-read.table(stdin(),head=T,sep=",")
yyyymm,Rainfall
1977-02,17.4
1977-03,34.0
1977-04,26.2
1977-05,42.6
1977-06,58.6
1977-07,23.2
1977-08,26.8
1977-09,48.4
1977-10,47.0
1977-11,37.2
1977-12,15.0
1978-01,2.6
1978-02,6.8
1978-03,9.0
1978-04,46.6

myDF$yyyy<-substr(myDF$yyyymm,1,4)
myDF$mm<-substr(myDF$yyyymm,6,7)
myDF<-subset(myDF, select=c(yyyy,mm,Rainfall))
myDF.reshape<-reshape(myDF,v.names="Rainfall",idvar="yyyy",
 timevar="mm",direction="wide")
myDF.reshape
best regards

milton

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Separating columns, and sorting by rows

RagingJim
In reply to this post by RagingJim
the other alternative would be to edit my sql query so that that data is brought in from the database and put in to the correct format initially.

"sqlQuery(conn, "select lsd,ttl_mo_prcp from mo_rains where stn_num=23090")"

That is my very basic query. I have also been given this for use in orcale (I believe):

        my $query = "select to_char(stn_num,'00009') as stn, to_char(lsd,'yyyy') as yr,
                                                nvl(max(decode(to_char(lsd,'MON'),'JAN',ttl_mo_prcp)),-9999) jan,
                                                nvl(max(decode(to_char(lsd,'MON'),'FEB',ttl_mo_prcp)),-9999) feb,
                                                nvl(max(decode(to_char(lsd,'MON'),'MAR',ttl_mo_prcp)),-9999) mar,
                                                nvl(max(decode(to_char(lsd,'MON'),'APR',ttl_mo_prcp)),-9999) apr,
                                                nvl(max(decode(to_char(lsd,'MON'),'MAY',ttl_mo_prcp)),-9999) may,
                                                nvl(max(decode(to_char(lsd,'MON'),'JUN',ttl_mo_prcp)),-9999) jun,
                                                nvl(max(decode(to_char(lsd,'MON'),'JUL',ttl_mo_prcp)),-9999) jul,
                                                nvl(max(decode(to_char(lsd,'MON'),'AUG',ttl_mo_prcp)),-9999) aug,
                                                nvl(max(decode(to_char(lsd,'MON'),'SEP',ttl_mo_prcp)),-9999) sep,
                                                nvl(max(decode(to_char(lsd,'MON'),'OCT',ttl_mo_prcp)),-9999) oct,
                                                nvl(max(decode(to_char(lsd,'MON'),'NOV',ttl_mo_prcp)),-9999) nov,
                                                nvl(max(decode(to_char(lsd,'MON'),'DEC',ttl_mo_prcp)),-9999) dec
                 from mo_rains
                 where (stn_num in ($stns))
                                 group by stn_num, to_char(lsd,'yyyy')
                                 order by to_char(lsd,'yyyy') desc;";

But I think that sorts by station number, as it is designed for multiple stations at a time, whereas mine is for one station only. Yet if I plug it into R just to see what happens, I get a plethora of extraordinarily long errors which I can post if needed.

Reply | Threaded
Open this post in threaded view
|

Re: Separating columns, and sorting by rows

David Winsemius
In reply to this post by milton ruser

On Feb 15, 2010, at 1:22 AM, milton ruser wrote:

> Hi Raging Jim
>
> may be this is a starting point.
>
> myDF<-read.table(stdin(),head=T,sep=",")

Those "yyyymm" entries will become factors, which can lead to  
confusion for newbies. Might be more straightforward to always use  
stringsAsFactors=FALSE in the read.table arguments.  I see that the  
yyymm column later gets thrown away so it may not matter here.

> yyyymm,Rainfall
> 1977-02,17.4
> 1977-03,34.0
> 1977-04,26.2
> 1977-05,42.6
> 1977-06,58.6
> 1977-07,23.2
> 1977-08,26.8
> 1977-09,48.4
> 1977-10,47.0
> 1977-11,37.2
> 1977-12,15.0
> 1978-01,2.6
> 1978-02,6.8
> 1978-03,9.0
> 1978-04,46.6
>

When I did a very similar maneuver, I added an extra NA entry at the  
beginning:

myDF <- rbind(list(yyyymm="1977-01", Rainfall=NA), myDF)

... so the columns would start with January. (The warning is harmless.)

> myDF$yyyy<-substr(myDF$yyyymm,1,4)
> myDF$mm<-substr(myDF$yyyymm,6,7)
> myDF<-subset(myDF, select=c(yyyy,mm,Rainfall))
> myDF.reshape<-reshape(myDF,v.names="Rainfall",idvar="yyyy",
> timevar="mm",direction="wide")
> myDF.reshape
> best regards

When the time comes to rename those columns, knowing that there is a  
system constant called month.names may come in handy. Perhaps  
(untested):

names(myDF.reshape) <- c("Year", month.names[1:12])

>
> milton
--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Separating columns, and sorting by rows

Gabor Grothendieck
In reply to this post by RagingJim
Try this:

Lines <- "    yyyy-mm     Rainfall(mm)
1   1977-02        17.4
2   1977-03        34.0
3   1977-04        26.2
4   1977-05        42.6
5   1977-06        58.6
6   1977-07        23.2
7   1977-08        26.8
8   1977-09        48.4
9   1977-10        47.0
10  1977-11        37.2
11  1977-12        15.0
12  1978-01         2.6
13  1978-02         6.8
14  1978-03         9.0
15  1978-04        46.6"

library(zoo)
z <- read.zoo(textConnection(Lines), FUN = as.yearmon)

sq <- seq(floor(start(z)), ceiling(end(z)) - 1/12, 1/12)
DF <- matrix(merge(z, zoo(, sq)), byrow = TRUE, nc = 12, dimnames =
list(unique(floor(sq)), month.abb))

The first statement reads the data into a zoo object z using yearmon
(year/month) time class.

yearmon objects represent Jan 1977 as 1977, Feb 1977 as 1977+1/12, Mar
1977 as 1977+2/12, ..., Dec 1977 as 1977+11/12, etc. so the floor and
ceiling operations on the start and end of the series can be used to
form a sequence from the beginning of the first year to the end of the
last.

Finally we merge z together with a zero width series (i.e. no data,
only times) and then form the result into a matrix with the
appropriate dimension names.

> DF
     Jan  Feb Mar  Apr  May  Jun  Jul  Aug  Sep Oct  Nov Dec
1977  NA 17.4  34 26.2 42.6 58.6 23.2 26.8 48.4  47 37.2  15
1978 2.6  6.8   9 46.6   NA   NA   NA   NA   NA  NA   NA  NA


On Mon, Feb 15, 2010 at 1:07 AM, RagingJim <[hidden email]> wrote:

>
> Dear anyone who knows more about R than me (so everyone). I have been bashing
> my head on the keyboard all day trying to do something with my table.
>
> I have some data, like so:
>     yyyy-mm     Rainfall(mm)
> 1   1977-02        17.4
> 2   1977-03        34.0
> 3   1977-04        26.2
> 4   1977-05        42.6
> 5   1977-06        58.6
> 6   1977-07        23.2
> 7   1977-08        26.8
> 8   1977-09        48.4
> 9   1977-10        47.0
> 10  1977-11        37.2
> 11  1977-12        15.0
> 12  1978-01         2.6
> 13  1978-02         6.8
> 14  1978-03         9.0
> 15  1978-04        46.6
>
> The data continues on for x number of hundreds of data points. Simply put, I
> need to make that data.frame into this data.frame/table/matrix/grid/... you
> get the picture.
>
>         Jan  Feb  Mar ... etc
> Year   Rain Rain Rain
> Year   Rain Rain Rain
> Year   Rain Rain Rain
> Year   etc...
> Year
> Year
>
> How on earth do I do it? I have made little to no progress on it all day.
> Also, just like all the other problems, the year and month will change every
> time depending upon which csv file or sql query I load into the program. If
> anyone has any pointers, that would be awesome.
>
> Cheers lads.
> --
> View this message in context: http://n4.nabble.com/Separating-columns-and-sorting-by-rows-tp1555806p1555806.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Separating columns, and sorting by rows

RagingJim
Thanks for the help guys. This worked:

x<-sqlQuery(conn, "select to_char(lsd,'yyyy-mm') as yr,ttl_mo_prcp from mo_rains where stn_num=23090")


myDF=x
myDF[,1]=as.yearmon(myDF[,1])
myDF$R=myDF$TTL_MO_PRCP
myDF[,-2]
myDF$yyyy<-substr(myDF$YR,5,8)
myDF$mm<-substr(myDF$YR,1,4)
myDF<-subset(myDF, select=c(yyyy,mm,R))
myDF.reshape<-reshape(myDF,v.names="R",idvar="yyyy",timevar="mm",direction="wide")
myDF.reshape

kent=myDF.reshape


library(zoo)

rows=nrow(myDF.reshape)
kent[,1]=as.numeric(as.character(kent[,1]))
ann=c(kent1[,2]+kent1[,3]+kent1[,4]+kent1[,5]+kent1[,6]+kent1[,7]+kent1[,8]+kent1[,9]+kent1[,10]+kent1[,11]+kent1[,12]+kent1[,13])

kent1=cbind(kent$yyyy,kent$R.Jan,kent$R.Feb,kent$R.Mar,kent$R.Apr,kent$R.May,kent$R.Jun,kent$R.Jul,kent$R.Aug,kent$R.Sep,kent$R.Oct,kent$R.Nov,kent$R.Dec,ann)


Still some cleaning up to do, but it now works as promised :)

All I ahve to figure out now is how to fix have a pop up for people to write in a station number, and that station number then gets input into the query.

x<-sqlQuery(conn, "select to_char(lsd,'yyyy-mm') as yr,ttl_mo_prcp from mo_rains where stn_num=23090")

Any ideas at all?

Cheers again!