Subset a data frame with specific date

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Subset a data frame with specific date

ani jaya
Good morning R-Help,

I have a dataframe with 7 columns and 10000+ rows. I want to subset/extract
those data frame with specific date (not in order). Here the head of my
data frame:

head(mjo30)  year month date      rmm1     rmm2 phase     amp
1 1986     1    1 -0.326480 -1.55895     2 1.59277
2 1986     1    2 -0.417700 -1.82689     2 1.87403
3 1986     1    3  0.032915 -2.40150     3 2.40172
4 1986     1    4  0.492743 -2.49216     3 2.54041
5 1986     1    5  0.585106 -2.76866     3 2.82981
6 1986     1    6  0.665013 -3.13883     3 3.20851

and here my specific date:
> date [1] "1986-04-25" "1987-06-10" "1988-09-03" "1989-10-05" "1990-10-26" "1991-05-07" "1992-11-19" "1993-01-23" "1994-12-04"
[10] "1995-05-11" "1996-10-04" "1997-04-29" "1998-04-08" "1999-01-16"
"2000-08-01" "2001-10-02" "2002-05-08" "2003-04-01"
[19] "2004-05-07" "2005-09-02" "2006-12-30" "2007-09-03" "2008-10-24"
"2009-11-14" "2010-07-05" "2011-04-30" "2012-05-21"
[28] "2013-04-07" "2014-05-07" "2015-07-26"

And also I was confused when I dput my date, it show like this:
> dput(date)structure(c(5958, 6369, 6820, 7217, 7603, 7796, 8358, 8423, 9103,
9261, 9773, 9980, 10324, 10607, 11170, 11597, 11815, 12143, 12545,
13028, 13512, 13759, 14176, 14562, 14795, 15094, 15481, 15802,
16197, 16642), class = "Date")

what is that mean? I mean why it is not recall the dates but some
values (5958,6369,7217,..)?

Any comment and recommendation is appreciate.  Thank you.

Best,

Ani

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Subset a data frame with specific date

Jeff Newmiller
The dput function is for re-creating an R object in another R workspace, so it uses fundamental base types to define objects. A Date is really the number of days since a specific date (typically 1970-01-01) that get converted to look like dates whenever you display or print them, so what you are seiing are those numbers. If we enter the R code returned by dput into our R session we will be able to see the dates.

Your mjo30 table seems to call the day of the month the "date"... which is confusing. I would combine those three columns into one like

mjo30$Dt <- as.Date( ISOdate( mjo30$year, mjo30$month, mjo30$date ) )

You could then use indexing

mjo30[ date[1] == mjo30$Dt, ]

or

mjo30[ mjo30$Dt %in% date, ]

but the subset function would not work in this case because you have two different objects (a column in mjo30 and a vector in your global environment) both referred to as 'date'.

On January 13, 2020 8:53:38 PM PST, ani jaya <[hidden email]> wrote:

>Good morning R-Help,
>
>I have a dataframe with 7 columns and 10000+ rows. I want to
>subset/extract
>those data frame with specific date (not in order). Here the head of my
>data frame:
>
>head(mjo30)  year month date      rmm1     rmm2 phase     amp
>1 1986     1    1 -0.326480 -1.55895     2 1.59277
>2 1986     1    2 -0.417700 -1.82689     2 1.87403
>3 1986     1    3  0.032915 -2.40150     3 2.40172
>4 1986     1    4  0.492743 -2.49216     3 2.54041
>5 1986     1    5  0.585106 -2.76866     3 2.82981
>6 1986     1    6  0.665013 -3.13883     3 3.20851
>
>and here my specific date:
>> date [1] "1986-04-25" "1987-06-10" "1988-09-03" "1989-10-05"
>"1990-10-26" "1991-05-07" "1992-11-19" "1993-01-23" "1994-12-04"
>[10] "1995-05-11" "1996-10-04" "1997-04-29" "1998-04-08" "1999-01-16"
>"2000-08-01" "2001-10-02" "2002-05-08" "2003-04-01"
>[19] "2004-05-07" "2005-09-02" "2006-12-30" "2007-09-03" "2008-10-24"
>"2009-11-14" "2010-07-05" "2011-04-30" "2012-05-21"
>[28] "2013-04-07" "2014-05-07" "2015-07-26"
>
>And also I was confused when I dput my date, it show like this:
>> dput(date)structure(c(5958, 6369, 6820, 7217, 7603, 7796, 8358, 8423,
>9103,
>9261, 9773, 9980, 10324, 10607, 11170, 11597, 11815, 12143, 12545,
>13028, 13512, 13759, 14176, 14562, 14795, 15094, 15481, 15802,
>16197, 16642), class = "Date")
>
>what is that mean? I mean why it is not recall the dates but some
>values (5958,6369,7217,..)?
>
>Any comment and recommendation is appreciate.  Thank you.
>
>Best,
>
>Ani
>
> [[alternative HTML version deleted]]
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Subset a data frame with specific date

Bert Gunter-2
In reply to this post by ani jaya
Inline.

Bert Gunter




On Mon, Jan 13, 2020 at 8:54 PM ani jaya <[hidden email]> wrote:

> Good morning R-Help,
>
> I have a dataframe with 7 columns and 10000+ rows. I want to subset/extract
> those data frame with specific date (not in order). Here the head of my
> data frame:
>
> head(mjo30)



> year month date      rmm1     rmm2 phase     amp
> 1 1986     1    1 -0.326480 -1.55895     2 1.59277
> 2 1986     1    2 -0.417700 -1.82689     2 1.87403
> 3 1986     1    3  0.032915 -2.40150     3 2.40172
> 4 1986     1    4  0.492743 -2.49216     3 2.54041
> 5 1986     1    5  0.585106 -2.76866     3 2.82981
> 6 1986     1    6  0.665013 -3.13883     3 3.20851
>

These are columns of numeric values. That you label them as year, month,
date is irrelevant,.

>
> and here my specific date:
> > date



> [1] "1986-04-25" "1987-06-10" "1988-09-03" "1989-10-05" "1990-10-26"
> "1991-05-07" "1992-11-19" "1993-01-23" "1994-12-04"
> [10] "1995-05-11" "1996-10-04" "1997-04-29" "1998-04-08" "1999-01-16"
> "2000-08-01" "2001-10-02" "2002-05-08" "2003-04-01"
> [19] "2004-05-07" "2005-09-02" "2006-12-30" "2007-09-03" "2008-10-24"
> "2009-11-14" "2010-07-05" "2011-04-30" "2012-05-21"
> [28] "2013-04-07" "2014-05-07" "2015-07-26"
>
> This is how the print method for Date objects prints the dates. See ?Dates

And also I was confused when I dput my date, it show like this:
> > dput(date)



> structure(c(5958, 6369, 6820, 7217, 7603, 7796, 8358, 8423, 9103,
> 9261, 9773, 9980, 10324, 10607, 11170, 11597, 11815, 12143, 12545,
> 13028, 13512, 13759, 14176, 14562, 14795, 15094, 15481, 15802,
> 16197, 16642), class = "Date")
>

These are how objects of class date are represented internally, as
integers. See ?Dates.
Use ?str to see the structure of an object, not dput()
I think you need to go through a tutorial or two on dates in R. And
probably also on S3 methods in R.


> what is that mean? I mean why it is not recall the dates but some
> values (5958,6369,7217,..)?
>
> Any comment and recommendation is appreciate.  Thank you.
>
> Extended tutorials on these topics are inappropriate here. There are many
places they can be found on the web.
But here's an example for one simple way to do it:

> d <- as.Date("2004-10-5") ## create object of class "Date"
## This is what you want to subset with
> d  ## how they are printed
[1] "2004-10-05"
> str(d)
 Date[1:1], format: "2004-10-05"
> class(d)
[1] "Date"
> dput(d) ## the internal representation of Date objects
structure(12696, class = "Date")
>
>
> ## Now create a data frame that you want to subset with d
> df <- data.frame (year = c(2004,2005),
+       month = c(10,2),
+       date = c(5,15))
> df
  year month date
1 2004    10    5
2 2005     2   15
> ## convert to a formatted character column of dates
> alldates <- with(df,paste(year,month,date, sep ="-"))
> alldates ## vector of formatted character strings.
[1] "2004-10-5" "2005-2-15"
> class(alldates)
[1] "character"
> ## convert it to "Date" class
> alldates <- as.Date(alldates)
> class(alldates)
[1] "Date"
> ## Now use this to subset the data frame
> df[alldates %in% d, ]
  year month date
1 2004    10    5


## And please post in **plain text** not HTML in future.

Cheers,
Bert




Best,

>
> Ani
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Subset a data frame with specific date

ani jaya
In reply to this post by Jeff Newmiller
Dear Jeff and Bert,

Thank you very much for your correction and explanation.
And yes, I need to study about date format more.
Sorry for HTML mail, don't realize.

I was able to subset the data that I want.

mjo30<-read.table("rmm.txt", header=FALSE, skip=4234, nrows=10957)
mjo30$V8<-NULL
names(mjo30)<-c("year","month","day", "rmm1","rmm2","phase","amp")
mjo3<-as.Date(with(mjo30,paste(year,month, day, sep="-")),"%Y-%m-%d")
mjo<-mjo30[which(mjo3%in%date),]

head(mjo)
     year month day      rmm1      rmm2 phase      amp
115  1986     4  25 -0.319090 -0.363030     2 0.483332
526  1987     6  10  1.662870  0.291632     5 1.688250
977  1988     9   3 -0.604950 -0.299850     1 0.675181
1374 1989    10   5  0.972298 -0.461030     4 1.076060
1760 1990    10  26 -1.183110 -1.589810     2 1.981730
1953 1991     5   7 -0.317180  0.953061     7 1.004450


Best,
Ani


On Tue, Jan 14, 2020 at 3:20 PM Jeff Newmiller <[hidden email]> wrote:

>
> The dput function is for re-creating an R object in another R workspace, so it uses fundamental base types to define objects. A Date is really the number of days since a specific date (typically 1970-01-01) that get converted to look like dates whenever you display or print them, so what you are seiing are those numbers. If we enter the R code returned by dput into our R session we will be able to see the dates.
>
> Your mjo30 table seems to call the day of the month the "date"... which is confusing. I would combine those three columns into one like
>
> mjo30$Dt <- as.Date( ISOdate( mjo30$year, mjo30$month, mjo30$date ) )
>
> You could then use indexing
>
> mjo30[ date[1] == mjo30$Dt, ]
>
> or
>
> mjo30[ mjo30$Dt %in% date, ]
>
> but the subset function would not work in this case because you have two different objects (a column in mjo30 and a vector in your global environment) both referred to as 'date'.
>
> On January 13, 2020 8:53:38 PM PST, ani jaya <[hidden email]> wrote:
> >Good morning R-Help,
> >
> >I have a dataframe with 7 columns and 10000+ rows. I want to
> >subset/extract
> >those data frame with specific date (not in order). Here the head of my
> >data frame:
> >
> >head(mjo30)  year month date      rmm1     rmm2 phase     amp
> >1 1986     1    1 -0.326480 -1.55895     2 1.59277
> >2 1986     1    2 -0.417700 -1.82689     2 1.87403
> >3 1986     1    3  0.032915 -2.40150     3 2.40172
> >4 1986     1    4  0.492743 -2.49216     3 2.54041
> >5 1986     1    5  0.585106 -2.76866     3 2.82981
> >6 1986     1    6  0.665013 -3.13883     3 3.20851
> >
> >and here my specific date:
> >> date [1] "1986-04-25" "1987-06-10" "1988-09-03" "1989-10-05"
> >"1990-10-26" "1991-05-07" "1992-11-19" "1993-01-23" "1994-12-04"
> >[10] "1995-05-11" "1996-10-04" "1997-04-29" "1998-04-08" "1999-01-16"
> >"2000-08-01" "2001-10-02" "2002-05-08" "2003-04-01"
> >[19] "2004-05-07" "2005-09-02" "2006-12-30" "2007-09-03" "2008-10-24"
> >"2009-11-14" "2010-07-05" "2011-04-30" "2012-05-21"
> >[28] "2013-04-07" "2014-05-07" "2015-07-26"
> >
> >And also I was confused when I dput my date, it show like this:
> >> dput(date)structure(c(5958, 6369, 6820, 7217, 7603, 7796, 8358, 8423,
> >9103,
> >9261, 9773, 9980, 10324, 10607, 11170, 11597, 11815, 12143, 12545,
> >13028, 13512, 13759, 14176, 14562, 14795, 15094, 15481, 15802,
> >16197, 16642), class = "Date")
> >
> >what is that mean? I mean why it is not recall the dates but some
> >values (5958,6369,7217,..)?
> >
> >Any comment and recommendation is appreciate.  Thank you.
> >
> >Best,
> >
> >Ani
> >
> >       [[alternative HTML version deleted]]
> >
> >______________________________________________
> >[hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Subset a data frame with specific date

ani jaya
Dear Jeff and Bert,

Thank you for your correction and explanation.
Yes, I need more study regarding date format and
sorry for HTML mail.

I was able to subset data that I want.

mjo30<-read.table("rmm.txt", header=FALSE, skip=4234, nrows=10957)
mjo30$V8<-NULL
names(mjo30)<-c("year","month","day", "rmm1","rmm2","phase","amp")
mjo3<-as.Date(with(mjo30,paste(year,month, day, sep="-")),"%Y-%m-%d")
mjo<-mjo30[which(mjo3%in%date),]

head(mjo)
     year month day      rmm1      rmm2 phase      amp
115  1986     4  25 -0.319090 -0.363030     2 0.483332
526  1987     6  10  1.662870  0.291632     5 1.688250
977  1988     9   3 -0.604950 -0.299850     1 0.675181
1374 1989    10   5  0.972298 -0.461030     4 1.076060
1760 1990    10  26 -1.183110 -1.589810     2 1.981730
1953 1991     5   7 -0.317180  0.953061     7 1.004450

Best,
Ani

On Tue, Jan 14, 2020 at 3:56 PM ani jaya <[hidden email]> wrote:
>
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Subset a data frame with specific date

Bert Gunter-2
In reply to this post by ani jaya
That's fine, but do note that the which() function is wholly unnecessary in
your last line as R allows logical indexing. Perhaps another topic you need
to study.

-- Bert



On Mon, Jan 13, 2020 at 10:56 PM ani jaya <[hidden email]> wrote:

> Dear Jeff and Bert,
>
> Thank you very much for your correction and explanation.
> And yes, I need to study about date format more.
> Sorry for HTML mail, don't realize.
>
> I was able to subset the data that I want.
>
> mjo30<-read.table("rmm.txt", header=FALSE, skip=4234, nrows=10957)
> mjo30$V8<-NULL
> names(mjo30)<-c("year","month","day", "rmm1","rmm2","phase","amp")
> mjo3<-as.Date(with(mjo30,paste(year,month, day, sep="-")),"%Y-%m-%d")
> mjo<-mjo30[which(mjo3%in%date),]
>
> head(mjo)
>      year month day      rmm1      rmm2 phase      amp
> 115  1986     4  25 -0.319090 -0.363030     2 0.483332
> 526  1987     6  10  1.662870  0.291632     5 1.688250
> 977  1988     9   3 -0.604950 -0.299850     1 0.675181
> 1374 1989    10   5  0.972298 -0.461030     4 1.076060
> 1760 1990    10  26 -1.183110 -1.589810     2 1.981730
> 1953 1991     5   7 -0.317180  0.953061     7 1.004450
>
>
> Best,
> Ani
>
>
> On Tue, Jan 14, 2020 at 3:20 PM Jeff Newmiller <[hidden email]>
> wrote:
> >
> > The dput function is for re-creating an R object in another R workspace,
> so it uses fundamental base types to define objects. A Date is really the
> number of days since a specific date (typically 1970-01-01) that get
> converted to look like dates whenever you display or print them, so what
> you are seiing are those numbers. If we enter the R code returned by dput
> into our R session we will be able to see the dates.
> >
> > Your mjo30 table seems to call the day of the month the "date"... which
> is confusing. I would combine those three columns into one like
> >
> > mjo30$Dt <- as.Date( ISOdate( mjo30$year, mjo30$month, mjo30$date ) )
> >
> > You could then use indexing
> >
> > mjo30[ date[1] == mjo30$Dt, ]
> >
> > or
> >
> > mjo30[ mjo30$Dt %in% date, ]
> >
> > but the subset function would not work in this case because you have two
> different objects (a column in mjo30 and a vector in your global
> environment) both referred to as 'date'.
> >
> > On January 13, 2020 8:53:38 PM PST, ani jaya <[hidden email]> wrote:
> > >Good morning R-Help,
> > >
> > >I have a dataframe with 7 columns and 10000+ rows. I want to
> > >subset/extract
> > >those data frame with specific date (not in order). Here the head of my
> > >data frame:
> > >
> > >head(mjo30)  year month date      rmm1     rmm2 phase     amp
> > >1 1986     1    1 -0.326480 -1.55895     2 1.59277
> > >2 1986     1    2 -0.417700 -1.82689     2 1.87403
> > >3 1986     1    3  0.032915 -2.40150     3 2.40172
> > >4 1986     1    4  0.492743 -2.49216     3 2.54041
> > >5 1986     1    5  0.585106 -2.76866     3 2.82981
> > >6 1986     1    6  0.665013 -3.13883     3 3.20851
> > >
> > >and here my specific date:
> > >> date [1] "1986-04-25" "1987-06-10" "1988-09-03" "1989-10-05"
> > >"1990-10-26" "1991-05-07" "1992-11-19" "1993-01-23" "1994-12-04"
> > >[10] "1995-05-11" "1996-10-04" "1997-04-29" "1998-04-08" "1999-01-16"
> > >"2000-08-01" "2001-10-02" "2002-05-08" "2003-04-01"
> > >[19] "2004-05-07" "2005-09-02" "2006-12-30" "2007-09-03" "2008-10-24"
> > >"2009-11-14" "2010-07-05" "2011-04-30" "2012-05-21"
> > >[28] "2013-04-07" "2014-05-07" "2015-07-26"
> > >
> > >And also I was confused when I dput my date, it show like this:
> > >> dput(date)structure(c(5958, 6369, 6820, 7217, 7603, 7796, 8358, 8423,
> > >9103,
> > >9261, 9773, 9980, 10324, 10607, 11170, 11597, 11815, 12143, 12545,
> > >13028, 13512, 13759, 14176, 14562, 14795, 15094, 15481, 15802,
> > >16197, 16642), class = "Date")
> > >
> > >what is that mean? I mean why it is not recall the dates but some
> > >values (5958,6369,7217,..)?
> > >
> > >Any comment and recommendation is appreciate.  Thank you.
> > >
> > >Best,
> > >
> > >Ani
> > >
> > >       [[alternative HTML version deleted]]
> > >
> > >______________________________________________
> > >[hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > >https://stat.ethz.ch/mailman/listinfo/r-help
> > >PLEASE do read the posting guide
> > >http://www.R-project.org/posting-guide.html
> > >and provide commented, minimal, self-contained, reproducible code.
> >
> > --
> > Sent from my phone. Please excuse my brevity.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Subset a data frame with specific date

ani jaya
Thank you Bert.
And yes another topic to study.

On Tue, Jan 14, 2020 at 4:10 PM Bert Gunter <[hidden email]> wrote:

>
> That's fine, but do note that the which() function is wholly unnecessary in your last line as R allows logical indexing. Perhaps another topic you need to study.
>
> -- Bert
>
>
>
> On Mon, Jan 13, 2020 at 10:56 PM ani jaya <[hidden email]> wrote:
>>
>> Dear Jeff and Bert,
>>
>> Thank you very much for your correction and explanation.
>> And yes, I need to study about date format more.
>> Sorry for HTML mail, don't realize.
>>
>> I was able to subset the data that I want.
>>
>> mjo30<-read.table("rmm.txt", header=FALSE, skip=4234, nrows=10957)
>> mjo30$V8<-NULL
>> names(mjo30)<-c("year","month","day", "rmm1","rmm2","phase","amp")
>> mjo3<-as.Date(with(mjo30,paste(year,month, day, sep="-")),"%Y-%m-%d")
>> mjo<-mjo30[which(mjo3%in%date),]
>>
>> head(mjo)
>>      year month day      rmm1      rmm2 phase      amp
>> 115  1986     4  25 -0.319090 -0.363030     2 0.483332
>> 526  1987     6  10  1.662870  0.291632     5 1.688250
>> 977  1988     9   3 -0.604950 -0.299850     1 0.675181
>> 1374 1989    10   5  0.972298 -0.461030     4 1.076060
>> 1760 1990    10  26 -1.183110 -1.589810     2 1.981730
>> 1953 1991     5   7 -0.317180  0.953061     7 1.004450
>>
>>
>> Best,
>> Ani
>>
>>
>> On Tue, Jan 14, 2020 at 3:20 PM Jeff Newmiller <[hidden email]> wrote:
>> >
>> > The dput function is for re-creating an R object in another R workspace, so it uses fundamental base types to define objects. A Date is really the number of days since a specific date (typically 1970-01-01) that get converted to look like dates whenever you display or print them, so what you are seiing are those numbers. If we enter the R code returned by dput into our R session we will be able to see the dates.
>> >
>> > Your mjo30 table seems to call the day of the month the "date"... which is confusing. I would combine those three columns into one like
>> >
>> > mjo30$Dt <- as.Date( ISOdate( mjo30$year, mjo30$month, mjo30$date ) )
>> >
>> > You could then use indexing
>> >
>> > mjo30[ date[1] == mjo30$Dt, ]
>> >
>> > or
>> >
>> > mjo30[ mjo30$Dt %in% date, ]
>> >
>> > but the subset function would not work in this case because you have two different objects (a column in mjo30 and a vector in your global environment) both referred to as 'date'.
>> >
>> > On January 13, 2020 8:53:38 PM PST, ani jaya <[hidden email]> wrote:
>> > >Good morning R-Help,
>> > >
>> > >I have a dataframe with 7 columns and 10000+ rows. I want to
>> > >subset/extract
>> > >those data frame with specific date (not in order). Here the head of my
>> > >data frame:
>> > >
>> > >head(mjo30)  year month date      rmm1     rmm2 phase     amp
>> > >1 1986     1    1 -0.326480 -1.55895     2 1.59277
>> > >2 1986     1    2 -0.417700 -1.82689     2 1.87403
>> > >3 1986     1    3  0.032915 -2.40150     3 2.40172
>> > >4 1986     1    4  0.492743 -2.49216     3 2.54041
>> > >5 1986     1    5  0.585106 -2.76866     3 2.82981
>> > >6 1986     1    6  0.665013 -3.13883     3 3.20851
>> > >
>> > >and here my specific date:
>> > >> date [1] "1986-04-25" "1987-06-10" "1988-09-03" "1989-10-05"
>> > >"1990-10-26" "1991-05-07" "1992-11-19" "1993-01-23" "1994-12-04"
>> > >[10] "1995-05-11" "1996-10-04" "1997-04-29" "1998-04-08" "1999-01-16"
>> > >"2000-08-01" "2001-10-02" "2002-05-08" "2003-04-01"
>> > >[19] "2004-05-07" "2005-09-02" "2006-12-30" "2007-09-03" "2008-10-24"
>> > >"2009-11-14" "2010-07-05" "2011-04-30" "2012-05-21"
>> > >[28] "2013-04-07" "2014-05-07" "2015-07-26"
>> > >
>> > >And also I was confused when I dput my date, it show like this:
>> > >> dput(date)structure(c(5958, 6369, 6820, 7217, 7603, 7796, 8358, 8423,
>> > >9103,
>> > >9261, 9773, 9980, 10324, 10607, 11170, 11597, 11815, 12143, 12545,
>> > >13028, 13512, 13759, 14176, 14562, 14795, 15094, 15481, 15802,
>> > >16197, 16642), class = "Date")
>> > >
>> > >what is that mean? I mean why it is not recall the dates but some
>> > >values (5958,6369,7217,..)?
>> > >
>> > >Any comment and recommendation is appreciate.  Thank you.
>> > >
>> > >Best,
>> > >
>> > >Ani
>> > >
>> > >       [[alternative HTML version deleted]]
>> > >
>> > >______________________________________________
>> > >[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> > >https://stat.ethz.ch/mailman/listinfo/r-help
>> > >PLEASE do read the posting guide
>> > >http://www.R-project.org/posting-guide.html
>> > >and provide commented, minimal, self-contained, reproducible code.
>> >
>> > --
>> > Sent from my phone. Please excuse my brevity.
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Subset a data frame with specific date

PIKAL Petr
In reply to this post by Bert Gunter-2
Hi Bert

I sometimes use indexing with  "which" too, depends on desired result,
especially with data frames.

x <- 1:10
x[5:6] <- NA
> xd <- data.frame(x, y=rnorm(10))

> xd[xd$x>3,]
      x          y
4     4 -1.5086790
NA   NA         NA
NA.1 NA         NA
7     7 -0.2302614
8     8 -0.1660547
9     9  1.3197811
10   10 -0.3234029
> xd[which(xd$x>3),]
    x          y
4   4 -1.5086790
7   7 -0.2302614
8   8 -0.1660547
9   9  1.3197811
10 10 -0.3234029

The variant without which retains NA values, which may be sometimes
undesirable.

Cheers
Petr

> -----Original Message-----
> From: R-help <[hidden email]> On Behalf Of Bert Gunter
> Sent: Tuesday, January 14, 2020 8:10 AM
> To: ani jaya <[hidden email]>
> Cc: r-help <[hidden email]>
> Subject: Re: [R] Subset a data frame with specific date
>
> That's fine, but do note that the which() function is wholly unnecessary
in
> your last line as R allows logical indexing. Perhaps another topic you
need to

> study.
>
> -- Bert
>
>
>
> On Mon, Jan 13, 2020 at 10:56 PM ani jaya <[hidden email]> wrote:
>
> > Dear Jeff and Bert,
> >
> > Thank you very much for your correction and explanation.
> > And yes, I need to study about date format more.
> > Sorry for HTML mail, don't realize.
> >
> > I was able to subset the data that I want.
> >
> > mjo30<-read.table("rmm.txt", header=FALSE, skip=4234, nrows=10957)
> > mjo30$V8<-NULL names(mjo30)<-c("year","month","day",
> > "rmm1","rmm2","phase","amp")
> > mjo3<-as.Date(with(mjo30,paste(year,month, day, sep="-")),"%Y-%m-%d")
> > mjo<-mjo30[which(mjo3%in%date),]
> >
> > head(mjo)
> >      year month day      rmm1      rmm2 phase      amp
> > 115  1986     4  25 -0.319090 -0.363030     2 0.483332
> > 526  1987     6  10  1.662870  0.291632     5 1.688250
> > 977  1988     9   3 -0.604950 -0.299850     1 0.675181
> > 1374 1989    10   5  0.972298 -0.461030     4 1.076060
> > 1760 1990    10  26 -1.183110 -1.589810     2 1.981730
> > 1953 1991     5   7 -0.317180  0.953061     7 1.004450
> >
> >
> > Best,
> > Ani
> >
> >
> > On Tue, Jan 14, 2020 at 3:20 PM Jeff Newmiller
> > <[hidden email]>
> > wrote:
> > >
> > > The dput function is for re-creating an R object in another R
> > > workspace,
> > so it uses fundamental base types to define objects. A Date is really
> > the number of days since a specific date (typically 1970-01-01) that
> > get converted to look like dates whenever you display or print them,
> > so what you are seiing are those numbers. If we enter the R code
> > returned by dput into our R session we will be able to see the dates.
> > >
> > > Your mjo30 table seems to call the day of the month the "date"...
> > > which
> > is confusing. I would combine those three columns into one like
> > >
> > > mjo30$Dt <- as.Date( ISOdate( mjo30$year, mjo30$month, mjo30$date )
> > > )
> > >
> > > You could then use indexing
> > >
> > > mjo30[ date[1] == mjo30$Dt, ]
> > >
> > > or
> > >
> > > mjo30[ mjo30$Dt %in% date, ]
> > >
> > > but the subset function would not work in this case because you have
> > > two
> > different objects (a column in mjo30 and a vector in your global
> > environment) both referred to as 'date'.
> > >
> > > On January 13, 2020 8:53:38 PM PST, ani jaya <[hidden email]>
> wrote:
> > > >Good morning R-Help,
> > > >
> > > >I have a dataframe with 7 columns and 10000+ rows. I want to
> > > >subset/extract those data frame with specific date (not in order).
> > > >Here the head of my data frame:
> > > >
> > > >head(mjo30)  year month date      rmm1     rmm2 phase     amp
> > > >1 1986     1    1 -0.326480 -1.55895     2 1.59277
> > > >2 1986     1    2 -0.417700 -1.82689     2 1.87403
> > > >3 1986     1    3  0.032915 -2.40150     3 2.40172
> > > >4 1986     1    4  0.492743 -2.49216     3 2.54041
> > > >5 1986     1    5  0.585106 -2.76866     3 2.82981
> > > >6 1986     1    6  0.665013 -3.13883     3 3.20851
> > > >
> > > >and here my specific date:
> > > >> date [1] "1986-04-25" "1987-06-10" "1988-09-03" "1989-10-05"
> > > >"1990-10-26" "1991-05-07" "1992-11-19" "1993-01-23" "1994-12-04"
> > > >[10] "1995-05-11" "1996-10-04" "1997-04-29" "1998-04-08" "1999-01-
> 16"
> > > >"2000-08-01" "2001-10-02" "2002-05-08" "2003-04-01"
> > > >[19] "2004-05-07" "2005-09-02" "2006-12-30" "2007-09-03" "2008-10-
> 24"
> > > >"2009-11-14" "2010-07-05" "2011-04-30" "2012-05-21"
> > > >[28] "2013-04-07" "2014-05-07" "2015-07-26"
> > > >
> > > >And also I was confused when I dput my date, it show like this:
> > > >> dput(date)structure(c(5958, 6369, 6820, 7217, 7603, 7796, 8358,
> > > >> 8423,
> > > >9103,
> > > >9261, 9773, 9980, 10324, 10607, 11170, 11597, 11815, 12143, 12545,
> > > >13028, 13512, 13759, 14176, 14562, 14795, 15094, 15481, 15802,
> > > >16197, 16642), class = "Date")
> > > >
> > > >what is that mean? I mean why it is not recall the dates but some
> > > >values (5958,6369,7217,..)?
> > > >
> > > >Any comment and recommendation is appreciate.  Thank you.
> > > >
> > > >Best,
> > > >
> > > >Ani
> > > >
> > > >       [[alternative HTML version deleted]]
> > > >
> > > >______________________________________________
> > > >[hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > > >https://stat.ethz.ch/mailman/listinfo/r-help
> > > >PLEASE do read the posting guide
> > > >http://www.R-project.org/posting-guide.html
> > > >and provide commented, minimal, self-contained, reproducible code.
> > >
> > > --
> > > Sent from my phone. Please excuse my brevity.
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.