Help with Binning Data

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Help with Binning Data

Shouro Dasgupta
Dear all,

I have 3-hourly temperature data from 1970-2010 for 122 cities in the US. I
would like to bin this data by city-year-week. My idea is if the
temperature for a particular city in a given week falls within a given
range (-17.78 & -12.22), (-12.22 & -6.67), ... (37.78 & 43.33), then the
corresponding bin would have a value of 1 and 0 otherwise.

The data looks like this. Basically, I need to generate a dummy variable
for each temperature range. Any help will be greatly appreciated.

tmp2<- dput(head(tmp1,10))

> structure(list(yearday = c(1970001L, 1970001L, 1970001L, 1970001L,
> 1970001L, 1970001L, 1970001L, 1970001L, 1970001L, 1970001L),
>     City = structure(1:10, .Label = c("AKRON", "ALBANY", "ALBUQUERQUE",
>     "ALLENTOWN", "ATLANTA", "AUSTIN", "BALTIMORE", "BATON ROUGE",
>     "BERKELEY", "BIRMINGHAM", "BOISE", "BOSTON", "BRIDGEPORT",
>     "BUFFALO", "CAMBRIDGE", "CAMDEN", "CANTON", "CHARLOTTE",
>     "CHATTANOOGA", "CHICAGO", "CINCINNATI", "CLEVELAND", "COLORADO
> SPRINGS",
>     "COLUMBUS", "CORPUS CHRISTI", "DALLAS", "DAYTON", "DENVER",
>     "DES MOINES", "DETROIT", "DULUTH", "EL PASO", "ELIZABETH",
>     "ERIE", "EVANSVILLE", "FALL RIVER", "FLINT", "FORT WAYNE",
>     "FRESNO", "FT WORTH", "GARY", "GLENDALE", "GRAND RAPIDS",
>     "HARTFORD", "HONOLULU", "HOUSTON", "INDIANAPOLIS", "JACKSONVILLE",
>     "JERSEY CITY", "KANSAS CITY", "KANSAS ITY", "KNOXVILLE",
>     "Lansing ", "LAS VEGAS", "LEXINGTON", "LINCOLN", "LITTLE ROCK",
>     "LONG BEACH", "LOS ANGELES", "LOUISVILLE", "LOWELL", "LYNN",
>     "MADISON", "MEMPHIS", "MIAMI", "MILWAUKEE", "MINNEAPOLIS",
>     "MOBILE", "MONTGOMERY", "NASHVILLE", "NEW BEDFORD", "NEW HAVEN",
>     "NEW ORLEANS", "NEW YORK CITY", "NEWARK", "NORFOLK", "OAKLAND",
>     "OGDEN", "OKLAHOMA CITY", "OMAHA", "PASADENA", "PATERSON",
>     "PEORIA", "PHILADELPHIA", "PHOENIX", "PITTSBURG", "PORTLAND",
>     "PROVIDENCE", "PUEBLO", "READING", "RICHMOND", "ROCHESTER",
>     "ROCKFORD", "SACRAMENTO", "SALT LAKE CITY", "SAN ANTONIO",
>     "SAN CRUZ", "SAN DIEGO", "SAN FRANCISCO", "SAN JOSE", "SAVANNAH",
>     "SCHENECTADY", "SCRANTON", "SEATTLE", "SHREVEPORT", "SOMERVILLE",
>     "SOUTH BEND", "SPOKANE", "SPRINGFIELD", "ST LOUIS", "ST PAUL",
>     "ST PETERSBURG", "SYRACUSE", "TACOMA", "TAMPA", "TOLEDO",
>     "TRENTON", "TUCSON", "TULSA", "UTICA", "WASHINGTON", "WATERBURY",
>     "WICHITA", "WILMINGTON", "WORCESTER", "YONKERS", "YOUNGSTOWN"
>     ), class = "factor"), cell_number = c(17379L, 17027L, 19514L,
>     17745L, 20256L, 21323L, 18104L, 21329L, 18779L, 20254L),
>     longitude = c(-81.519005, -73.756232, -106.609991, -75.490183,
>     -84.387982, -97.743061, -76.612189, -91.14032, -121.635963,
>     -86.80249), latitude = c(41.081445, 42.652579, 35.110703,
>     40.608431, 33.748995, 30.267153, 39.290385, 30.458283, 37.871744,
>     33.520661), State = structure(c(29L, 28L, 27L, 32L, 10L,
>     35L, 19L, 17L, 4L, 1L), .Label = c(" ALA", " ARIZ", " ARK",
>     " CAL", " COLO", " CONN", " DC", " DEL", " FLA", " GA", " HAWAII",
>     " ILL", " IND", " IOWA", " KANS", " KY", " LA", " MASS",
>     " MD", " MICH", " MINN", " MO", " NC", " NEBR", " NEV", " NJ",
>     " NM", " NY", " OHIO", " OKLA", " ORE", " PA", " RI", " TENN",
>     " TEX", " UTAH", " VA", " WASH", " WIS", "CAL", "CONN", "IDAH",
>     "KY", "MASS"), class = "factor"), avsft = c(-7.81, -16.06,
>     -7.71999999999997, -1.88999999999999, 2.90000000000003, 5.12,
>     -5.02999999999997, 9.33000000000004, 15.08, 2.89000000000004
>     ), year = c(1970L, 1970L, 1970L, 1970L, 1970L, 1970L, 1970L,
>     1970L, 1970L, 1970L), day = c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
>     1L, 1L, 1L), hour = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
>     0L), yearweek = c(197001L, 197001L, 197001L, 197001L, 197001L,
>     197001L, 197001L, 197001L, 197001L, 197001L), week = c(1L,
>     1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L)), .Names = c("yearday",
> "City", "cell_number", "longitude", "latitude", "State", "avsft",
> "year", "day", "hour", "yearweek", "week"), row.names = c(NA,
> 10L), class = "data.frame")


Sincerely,

Shouro

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Help with Binning Data

Bert Gunter-2
1. Posting in HTML largely negated your ability to provide data
through dput(). Folow he posting guide and post in PLAIN TEXT only,
please.

2. See ?cut  . I think this will at least get you started.

Cheers,
Bert
Bert Gunter

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
   -- Clifford Stoll


On Thu, Sep 10, 2015 at 3:28 PM, Shouro Dasgupta <[hidden email]> wrote:

> Dear all,
>
> I have 3-hourly temperature data from 1970-2010 for 122 cities in the US. I
> would like to bin this data by city-year-week. My idea is if the
> temperature for a particular city in a given week falls within a given
> range (-17.78 & -12.22), (-12.22 & -6.67), ... (37.78 & 43.33), then the
> corresponding bin would have a value of 1 and 0 otherwise.
>
> The data looks like this. Basically, I need to generate a dummy variable
> for each temperature range. Any help will be greatly appreciated.
>
> tmp2<- dput(head(tmp1,10))
>> structure(list(yearday = c(1970001L, 1970001L, 1970001L, 1970001L,
>> 1970001L, 1970001L, 1970001L, 1970001L, 1970001L, 1970001L),
>>     City = structure(1:10, .Label = c("AKRON", "ALBANY", "ALBUQUERQUE",
>>     "ALLENTOWN", "ATLANTA", "AUSTIN", "BALTIMORE", "BATON ROUGE",
>>     "BERKELEY", "BIRMINGHAM", "BOISE", "BOSTON", "BRIDGEPORT",
>>     "BUFFALO", "CAMBRIDGE", "CAMDEN", "CANTON", "CHARLOTTE",
>>     "CHATTANOOGA", "CHICAGO", "CINCINNATI", "CLEVELAND", "COLORADO
>> SPRINGS",
>>     "COLUMBUS", "CORPUS CHRISTI", "DALLAS", "DAYTON", "DENVER",
>>     "DES MOINES", "DETROIT", "DULUTH", "EL PASO", "ELIZABETH",
>>     "ERIE", "EVANSVILLE", "FALL RIVER", "FLINT", "FORT WAYNE",
>>     "FRESNO", "FT WORTH", "GARY", "GLENDALE", "GRAND RAPIDS",
>>     "HARTFORD", "HONOLULU", "HOUSTON", "INDIANAPOLIS", "JACKSONVILLE",
>>     "JERSEY CITY", "KANSAS CITY", "KANSAS ITY", "KNOXVILLE",
>>     "Lansing ", "LAS VEGAS", "LEXINGTON", "LINCOLN", "LITTLE ROCK",
>>     "LONG BEACH", "LOS ANGELES", "LOUISVILLE", "LOWELL", "LYNN",
>>     "MADISON", "MEMPHIS", "MIAMI", "MILWAUKEE", "MINNEAPOLIS",
>>     "MOBILE", "MONTGOMERY", "NASHVILLE", "NEW BEDFORD", "NEW HAVEN",
>>     "NEW ORLEANS", "NEW YORK CITY", "NEWARK", "NORFOLK", "OAKLAND",
>>     "OGDEN", "OKLAHOMA CITY", "OMAHA", "PASADENA", "PATERSON",
>>     "PEORIA", "PHILADELPHIA", "PHOENIX", "PITTSBURG", "PORTLAND",
>>     "PROVIDENCE", "PUEBLO", "READING", "RICHMOND", "ROCHESTER",
>>     "ROCKFORD", "SACRAMENTO", "SALT LAKE CITY", "SAN ANTONIO",
>>     "SAN CRUZ", "SAN DIEGO", "SAN FRANCISCO", "SAN JOSE", "SAVANNAH",
>>     "SCHENECTADY", "SCRANTON", "SEATTLE", "SHREVEPORT", "SOMERVILLE",
>>     "SOUTH BEND", "SPOKANE", "SPRINGFIELD", "ST LOUIS", "ST PAUL",
>>     "ST PETERSBURG", "SYRACUSE", "TACOMA", "TAMPA", "TOLEDO",
>>     "TRENTON", "TUCSON", "TULSA", "UTICA", "WASHINGTON", "WATERBURY",
>>     "WICHITA", "WILMINGTON", "WORCESTER", "YONKERS", "YOUNGSTOWN"
>>     ), class = "factor"), cell_number = c(17379L, 17027L, 19514L,
>>     17745L, 20256L, 21323L, 18104L, 21329L, 18779L, 20254L),
>>     longitude = c(-81.519005, -73.756232, -106.609991, -75.490183,
>>     -84.387982, -97.743061, -76.612189, -91.14032, -121.635963,
>>     -86.80249), latitude = c(41.081445, 42.652579, 35.110703,
>>     40.608431, 33.748995, 30.267153, 39.290385, 30.458283, 37.871744,
>>     33.520661), State = structure(c(29L, 28L, 27L, 32L, 10L,
>>     35L, 19L, 17L, 4L, 1L), .Label = c(" ALA", " ARIZ", " ARK",
>>     " CAL", " COLO", " CONN", " DC", " DEL", " FLA", " GA", " HAWAII",
>>     " ILL", " IND", " IOWA", " KANS", " KY", " LA", " MASS",
>>     " MD", " MICH", " MINN", " MO", " NC", " NEBR", " NEV", " NJ",
>>     " NM", " NY", " OHIO", " OKLA", " ORE", " PA", " RI", " TENN",
>>     " TEX", " UTAH", " VA", " WASH", " WIS", "CAL", "CONN", "IDAH",
>>     "KY", "MASS"), class = "factor"), avsft = c(-7.81, -16.06,
>>     -7.71999999999997, -1.88999999999999, 2.90000000000003, 5.12,
>>     -5.02999999999997, 9.33000000000004, 15.08, 2.89000000000004
>>     ), year = c(1970L, 1970L, 1970L, 1970L, 1970L, 1970L, 1970L,
>>     1970L, 1970L, 1970L), day = c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
>>     1L, 1L, 1L), hour = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
>>     0L), yearweek = c(197001L, 197001L, 197001L, 197001L, 197001L,
>>     197001L, 197001L, 197001L, 197001L, 197001L), week = c(1L,
>>     1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L)), .Names = c("yearday",
>> "City", "cell_number", "longitude", "latitude", "State", "avsft",
>> "year", "day", "hour", "yearweek", "week"), row.names = c(NA,
>> 10L), class = "data.frame")
>
>
> Sincerely,
>
> Shouro
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Help with Binning Data

David Winsemius
In reply to this post by Shouro Dasgupta

On Sep 10, 2015, at 3:28 PM, Shouro Dasgupta wrote:

> Dear all,
>
> I have 3-hourly temperature data from 1970-2010 for 122 cities in the US. I
> would like to bin this data by city-year-week. My idea is if the
> temperature for a particular city in a given week falls within a given
> range (-17.78 & -12.22), (-12.22 & -6.67), ... (37.78 & 43.33), then the
> corresponding bin would have a value of 1 and 0 otherwise.
>
> The data looks like this. Basically, I need to generate a dummy variable
> for each temperature range. Any help will be greatly appreciated.

The urge to imitate other statistical package that rely on profusion of dummies should be resisted. R repression functions can handle factor variables and the `cut` function can deliver them along with appropriate use of `seq`:

  tmp2$Tcat <- cut( tmp2$avsft, breaks=seq (-17.78,  43.33, by= 5.55 ) )

> tmp2$Tcat
 [1] (-12.2,-6.68] (-17.8,-12.2] (-12.2,-6.68] (-6.68,-1.13]
 [5] (-1.13,4.42]  (4.42,9.97]   (-6.68,-1.13] (4.42,9.97]  
 [9] (9.97,15.5]   (-1.13,4.42]
11 Levels: (-17.8,-12.2] (-12.2,-6.68] ... (37.7,43.3]


> tmp2[ , c("City", "Tcat")]
          City          Tcat
1        AKRON (-12.2,-6.68]
2       ALBANY (-17.8,-12.2]
3  ALBUQUERQUE (-12.2,-6.68]
4    ALLENTOWN (-6.68,-1.13]
5      ATLANTA  (-1.13,4.42]
6       AUSTIN   (4.42,9.97]
7    BALTIMORE (-6.68,-1.13]
8  BATON ROUGE   (4.42,9.97]
9     BERKELEY   (9.97,15.5]
10  BIRMINGHAM  (-1.13,4.42]

Must have been a cold snap in the southeast that New Years Day.


There.... isn't that much neater than have a messy bunch of dummies? If you really need to build them then look at `?model.frame`.

--
David.

>
> tmp2<- dput(head(tmp1,10))
>> structure(list(yearday = c(1970001L, 1970001L, 1970001L, 1970001L,
>> 1970001L, 1970001L, 1970001L, 1970001L, 1970001L, 1970001L),
>>    City = structure(1:10, .Label = c("AKRON", "ALBANY", "ALBUQUERQUE",
>>    "ALLENTOWN", "ATLANTA", "AUSTIN", "BALTIMORE", "BATON ROUGE",
>>    "BERKELEY", "BIRMINGHAM", "BOISE", "BOSTON", "BRIDGEPORT",
>>    "BUFFALO", "CAMBRIDGE", "CAMDEN", "CANTON", "CHARLOTTE",
>>    "CHATTANOOGA", "CHICAGO", "CINCINNATI", "CLEVELAND", "COLORADO
>> SPRINGS",
>>    "COLUMBUS", "CORPUS CHRISTI", "DALLAS", "DAYTON", "DENVER",
>>    "DES MOINES", "DETROIT", "DULUTH", "EL PASO", "ELIZABETH",
>>    "ERIE", "EVANSVILLE", "FALL RIVER", "FLINT", "FORT WAYNE",
>>    "FRESNO", "FT WORTH", "GARY", "GLENDALE", "GRAND RAPIDS",
>>    "HARTFORD", "HONOLULU", "HOUSTON", "INDIANAPOLIS", "JACKSONVILLE",
>>    "JERSEY CITY", "KANSAS CITY", "KANSAS ITY", "KNOXVILLE",
>>    "Lansing ", "LAS VEGAS", "LEXINGTON", "LINCOLN", "LITTLE ROCK",
>>    "LONG BEACH", "LOS ANGELES", "LOUISVILLE", "LOWELL", "LYNN",
>>    "MADISON", "MEMPHIS", "MIAMI", "MILWAUKEE", "MINNEAPOLIS",
>>    "MOBILE", "MONTGOMERY", "NASHVILLE", "NEW BEDFORD", "NEW HAVEN",
>>    "NEW ORLEANS", "NEW YORK CITY", "NEWARK", "NORFOLK", "OAKLAND",
>>    "OGDEN", "OKLAHOMA CITY", "OMAHA", "PASADENA", "PATERSON",
>>    "PEORIA", "PHILADELPHIA", "PHOENIX", "PITTSBURG", "PORTLAND",
>>    "PROVIDENCE", "PUEBLO", "READING", "RICHMOND", "ROCHESTER",
>>    "ROCKFORD", "SACRAMENTO", "SALT LAKE CITY", "SAN ANTONIO",
>>    "SAN CRUZ", "SAN DIEGO", "SAN FRANCISCO", "SAN JOSE", "SAVANNAH",
>>    "SCHENECTADY", "SCRANTON", "SEATTLE", "SHREVEPORT", "SOMERVILLE",
>>    "SOUTH BEND", "SPOKANE", "SPRINGFIELD", "ST LOUIS", "ST PAUL",
>>    "ST PETERSBURG", "SYRACUSE", "TACOMA", "TAMPA", "TOLEDO",
>>    "TRENTON", "TUCSON", "TULSA", "UTICA", "WASHINGTON", "WATERBURY",
>>    "WICHITA", "WILMINGTON", "WORCESTER", "YONKERS", "YOUNGSTOWN"
>>    ), class = "factor"), cell_number = c(17379L, 17027L, 19514L,
>>    17745L, 20256L, 21323L, 18104L, 21329L, 18779L, 20254L),
>>    longitude = c(-81.519005, -73.756232, -106.609991, -75.490183,
>>    -84.387982, -97.743061, -76.612189, -91.14032, -121.635963,
>>    -86.80249), latitude = c(41.081445, 42.652579, 35.110703,
>>    40.608431, 33.748995, 30.267153, 39.290385, 30.458283, 37.871744,
>>    33.520661), State = structure(c(29L, 28L, 27L, 32L, 10L,
>>    35L, 19L, 17L, 4L, 1L), .Label = c(" ALA", " ARIZ", " ARK",
>>    " CAL", " COLO", " CONN", " DC", " DEL", " FLA", " GA", " HAWAII",
>>    " ILL", " IND", " IOWA", " KANS", " KY", " LA", " MASS",
>>    " MD", " MICH", " MINN", " MO", " NC", " NEBR", " NEV", " NJ",
>>    " NM", " NY", " OHIO", " OKLA", " ORE", " PA", " RI", " TENN",
>>    " TEX", " UTAH", " VA", " WASH", " WIS", "CAL", "CONN", "IDAH",
>>    "KY", "MASS"), class = "factor"), avsft = c(-7.81, -16.06,
>>    -7.71999999999997, -1.88999999999999, 2.90000000000003, 5.12,
>>    -5.02999999999997, 9.33000000000004, 15.08, 2.89000000000004
>>    ), year = c(1970L, 1970L, 1970L, 1970L, 1970L, 1970L, 1970L,
>>    1970L, 1970L, 1970L), day = c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
>>    1L, 1L, 1L), hour = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
>>    0L), yearweek = c(197001L, 197001L, 197001L, 197001L, 197001L,
>>    197001L, 197001L, 197001L, 197001L, 197001L), week = c(1L,
>>    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L)), .Names = c("yearday",
>> "City", "cell_number", "longitude", "latitude", "State", "avsft",
>> "year", "day", "hour", "yearweek", "week"), row.names = c(NA,
>> 10L), class = "data.frame")
>
>
> Sincerely,
>
> Shouro
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: [FORGED] Re: Help with Binning Data

Rolf Turner
On 11/09/15 11:57, David Winsemius wrote:

<SNIP>

> The urge to imitate other statistical package that rely on profusion
> of dummies should be resisted. R repression functions can handle
> factor variables ....

<SNIP>

Fortune? :-)

cheers,

Rolf

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: [FORGED] Re: Help with Binning Data

Achim Zeileis-4
On Fri, 11 Sep 2015, Rolf Turner wrote:

> On 11/09/15 11:57, David Winsemius wrote:
>
> <SNIP>
>
>> The urge to imitate other statistical package that rely on profusion
>> of dummies should be resisted. R repression functions can handle
>> factor variables ....
>
> <SNIP>
>
> Fortune? :-)

Nice! Should I include the "repression" typo? :-)

Best,
Z

> cheers,
>
> Rolf
>
> --
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: [FORGED] Re: Help with Binning Data

Rolf Turner
On 11/09/15 12:25, Achim Zeileis wrote:

> On Fri, 11 Sep 2015, Rolf Turner wrote:
>
>> On 11/09/15 11:57, David Winsemius wrote:
>>
>> <SNIP>
>>
>>> The urge to imitate other statistical package that rely on profusion
>>> of dummies should be resisted. R repression functions can handle
>>> factor variables ....
>>
>> <SNIP>
>>
>> Fortune? :-)
>
> Nice! Should I include the "repression" typo? :-)

Yes!  That's the point, from my point of view! :-)

cheers,

Rolf

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: [FORGED] Help with Binning Data

David Winsemius
In reply to this post by Achim Zeileis-4

On Sep 10, 2015, at 5:25 PM, Achim Zeileis wrote:

> On Fri, 11 Sep 2015, Rolf Turner wrote:
>
>> On 11/09/15 11:57, David Winsemius wrote:
>>
>> <SNIP>
>>
>>> The urge to imitate other statistical package that rely on profusion
>>> of dummies should be resisted. R repression functions can handle
>>> factor variables ....
>>
>> <SNIP>
>>
>> Fortune? :-)
>
> Nice! Should I include the "repression" typo? :-)
>

 Er, maybe not. Or the package[s] error.

Whatever;
David.

> Best,
> Z
>
>> cheers,
>>
>> Rolf
>>
>> --
>> Technical Editor ANZJS
>> Department of Statistics
>> University of Auckland
>> Phone: +64-9-373-7599 ext. 88276
>>

David Winsemius
Alameda, CA, USA

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Help with Binning Data

Shouro Dasgupta
In reply to this post by Bert Gunter-2
Apologies for the HTML. It shouldn't have happened. I would like to use the
dummies as independent variables in a regression. I did manage to use count
of observations in a given range using the following code:

for (i in filelist) {
  # i <- filelist[1]
  tmp1 <- as.data.table(read.csv(i, sep=","))
  year<-tmp1$year[1]
  mykey=c("City","year","week")
  output <- as.data.frame(tmp1[,sum(avsft< -0),by=mykey])[,1:length(mykey)]
  output$avsft_1<- as.data.frame(tmp1[,sum(avsft>= -17.78 & avsft< -12.22,
na.rm=T), by=mykey])[,length(mykey)+1]

Where "i" is filenames (each file has data for 1 year). But instead of
count I would like to generate dummy variables for ranges [(-17.78 &
-12.22), (-12.22 & -6.67), ... (37.78 & 43.33)], so if a temperature
observation falls within a given range - the dummy variable for that range
will have a value of 1 for that week. Thanks again!

tmp2<- dput(head(tmp1,10))
structure(list(yearday = c(1970001L, 1970001L, 1970001L, 1970001L,
1970001L, 1970001L, 1970001L, 1970001L, 1970001L, 1970001L),
    City = structure(1:10, .Label = c("AKRON", "ALBANY", "ALBUQUERQUE",
    "ALLENTOWN", "ATLANTA", "AUSTIN", "BALTIMORE", "BATON ROUGE",
    "BERKELEY", "BIRMINGHAM", "BOISE", "BOSTON", "BRIDGEPORT",
    "BUFFALO", "CAMBRIDGE", "CAMDEN", "CANTON", "CHARLOTTE",
    "CHATTANOOGA", "CHICAGO", "CINCINNATI", "CLEVELAND", "COLORADO
SPRINGS",
    "COLUMBUS", "CORPUS CHRISTI", "DALLAS", "DAYTON", "DENVER",
    "DES MOINES", "DETROIT", "DULUTH", "EL PASO", "ELIZABETH",
    "ERIE", "EVANSVILLE", "FALL RIVER", "FLINT", "FORT WAYNE",
    "FRESNO", "FT WORTH", "GARY", "GLENDALE", "GRAND RAPIDS",
    "HARTFORD", "HONOLULU", "HOUSTON", "INDIANAPOLIS", "JACKSONVILLE",
    "JERSEY CITY", "KANSAS CITY", "KANSAS ITY", "KNOXVILLE",
    "Lansing ", "LAS VEGAS", "LEXINGTON", "LINCOLN", "LITTLE ROCK",
    "LONG BEACH", "LOS ANGELES", "LOUISVILLE", "LOWELL", "LYNN",
    "MADISON", "MEMPHIS", "MIAMI", "MILWAUKEE", "MINNEAPOLIS",
    "MOBILE", "MONTGOMERY", "NASHVILLE", "NEW BEDFORD", "NEW HAVEN",
    "NEW ORLEANS", "NEW YORK CITY", "NEWARK", "NORFOLK", "OAKLAND",
    "OGDEN", "OKLAHOMA CITY", "OMAHA", "PASADENA", "PATERSON",
    "PEORIA", "PHILADELPHIA", "PHOENIX", "PITTSBURG", "PORTLAND",
    "PROVIDENCE", "PUEBLO", "READING", "RICHMOND", "ROCHESTER",
    "ROCKFORD", "SACRAMENTO", "SALT LAKE CITY", "SAN ANTONIO",
    "SAN CRUZ", "SAN DIEGO", "SAN FRANCISCO", "SAN JOSE", "SAVANNAH",
    "SCHENECTADY", "SCRANTON", "SEATTLE", "SHREVEPORT", "SOMERVILLE",
    "SOUTH BEND", "SPOKANE", "SPRINGFIELD", "ST LOUIS", "ST PAUL",
    "ST PETERSBURG", "SYRACUSE", "TACOMA", "TAMPA", "TOLEDO",
    "TRENTON", "TUCSON", "TULSA", "UTICA", "WASHINGTON", "WATERBURY",
    "WICHITA", "WILMINGTON", "WORCESTER", "YONKERS", "YOUNGSTOWN"
    ), class = "factor"), cell_number = c(17379L, 17027L, 19514L,
    17745L, 20256L, 21323L, 18104L, 21329L, 18779L, 20254L),
    longitude = c(-81.519005, -73.756232, -106.609991, -75.490183,
    -84.387982, -97.743061, -76.612189, -91.14032, -121.635963,
    -86.80249), latitude = c(41.081445, 42.652579, 35.110703,
    40.608431, 33.748995, 30.267153, 39.290385, 30.458283, 37.871744,
    33.520661), State = structure(c(29L, 28L, 27L, 32L, 10L,
    35L, 19L, 17L, 4L, 1L), .Label = c(" ALA", " ARIZ", " ARK",
    " CAL", " COLO", " CONN", " DC", " DEL", " FLA", " GA", " HAWAII",
    " ILL", " IND", " IOWA", " KANS", " KY", " LA", " MASS",
    " MD", " MICH", " MINN", " MO", " NC", " NEBR", " NEV", " NJ",
    " NM", " NY", " OHIO", " OKLA", " ORE", " PA", " RI", " TENN",
    " TEX", " UTAH", " VA", " WASH", " WIS", "CAL", "CONN", "IDAH",
    "KY", "MASS"), class = "factor"), avsft = c(-7.81, -16.06,
    -7.71999999999997, -1.88999999999999, 2.90000000000003, 5.12,
    -5.02999999999997, 9.33000000000004, 15.08, 2.89000000000004
    ), year = c(1970L, 1970L, 1970L, 1970L, 1970L, 1970L, 1970L,
    1970L, 1970L, 1970L), day = c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L), hour = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
    0L), yearweek = c(197001L, 197001L, 197001L, 197001L, 197001L,
    197001L, 197001L, 197001L, 197001L, 197001L), week = c(1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L)), .Names = c("yearday",
"City", "cell_number", "longitude", "latitude", "State", "avsft",
"year", "day", "hour", "yearweek", "week"), row.names = c(NA,
10L), class = "data.frame")

Sincerely,

Shouro




On Fri, Sep 11, 2015 at 12:33 AM, Bert Gunter <[hidden email]>
wrote:

> 1. Posting in HTML largely negated your ability to provide data
> through dput(). Folow he posting guide and post in PLAIN TEXT only,
> please.
>
> 2. See ?cut  . I think this will at least get you started.
>
> Cheers,
> Bert
> Bert Gunter
>
> "Data is not information. Information is not knowledge. And knowledge
> is certainly not wisdom."
>    -- Clifford Stoll
>
>
> On Thu, Sep 10, 2015 at 3:28 PM, Shouro Dasgupta <[hidden email]> wrote:
> > Dear all,
> >
> > I have 3-hourly temperature data from 1970-2010 for 122 cities in the
> US. I
> > would like to bin this data by city-year-week. My idea is if the
> > temperature for a particular city in a given week falls within a given
> > range (-17.78 & -12.22), (-12.22 & -6.67), ... (37.78 & 43.33), then the
> > corresponding bin would have a value of 1 and 0 otherwise.
> >
> > The data looks like this. Basically, I need to generate a dummy variable
> > for each temperature range. Any help will be greatly appreciated.
> >
> > tmp2<- dput(head(tmp1,10))
> >> structure(list(yearday = c(1970001L, 1970001L, 1970001L, 1970001L,
> >> 1970001L, 1970001L, 1970001L, 1970001L, 1970001L, 1970001L),
> >>     City = structure(1:10, .Label = c("AKRON", "ALBANY", "ALBUQUERQUE",
> >>     "ALLENTOWN", "ATLANTA", "AUSTIN", "BALTIMORE", "BATON ROUGE",
> >>     "BERKELEY", "BIRMINGHAM", "BOISE", "BOSTON", "BRIDGEPORT",
> >>     "BUFFALO", "CAMBRIDGE", "CAMDEN", "CANTON", "CHARLOTTE",
> >>     "CHATTANOOGA", "CHICAGO", "CINCINNATI", "CLEVELAND", "COLORADO
> >> SPRINGS",
> >>     "COLUMBUS", "CORPUS CHRISTI", "DALLAS", "DAYTON", "DENVER",
> >>     "DES MOINES", "DETROIT", "DULUTH", "EL PASO", "ELIZABETH",
> >>     "ERIE", "EVANSVILLE", "FALL RIVER", "FLINT", "FORT WAYNE",
> >>     "FRESNO", "FT WORTH", "GARY", "GLENDALE", "GRAND RAPIDS",
> >>     "HARTFORD", "HONOLULU", "HOUSTON", "INDIANAPOLIS", "JACKSONVILLE",
> >>     "JERSEY CITY", "KANSAS CITY", "KANSAS ITY", "KNOXVILLE",
> >>     "Lansing ", "LAS VEGAS", "LEXINGTON", "LINCOLN", "LITTLE ROCK",
> >>     "LONG BEACH", "LOS ANGELES", "LOUISVILLE", "LOWELL", "LYNN",
> >>     "MADISON", "MEMPHIS", "MIAMI", "MILWAUKEE", "MINNEAPOLIS",
> >>     "MOBILE", "MONTGOMERY", "NASHVILLE", "NEW BEDFORD", "NEW HAVEN",
> >>     "NEW ORLEANS", "NEW YORK CITY", "NEWARK", "NORFOLK", "OAKLAND",
> >>     "OGDEN", "OKLAHOMA CITY", "OMAHA", "PASADENA", "PATERSON",
> >>     "PEORIA", "PHILADELPHIA", "PHOENIX", "PITTSBURG", "PORTLAND",
> >>     "PROVIDENCE", "PUEBLO", "READING", "RICHMOND", "ROCHESTER",
> >>     "ROCKFORD", "SACRAMENTO", "SALT LAKE CITY", "SAN ANTONIO",
> >>     "SAN CRUZ", "SAN DIEGO", "SAN FRANCISCO", "SAN JOSE", "SAVANNAH",
> >>     "SCHENECTADY", "SCRANTON", "SEATTLE", "SHREVEPORT", "SOMERVILLE",
> >>     "SOUTH BEND", "SPOKANE", "SPRINGFIELD", "ST LOUIS", "ST PAUL",
> >>     "ST PETERSBURG", "SYRACUSE", "TACOMA", "TAMPA", "TOLEDO",
> >>     "TRENTON", "TUCSON", "TULSA", "UTICA", "WASHINGTON", "WATERBURY",
> >>     "WICHITA", "WILMINGTON", "WORCESTER", "YONKERS", "YOUNGSTOWN"
> >>     ), class = "factor"), cell_number = c(17379L, 17027L, 19514L,
> >>     17745L, 20256L, 21323L, 18104L, 21329L, 18779L, 20254L),
> >>     longitude = c(-81.519005, -73.756232, -106.609991, -75.490183,
> >>     -84.387982, -97.743061, -76.612189, -91.14032, -121.635963,
> >>     -86.80249), latitude = c(41.081445, 42.652579, 35.110703,
> >>     40.608431, 33.748995, 30.267153, 39.290385, 30.458283, 37.871744,
> >>     33.520661), State = structure(c(29L, 28L, 27L, 32L, 10L,
> >>     35L, 19L, 17L, 4L, 1L), .Label = c(" ALA", " ARIZ", " ARK",
> >>     " CAL", " COLO", " CONN", " DC", " DEL", " FLA", " GA", " HAWAII",
> >>     " ILL", " IND", " IOWA", " KANS", " KY", " LA", " MASS",
> >>     " MD", " MICH", " MINN", " MO", " NC", " NEBR", " NEV", " NJ",
> >>     " NM", " NY", " OHIO", " OKLA", " ORE", " PA", " RI", " TENN",
> >>     " TEX", " UTAH", " VA", " WASH", " WIS", "CAL", "CONN", "IDAH",
> >>     "KY", "MASS"), class = "factor"), avsft = c(-7.81, -16.06,
> >>     -7.71999999999997, -1.88999999999999, 2.90000000000003, 5.12,
> >>     -5.02999999999997, 9.33000000000004, 15.08, 2.89000000000004
> >>     ), year = c(1970L, 1970L, 1970L, 1970L, 1970L, 1970L, 1970L,
> >>     1970L, 1970L, 1970L), day = c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
> >>     1L, 1L, 1L), hour = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
> >>     0L), yearweek = c(197001L, 197001L, 197001L, 197001L, 197001L,
> >>     197001L, 197001L, 197001L, 197001L, 197001L), week = c(1L,
> >>     1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L)), .Names = c("yearday",
> >> "City", "cell_number", "longitude", "latitude", "State", "avsft",
> >> "year", "day", "hour", "yearweek", "week"), row.names = c(NA,
> >> 10L), class = "data.frame")
> >
> >
> > Sincerely,
> >
> > Shouro
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: [FORGED] Re: Help with Binning Data

John Kane
In reply to this post by Rolf Turner
"that rely on profusion of dummies" :)

+1

John Kane
Kingston ON Canada


> -----Original Message-----
> From: [hidden email]
> Sent: Fri, 11 Sep 2015 12:22:38 +1200
> To: [hidden email]
> Subject: Re: [R] [FORGED] Re: Help with Binning Data
>
> On 11/09/15 11:57, David Winsemius wrote:
>
> <SNIP>
>
>> The urge to imitate other statistical package that rely on profusion
>> of dummies should be resisted. R repression functions can handle
>> factor variables ....
>
> <SNIP>
>
> Fortune? :-)
>
> cheers,
>
> Rolf
>
> --
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

____________________________________________________________
FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family!
Visit http://www.inbox.com/photosharing to find out more!

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: [FORGED] Re: Help with Binning Data

Jim Lemon-4
Hi Shouro,
While I have enjoyed the continuing discussion on this particular message
(repression may have been a Galtonian slip), there is a lingering doubt in
my mind. You say that you want to categorize the weekly temperatures for
cities in bins of about 5.6 degrees (centigrade?). In almost all of the
cities you include in your sample data (quite a few of which I have
personal experience) the variation in temperature over a day, not to
mention a week, is more than this. Unless you derive some particular
temperature value, many cities will span more than one bin over a week.
Have you already calculated a weekly average from your 3 hour observations?

Jim


On Fri, Sep 11, 2015 at 7:47 PM, John Kane <[hidden email]> wrote:

> "that rely on profusion of dummies" :)
>
> +1
>
> John Kane
> Kingston ON Canada
>
>
> > -----Original Message-----
> > From: [hidden email]
> > Sent: Fri, 11 Sep 2015 12:22:38 +1200
> > To: [hidden email]
> > Subject: Re: [R] [FORGED] Re: Help with Binning Data
> >
> > On 11/09/15 11:57, David Winsemius wrote:
> >
> > <SNIP>
> >
> >> The urge to imitate other statistical package that rely on profusion
> >> of dummies should be resisted. R repression functions can handle
> >> factor variables ....
> >
> > <SNIP>
> >
> > Fortune? :-)
> >
> > cheers,
> >
> > Rolf
> >
> > --
> > Technical Editor ANZJS
> > Department of Statistics
> > University of Auckland
> > Phone: +64-9-373-7599 ext. 88276
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ____________________________________________________________
> FREE ONLINE PHOTOSHARING - Share your photos online with your friends and
> family!
> Visit http://www.inbox.com/photosharing to find out more!
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: [FORGED] Re: Help with Binning Data

Shouro Dasgupta
Dear Jim,

Thank you for your reply and pointing this out. I thought about it and then
I forgot. I have computed the weekly average (and max and min). The data is
below. Again I computed the max/min/mean by each year, so each file
contains data for one year. Can I modify the code I used for count? Thanks
again!

Sincerely,

Shouro

structure(list(City = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L), .Label = c("AKRON", "ALBANY", "ALBUQUERQUE", "ALLENTOWN",
"ATLANTA", "AUSTIN", "BALTIMORE", "BATON ROUGE", "BERKELEY",
"BIRMINGHAM", "BOISE", "BOSTON", "BRIDGEPORT", "BUFFALO", "CAMBRIDGE",
"CAMDEN", "CANTON", "CHARLOTTE", "CHATTANOOGA", "CHICAGO", "CINCINNATI",
"CLEVELAND", "COLORADO SPRINGS", "COLUMBUS", "CORPUS CHRISTI",
"DALLAS", "DAYTON", "DENVER", "DES MOINES", "DETROIT", "DULUTH",
"EL PASO", "ELIZABETH", "ERIE", "EVANSVILLE", "FALL RIVER", "FLINT",
"FORT WAYNE", "FRESNO", "FT WORTH", "GARY", "GLENDALE", "GRAND RAPIDS",
"HARTFORD", "HONOLULU", "HOUSTON", "INDIANAPOLIS", "JACKSONVILLE",
"JERSEY CITY", "KANSAS CITY", "KANSAS ITY", "KNOXVILLE", "Lansing ",
"LAS VEGAS", "LEXINGTON", "LINCOLN", "LITTLE ROCK", "LONG BEACH",
"LOS ANGELES", "LOUISVILLE", "LOWELL", "LYNN", "MADISON", "MEMPHIS",
"MIAMI", "MILWAUKEE", "MINNEAPOLIS", "MOBILE", "MONTGOMERY",
"NASHVILLE", "NEW BEDFORD", "NEW HAVEN", "NEW ORLEANS", "NEW YORK CITY",
"NEWARK", "NORFOLK", "OAKLAND", "OGDEN", "OKLAHOMA CITY", "OMAHA",
"PASADENA", "PATERSON", "PEORIA", "PHILADELPHIA", "PHOENIX",
"PITTSBURG", "PORTLAND", "PROVIDENCE", "PUEBLO", "READING", "RICHMOND",
"ROCHESTER", "ROCKFORD", "SACRAMENTO", "SALT LAKE CITY", "SAN ANTONIO",
"SAN CRUZ", "SAN DIEGO", "SAN FRANCISCO", "SAN JOSE", "SAVANNAH",
"SCHENECTADY", "SCRANTON", "SEATTLE", "SHREVEPORT", "SOMERVILLE",
"SOUTH BEND", "SPOKANE", "SPRINGFIELD", "ST LOUIS", "ST PAUL",
"ST PETERSBURG", "SYRACUSE", "TACOMA", "TAMPA", "TOLEDO", "TRENTON",
"TUCSON", "TULSA", "UTICA", "WASHINGTON", "WATERBURY", "WICHITA",
"WILMINGTON", "WORCESTER", "YONKERS", "YOUNGSTOWN"), class = "factor"),
    year = c(1970L, 1970L, 1970L, 1970L, 1970L, 1970L, 1970L,
    1970L, 1970L, 1970L), week = 1:10, tmax = c(-3.94999999999997,
    -6.28714285714283, -3.38285714285712, -4.24571428571427,
    0.188571428571453, -1.3485714285714, -1.40285714285712,
3.66285714285717,
    4.55000000000002, 5.7157142857143), tmin = c(-10.7316666666667,
    -12.2057142857143, -10.7885714285714, -13.3157142857143,
    -6.73999999999998, -8.60999999999998, -8.47999999999997,
    -6.02428571428569, -4.36428571428569, -2.43999999999998),
    tmean = c(-7.36583333333332, -9.77446428571427, -7.11892857142855,
    -9.07499999999999, -3.45946428571426, -4.99214285714284,
    -5.27874999999998, -1.31928571428569, -0.556249999999979,
    1.24714285714287)), .Names = c("City", "year", "week", "tmax",
"tmin", "tmean"), row.names = c(NA, 10L), class = "data.frame")

On Fri, Sep 11, 2015 at 12:14 PM, Jim Lemon <[hidden email]> wrote:

> Hi Shouro,
> While I have enjoyed the continuing discussion on this particular message
> (repression may have been a Galtonian slip), there is a lingering doubt in
> my mind. You say that you want to categorize the weekly temperatures for
> cities in bins of about 5.6 degrees (centigrade?). In almost all of the
> cities you include in your sample data (quite a few of which I have
> personal experience) the variation in temperature over a day, not to
> mention a week, is more than this. Unless you derive some particular
> temperature value, many cities will span more than one bin over a week.
> Have you already calculated a weekly average from your 3 hour observations?
>
> Jim
>
>
> On Fri, Sep 11, 2015 at 7:47 PM, John Kane <[hidden email]> wrote:
>
>> "that rely on profusion of dummies" :)
>>
>> +1
>>
>> John Kane
>> Kingston ON Canada
>>
>>
>> > -----Original Message-----
>> > From: [hidden email]
>> > Sent: Fri, 11 Sep 2015 12:22:38 +1200
>> > To: [hidden email]
>> > Subject: Re: [R] [FORGED] Re: Help with Binning Data
>> >
>> > On 11/09/15 11:57, David Winsemius wrote:
>> >
>> > <SNIP>
>> >
>> >> The urge to imitate other statistical package that rely on profusion
>> >> of dummies should be resisted. R repression functions can handle
>> >> factor variables ....
>> >
>> > <SNIP>
>> >
>> > Fortune? :-)
>> >
>> > cheers,
>> >
>> > Rolf
>> >
>> > --
>> > Technical Editor ANZJS
>> > Department of Statistics
>> > University of Auckland
>> > Phone: +64-9-373-7599 ext. 88276
>> >
>> > ______________________________________________
>> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> ____________________________________________________________
>> FREE ONLINE PHOTOSHARING - Share your photos online with your friends and
>> family!
>> Visit http://www.inbox.com/photosharing to find out more!
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>


--

*Shouro Dasgupta*
PhD Candidate
Science and Management of Climate Change
Department of Economics | Ca' Foscari University of Venice
-------------------------------------------------------------------------------------------------------
Junior Researcher
Fondazione Eni Enrico Mattei (FEEM) | Centro Euro-Mediterraneo per i
Cambiamenti Climatici (CMCC)
Isola di San Giorgio Maggiore, 8
30124 Venezia
Phone: +39 041 2700 436

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.