getting strata/cluster level values with survey package?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

getting strata/cluster level values with survey package?

Jeff Hamann
First, I appoligise for the rookie question, but...

I'm trying to obtain standard errors, confidence intervals, etc. from a
sample design and have been trouble getting the results for anything other
than the basic total or mean for the overall survey from the survey
package.

For example, using the following dataset,

strata,cluster,vol
A,1,18.58556192
A,1,12.55175443
A,1,21.65882438
A,1,17.11172946
A,1,15.41713348
A,2,13.9344623
A,2,17.13104821
A,2,14.6806479
A,2,14.68357291
A,2,18.86017714
A,2,20.67642515
A,2,15.15295351
A,2,13.82121102
A,2,12.9110477
A,2,14.83153677
A,2,21.90772687
A,3,18.69795427
A,3,18.45636428
A,3,15.77175793
A,3,15.54715217
A,3,20.31948393
A,3,19.26391445
A,3,15.54750775
A,3,19.18724018
A,4,12.89572151
A,4,12.92047701
A,4,12.64958757
A,4,19.85888418
A,4,19.64057669
A,4,19.19188964
A,4,18.81619298
A,4,21.73670878
A,5,15.99430802
A,5,18.66666517
A,5,21.80441654
A,5,14.22081904
A,5,16.01576433
A,5,14.92497202
A,5,17.95123218
A,5,19.82027165
A,5,19.35698273
A,5,19.10826519
B,6,13.40892677
B,6,14.3956207
B,6,13.82113391
B,6,16.37338569
B,6,19.70159575
B,7,14.74334178
B,7,16.55125245
B,7,12.38329798
B,7,18.16472408
B,7,16.32938475
B,7,16.06465494
B,7,12.63086062
B,7,14.46114813
B,7,21.90134013
B,7,13.81025827
B,7,15.85805494
B,7,20.18195326
B,8,19.05120792
B,8,12.83856639
B,8,12.61360139
B,8,21.30434314
B,8,14.19960469
B,8,17.38397826
B,8,15.66477339
B,8,22.07182834
B,8,12.07487394
B,8,20.36357359
B,8,20.2543677
B,9,14.44499362
B,9,17.77235228
B,9,13.01620902
B,9,18.10976359
B,10,18.22350661
B,10,18.41504728
B,10,17.94735486
B,10,18.39173938
B,10,14.21729704
B,10,16.95753684
B,10,21.11643087
B,10,16.09688752
B,10,19.54707452
B,10,22.00450065
B,10,15.15308873
B,10,14.72488972
B,10,17.65280737
B,10,14.61615255
B,10,12.89525607
B,11,22.35831089
B,11,18.0853187
B,11,22.12815791
B,11,17.74562214
B,11,21.45724242
B,11,20.57933779
B,11,19.97397415
B,11,16.34967424
B,12,22.14385376
B,12,17.82816113
B,12,18.37056381
B,12,16.13152759
B,12,22.06764318
B,12,12.80924472
B,12,18.95522175
B,13,20.40554286
B,13,19.72951878
C,14,15.51581
C,14,15.4836358
C,14,13.35882363
C,14,13.16072916
C,14,21.69168971
C,14,19.09686303
C,14,14.47450457
C,14,12.04870424
C,14,13.33096141
C,14,17.38388981
C,14,16.29015289
C,14,16.32707754
C,14,16.2784054
C,15,15.0170597
C,15,14.95767365
C,15,15.20739614
C,15,22.10458509
C,15,12.3362457
C,15,19.87895753
C,15,18.8363682
C,15,16.43738666
C,15,12.84570744
C,15,15.99869357
C,15,14.42551321
C,15,13.63489872
C,15,15.67179885
C,16,14.61700901
C,16,14.64864676
C,16,14.13014582
C,16,21.7637441
C,16,20.66825543
C,16,17.05977818
C,16,17.80118916
C,16,15.16641698

where this is read into stand.data. When I use the following survey designs,

srv1 <- svydesign(ids=~1, strata=~strata, data=stand.data )

or,

srv1 <- svydesign(ids=~cluster, strata=~strata, data=stand.data )

with,

print( svytotal( ~vol, srv1 ) )

I only obtain the total,

> print( svytotal( ~vol, srv1 ) )
    total     SE
vol  2377 34.464

or worse,

print( svytotal( ~vol + strata, srv1 ) )
         total     SE
vol     2377.0 34.464
strataA   42.0  0.000
strataB   64.0  0.000
strataC   34.0  0.000

which reports the number of observations in each of the strata. I'm sure
this is a RTFM question, but I just need a start. The size of each "plot"
is 0.04 units (hectares) and I want to be able to quickly examine working
up each sample with and without clusters (this is going to be part of a
larger simulation study).

I'm trying to not use SAS for this and hate to admit defeat.

Thanks,
Jeff.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: getting strata/cluster level values with survey package?

Thomas Lumley
On Tue, 7 Feb 2006, Jeff D. Hamann wrote:

> First, I appoligise for the rookie question, but...
>
> I'm trying to obtain standard errors, confidence intervals, etc. from a
> sample design and have been trouble getting the results for anything other
> than the basic total or mean for the overall survey from the survey
> package.

You want svyby() and then perhaps ftable() for formatting. (?svyby,
?ftable.svyby).

(You also want to send only one copy of the email message, not three).

  -thomas



>
> For example, using the following dataset,
>
> strata,cluster,vol
> A,1,18.58556192
> A,1,12.55175443
> A,1,21.65882438
> A,1,17.11172946
> A,1,15.41713348
> A,2,13.9344623
> A,2,17.13104821
> A,2,14.6806479
> A,2,14.68357291
> A,2,18.86017714
> A,2,20.67642515
> A,2,15.15295351
> A,2,13.82121102
> A,2,12.9110477
> A,2,14.83153677
> A,2,21.90772687
> A,3,18.69795427
> A,3,18.45636428
> A,3,15.77175793
> A,3,15.54715217
> A,3,20.31948393
> A,3,19.26391445
> A,3,15.54750775
> A,3,19.18724018
> A,4,12.89572151
> A,4,12.92047701
> A,4,12.64958757
> A,4,19.85888418
> A,4,19.64057669
> A,4,19.19188964
> A,4,18.81619298
> A,4,21.73670878
> A,5,15.99430802
> A,5,18.66666517
> A,5,21.80441654
> A,5,14.22081904
> A,5,16.01576433
> A,5,14.92497202
> A,5,17.95123218
> A,5,19.82027165
> A,5,19.35698273
> A,5,19.10826519
> B,6,13.40892677
> B,6,14.3956207
> B,6,13.82113391
> B,6,16.37338569
> B,6,19.70159575
> B,7,14.74334178
> B,7,16.55125245
> B,7,12.38329798
> B,7,18.16472408
> B,7,16.32938475
> B,7,16.06465494
> B,7,12.63086062
> B,7,14.46114813
> B,7,21.90134013
> B,7,13.81025827
> B,7,15.85805494
> B,7,20.18195326
> B,8,19.05120792
> B,8,12.83856639
> B,8,12.61360139
> B,8,21.30434314
> B,8,14.19960469
> B,8,17.38397826
> B,8,15.66477339
> B,8,22.07182834
> B,8,12.07487394
> B,8,20.36357359
> B,8,20.2543677
> B,9,14.44499362
> B,9,17.77235228
> B,9,13.01620902
> B,9,18.10976359
> B,10,18.22350661
> B,10,18.41504728
> B,10,17.94735486
> B,10,18.39173938
> B,10,14.21729704
> B,10,16.95753684
> B,10,21.11643087
> B,10,16.09688752
> B,10,19.54707452
> B,10,22.00450065
> B,10,15.15308873
> B,10,14.72488972
> B,10,17.65280737
> B,10,14.61615255
> B,10,12.89525607
> B,11,22.35831089
> B,11,18.0853187
> B,11,22.12815791
> B,11,17.74562214
> B,11,21.45724242
> B,11,20.57933779
> B,11,19.97397415
> B,11,16.34967424
> B,12,22.14385376
> B,12,17.82816113
> B,12,18.37056381
> B,12,16.13152759
> B,12,22.06764318
> B,12,12.80924472
> B,12,18.95522175
> B,13,20.40554286
> B,13,19.72951878
> C,14,15.51581
> C,14,15.4836358
> C,14,13.35882363
> C,14,13.16072916
> C,14,21.69168971
> C,14,19.09686303
> C,14,14.47450457
> C,14,12.04870424
> C,14,13.33096141
> C,14,17.38388981
> C,14,16.29015289
> C,14,16.32707754
> C,14,16.2784054
> C,15,15.0170597
> C,15,14.95767365
> C,15,15.20739614
> C,15,22.10458509
> C,15,12.3362457
> C,15,19.87895753
> C,15,18.8363682
> C,15,16.43738666
> C,15,12.84570744
> C,15,15.99869357
> C,15,14.42551321
> C,15,13.63489872
> C,15,15.67179885
> C,16,14.61700901
> C,16,14.64864676
> C,16,14.13014582
> C,16,21.7637441
> C,16,20.66825543
> C,16,17.05977818
> C,16,17.80118916
> C,16,15.16641698
>
> where this is read into stand.data. When I use the following survey designs,
>
> srv1 <- svydesign(ids=~1, strata=~strata, data=stand.data )
>
> or,
>
> srv1 <- svydesign(ids=~cluster, strata=~strata, data=stand.data )
>
> with,
>
> print( svytotal( ~vol, srv1 ) )
>
> I only obtain the total,
>
>> print( svytotal( ~vol, srv1 ) )
>    total     SE
> vol  2377 34.464
>
> or worse,
>
> print( svytotal( ~vol + strata, srv1 ) )
>         total     SE
> vol     2377.0 34.464
> strataA   42.0  0.000
> strataB   64.0  0.000
> strataC   34.0  0.000
>
> which reports the number of observations in each of the strata. I'm sure
> this is a RTFM question, but I just need a start. The size of each "plot"
> is 0.04 units (hectares) and I want to be able to quickly examine working
> up each sample with and without clusters (this is going to be part of a
> larger simulation study).
>
> I'm trying to not use SAS for this and hate to admit defeat.
>
> Thanks,
> Jeff.
>
>
>
>

Thomas Lumley Assoc. Professor, Biostatistics
[hidden email] University of Washington, Seattle

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html