Discussion:
standard error=0 in stratified sampling?
(too old to reply)
p***@gmail.com
2017-08-03 13:12:37 UTC
Permalink
Suppose I have a population with 100 events and 900 non-events. Thus, the population’s event rate is 0.1.
When I select 10% from these 100 events and also 10% from the 900 non-events, the sample’s event rate is 0.1.
If I repeat the process 20 times to create 20 samples, each sample’s rate is 0.1. Then, the standard error (the square root of the variance of these 20 means) is 0 because all the 20 event rates is 0.1.
Do I miss something or this is a legitimate stratified sampling. Please help. Thanks.
Rich Ulrich
2017-08-03 17:26:59 UTC
Permalink
I'm cross-posting this in the 3 groups where I see the identical
message.
Post by p***@gmail.com
Suppose I have a population with 100 events and 900 non-events. Thus, the population’s event rate is 0.1.
When I select 10% from these 100 events and also 10% from the 900 non-events, the sample’s event rate is 0.1.
If I repeat the process 20 times to create 20 samples, each sample’s rate is 0.1. Then, the standard error (the square root of the variance of these 20 means) is 0 because all the 20 event rates is 0.1.
Do I miss something or this is a legitimate stratified sampling. Please help. Thanks.
This does not look familiar to me, but you can do a lot of stuff
if you can justify it. What is very clear is that you cannot use
a variance (or SE) for any inference or testing after you have
set it to zero by the design. What are you trying to estimate?

A binomial rate has its own variance based on the mean, so those
variances are ordinarily robust.

So: If you are interested in the variance of the rate of events,
that should not be very problematic unless you are stratifying by
some /other/ variable that has a strong effect on the event rate.

Stratification, jack-knife, bootstrap -- I've never done much with
any of them, but it looks to me like you are confusing the ideas.
Bootstrapping goes after difficult variances, but I don't picture
that problem with a dichotomous outcome. (Jackknife, ditto.)

You seem to have the whole sample in hand, so I don't see why your
stratification is desirable. I see that as a sampling scheme which is
used when resources are inadequate. Or, to avoid huge Ns (which
is less often seen as a problem now, than when computers were
1000 times slower).

Hope this helps,
--
Rich Ulrich
Loading...