Discussion:
Fisher's & McNemar's tests
(too old to reply)
t***@gmail.com
2014-07-16 14:16:39 UTC
Permalink
Dear all,

I'm conducting a retrospective study to assess the efficacy of a new test for predicting sudden cardiac death (SCD). I also want to compare the accuracy of my new predictor with that of a conventional predictor. I'm not a statistician, so I would very much appreciate it if any of you experts can confirm that I'm in the right direction w.r.t the statistical tests I need to use.

1. I'm planning to use Fisher's exact test to see if my new predictor is associated with occurrence of SCD. That is, I will divide my sample of patients into two groups, one group in which patients died of SCD, and the other in which no one died of SCD, and I will compare between the two groups the proportion of patients in which my predictor was positive. Am I using the Fisher's test appropriately here?

2. I'm planning to use McNemar's test to compare the accuracy of my new predictor with that of the conventional predictor. That is, I will have a 2x2 contingency table with the following entries:

Top left cell: # patients in which both predictors accurately determined where
or not they died of SCD
Top right cell: # patients in which my predictor was accurate by the
conventional predictor was not
Bottom left cell: # patients in which my predictor was not accurate but the
conventional predictor was
Bottom right cell: # patients in which both predictors were inaccurate

Is my use of McNemar's test correct?

Thanks much for your help,
Fijoy
Rich Ulrich
2014-07-16 17:06:10 UTC
Permalink
Post by t***@gmail.com
Dear all,
I'm conducting a retrospective study to assess the efficacy of a new test for predicting sudden cardiac death (SCD). I also want to compare the accuracy of my new predictor with that of a conventional predictor. I'm not a statistician, so I would very much appreciate it if any of you experts can confirm that I'm in the right direction w.r.t the statistical tests I need to use.
1. I'm planning to use Fisher's exact test to see if my new predictor is associated with occurrence of SCD. That is, I will divide my sample of patients into two groups, one group in which patients died of SCD, and the other in which no one died of SCD, and I will compare between the two groups the proportion of patients in which my predictor was positive. Am I using the Fisher's test appropriately here?
The classical, ideal case for requiring the FET is when
both sets of Marginal totals are fixed: two equal groups
with a forced choice to equalize outcome counts, or outcome
decided by a median-split.

The 2x2 contingency chisquared with Yates's correction gives
p-values that very closely match the FET.

Some people like the FET for everything. The time that others
retreat to the FET is when there are low cell expectations, and
the X^2 might come out far too large.
Post by t***@gmail.com
Top left cell: # patients in which both predictors accurately determined where
or not they died of SCD
Top right cell: # patients in which my predictor was accurate by the
conventional predictor was not
Bottom left cell: # patients in which my predictor was not accurate but the
conventional predictor was
Bottom right cell: # patients in which both predictors were inaccurate
Is my use of McNemar's test correct?
Mainly, NO. The counts of True Positives and True Negatives
are not at all assured to be commensurable; that is, you
probably should not add them, but you should look carefully
at False Positives and False Negatives. - If you have outcomes
that are about 50% in each group, then it will work okay, since
the two sorts of errors will be equally likely.

Technically, if you frame the question as "accurate versus
inaccurate", the McNemar's test will answer it. McNemar's is
a computational convenience for computing a sign-test on
two quantities that might be expected to be equal; it generates
a chi-squared statistic, so you do not need a binomial table.

However, epidemiologists are very careful in separating
out the different errors, sensitivity versus specificity.
Your proposed use is wrong because it ignores that.

ILLUSTRATION.
The problem for rare events is that any time the number of false-
positives exceeds the number of true-positives, "best accuracy"
- in terms of absolute sum of all errors - is achieved by predicting
that no one at all will have the event. So if a screening test says
that 20 people out of 1000 are at risk of SCD, and only 7 experience
it, that is 13 errors (useful screen) versus 7 errors (no screen).
(Both methods get the errors for people both missed.) Wrong
conclusion. We do like some screening tests that might be
overly-inclusive. Decision theory considers the value or cost
of the hits and misses in either direction.
--
Rich Ulrich
Fijoy Vadakkumpadan
2014-07-16 17:41:44 UTC
Permalink
Hi Rich,

Thanks for your response. To confirm that I have understood your response to my first question correctly, are you suggesting that I use Chi-square test with Yates correction (instead of FET) to test if my new predictor is associated with clinical outcome?

Thanks
Fijoy
Post by Rich Ulrich
Post by t***@gmail.com
Dear all,
I'm conducting a retrospective study to assess the efficacy of a new test for predicting sudden cardiac death (SCD). I also want to compare the accuracy of my new predictor with that of a conventional predictor. I'm not a statistician, so I would very much appreciate it if any of you experts can confirm that I'm in the right direction w.r.t the statistical tests I need to use.
1. I'm planning to use Fisher's exact test to see if my new predictor is associated with occurrence of SCD. That is, I will divide my sample of patients into two groups, one group in which patients died of SCD, and the other in which no one died of SCD, and I will compare between the two groups the proportion of patients in which my predictor was positive. Am I using the Fisher's test appropriately here?
The classical, ideal case for requiring the FET is when
both sets of Marginal totals are fixed: two equal groups
with a forced choice to equalize outcome counts, or outcome
decided by a median-split.
The 2x2 contingency chisquared with Yates's correction gives
p-values that very closely match the FET.
Some people like the FET for everything. The time that others
retreat to the FET is when there are low cell expectations, and
the X^2 might come out far too large.
Post by t***@gmail.com
Top left cell: # patients in which both predictors accurately determined where
or not they died of SCD
Top right cell: # patients in which my predictor was accurate by the
conventional predictor was not
Bottom left cell: # patients in which my predictor was not accurate but the
conventional predictor was
Bottom right cell: # patients in which both predictors were inaccurate
Is my use of McNemar's test correct?
Mainly, NO. The counts of True Positives and True Negatives
are not at all assured to be commensurable; that is, you
probably should not add them, but you should look carefully
at False Positives and False Negatives. - If you have outcomes
that are about 50% in each group, then it will work okay, since
the two sorts of errors will be equally likely.
Technically, if you frame the question as "accurate versus
inaccurate", the McNemar's test will answer it. McNemar's is
a computational convenience for computing a sign-test on
two quantities that might be expected to be equal; it generates
a chi-squared statistic, so you do not need a binomial table.
However, epidemiologists are very careful in separating
out the different errors, sensitivity versus specificity.
Your proposed use is wrong because it ignores that.
ILLUSTRATION.
The problem for rare events is that any time the number of false-
positives exceeds the number of true-positives, "best accuracy"
- in terms of absolute sum of all errors - is achieved by predicting
that no one at all will have the event. So if a screening test says
that 20 people out of 1000 are at risk of SCD, and only 7 experience
it, that is 13 errors (useful screen) versus 7 errors (no screen).
(Both methods get the errors for people both missed.) Wrong
conclusion. We do like some screening tests that might be
overly-inclusive. Decision theory considers the value or cost
of the hits and misses in either direction.
--
Rich Ulrich
Rich Ulrich
2014-07-16 23:01:37 UTC
Permalink
On Wed, 16 Jul 2014 10:41:44 -0700 (PDT), Fijoy Vadakkumpadan
Post by Fijoy Vadakkumpadan
Hi Rich,
Thanks for your response. To confirm that I have understood your response to my first question correctly, are you suggesting that I use Chi-square test with Yates correction (instead of FET) to test if my new predictor is associated with clinical outcome?
[snip previous]

I would only report the FET if I had to make a point of the
small number of cases predicted and observed.

I might use the n-1 chisquared mentioned by Bruce if
anyone else has started using it. Or if it has made an
appearance in a good text book.
--
Rich Ulrich
Bruce Weaver
2014-07-16 19:49:12 UTC
Permalink
Post by t***@gmail.com
Dear all,
I'm conducting a retrospective study to assess the efficacy of a new test for predicting sudden cardiac death (SCD). I also want to compare the accuracy of my new predictor with that of a conventional predictor. I'm not a statistician, so I would very much appreciate it if any of you experts can confirm that I'm in the right direction w.r.t the statistical tests I need to use.
1. I'm planning to use Fisher's exact test to see if my new predictor is associated with occurrence of SCD. That is, I will divide my sample of patients into two groups, one group in which patients died of SCD, and the other in which no one died of SCD, and I will compare between the two groups the proportion of patients in which my predictor was positive. Am I using the Fisher's test appropriately here?
As Rich noted in his reply, the Fisher-Irwin test is meant for the
situation where the marginal totals are fixed in advance. I would
suggest the N-1 chi-square instead. You can read about it (and find an
online calculator) here:

http://www.iancampbell.co.uk/twobytwo/twobytwo.htm
Post by t***@gmail.com
Top left cell: # patients in which both predictors accurately determined where
or not they died of SCD
Top right cell: # patients in which my predictor was accurate by the
conventional predictor was not
Bottom left cell: # patients in which my predictor was not accurate but the
conventional predictor was
Bottom right cell: # patients in which both predictors were inaccurate
Is my use of McNemar's test correct?
Thanks much for your help,
Fijoy
If you want to use McNemar's test, you need to use it twice, once to
compare the sensitivities of the two tests (using only SCD-positive
cases), and once to compare the specificities (using only the
SCD-negative cases). The tables will look like this (use fixed font to
view):

T2+ T2-
T1+ a b
T1- c d

T1 = test 1
T2 = test 2
+ and - indicate positive & negative tests respectively

But you could also take a look at Robert Newcombe's book on confidence
intervals for proportions and related measures of effect size. It
includes discussion of his earlier article, "Simultaneous comparison of
sensitivity and specificity of two tests in the paired design: a
straightforward graphical approach". Here are some links:

http://onlinelibrary.wiley.com/doi/10.1002/sim.906/abstract
http://www.crcpress.com/product/isbn/9781439812785

See the Downloads tab on the second site. You can download a zip file
of Excel worksheets, one of which has the "simultaneous comparison of
sensitivity and specificity".

HTH.
--
Bruce Weaver
***@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/Home
"When all else fails, RTFM."
Loading...