Comparing two models

Discussion:

(too old to reply)

Fijoy Vadakkumpadan

2014-09-06 14:49:09 UTC

Dear all,

I am interested in comparing two logistic regression models. The two models are nested. I understand that I can use the likelihood ratio test (LRT) for the comparison.

I'm wondering if I can use the Akaike Information Criterion (AIC) to compare the two models. Or, is the LRT test more accurate than AIC when the models are nested?

Thank you,
Fijoy

Rich Ulrich

2014-09-06 19:56:22 UTC

Permalink

On Sat, 6 Sep 2014 07:49:09 -0700 (PDT), Fijoy Vadakkumpadan

Post by Fijoy Vadakkumpadan
Dear all,
I am interested in comparing two logistic regression models. The two models are nested. I understand that I can use the likelihood ratio test (LRT) for the comparison.
I'm wondering if I can use the Akaike Information Criterion (AIC) to compare the two models. Or, is the LRT test more accurate than AIC when the models are nested?

--
Rich Ulrich

David Jones

2014-09-07 01:06:19 UTC

Permalink

"Rich Ulrich" wrote in message news:***@4ax.com...

On Sat, 6 Sep 2014 07:49:09 -0700 (PDT), Fijoy Vadakkumpadan

Post by Fijoy Vadakkumpadan
Dear all,
I am interested in comparing two logistic regression models. The two models
are nested. I understand that I can use the likelihood ratio test (LRT) for
the comparison.
I'm wondering if I can use the Akaike Information Criterion (AIC) to
compare the two models. Or, is the LRT test more accurate than AIC when the
models are nested?

The LRT is an exact test that is available for nested models.

The AICc is an approximation; computational formulas show it
to be a modification of the LRT. It is favored over BIC (or AIC)
because it is pretty robust when you do not have nested models.

Rich Ulrich

--------------------------------------------------------------------------------------------

LRT and variants of AIC have different purposes, and you should choose the
one that most closely matches your own purpose.

LRT is, as usual, essentially asymmetric in that it tends to retain the
simpler model unless there is strong evidence that something more
complicated is needed.

AIC and variants tend to counter-balance the two opposing effects of (a)
increased inaccuracy through estimating an unnecessary parameter, and (b)
increased accuracy through better modelling by adding an extra parameter.
There is an implicit idea that you will go on to use the selected model with
estimated parameters.

The LRT is not exact since it relies on the chi-squared approximation for
its application, but perhaps "exact" means well-determined by theory and
anyway the approximation is usually counted as good. It essentially compares
a pair of (nested) models whereas AIC (etc.) is often used to choose between
a whole range of models. For the LRT you have to choose a significance level
and this will have a dramatic effect on the outcome of model selection. Each
variant of AIC (usually) has a fixed way of balancing the two effects
mentioned, so there is no control on the selection corresponding to the
significance level.

It may be possible to find, or set up, a simulation study that shows how the
power of the LRT test varies as the significance level and size of model
effects change, and to show how the outcome of the AIC procedures changes:
in each case the simulation study would measure how often a given model
structure is selected

David Jones

Fijoy Vadakkumpadan

2014-09-08 22:33:18 UTC

Permalink

Thank you, David. Yeah, I agree that AIC offers advantages when comparing more than 2 models (nested or not). Currently, my experiments involve only 2 nested models, and AIC still seems to be an attractive option. I was wondering if it's a generally accepted practice to use AIC even when comparing two nested models.

Thanks,
Fijoy

Post by Rich Ulrich
On Sat, 6 Sep 2014 07:49:09 -0700 (PDT), Fijoy Vadakkumpadan

Post by Fijoy Vadakkumpadan
Dear all,
I am interested in comparing two logistic regression models. The two models
are nested. I understand that I can use the likelihood ratio test (LRT) for
the comparison.
I'm wondering if I can use the Akaike Information Criterion (AIC) to
compare the two models. Or, is the LRT test more accurate than AIC when the
models are nested?

The LRT is an exact test that is available for nested models.
The AICc is an approximation; computational formulas show it
to be a modification of the LRT. It is favored over BIC (or AIC)
because it is pretty robust when you do not have nested models.
Rich Ulrich
--------------------------------------------------------------------------------------------
LRT and variants of AIC have different purposes, and you should choose the
one that most closely matches your own purpose.
LRT is, as usual, essentially asymmetric in that it tends to retain the
simpler model unless there is strong evidence that something more
complicated is needed.
AIC and variants tend to counter-balance the two opposing effects of (a)
increased inaccuracy through estimating an unnecessary parameter, and (b)
increased accuracy through better modelling by adding an extra parameter.
There is an implicit idea that you will go on to use the selected model with
estimated parameters.
The LRT is not exact since it relies on the chi-squared approximation for
its application, but perhaps "exact" means well-determined by theory and
anyway the approximation is usually counted as good. It essentially compares
a pair of (nested) models whereas AIC (etc.) is often used to choose between
a whole range of models. For the LRT you have to choose a significance level
and this will have a dramatic effect on the outcome of model selection. Each
variant of AIC (usually) has a fixed way of balancing the two effects
mentioned, so there is no control on the selection corresponding to the
significance level.
It may be possible to find, or set up, a simulation study that shows how the
power of the LRT test varies as the significance level and size of model
in each case the simulation study would measure how often a given model
structure is selected
David Jones

David Jones

2014-09-09 00:05:01 UTC

Permalink

"Fijoy Vadakkumpadan" wrote in message news:fddf9873-df48-457b-abea-***@googlegroups.com...

Thank you, David. Yeah, I agree that AIC offers advantages when comparing
more than 2 models (nested or not). Currently, my experiments involve only 2
nested models, and AIC still seems to be an attractive option. I was
wondering if it's a generally accepted practice to use AIC even when
comparing two nested models.

Thanks,
Fijoy

================================

That's not quite what I said or meant to say. I don't see much, if any,
reports of practical statistical analyses, so can't say if AIC is seriously
used for the simple case of two nested models, but there is no reason why
not. But this must be in the context of how any conclusions are to be used.
For the LRT or any significance test, you are deciding whether a set of
variables have a detectable effect on model outputs, while for AIC it
doesn't really matter externally what variables are included: you are just
looking for a model to use to represent the distributions of future
observations.

For a whole collection of (nested) models, you could do a sequence of
significance tests but the overall properties of such a sequence of tests
are difficult to determine theoretically because of the dependencies. For
ordinary least squares regression this broadly corresponds to what was once
a widespread procedure called forward selection and that is now deprecated.
Various books now recommend cross-validation, and there seems no strong
reason why this can't be applied to non-nested and non-normal models using
the likelihood function instead of the sum of squares.

For significance tests, there are the well-established ideas behind looking
at the power of the test to decide what is a good test. For model selection
where it is the usefulness of the model that is of interest, not its
"truth", it is less clear how to quantify the effectiveness of a model
selection procedure or of a selected model, and attempts to determine such a
thing lead to different variants of AIC.

David Jones

Fijoy Vadakkumpadan

2014-09-08 22:30:01 UTC

Permalink

Hi Rich,

I came across 2 studies that seem to indicate that AIC is better than an F-test:

1.http://scitation.aip.org/content/aapm/journal/medphys/34/11/10.1118/1.2794176

2. http://www.ncbi.nlm.nih.gov/pubmed/19761098

They are not specifically comparing AIC to LRT, but since LRT uses F-distribution, they are relevant to the question. Have you seen these studies? If yes, do you think that they provide evidence for the superiority of AIC?

Thanks,
Fijoy

Post by Rich Ulrich
On Sat, 6 Sep 2014 07:49:09 -0700 (PDT), Fijoy Vadakkumpadan

Bruce Weaver

2014-09-09 19:15:34 UTC

Permalink

Post by Fijoy Vadakkumpadan
Hi Rich,
1.http://scitation.aip.org/content/aapm/journal/medphys/34/11/10.1118/1.2794176
2. http://www.ncbi.nlm.nih.gov/pubmed/19761098
They are not specifically comparing AIC to LRT, but since LRT uses F-distribution, they are relevant to the question. Have you seen these studies? If yes, do you think that they provide evidence for the superiority of AIC?
Thanks,
Fijoy

The likelihood ratio test I am familiar with uses the chi-square
distribution.

--
Bruce Weaver
***@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/Home
"When all else fails, RTFM."

Rich Ulrich

2014-09-10 06:32:05 UTC

Permalink

On Mon, 8 Sep 2014 15:30:01 -0700 (PDT), Fijoy Vadakkumpadan

First, I want to say that my recent knowledge of AIC has been
doubled, at least, by reading the Wikip article and David's post.
So I am not expert. If you want to use AIC, I suggest that you
find a good model in your own area. I don't remember ever seeing
an article with AIC in my own part of biostatistics, so I can't say
how it is used, or when.

And I think that I would distrust those two articles on the same
simulations -- though I judge them only by the abstracts, so I
could be wrong.

From David and from Wikip, I gather that the proper use of AIC
is to compare to choices that are of equal a-priori relevance and
desirability; it is not a test.

The authors of your articles also throw a bone in the direction of
"choice of models"; but it seems to me that they say "AIC is better"
because AICc favored models with fewer parameters. That ...
(a) would seem to be a bias, rather than an advantage; and
(b) would seem to be a matter of "tuning" what their approach
based on an F-test.

Without seeing the articles, I don't *know* what they are comparing.
As Bruce mentions, the LRT is ordinarily tested by chisquared.
However, I do get this clue from the Wikip article, followed by my
own inspired guessing.

- Near the end of the Wikip article on AIC -
"Further comparison of AIC and BIC, in the context of regression, is
given by Yang (2005). In particular, AIC is asymptotically optimal
in selecting the model with the least mean squared error, ..."

Now, in order to select the model with the least mean squared error,
you use the regression stepwise criterion which is equivalent to:
"Enter the new variable if the F-to-enter is greater than 1.0."
Keep in mind that the 1.0 is the average value for a random
predictor.

Further: The equations, if you can figure them out, show that the
AIC criterion (not AICc, the corrected AIC) is closely analogous
to using F-to-enter of 1.0. Thus, I figure when the authors say
that they are using F, they must be using that criterion of 1.0.

However, if you *prefer* fewer variables, apparently you should
not use a symmetric procedure like AIC. In the (explicit) interest of
parsimony, a default for "stepwise regression" was often taken as
a test of F at 5%, not at 50%.

Getting back to the article:
The authors, however, are testing AICc, which builds in a clear
bias (compared to AIC) against models with more parameters,
especially with small sample Ns (which is what they test).
So: I would not be surprised if there modeling would have had
*exactly* the same choices between AIC and their F-test method;
in that case, what they demonstrate is the tuning difference between
AIC and AICc.

I suppose that what AICc gives you is the justification hidden
inside the "black box" of obscurity, for using a criterion that is a
little stiffer than F=1.0 in deciding to accept a larger model.

... I end with apologies to the authors, for however much I
have mis-called what they were doing.

(David - Does this make sense?)

--
Rich Ulrich

Rich Ulrich

2014-09-10 07:00:24 UTC

Permalink

On Wed, 10 Sep 2014 02:32:05 -0400, Rich Ulrich

Post by Rich Ulrich
First, I want to say that my recent knowledge of AIC has been
doubled, at least, by reading the Wikip article and David's post.
So I am not expert. If you want to use AIC, I suggest that you
find a good model in your own area. I don't remember ever seeing
an article with AIC in my own part of biostatistics, so I can't say
how it is used, or when.

Near the end of the Wikipedia article on the Akaike Information
Criterion, there is a link to "Model selection (University of Iowa)",
which gets you to a fine set (on my brief inspection) of lecture
notes.

http://myweb.uiowa.edu/cavaaugh/ms_seminar.html

I scanned the notes on the AIC lecture and the lecture on
regression.

--
Rich Ulrich

David Jones

2014-09-10 09:51:35 UTC

Permalink

"Rich Ulrich" wrote in message news:***@4ax.com...

On Mon, 8 Sep 2014 15:30:01 -0700 (PDT), Fijoy Vadakkumpadan

Post by Fijoy Vadakkumpadan
Hi Rich,
1.http://scitation.aip.org/content/aapm/journal/medphys/34/11/10.1118/1.2794176
2. http://www.ncbi.nlm.nih.gov/pubmed/19761098
They are not specifically comparing AIC to LRT, but since LRT uses
F-distribution, they are relevant to the question. Have you seen these
studies? If yes, do you think that they provide evidence for the
superiority of AIC?