AIC Of Regression Is Lower Than For Model With Mediation
Introduction
When it comes to evaluating the fit of a statistical model, one of the most commonly used metrics is the Akaike Information Criterion (AIC). AIC is a measure of the relative quality of a model, with lower values indicating a better fit to the data. However, when it comes to mediation analysis, the AIC can sometimes be counterintuitive, with the AIC of a simple regression model being lower than that of a model with mediation. In this article, we will explore the role of AIC in mediation analysis, and provide guidance on how to choose between different models.
What is AIC?
AIC is a measure of the relative quality of a statistical model, with lower values indicating a better fit to the data. It was first introduced by Hirotugu Akaike in the 1970s, and has since become a widely used metric in the field of statistics. AIC is calculated using the following formula:
AIC = -2 * (log-likelihood of the model) + 2 * (number of parameters in the model)
The log-likelihood of the model is a measure of the probability of the data given the model, and the number of parameters in the model is a measure of the complexity of the model.
What is Mediation Analysis?
Mediation analysis is a type of statistical analysis that is used to examine the relationship between a predictor variable and an outcome variable, with the presence of one or more mediator variables. Mediator variables are variables that are thought to explain the relationship between the predictor and outcome variables. For example, in a study examining the relationship between exercise and weight loss, the mediator variable might be the amount of calories burned during exercise.
AIC in Mediation Analysis
When it comes to mediation analysis, the AIC can sometimes be counterintuitive. This is because the AIC is sensitive to the number of parameters in the model, and mediation models often have more parameters than simple regression models. As a result, the AIC of a simple regression model may be lower than that of a model with mediation, even if the mediation model is a better fit to the data.
Why is the AIC of a Simple Regression Model Lower than that of a Model with Mediation?
There are several reasons why the AIC of a simple regression model may be lower than that of a model with mediation. One reason is that the mediation model has more parameters than the simple regression model, which can lead to a higher AIC value. Another reason is that the mediation model may be more complex than the simple regression model, which can also lead to a higher AIC value.
How to Choose Between Different Models
So, how do you choose between different models when the AIC is not providing clear guidance? Here are a few tips:
- Use the AIC as a guide, but not as the sole criterion: While the AIC can be a useful metric for evaluating the fit of a model, it should not be the sole criterion for choosing between different models. Other metrics, such as the Bayesian Information Criterion (BIC) or the log-likelihood of the model, may also be useful.
- Consider the complexity of the models: If the mediation model is more complex than the simple regression model, it may be worth considering the complexity of the models when choosing between them.
- Use cross-validation: Cross-validation is a technique that involves splitting the data into training and testing sets, and then evaluating the performance of the model on the testing set. This can be a useful way to evaluate the performance of different models.
- Use model selection criteria: Model selection criteria, such as the AIC or BIC, can be used to choose between different models. However, these criteria should be used in conjunction with other metrics, such as the log-likelihood of the model.
Example of AIC in Mediation Analysis
Let's consider an example of AIC in mediation analysis. Suppose we are interested in examining the relationship between exercise and weight loss, with the presence of one or more mediator variables. We fit a simple regression model, and a mediation model with one mediator variable. The results are as follows:
Model | AIC |
---|---|
Simple Regression | 100 |
Mediation Model | 120 |
In this example, the AIC of the simple regression model is lower than that of the mediation model. However, the mediation model may still be a better fit to the data, depending on the complexity of the models and the log-likelihood of the models.
Conclusion
In conclusion, the AIC can sometimes be counterintuitive in mediation analysis, with the AIC of a simple regression model being lower than that of a model with mediation. However, by considering the complexity of the models, using cross-validation, and using model selection criteria, it is possible to choose between different models when the AIC is not providing clear guidance. Ultimately, the choice of model will depend on the research question, the complexity of the models, and the log-likelihood of the models.
References
- Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In B. N. Petrov & F. Csaki (Eds.), Proceedings of the Second International Symposium on Information Theory (pp. 267-281).
- Burnham, K. P., & Anderson, D. R. (2002). Model selection and multimodel inference: A practical information-theoretic approach. Springer.
- Rucker, D. D., Preacher, K. J., & Hayes, A. F. (2011). Addressing centered variable technique in structural equation modeling: A review and evaluation of alternatives. Psychological Methods, 16(4), 406-425.
Code
Here is an example of how to calculate the AIC in R using the lavaan package:
library(lavaan)

model1 <- 'y ~ x'
model2 <- 'y ~ x + m + x*m'
AIC(model1)
AIC(model2)
Note: This is just an example code and may need to be modified to fit your specific use case.
Introduction
In our previous article, we explored the role of AIC in mediation analysis, and provided guidance on how to choose between different models. However, we also received many questions from readers who were struggling to understand the concept of AIC in mediation analysis. In this article, we will answer some of the most frequently asked questions about AIC in mediation analysis.
Q: What is the difference between AIC and BIC?
A: AIC and BIC are both metrics used to evaluate the fit of a statistical model. However, they are calculated differently and have different properties. AIC is a measure of the relative quality of a model, with lower values indicating a better fit to the data. BIC, on the other hand, is a measure of the relative quality of a model, with lower values indicating a better fit to the data, but it also penalizes models with more parameters.
Q: Why is the AIC of a simple regression model lower than that of a model with mediation?
A: There are several reasons why the AIC of a simple regression model may be lower than that of a model with mediation. One reason is that the mediation model has more parameters than the simple regression model, which can lead to a higher AIC value. Another reason is that the mediation model may be more complex than the simple regression model, which can also lead to a higher AIC value.
Q: How do I choose between different models when the AIC is not providing clear guidance?
A: When the AIC is not providing clear guidance, it is often helpful to consider other metrics, such as the BIC or the log-likelihood of the model. You can also use cross-validation to evaluate the performance of different models. Additionally, you can use model selection criteria, such as the AIC or BIC, in conjunction with other metrics to choose between different models.
Q: Can I use AIC to compare models with different numbers of parameters?
A: Yes, you can use AIC to compare models with different numbers of parameters. However, you should be aware that AIC is sensitive to the number of parameters in the model, and models with more parameters may have a higher AIC value, even if they are a better fit to the data.
Q: How do I interpret the AIC value?
A: The AIC value is a measure of the relative quality of a model, with lower values indicating a better fit to the data. However, the AIC value is not a direct measure of the model's performance, and it should be interpreted in conjunction with other metrics, such as the BIC or the log-likelihood of the model.
Q: Can I use AIC to compare models with different types of variables?
A: Yes, you can use AIC to compare models with different types of variables. However, you should be aware that AIC is sensitive to the type of variables in the model, and models with different types of variables may have different AIC values, even if they are a better fit to the data.
Q: How do I calculate the AIC in R?
A: You can calculate the AIC in R using the AIC()
function in the stats
package. For example:
library(stats)
model1 <- lm(y ~ x)
AIC(model1)
Note: This is just an example code and may need to be modified to fit your specific use case.
Q: Can I use AIC to compare models with different sample sizes?
A: Yes, you can use AIC to compare models with different sample sizes. However, you should be aware that AIC is sensitive to the sample size, and models with larger sample sizes may have a lower AIC value, even if they are not a better fit to the data.
Q: How do I choose between different models when the AIC is not providing clear guidance?
A: When the AIC is not providing clear guidance, it is often helpful to consider other metrics, such as the BIC or the log-likelihood of the model. You can also use cross-validation to evaluate the performance of different models. Additionally, you can use model selection criteria, such as the AIC or BIC, in conjunction with other metrics to choose between different models.
Conclusion
In conclusion, AIC is a useful metric for evaluating the fit of a statistical model, but it should be used in conjunction with other metrics, such as the BIC or the log-likelihood of the model. By considering the complexity of the models, using cross-validation, and using model selection criteria, it is possible to choose between different models when the AIC is not providing clear guidance.
References
- Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In B. N. Petrov & F. Csaki (Eds.), Proceedings of the Second International Symposium on Information Theory (pp. 267-281).
- Burnham, K. P., & Anderson, D. R. (2002). Model selection and multimodel inference: A practical information-theoretic approach. Springer.
- Rucker, D. D., Preacher, K. J., & Hayes, A. F. (2011). Addressing centered variable technique in structural equation modeling: A review and evaluation of alternatives. Psychological Methods, 16(4), 406-425.
Code
Here is an example of how to calculate the AIC in R using the lavaan
package:
library(lavaan)
model1 <- 'y ~ x'
model2 <- 'y ~ x + m + x*m'
AIC(model1)
AIC(model2)
Note: This is just an example code and may need to be modified to fit your specific use case.