Regression - Diagnostic - Plot - Influence Index extension
Create index plots of studentized residuals, hat values, and Cook's distance
Charts the studentized residuals, hat values, and Cook's distances for the observations in a regression model. The observation numbers of the five highest values on each of the measures are charted.
Example
The below example shows the output from running this diagnostic on a Poisson regression model used to predict days absent from school for a sample of school children in New South Wales, Australia.
Details
Roughly, in a model with n samples and p predictors (including the intercept), studentized residuals larger than 2 indicate possible outliers, hat values larger than 2*p/n indicate possible high leverage values, and Cook's distance values larger than 4/(n-p) indicate possible highly influential observations.
Acknowledgements
Uses the influenceIndexPlot function from the car package.
References
Fox, J., & Sanford, W. (2011). An R Companion to Applied Regression, Second Edition. Thousand Oaks CA: Sage.
Weisberg, S. (2014). Applied Linear Regression, Fourth Edition. Wiley.
Code
includeWeb("QScript R Output Functions");
var is_displayr = (!!Q.isOnTheWeb && Q.isOnTheWeb());
if (!is_displayr)
{
if (Q.fileFormatVersion() >= 17.13)
main();
else
alert("Please update Q to use this feature from the extension button, or run it from the menu via Create > Regression > Diagnostic > Plot > Influence Index.");
}
else
{
main();
}
function main() {
// The following 2 variables contain information specific to this diagnostic.
var required_class = "Regression";
var output_name_suffix = "influence.index";
var item = checkSelectedItemClass(required_class);
if (item == null)
return false;
var r_name = stringToRName(item.referenceName);
// The following lines contain the R code to run
var expression = "car::influenceIndexPlot(" + r_name + ", id = list(method = 'y', n = 5, cex = 1, location = 'lr'), vars = c('Studentized', 'hat', 'Cook'))";
return createROutput(item, expression, output_name_suffix);
}