Regression - Diagnostic - Plot - Influence Index extension

From Q
Jump to navigation Jump to search


Create index plots of studentized residuals, hat values, and Cook's distance

Charts the studentized residuals, hat values, and Cook's distancesCook's distances for the observations in a regression model. The observation numbers of the five highest values on each of the measures are charted.

Example

The below example shows the output from running this diagnostic on a Poisson regression modelPoisson regression model used to predict days absent from school for a sample of school children in New South Wales, Australia.

Details

Roughly, in a model with n samples and p predictors (including the intercept), studentized residuals larger than 2 indicate possible outliers, hat values larger than 2*p/n indicate possible high leverage values, and Cook's distance values larger than 4/(n-p) indicate possible highly influential observations.

Acknowledgements

Uses the influenceIndexPlot function from the car package.

References

Fox, J., & Sanford, W. (2011). An R Companion to Applied Regression, Second Edition. Thousand Oaks CA: Sage.

Weisberg, S. (2014). Applied Linear Regression, Fourth Edition. Wiley.

Code

includeWeb("QScript R Output Functions");

var is_displayr = (!!Q.isOnTheWeb && Q.isOnTheWeb());
if (!is_displayr)
{
    if (Q.fileFormatVersion() >= 17.13)
        main();
    else
        alert("Please update Q to use this feature from the extension button, or run it from the menu via Create > Regression > Diagnostic > Plot > Influence Index.");
}
else
{
    main();
}

function main() {

    // The following 2 variables contain information specific to this diagnostic.
    var required_class = "Regression";
    var output_name_suffix = "influence.index";
    
    var item = checkSelectedItemClass(required_class);
    if (item == null)
        return false;
    var r_name = stringToRName(item.referenceName);

    // The following lines contain the R code to run
    var expression = "car::influenceIndexPlot(" + r_name + ", id = list(method = 'y', n = 5, cex = 1, location = 'lr'),  vars = c('Studentized', 'hat', 'Cook'))";

    return createROutput(item, expression, output_name_suffix);
}