Regression Troubleshooting

From Q
Jump to navigation Jump to search


One or more of the independent variables contains no variation

This error message indicates that a variable contains no variation and it is thus impossible to estimate the regression model. Most commonly, this is caused by dummy coding, with a variable being created for a category containing no respondents. Collapse the offending category in the Outputs Tab to solve the problem. Please note that Q automatically filters respondents who have missing data on any of the variables in the regression.

All coefficients are NaN or 0

Most likely, there is some kind of problem with your data. For example:

  • The range of one of your independent variables is too large (e.g., in the thousands or more), which causes numerical precision problems. If you divided each by, say, 2 times its standard deviation, you typically solve this problem and, as an added benefit, have coefficients that are directly comparable with the coefficients of categorical variables.[1]
  • Perfect multicollinearity. Inspect the VIF columns (see Coefficients and related statistics).

A coefficient is zero but is shows as being statistically significant

The range of the independent variables is too large (e.g., in the thousands or more), which causes numerical precision problems. If you divided each by, say, 2 times its standard deviation, you typically solve this problem and, as an added benefit, have coefficients that are directly comparable with the coefficients of categorical variables.[2]

Lots of NaNs in the outputs

Most likely, you have included independent variables that induce perfect multicollinearity. Inspect the VIF columns (see Coefficients and related statistics).

Implausibly big numbers

If using an Ordered Logit Model, the problem will likely result to their being categories containing too few values. Collapse some of the categories and the problem should be resolved.

Error in incomplete Beta function

Possibly the included independent variables have induced perfect multicollinearity (i.e., they are linearly dependent).

An independent variable has no variation in a category of the dependent question

There are usually three solutions to this problem. Either, collapse smaller categories of categorical independent questions (see the table provided in the output), collapse categories in the dependent question, or change the type of the dependent question (e.g., from categorical to numeric).

References

Template:Reflist

  1. Andrew Gelman (2008), Scaling regression inputs by dividing by two standard deviations in STATISTICS IN MEDICINE, 27:2865–2873.
  2. Andrew Gelman (2008), Scaling regression inputs by dividing by two standard deviations in STATISTICS IN MEDICINE, 27:2865–2873.