Test - Chi-Square Test of Independence
Test for independence between a pair of categorical variables
Tests for independence between a pair of categorical variables. Any non-categorical variables that are supplied will be treated as categorical, that is to say, cases with the same value are treated as being in the same category, and date variables are categorised by period.
How to run this test
- In Displayr, go to Insert > More > Test > Chi-Square Test of Independence. In Q, go to Create > Test > Chi-Square Test of Independence
- Specify the variables to use under Inputs > Input Variables
- Adjust the options (noted below)
You should use numeric variables as inputs. If you use categorical or ordinal variables, they will be coerced to numeric based on their values for the purposes of runnning the test.
Example
An example output is shown below:
Options
INPUTS
- Variable 1 Sample to analyse.
- Variable 2 Second sample to compare to Variable 1.
- Variable names Display Variable Names in the output, instead of Variable Labels.
- More decimal places Display numeric values with 8 decimal places.
Additional Properties
When using this feature you can obtain additional information that is stored by the R code which produces the output.
- To do so, select Create > R Output.
- In the R CODE, paste: item = YourReferenceName
- Replace YourReferenceName with the name of your item. (eg: 'chi_square.test'). You can find this by selecting the item and then going to Properties > General > Name from the object inspector on the right.
- Below the first line of code, you can paste in snippets from below or type in str(item) to see a list of available information.
Acknowledgements
Uses the svychisq function from the survey package to conduct the chi-square test.
Code
form.setHeading("Chi-Square Test of Independence");
form.dropBox({label: "Variable 1",
types:["Variable: Numeric, Categorical, OrderedCategorical, Text, Date, Money"],
name: "formVariable1", prompt: "Select the Variable containing the first sample"});
form.dropBox({label: "Variable 2",
types:["Variable: Numeric, Categorical, OrderedCategorical, Text, Date, Money"],
name: "formVariable2", prompt: "Select the Variable containing the second sample"});
form.checkBox({label: "Variable names", name: "formNames", default_value: false,
prompt: "Display names instead of labels"});
form.checkBox({label: "More decimal places", name: "formDecimals", default_value: false,
prompt: "Display numeric values with eight decimal places"});
library(flipData)
library(flipFormat)
library(flipTransformations)
library(survey)
if (length(formVariable1) != length(formVariable2))
stop("Variables 1 and 2 have different lengths. Please ensure that the variables are from the same data set or have the same length.")
dat.raw <- ProcessQVariables(data.frame(var1 = formVariable1, var2 = formVariable2, stringsAsFactors=FALSE))
dat <- dat.raw[QFilter, ]
dat$var1 <- factor(dat$var1)
dat$var2 <- factor(dat$var2)
if (is.null(QCalibratedWeight)) {
s <- summary(xtabs(~ var1 + var2, dat))
statistic.value <- s$statistic
df <- s$parameter
p.value <- s$p.value
statistic.name <- "Chi-square"
} else {
wgt <- QCalibratedWeight[QFilter]
design <- WeightedSurveyDesign(dat, wgt)
tryCatch(test <- svychisq(~ var1 + var2, design, statistic = "F"),
error = function(e) {
if (grepl("system is computationally singular", e))
stop(paste("A weighted chi-square test could not be run using the selected variables.",
"Consider merging categories or removing the weight variable."))
else
stop(e)
})
statistic.value <- test$statistic
df <- test$parameter[1]
p.value <- test$p.value
statistic.name <- "F"
}
decimal.places <- if (formDecimals) 8 else NULL
chi.sq.test <- list(statistic = statistic.value, df = df, p.value = p.value)
SignificanceTest(chi.sq.test, "Chi-Square Test of Independence", dat.raw, filter = QFilter,
show.labels = !formNames, decimal.places = decimal.places)