Creating Binary Variables

From Q
Jump to navigation Jump to search

Binary variables are variables which only take two values. For example, Male or Female, True or False and Yes or No. While many variables and questions are naturally binary, it is often useful to construct binary variables from other types of data. For example, turning age into two groups: less than 35 and 35 or more. Constructing binary variables is also known as quantizing and dichotomizing.

The uses of binary variables

Filters

A 1 indicates inclusion of a respondent in the filter and 0 indicates exclusion.

As intermediate variables

Often it is useful to construct binary variables for use in creating other variables. For example, if creating a new segmentation variable, the first step may be to create multiple binary variables, each representing a single segment, and then convert these into a Pick One question using Insert Ready-Made Formula(s) Menu > Pick Any -> Pick One.

For use in statistical analysis

To most people, averages and percentages are quite different concepts, with averages applying to numeric data (e.g., number of pizzas eaten in a week) and percentages relating to categories (favorite brand of pizza). From a computational perspective, averages and proportions are very closely related and this interrelationship can be exploited using Q to save time (if you used SPSS, it is likely that you already understand the basic principles that are demonstrated in this section; if not, it may seem a bit strange at first).

If you construct binary variables by recoding or constructing numeric values to only take values of 0, 1 and NaN, any computed averages will also be proportions. If, for example, you have a sample with 56% males, and you recode the gender variable so that males have a value of 1 and females 0, and convert its Question Type to either Number or Number – Multi, the average will be 0.56. The main benefit of binary variable “maths” is that while a variable that Q knows is binary will always have a NET, a numeric variable instead has a SUM. If, for example, the question was measuring brands that the consumer would consider buying, the SUM would then measure the consideration set size (whereas with a traditional binary variable, the NET would indicate the proportion of the sample to consider 1 or more brands).

Why does this work? It is because any binary variable, by definition, is implicitly also a Numeric Variable. That is, Numeric Variables are variables that can take any value, and binary variables take values of 0 and 1 and are thus Numeric Variables.

Ways of creating binary variables

Related Online Training modules
Complicated Filters
Simple Filters
Generally it is best to access online training from within Q by selecting Help > Online Training

Any of the ways for creating numeric variables can be used to create binary variables. However, the main approaches are:

Creating a Binary - Complicated filter

By right-clicking on the Variables and Questions tab and selecting Insert Variable(s) and Binary - Complicated Filter.

Creating filters

Any Filters that are created (from a cell in a table, for example) are binary variables.

Creating a binary question by changing Question Type

When you use Set Question to create a Pick Any or Pick Any – Grid question, each of the variables in the question become binary.

Recoding

Binary variables can be constructed by editing the Values in the Value Attributes so that they all take only two values (generally, 0 and 1 are most appropriate).

Creating banner questions

Banner questions are intrinsically binary (i.e., they have the Question Type of Pick Any).

Missing values in binary variables

In theory, binary variables should only have two values. In practice, it is often useful if they can also have missing values, in which case the Binary - Complicated Filter and the methods based around creating Filters tend not to be useful.