How To Troubleshoot Errors in Weights

From Q
Jump to navigation Jump to search

When creating a weight it is possible for there to be a mis-match between the targets that have been specified and the actual sample sizes in the data set which prevents the weight from being calculated. This page describes the kinds of mis-match that can occur and the steps that can be taken to allow the weight to be calculated.

Problems with weights often occur when the samples in the categories used by the weight are small or empty. Errors that are encountered when constructing a weight can indicate that there are problems with the sample, or that the weight scheme is too complicated. The general approach to solving problems with weighting is to simplify the weighting scheme by either reducing the number of questions that are being used, or by consolidating categories within the input questions.

When creating a weight you can obtain information about the weight by clicking on the Diagnostics Report button at the bottom left of the weight window.

Empty Categories

If a category or combination of categories is empty in the data file, then it is logically impossible for that category to represent more than 0% of the data when weighted. As a result, if you assign a target value to that category which is not zero then the weight cannot be calculated.

In this situation, you will see a message like:

This message tells us that there are two categories in the data which are empty, Males, Less than 18 and Females, Less than 18, but that these categories have targets assigned to them. Approaches to solving this problem include:

  1. Enter target values of 0 for the empty categories. This will require the other targets to be adjusted appropriately according to the design of the research.
  2. Merge the empty categories with other categories. Merging is done by dragging-and-dropping categories in the table where weight targets are entered.

Example

The warning above was generated for the set of targets for Age and Gender shown on the left. An alternative set of targets (on the right), has been obtained by setting the two empty categories to have targets of 0, and adjusting the remaining targets so that they sum to 100%:

Rim Weighting does not Converge

When the weighting involves multiple sets of targets (called Weight Sets) Q uses an algorithm called Rim Weighting to estimate the weight for each respondent so that each of the different sets of targets is achieved. Because each set of targets is independent of the others, the situation can arise where one set of weight targets contradicts one of the other sets of targets by predicting two different sample sizes for the same group of respondents. When a contradiction like this occurs then it is logically impossible to calculate an appropriate weight.

In cases of this kind you will see the following message:

Diagnosing and Solving the Problem

If you have used more than two weight sets then the first step to solving the problem is to identify which pair of questions is in conflict. It is possible that three or more questions can be in conflict with one another. The following process will allow you to identify which weight sets are contributing to the conflict:

  1. Use Insert Variables > Weight to create your weight, include as many weight sets in the weight as possible without getting an error in Diagnostics Report.
  2. Save the variable by clicking OK in the Weight Construction dialogue box.
  3. For each weight set:
    1. Right-click the weight variable and select Edit Weight.
    2. Complete the weight, ensuring that all targets are entered for all weight sets.
    3. Remove the current weight set.
    4. Click Diagnostics Report. If an error is still generated then you know that the weight set that you removed is not the source of the conflict in the weight. If an error is not generated then the weight set that has been removed is contributing to the conflict.

The final step is to try and identify the reason why the questions that have been identified above are in conflict with the weight targets. There is no general solution, and the problem can be tricky to identify. Some trial-and-error is required. One approach is to:

  1. Create a cross-tab between each pair of questions identified above, and show the n in the Statistics - Cells.
  2. Examine the sample sizes in the tables and try to identify those which are particularly small (e.g. n = 1 or n = 2) as these are most likely to present problems.
  3. Create your weight again.
  4. In the Weight Construction window, merge the category with the small sample size with another category, enter a target for the combined category and then click Diagnostics Report. If the same error still occurs then a different combination of categories is require, and the process should be repeated. The choice of category to merge will depend both on what makes the most sense from the point of view, and which combination of categories solves the issue.

Example

The following example illustrates a mis-match of weight sets which is particularly extreme. Here, a weight is constructed from three weight sets, one for each of Age, Gender, and Income. The weight generates the error message shown above.

The first step is to try to determine which questions are at fault by creating the weight with each of the weight sets left out. Doing this, we find that leaving Gender out of the weight set still results in the same error, but removing either Age or Income does not. From this we can conclude that Age and Income are in conflict with one another.

The next step is to inspect Age and Income to determine where the fault lies. Creating a cross-tab and showing Statistics - Cells > n shows us that there is a single respondent who is in the 50 or more age bracket and $45,001 to $60,000 income bracket. This is the source of the contradiction in the targets.

The targets for age predicts that this person should account for 20% of the weighted sample, whereas the targets for income predict that this person should account for 30% of the weighted sample. Since these two conditions cannot be true at the same time, there is no way to calculate a weight for these targets. The solution is to merge one of the trouble categories with another, so that this respondent is no longer the sole occupant of two categories with different targets. In this case it might be appropriate to merge $45,001 to $60,000 with Less than $40,000, or merge 50 or more with 35 to 49. Both of these options will allow the weight to be computed.