Advanced Data Tidying

The steps described in Basic Workflow For Checking and Cleaning a Project and Constructing Variables to Make Analysis Easy will be sufficient for most in terms of preparing your data. However, where the data is to be used by groups that are relatively inexperienced in data analysis, or, those under extreme time pressure, it can be useful to tidy your data further.

Which questions go where

Overall structure

Where the users are very familiar with the questionnaire and its structure, it is usually best to have the data file reflect the order of the questionnaire. In other situations, the following structures can be better:

Ordering data according to how often it will be used, with the most regularly used data at the top (e.g., demographics and segments).
General-to-specific. For instance, category questions could be listed first, then brand questions. And, in the case of trackers, you may choose to have questions that are current and consistently asked at the top and historic or ad-hoc questions at the bottom.
WHO, WHAT, WHERE, WHEN and WHY.
SITUATION (when, who with and other aspects of context), BEHAVIOR (i.e., action), and PERSON (personality, values, demographics, etc.).
INFORMATION SEARCH, AWARENESS, CONSIDERATION, TRIAL, USAGE FREQUENCY and SATISFACTION.
DEMOGRAPHICS, MEDIA, ATTITUDES, CATEGORY BEHAVIOUR, BRAND BEHAVIOUR.

Data can be re-ordered on the Variables and Questions tab by dragging-and-dropping and via the Variables and Questions Tab toolbar buttons.

Standard analysis variables

Any standard analysis variables should be included in the data file or created. These will typically vary by client and industry. For example:

In packaged goods and financial services studies "family lifestage" is usually relevant.
In media studies Age-by-Gender is often valuable.
Medical studies typically create variables on patient attributes.

Hiding uninteresting data

See Hiding irrelevant data

Creating sections headings in the data

Although Q does not allow you to create folders of variables, you can get a similar effect by inserting variables as section headings. Sections can be created in Q by inserting a new Binary Variable and giving it a distinctive design to separate blocks of questions in your data. For instance, you may make it indented and in capitals such as NEEDS AND WANTS and show the proportion of people to complete the section, with the label showing a description of the sample.

Tidying questions

Names and labels

Often the names shown in the Question and Label fields of the Variables and Questions tab are messy, containing strange programming characters and truncated question wordings. It is generally a good idea to:

Tidy them.
Abbreviate them, so that when they appear in menus and exports they are easy-to-read.
If the questionnaire has been ordered by question number, include the question number in the name of the question (e.g., Q1. Age). Note that you can include the full wording of the question in the footer (see below).

The following can be useful ways of quickly tidying up names and labels:

Ensuring that they are created in a neat and organized way in the original data file (e.g., see SPSS Data File Specifications).
Modifications can be made to label using Find/Replace, which supports wildcards (see Find Replace).
Copying and pasting the Label column into Excel, modifying in Excel, and pasting back into Q again (by right-clicking the first variable and selecting Paste Labels).
Preliminary Project Setup - Search for Improved Question Names in Data Labels
Preliminary Project Setup - Remove Truncated Text from Variable Labels

Sorting categories within a question

Sorting can either be done manually, by dragging-and-dropping, but there are also several options for automatic sorting. These can be found by typing the word sorting into the Search features and data box in the top right of the Q window. The options are:

Select Sorting and Reordering - Sort from Highest to Lowest (Does Not Update When Data Changes) to sort all questions in the project once according to the data in their SUMMARY tables.
Select Sorting and Reordering - Sort Rows (Automatically Updates when Data Changes) to apply a table Rule to any selected tables which sorts them according to the results currently shown in the table.

Merging together small categories

Merge together categories with small counts (e.g., collapsing age categories and brands with less than 2% market share). This is often best done by:

Select Automate > Browse Online Library Preliminary Project Setup > Create Tables for Data Checking. This creates tables containing data with small counts.
Merging together categories by dragging-and-dropping.

Removing irrelevant SUM and NET categories

Sometimes the SUM and NET categories are unhelpful. For example, if using a Number - Multi question to represent rating scales. They can be removed by right-clicking on them and selecting Hide.

Creating a report "shell"

It is sometimes useful when setting up a project to create the "shell" of a report, which can be be modified as per requirements by users.

Creating tables

The Report tree in Q is a useful way of setting out the most important findings in the data, or for providing an overview against key groups such as segments, countries or targets. Depending on the user, this will either be:

A set of summary tables or charts, that can provide a starting point for the user to use in exploring the project.
A set of crosstabs with all the tables crossed by a few standard questions

Although there are lots of tools within Q for quickly creating a number of overview tables, the most straightforward approach may be to use one of the following scripts:

Select Automate > Browse Online Library.
Select and run whichever seems most appropriate of:

It is often useful to create Groups of tables in the Report tree to make it easier for users to navigate.

Folders

Tables and charts can be grouped into folders. The folders are created by right-clicking on the Report and selecting Add Folder.

Filters

Quickly create lots of filters using Automate > Browse Online Library > Filtering > Filters from Selected Data. See Filters for a description of other ways to create filters.

Where a question has only been asked to a subset of the sample, it can be useful to create a a relevant footer and apply it to the appropriate tables.

Footers

Footers can be customized. In most cases, this is best done using Table Options. However, if adding footnotes containing question wordings, this is best done using Automate > Browse Online Library > Modify Footers > Description of Selected Data (e.g., Question Name, Skips, Filtering).

Changing the Appearance of Charts and Tables

See Creating And Modifying Tables.

Sample size warnings and automated data hiding

There are a variety of QScripts and Rules that can be used to give appropriate warnings and to hide data. To apply these:

Type the words sample size into the Search features and data box in the top right of the Q window
Select the desired option from the QScripts and Rules section of the results. For example:

Transposing tables

Sometimes tables of Pick One - Multi and grid questions are easier to read if transposed (i.e., right-click on a column or row and select Swap Rows and Columns).

Statistics

Statistics can be placed on multiple tables at one time by either:

Multi-selecting lots of tables and using whichever is appropriate of Statistics - Cells, Statistics - Right and Statistics - Below.
Rules (e.g., Modifying The Whole Table or Plot - Always Show Sample Size).

With tables involving Pick One - Multi and Pick One questions, it is often a good idea to use Statistics - Right and Statistics - Below when setting up the project, as many users will not discover these on their own.

Customizing the names of statistics

Statistics can be renamed (e.g., changing Average to Mean or Net Promoter Score), by either:

Edit > Project Options > Customize > Output Text which will rename the statistics for the entire project.
Edit > Table Options > Output Text which will rename the statistics for the selected tables.
Using Rules (e.g., Modify Headers - Automatically Rename Row Labels).

However, in general, it is often a bad a idea to rename statistics, as it can make it hard for users to understand how Q works, as it will cause the version they are using to appear different to the version that appears in all of Q's documentation.

Sharing data and results

See Sharing Data And Results