Dedoose User Guide


Dedoose is an easy-to-use, collaborative, web based application that facilitates all types of research data management and analysis. Here's what you need to know about how to use it.

Analysis Workspace Features

Access the Analyze Workspace by clicking the ‘Analyze’ button on the Dedoose main menu bar:

The Analyze Workspace offers a number of chart ‘sets’ based on the various aspects of a project database. The sets are organized in a folder system so it is easy to find the charts, tables, or plots you are looking for depending on the immediate needs of your work. Here’s an example of the charts in the ‘Mixed Methods’ set.

Dedoose Analyze Workspace Mixed Methods Chart Menu Analyze img 2

It is also worth noting that a number of the charts, tables, and plots will appear in multiple sets within the ‘Chart Selector’ depending on the aspects of the database being represented in the visual. For example, the ‘Excerpts Per Document’ chart will appear in both the ‘Document Charts’ and ‘Excerpts Charts’ folders.

Analysis Charts, Tables, and Plots

The Dedoose Analyze Workspace offers a wide variety of data visualizations. These visualizations can be used to examine the general nature of your data, understand how the code system has been applied to the qualitative content, and expose patterns of variation in the qualitative data and coding activity across sub-groups. The visuals can be exported to facilitate the presentation of research findings, and as filters or windows to drill deeper into findings. These charts, tables, and plots are designed to be as informative, intuitive, and transparent as possible. They can be used in numerous combinations and be flexibly adapted to address particular research questions. Finally, most visuals have a variety of options that will be found in upper right corner of display, so look for icons to swap display format, expand view, export, include sub-code count …


Fundamental to all aspects of the Dedoose design is transparency. Our goal is to ensure that all users can quickly learn to navigate smoothly throughout Dedoose and easily access the features, input data, manipulation, and analysis tasks. Noteworthy is that in the Analysis Center and throughout Dedoose, every bar, slice of pie, bubble, and cell in a table is ‘HOT,’ i.e. dynamically linked to the underlying qualitative data. One click on the aspect of the visual will pull up the associated qualitative content being represented by the bar/slice/bubble/cell. From there you can explore further to more deeply understand the nature of the qualitative content or export for use in a manuscript or presentation preparation.

Chart Expansion and Export

Throughout Dedoose, there are two common controls in the panel header for exporting and viewing the chart in full screen:

  • Clicking the 'Full Screen' button expands the chart for easier viewing
  • Clicking the 'Export' button prepares a MS Excel file with the bar charts, tables, or plots and activates prompts to download and save the file on your local computer - just follow the prompts and you’ve got an Excel version of the chart and all underlying data for further analysis or presentation.

Introduction to Dedoose Charts, Tables, and Plots

Again, while a number of the data visualizations will appear in multiple sets within the ‘Chart Selector,’ we introduce the key types here before moving on to a more detailed description of what can be found within each of the charts/tables/plots.

Frequency Charts

The charts in this screen shot represent the relative number of excerpts that have been created within each media file. As a reminder, each bar in this display is ‘hot’ and a simple click will immediately pull up and present the underlying excerpts to facilitate interpretation of the graphical image.

Example: Excerpt Count x Media Chart Analyze img 3

Frequency Tables This next screen shot is an example of one of the many Dedoose frequency tables. This table presents the frequency with which each code (columns) has been applied to an excerpt in each media file (rows).

Example: Code Application Chart Analyze img 4

Tables like these are useful in visualizing how a code system has been applied across media files/data. While it may seem easy to recall the pattern in coded content by case, remembering interview #1 after working with interviews #2 to #10 is very difficult. Moreover, team members cannot be familiar with the work done by others. Emerging patterns in how a code system has been applied can play an important role in data analysis and interpretation. It is under these circumstances where presentations of data as in this Code by Document table can be illuminating. Remember, clicking a cell in the table will pull up the underlying content for further consideration and the totals row and column allow for pulling up batches of excerpts by media or code. Also, note that frequency is mapped to the color spectrum—reds being more frequent and blues less—so your eye will quickly move to where the action is taking place. Patterns here can help see how the conceptual framework you have represented in your code/tag system has been applied to (or mapped onto) your source data.

Descriptor Ratios

Descriptor ratio charts present the relative numbers of each sub-group for each of a project’s list-type (categorical) descriptor fields. These visualizations facilitate an understanding of variation within a project sample and can serve as filters or windows on the data provided by a particular sub-group for segmentation or sub-group specific analysis.

Example: Descriptor Ratios Chart Analyze img 5

Code Application by Descriptor Charts

The Code Application by Descriptor field charts (including the dynamic descriptors which are designed to support analysis of change over time in you qualitative data) have a number of options and can be some of the most useful visualizations for analysis, interpretation, and communication/presentation of research findings. Essentially, these charts represent the number of excerpts that have been associated with a particular code separately for each sub-group within a descriptor field. In the example below we see charts for each code showing the relative number of excerpts tagged with the specific code separately for sub-groups based on the ‘Mother Primary Language’ descriptor field groups: ‘Bilingual,’ ‘English,’ and ‘Spanish.’

The drop-down menu allows for an immediate reconfiguration of the charts by selecting any of the list-type or grouped number or date/time descriptor fields in the project.

Example: Codes x Descriptor Chart Analyze img 6

Other important controls for these charts can be found in the panel sub-header showing radio buttons next to the drop-down menu to switch the chart from relative excerpt count to average weights applied (where used) together with check boxes for ‘Hit/Miss,’ Sub-code Count,’ ‘Normalize,’ and ‘%.’ By default, the ‘Normalize’ and ‘%’ boxes are checked.

  • Hit/Miss option (which defaults to ‘off’) toggles the chart to a display representing the number of cases in each sub-group with one or more excerpts tagged with the particular code
  • Sub-code Count option (which defaults to ‘off’) essentially serves to ‘collapse up the code tree,’ thus including excerpts tagged with all root codes AND subordinate codes (child, grandchild, …) in the visual. For example, if the Parent-Child code had child codes associated with it in the tree, all excerpts coded with Parent-Child Talking OR any of the child codes would be included in the chart for the Parent-Child Talking code
  • Normalization option (defaults to ‘on’) adjusts each bar based on the relative number of cases in each sub-group (see below for the normalization procedure). Simply, a graphical representation for code application frequency by sub-group is relatively meaningless if there are unequal numbers of individual cases across each sub-group. For example, in this study, the ‘Spanish’ group for ‘Mother Primary Language’ is disproportionally large (representing 64% of the total sample). Turning off the normalization adjustment results in a possibly misleading visualization. Below is the same chart as above with normalization turned off:

Example: Codes x Descriptor Chart with Normalization Off Analyze img 7

This ‘non-normalized’ chart, as compared to the original, appears to suggest a markedly high frequency of ‘Letter Recognition’ coded excerpts for the Spanish group. Hence, normalized charts provide a more unbiased perspective of the underlying data

  • The ‘%’ check box converts the chart from a raw count presentation to a percentage basis presentation, as shown in the following snapshot of the same chart with the percentage view deactivated:

Example: Codes x Descriptor Chart Raw Counts Analyze img 10

Descriptor by Descriptor by Code Application/Weight Charts

The Descriptor by Descriptor by Code Application/Weight chart has many options for drilling even deeper into variations across population subgroups. These ‘nested’ or ‘cross-tab’ display charts allow for examination of the qualitative data and coding/weighting activity based on two descriptor fields. The example below shows the average weight assigned to excerpts coded with ‘Reading by Mother' across ‘Mother Primary Language’ within ‘Mother Work Status’ subgroups. These charts can expose variation in value, sentiment, importance, quality, etc. (anything you have used the weighting system to represent) across various levels in the overall population. In the example, we see an interesting interaction pattern with variations in the levels of primary caretaker reading quality as a function of mother language and work status (i.e., excerpts were rated generally higher for not working moms who are bilingual or English speaking but full or part-time working moms who speak Spanish).

Example: Descriptor x Descriptor x Code Weight Chart Analyze img 11

Code Weight Frequency by Descriptor Field

The Code Weight Frequency by Descriptor Field Chart allows for a focus on the code weight distribution across descriptor field categories. The following example is based on the ‘Reading by Mother’ code and the ‘General Home Language’ descriptor field. The resulting chart shows clusters of excerpts tagged with ‘Reading by Mother’ across the weight scale and broken out by the home language descriptor field sub-groups. These displays can help communicate complex patterns and provide easy access to subsets of excerpts meeting specific sub-group and weight level criteria.

Example: Code Weight Frequency x Descriptor Field Chart Analyze img 12

The Code Weight Frequency by Descriptor Field charts is a prime example of how mixed methods can expose important patterns in research data. Well-designed code weight systems are ‘grounded’ in the underlying qualitative data—they are based on investigator defined variation within the sample population. When such systems are clearly defined and reliably applied, they:

  • Help generate clear illustrations of how coded qualitative content is distributed across the weighting system
  • Represent quantitative distributions of cases within the sample population across code applications, i.e., defining and exposing patterns of specific code application across the sample—providing both numerical information on cases based on the code’s content and a ‘grounded’ appreciation for each level of the weight system
  • Expose descriptor sub-group variation in the weighting assigned to each code application
  • Provide numerical representations of qualitative content that can be exported for use in other quantitative analysis.

Code Weight by Descriptor Bubble Plots

The Code Weight by Descriptor Bubble Plots allow a four-dimensional exploration and presentation of project data. These plots are based on the average weights associated with the selected codes across the selected descriptor field sub-groups.

Example: Code Average Weight x Descriptor Field Bubble Plot Analyze img 13

In the above example, from a study on the hotel characteristics reported as desirable across age and income level, the bubbles represent different income groups. The size of the bubbles represents the average code weight associated with application of the ‘Intimacy’ code, which has been applied when intimacy was mentioned as a desirable characteristic and a weighing to indicate the reported level of importance. The X and Y axes show the average weights associated with the application of ‘Sophistication’ and ‘Cost’ codes respectively. The highlighted bubble indicates that respondents reporting more than $250,000 in annual income report relatively low importance for ‘Cost,’ moderate importance for ‘Sophistication,’ and high importance of ‘Intimacy’ as compared to the other income level groups when making hotel decisions.

Wherever code weight/rating systems are used alongside code application activity (e.g. to index variation in importance, strength, sentiment, value, quality…), these plots can quickly expose complex multi-dimensional relations between variables across sub-groups. Finally, these plots are easily modified through the drop-down menus, communicate tremendous amounts of information and can be used to access the excerpts represented by a particular bubble or as filters for further analysis.

Code Frequency by Descriptor Bubble Plots

The Code Frequency by Descriptor Bubble Plots allow for a four-dimensional exploration and presentation of project data based on the frequency with which particular codes were applied to excerpts across the selected descriptor field sub-groups.

Example: Code Frequency by Descriptor Bubble Plot Analyze img 14

In the above example, from a study on the hotel characteristics reported as desirable across age and income level, the bubbles represent different annual income groups. The size of the bubbles represents the frequency with which the ‘Cost’ code was applied to excerpts within each sub-group. The X and Y axes represent the frequency with which the ‘Luxury’ and ‘Warmth’ codes were applied respectively. The highlighted bubble indicates that in comparison to other income groups, respondents reporting annual income of greater than $250K discuss issues of Luxury and Cost in hotel evaluations relatively more frequently and issues of Warmth relatively less frequently.

As with the code weight by descriptor bubble plots, these visuals can quickly expose complex multi-dimensional relations between variables across sub-groups and provide quick access to the underlying qualitative content.

You can also view our blog post on this type of chart.

Code Co-Occurrence Matrix

The Code Co-Occurrence Table provides information about how the code/tag system was used across all project excerpts. This symmetric, code by code, matrix present the frequencies for which all code pairings were applied to the same excerpt and, by default, overlapping excerpts. Such a display can expose both expected and unexpected patterns in which two codes were (or were not) used together. These patterns illuminate how concepts related to the research questions and represented by the code system are combined in the natural schema (i.e., cognitive frameworks that help organize and interpret information) activated by study participants as they report on the topic represented by project codes.

What does it tell us about our data, research questions, and how respondents naturally connect the concepts we are examining when we discover that that codes ‘A’ and ‘B’ co-occur at relatively low rates compared to codes ‘A’ and ‘C’? Dedoose facilitates the process of addressing questions like these quickly and with a variety of attributes to suit the needs of different researcher preferences. It is also important to note that when including counts for overlapping excerpts, the cell values represent ‘hits’ for excerpts coded with both the associated codes AND excerpt with one of the codes that overlaps with an excerpt coded with other code. This feature can be deactivated by clearing the ‘Include Overlapping Excerpts’ box in the upper right corner of the panel.

Example: Code Co-Occurrence Matrix Analyze img 15

For example, the highlighted cell in the table above indicates that 16 excerpts or overlapping excerpts were coded with both the ‘Reading by Others’ and ‘Letter Recognition’ codes. This pairing’s relatively high frequency indicates that as participants are thinking and reporting on one of the concepts, they often discuss thoughts about the other. Such a combination suggests that an overarching schema which includes both concepts are being activated as participants formulate their responses. Drilling down to the underlying qualitative data (by clicking the cell and reviewing the excerpts) provides a deeper understanding of participant reports and the naturally occurring patterns in their thought processes.

Observation of patterns in how the code system was applied can illuminate a wide variety of connections within (a) the nature of the conceptual framework represented by your code/tag system and how it was applied and (b) the nature of the data themselves. Patterns like these are often unlikely to be noticed or understood in the midst of coding activity. Yet in the analysis stage, these patterns can be extremely valuable in discovering and understanding how respondents naturally discuss these concepts in combination and how researchers will discuss these organizing principles as they understand and present their findings.

Code Weight Statistics Tables

The Code Weight Statistics Tables offer another means of examining and understanding coding activity in the project. These tables display basic counts and, where appropriate, statistics of how the weights for each code were distributed across code applications. The application count cells are also great shortcuts for activating a code-specific filter or pulling up all excerpts associated with the code for further exploration.

Example: Code Weight Statistics Chart Analyze img 16

Descriptor Field by Descriptor Field

The Descriptor Field by Descriptor Field chart is essentially a cross-tab analysis of the relative frequency of members in each sub-group plotted for two descriptor fields—one nested within the other. For example, in the screen shot below, you see ‘Father Work Status’ graphed against ‘PC Reading Change Group.’ The result of the analysis is Chi-square = 3.79 with 4 degrees of freedom—not a statistically significant relationship. This non-parametric statistical analysis is commonly reported in the description of a research project participant population and in discovering how the population may have interacting characteristics that should be identified and considered in the interpretation of study results.

Example: Descriptor Field by Descriptor Field Chart Analyze img 17


The normalization function, an option in a variety of Dedoose frequency charts, operates by assigning a weight of '1' to the class with the largest number of members (basis class) and then assigns weights to the other classes as a function of the numeric relation between the number of members in the class to that of the number of members in the 'basis' class. These weights are then used to adjust the number of raw counts to accomplish ratio equivalence across class for visualization and the weighted percentage is calculated based on these adjusted counts.

For example, if we are interested in the relative percentage of times people of different color eyes blink in one minute, we calculate as follows:

Analyze img 18

Data Filtering

Dedoose offers a variety of approaches to filtering your data. Most users analytical needs, much or all of the time, will be fully served by simply using the auto-generated and interactive data visualizations available in the Dedoose Analyze Workspace. The visuals make it easy to discover and explore patterns in their data and then drill in to better understand the rich qualitative stories that live beneath the surface. However, for those wishing to focus their analysis on particular subsets of data without distraction from other aspects of the database, Dedoose offers powerful database-wide filtering capabilities.

Filtering via Chart Selection Reviewer

There are two basic approaches to activating filters in Dedoose. The simplest approach is to filter via the chart selection reviewer activated following a click on some aspect of an analytic visualization. For example, the following series of snapshots illustrates the process.

  1. Pulling up all excerpts for families reporting having male children by clicking the ‘Male’ portion of the descriptor ratio pie chart in the Analyze Workspace: Analyze img 19
  2. Clicking ‘Make Active Set’ in the Chart Selection Reviewer activates a filter so all subsequent analysis will only be focused on excerpts from families reporting having male children. The following snapshot shows the impact of this filter: a. The only active data in the codes by descriptor bar chart b. The ‘Data Set’ icon in the main menu bar is shaded red as an indicator a filter is active c. A 'funnel' icon appears in the chart header which serves as a toggle to view all data or just filtered data when a filter is active

Analyze img 20 The same visual after clicking funnel icon to toggle filter to ‘off’ position

Analyze img 21

When any filter is active, all subsequent analysis is focused only on the data meeting the filter criteria. Thus, with this approach the three buttons referencing ‘Active Set’ at the bottom of the Chart Selection Reviewer pop-up allow for filtering the entire database based on the sub-set of data represented in the list. Beyond simply activating a filter as in the example, one can also add or remove other subsets to/from the active filter by using the ‘Add to/Remove from Active Set’ buttons.

Filtering via the Data Set Workspace

Though filtering via the chart, table, and plot shortcuts will serve many needs, the Dedoose Data Set Workspace serves those requiring a more complex database query and for those wishing to save filters for later use. The snapshot below shows the ‘Data Selector’ pop-up activated by clicking the Data Set Workspace icon.

Analyze img 22

In the Data Set Workspace filter can be defined based on any object in the database including descriptors, media, users, and codes. Two important things to keep in mind when creating filters are:

  1. Excerpts are the focus of all filtering, so remembering that the ultimate goal is to narrow to a subset of excerpts meeting particular criteria will help in creating clear and useful filters
  2. When a filter has been activated, it will remain in place until it is changed or deactivated by clicking the ‘Clear Current Set’ button in the Data Set Workspace or by exiting the Dedoose application.

Defining a Filter

So, what constitutes a complex query? Again, filtering in Dedoose can include criteria from descriptors, media, users, and/or codes and the sole focus is on the activation or deactivation of excerpts. Many users are familiar with the use of Boolean operators in defining a database query. Here we will work through an example in our sample data to illustrate how to define and activate a more complex query in Dedoose. Let’s imagine we wish to narrow our active set to only those excerpts meeting the following criteria:

  1. Coded with ‘Reading by Mother’ AND/OR ‘Reading by Others’
  2. From families who report General home language = Bilingual OR Spanish
  3. Families who report Father’s work status = Full-time OR Part-time AND
  4. Families who report Ethnicity = Hispanic.

Setting up this somewhat complex query is relatively transparent when working in the Dedoose Select Data Workspace and, while there are different approaches to setting up the same query, it is most straightforward by making basic use of the SuperMegaGrid features to isolate and activate the data of interest.

Note: For clarity, when working in the SuperMegaGrid we recommend clearing all columns except those of interest prior to filtering and you will see folders for each appear in the ‘Filtering’ panel AND ‘Deactivate All’ for a more forward approach to setting criteria for those excerpts you intend to activate.

Here are the steps: 1. Enter the Dedoose ‘Data Set’ workspace by clicking the binocular icon in the Dedoose Main Menu Bar 2. Click the ‘Codes’ tab, click ‘Filter None’ to clear all and then select only ‘Reading by Mother’ and Reading by Others’ 3. Click the ‘Media’ tab and here’s where we take advantage of the SuperMegaGrid 4. Click ‘Uncheck All’ in the Columns panel and then select Father work status, Ethnicity, and General Home Language since these are the field relevant to the filter we wish to create 5. Click ‘Deactivate All’ to ‘turn off’ all data from all media 6. In the ‘Filters’ panel, open each folder and select those values you wish to include—in our example: ‘Part-time’ or ‘Full-time’ working fathers, ‘Hispanic’ families, and ‘Spanish’ or ‘Bilingual’ Home Language families to filter down to the media that meet these criteria 7. Click ‘Activate All’ to ‘turn on’ all excerpts that meet these criteria and the filter has been created…see snapshot below 8. Once activated, you can click the ‘Dataset‘ tab to see the results of the filter a. View the excerpts b. Open the excerpts for full view c. Modify coding on excerpts when open and d. Save the current filter criteria in your Saved Data Sets library for later use—by highlighting and loading from the library 9. Close the pop-up and all subsequent analysis will be only on this subset of data—INCLUDING WHICH EXCERPTS WILL BE IN VIEW WITHIN DOCUMENTS 10. Closing Dedoose or returning to the Data Selector’s Dataset tab and clicking the ‘Clear Current Set’ button will serve to clear the filter.

Analyze img 23

The following screen shots show a basic code by descriptor chart with the filtering active:

Analyze img 24

…and toggled to the full data set:

Analyze img 25

Saving a Database Filter

Regardless of the complexity of activated filters, users and research teams may want to save and re-load database filters (queries) that have been defined for particular purposes.

Saving an active filter that may be useful in the future for review or as new data are added to the database, also takes place in the Data Selector workspace. To save a filter—WHEN THE FILTER IS ACTIVE:

  1. Click the ‘Dataset’ tab
  2. Click the ‘Save Current Set’ in the left side panel
  3. Provide a title and description of the filter and click ‘Submit’ to save.

Re-Loading a Saved Filter

Re-loading a saved filter (or database query) is a simple procedure in Dedoose and is achieved via the Data Selector workspace. To re-load a filter:

  1. Click the ‘Data Set’ tab
  2. Locate and highlight the desired filtered data set in the bottom panel
  3. Click the ‘Load’ button at the bottom of the panel to activate.

Deactivating/Clearing Filters

Filters are deactivated by either exiting the Dedoose App or:

  1. Returning to the Data Selector
  2. Clicking the ‘Data Sets’ tab
  3. Clicking the ‘Clear Current Set’ button in the left side panel.