CATA Word Cloud

CATA Word Cloud

Purpose

A word cloud (or tag cloud) is a visual display of text data that represents each word (or phrase) in a  font size that is proportional to the importance of each word.  The relative importance of the words can then be easily appreciated simply by viewing the resulting chart. 

Data Format

Any data set that contains variables of binary type (zeros and ones) is suitable for creating a CATA Word Cloud.  If your data set contains variables of other types, then make sure that these are excluded at the “Visualization and Selection” screen, otherwise an error will occur.

Background

In sensory and consumer science a word cloud is most often used to visualize the associations between products and binary variables indicating the absence or presence of a range of different attributes.  Usually this process if called check-all-that-apply or CATA for short.  The attributes themselves can be any characteristic of the products, e.g. functional properties, or even feelings or emotions that might be evoked by the products.  In practice the subject is given a list of the attributes and, while evaluating the product, places a tick next to each that they feel is appropriate.  The CATA data is first summarised by counting the number of subjects who associate each attribute/word with each product, then a word cloud for one product is constructed by writing the words in a font size proportional to subject count.  The algorithm then places the different sized words into a standard area, such that there are both closely packed together, and where the words with the largest font are as close to the centre of the plot as possible.  In order to place the words closely together, words may be written either horizontally or vertically, so no particular significance is intended by the direction in which a word is written.  The word cloud is a good way of visualising the most important associations between a product and a word since the viewer’s eye is most readily drawn towards the words in the largest font size toward the centre of the chart.  Using the example data set with emotional attributes CATA emotions.xlsx it can easily be seen that Product C (the first plot below) is most strongly associated with positive emotions such as “Interested”, “Excitement” and “Approval”, while Product J (send second plot below) is most strongly associated with negative emotions such as “Bored”, “Not Excited” and “Not Interested”.

 

Options

  1. Treat Sessions/Replicated Separately – select ‘no’ to obtain one word cloud for each product in the data set.  Use the other 2 options if you would like to create separate word clouds for:
    1. Each product within each session (Product by Session)
    2. Each product within each replicate (Product by Replica)

Results and Interpretation

  1. The ‘Word Cloud’ tab shows all the word clouds stacked on top of each other, where the grey title bar above each cloud contains the product name and possibly the session or replicate name.  Below the last word cloud, a product-by-attribute table of counts can be found – so each row of the table is the raw data relating to one of the word clouds above.  

Technical Information

  1. R is used to create the table of counts, and the EyeOpenR interface creates the word cloud itself.

References


    • Related Articles

    • Check All That Apply (CATA)

      Introduction The "Check-All-That-Apply" (CATA) method is utilized in sensory evaluation to collect information regarding the sensory characteristics of a product. In this method, participants are presented with a predetermined list of sensory ...
    • Cochran and McNemar test (CATA)

      Purpose Cochran and McNemar tests are used to test for differences between products when the data has been collected through a ‘Check All That Apply’ (CATA) design. Using a CATA method for sensory research means that the responses collected are ...
    • Penalty Analysis

      Purpose To provide a penalty analysis of a consumer data set, that is to investigate how liking or acceptability of product decreases when product attributes are not at the optimal intensity. Data Format Consumer.xlsx Note: for EyeOpenR to read your ...
    • Correspondence Analysis (CATA and categorical data)

      Purpose To visualise and summarise analyse tabular data and to highlight the patterns of association in two way tables. It is widely used for mapping pure qualitative variables – e.g cluster by demographic use. This is an example of typical data that ...
    • Text Highlighter Analysis

      Purpose This analysis method allows the user to analyse data collected using the text highlighter question type. The text highlighter analysis is a tool that visually represents the most frequently selected sentences in a text based on applied ...