Purpose
A word cloud (or tag cloud) is a visual
display of text data that represents each word (or phrase) in a font size that is proportional to the importance
of each word. The relative importance of
the words can then be easily appreciated simply by viewing the resulting chart.
Any data set that contains variables of
binary type (zeros and ones) is suitable for creating a CATA Word Cloud. If your data set contains variables of other
types, then make sure that these are excluded at the “Visualization and
Selection” screen, otherwise an error will occur.
Background
In sensory and
consumer science a word cloud is most often used to visualize the associations
between products and binary variables indicating the absence or presence of a
range of different attributes. Usually
this process if called check-all-that-apply or CATA for short. The attributes themselves can be any characteristic
of the products, e.g. functional properties, or even feelings or emotions that
might be evoked by the products. In
practice the subject is given a list of the attributes and, while evaluating
the product, places a tick next to each that they feel is appropriate. The CATA data is first summarised by counting
the number of subjects who associate each attribute/word with each product,
then a word cloud for one product is constructed by writing the words in a font
size proportional to subject count. The
algorithm then places the different sized words into a standard area, such that
there are both closely packed together, and where the words with the largest
font are as close to the centre of the plot as possible. In order to place the words closely together,
words may be written either horizontally or vertically, so no particular
significance is intended by the direction in which a word is written. The word cloud is a good way of visualising
the most important associations between a product and a word since the viewer’s
eye is most readily drawn towards the words in the largest font size toward the
centre of the chart. Using the example
data set with emotional attributes CATA emotions.xlsx it can easily be seen that
Product C (the first plot below) is most strongly associated with positive
emotions such as “Interested”, “Excitement” and “Approval”, while Product J
(send second plot below) is most strongly associated with negative emotions
such as “Bored”, “Not Excited” and “Not Interested”.
Options
- Treat Sessions/Replicated Separately – select
‘no’ to obtain one word cloud for each product in the data set. Use the other 2 options if you would like to
create separate word clouds for:
- Each product within each
session (Product by Session)
- Each product within each
replicate (Product by Replica)
Results and Interpretation
- The ‘Word Cloud’ tab shows all the word clouds stacked on top of
each other, where the grey title bar above each cloud contains the product name
and possibly the session or replicate name.
Below the last word cloud, a product-by-attribute table of counts can be
found – so each row of the table is the raw data relating to one of the word
clouds above.
- R is used to create the table
of counts, and the EyeOpenR interface creates the word cloud itself.
References