To provide an analysis of data collected using the napping methodology.
For EyeOpenR to read your data the first five columns must include the
following in the specified order: Assessor, Product, Session,
Replica and Sequence.
Assessor and Product columns can be of character
(e.g., “Assessor 01”, “Product A”), or numeric (“1”) format. Columns Session,
Replica and Sequence should contain only numeric values. If there
is no session, replica or sequence information available, the user should input
a value of “1” in each cell in the column that contains no collected
Data collected from a Napping study should be of two or three columns,
depending on if descriptive words were provided by the assessor, and appended
to the five columns stated above. Napping data should therefore start from
column six (column F).
In Napping, each assessor provides two pieces of information:
the X and Y coordinates per product; which are given in two separate columns
(columns F and G). For Sorted Napping this is extended with an additional
column containing descriptors (column H). All descriptors should be comma
separated (,) and placed within the same cell. For an example, please see the
Napping is a rapid methodology that has been used by sensory
and consumer scientists for approximately two decades. It is a simple method in
terms of instructions to assessors, with minimal training of assessors required:
on either a sheet of paper or digital screen, each assessor is asked to place
samples that are perceived similar close to each other and to place samples
that are perceived to be more different further away from each other (Dehlholm
et al., 2013). There are no right or wrong answers and no need for any panel calibration
prior to data collection. Each assessor is free to place the samples according
to their perception. As a result, the data from each assessor comprises X and Y
coordinates of each product’s position. Analysis of the data from several
assessors allows us to investigate if there is consensus in the partitioning of
products between the assessors. An appropriate method to deal with data from
assessors in this format is Multiple Factor Analysis (MFA).
MFA is used to obtain an overview across assessors of
distances between samples. In other words, it helps us understand whether there
are any consistent relationships between samples across the assessors. Besides
providing information on how samples relate to one another, MFA also provides information
on how well each assessor relates to this common overview.
One extension of the original Napping methodology is Sorted
Napping, described in the following section.
Sorted Napping extends the Napping method by asking
assessors to categorize samples that are similar to each other by providing descriptors
that characterise the subset of samples. Most often, an assessor writes a word
or two that describe samples closely located. This extension to Napping came during
the observation that assessors found it easier and more natural to describe a
cluster of closely related samples as opposed to having to describe each sample.
So, two samples (A and B) positioned very close could be encircled with the
descriptors “Nutty” and “Earthy”, while two samples positioned further away yet
close to one another (C and D) may be termed “Fruity”. Sorted Napping thereby provides
categorical information (i.e., words) in additional to the numerical
information (X and Y coordinates) given by traditional Napping. This
categorical information is represented by an additional column in the Napping data
uploaded to EyeOpenR , with descriptors for a given product placed in a single
cell. The categorical information is treated as a supplementary variable in
EyeOpenR analysis. Therefore, the descriptors do not impact the construction of
components by the MFA.
Analysis Considerations of Napping Data
Napping data consists of a series of data tables, each table
representing an assessor, with two columns of data per assessor — the X and Y
coordinates. The analyst wishes to maintain the dimensions of the paper/screen
used to gather the data (traditionally a sheet of paper 60 x 40cm is used or 1.5:1
ratio if digital screen) in the analysis. Therefore the data is not standardized,
as standardizing would give the two axes equal weight.
Napping data is analysed via MFA, comprising of two steps.
In the first step, a covariance-PCA is performed on each assessor’s data (X and
Y coordinates). In the second step, MFA gives each assessor equal weight in the
construction of an overall global representation. As a result, we access
several global plots that show:
(i) the degree of consensus amongst the assessors
(ii) the degree to which each assessor is in consensus with the global view
MFA is not restricted to Napping data but it is a very
convenient technique to handle numerous data tables. MFA can handle data tables
of different type (e.g., a blend of numeric and categorical) as long as the
information in one table is of the same type. An example is Sorted Napping, where
the descriptors are a supplementary data table. MFA is also a method offered in
other areas of EyeOpenR that require analysis of numerous data tables that are
potentially of a different type.
Analysis method for the words: Choice
of Correspondence Analysis (CA) or Principal Components Analysis
(PCA): this option is for how to analyse the supplementary table of
products by words, where the intersection of a specific product and a specific
word reflects the number of times that word has been given for that product,
across all assessors. In other words, the counts of the word across assessors. CA
is the most frequently used method with count-type data where one is interested
in examining relationships between two variables (here products and
descriptors). PCA is another method for analysis this table of products by
words, which finds the main sources of variation in a data set. CA and PCA tend
to give quite similar results if words that are used sparsely are removed from
CA . See below parameter for setting the minimum number of words.
Minimum frequency of word: The minimum threshold of counts that should be
met in order for that word to be kept in the analysis. This is mainly
applicable if CA has been selected, as CA has the tendency to give words used infrequently
a large distance from the origin and can thereby visualize relationships that
may not be truly supported.
Number of decimals for values: User
chooses the number of decimal values.
Results and Interpretation
a table of eigenvalues, percentage of variance explained and the cumulative
variation explained. A high proportion of variation explained in the first two
dimensions suggests there is consensus in how assessors partition the products.
Products tab: Provides information
on the products tested via napping across assessors.
Coord: Provides the coordinates of
the products from the global MFA (the global MFA is where each assessors has
equal weight. See Background section for more). These are visualized in the
respective Graph tab.
Contrib (contribution): Provides
information on how each product is contributing to the construction of the MFA
dimension: higher values imply the product is important in contributing to that
Cos2 (squared cosine):
Provides information on the quality of how well each product is represented on
each dimension. Each product row sums to 1: higher values of a product on a
dimension imply that that product is well represented; conversely lower values
(nearer 0) imply that the quality of representation is poor and therefore one
should be cautious in interpreting such products on the respective dimension.
Graph: Provides a map of
the product space and can be interpreted in similar fashion to a PCA Scores
plot. The coordinates can be found in the respective Coord tab.
Group tab: In Napping each
assessor is a group. So this tab provides information on how the groups are
related, in other words, relationships in consensus between assessors.
Lg: Lg is a coefficient
that measures the relationship between a single variable (one assessor) and a
group of variables (a panel of assessors). In other words, it is a metric of
how well each assessor is in consensus with the group of assessors. Thus, the
Lg coefficient provides information on the amount of common structure between
each pair of assessors, and between each assessor and the overall consensus
(see the final column, ‘MFA’, for such information). Higher Lg values indicate
more common structure.
RV: The RV coefficient is one of
the most important statistics from MFA, showing the degree of similarity
between assessors. It can be thought of as a meta-correlation coefficient.
Values of 1 indicate a strong positive relation between two groups (in the
context of napping, between two assessors). Conversely values nearer 0 indicate
that there is no overlap between assessors. The final column, ‘MFA’, provides
the user with information on the RV coefficient between the assessor and group
Graph: Provides a representation
of assessors on Dimensions 1 and 2. If assessors show a high value on Dimension
1 it means that Dimension 1 of the common structure (all assessors) is an
important source of variation for that respective assessor (see Lê and Worch,
2015). In other words, that assessor is in alignment with the consensus view on
Dimension 1. If an assessor has a low value on Dimension 1 but a high value on
Dimension 2, it means that for that assessor Dimension 2 of the common
structure is an important source of variation and is more important than
Dimension 1 of the common structure.
Partial axes tab: Partial
axes representation sees the superimposition of each assessor’s own dimensions
(from a PCA of each assessor’s data) on to the common structure given by the
global MFA with all assessors. As a result, the analyst can see how each
assessor’s dimensions relates to that of whole panel of assessors. Information
is also provided for the dimensions of supplementary information (as in the
case of Sorted Napping):
Coord and Cor:
Here, both the coordinates and the correlation tables present the same
information, namely the correlation of each assessor’s dimension to that from
the global MFA
Contrib: Provides information
on what individual assessor dimensions are most contributing to the formation
of the global dimension. Each global dimension (each column) sums to 100%: a
higher value indicates that that respective assessor is highly contributing to
Graph: Visualisation of the correlation table. Each
assessor’s dimensions are superimposed onto the global consensus. If Sorted
Napping has been used, then dimensions from the word analysis are also
Variables tab: Information
relating to each variables of the napping data, most likely to be each
assessor’s X and Y coordinates:
Coord: These are the factor
loadings related to the MFA. It is possible to see what X and Y coordinates
from each assessor load strongly on the MFA dimensions.
Contrib: Contribution of each
assessor’s dimensions to the construction of each MFA dimension, expressed as a
percentage. Each column of this table therefore totals 100.
Cos2: The quality of representation
of each assessor’s dimensions on the global MFA. High values approach 1; low
values approach 0, indicating poor quality.
Graph: Provides a plot of each assessor’s dimensions on
to the first two dimensions given by the MFA.
- Data tab: The data used for the napping analysis.
- Information tab: Provides information about the analysis requested.
Sorted Napping , two additional tabs are shown:
Words tab: Behind the scenes,
EyeOpenR transforms the uploaded data to a frequency count of products by
words. This frequency table is then treated as a supplementary table in the
analysis and does not impact the construction of the global MFA dimensions:
Coord: The coordinates of
each word across global MFA dimensions.
Cos2: The quality of
representation of each word across the dimensions from the global MFA.
A two-dimensional plot of the coordinates.
Sensory Overlay Plot: Overlays the words from
the Sorted Napping task on to the global products graph (i.e., the consensus
product graph across all assessors). The user should be aware that descriptor
(word) information has not been used in the construction of the MFA dimensions. (Available from version 5.5.4)
R packages: FactoMineR
In the sensory overlay plot the scaling between
the words and products is not uniquely determined. The heuristic we use to aid
interpretation is to rescale the mean variance of the words to match that of