Napping Analysis

Napping Analysis

Purpose

To provide an analysis of data collected using the napping methodology. 

Data Format

  1. Napping.xlsx

For EyeOpenR to read your data the first five columns must include the following in the specified order: Assessor, Product, Session, Replica and Sequence.

Assessor and Product columns can be of character (e.g., “Assessor 01”, “Product A”), or numeric (“1”) format. Columns Session, Replica and Sequence should contain only numeric values. If there is no session, replica or sequence information available, the user should input a value of “1” in each cell in the column that contains no collected information.

Data collected from a Napping study should be of two or three columns, depending on if descriptive words were provided by the assessor, and appended to the five columns stated above. Napping data should therefore start from column six (column F).

In Napping, each assessor provides two pieces of information: the X and Y coordinates per product; which are given in two separate columns (columns F and G). For Sorted Napping this is extended with an additional column containing descriptors (column H). All descriptors should be comma separated (,) and placed within the same cell. For an example, please see the dataset ‘Napping.xlsx’.

Background

Napping

Napping is a rapid methodology that has been used by sensory and consumer scientists for approximately two decades. It is a simple method in terms of instructions to assessors, with minimal training of assessors required: on either a sheet of paper or digital screen, each assessor is asked to place samples that are perceived similar close to each other and to place samples that are perceived to be more different further away from each other (Dehlholm et al., 2013). There are no right or wrong answers and no need for any panel calibration prior to data collection. Each assessor is free to place the samples according to their perception. As a result, the data from each assessor comprises X and Y coordinates of each product’s position. Analysis of the data from several assessors allows us to investigate if there is consensus in the partitioning of products between the assessors. An appropriate method to deal with data from assessors in this format is Multiple Factor Analysis (MFA).

MFA is used to obtain an overview across assessors of distances between samples. In other words, it helps us understand whether there are any consistent relationships between samples across the assessors. Besides providing information on how samples relate to one another, MFA also provides information on how well each assessor relates to this common overview.

One extension of the original Napping methodology is Sorted Napping, described in the following section.

Sorted Napping

Sorted Napping extends the Napping method by asking assessors to categorize samples that are similar to each other by providing descriptors that characterise the subset of samples. Most often, an assessor writes a word or two that describe samples closely located. This extension to Napping came during the observation that assessors found it easier and more natural to describe a cluster of closely related samples as opposed to having to describe each sample. So, two samples (A and B) positioned very close could be encircled with the descriptors “Nutty” and “Earthy”, while two samples positioned further away yet close to one another (C and D) may be termed “Fruity”. Sorted Napping thereby provides categorical information (i.e., words) in additional to the numerical information (X and Y coordinates) given by traditional Napping. This categorical information is represented by an additional column in the Napping data uploaded to EyeOpenR , with descriptors for a given product placed in a single cell. The categorical information is treated as a supplementary variable in EyeOpenR analysis. Therefore, the descriptors do not impact the construction of components by the MFA.

Analysis Considerations of Napping Data

Napping data consists of a series of data tables, each table representing an assessor, with two columns of data per assessor — the X and Y coordinates. The analyst wishes to maintain the dimensions of the paper/screen used to gather the data (traditionally a sheet of paper 60 x 40cm is used or 1.5:1 ratio if digital screen) in the analysis. Therefore the data is not standardized, as standardizing would give the two axes equal weight.

Napping data is analysed via MFA, comprising of two steps. In the first step, a covariance-PCA is performed on each assessor’s data (X and Y coordinates). In the second step, MFA gives each assessor equal weight in the construction of an overall global representation. As a result, we access several global plots that show:
(i) the degree of consensus amongst the assessors
(ii) the degree to which each assessor is in consensus with the global view

MFA is not restricted to Napping data but it is a very convenient technique to handle numerous data tables. MFA can handle data tables of different type (e.g., a blend of numeric and categorical) as long as the information in one table is of the same type. An example is Sorted Napping, where the descriptors are a supplementary data table. MFA is also a method offered in other areas of EyeOpenR that require analysis of numerous data tables that are potentially of a different type. 

Options

  1. Analysis method for the words: Choice of Correspondence Analysis (CA) or Principal Components Analysis (PCA): this option is for how to analyse the supplementary table of products by words, where the intersection of a specific product and a specific word reflects the number of times that word has been given for that product, across all assessors. In other words, the counts of the word across assessors. CA is the most frequently used method with count-type data where one is interested in examining relationships between two variables (here products and descriptors). PCA is another method for analysis this table of products by words, which finds the main sources of variation in a data set. CA and PCA tend to give quite similar results if words that are used sparsely are removed from CA . See below parameter for setting the minimum number of words.

  2. Minimum frequency of word: The minimum threshold of counts that should be met in order for that word to be kept in the analysis. This is mainly applicable if CA has been selected, as CA has the tendency to give words used infrequently a large distance from the origin and can thereby visualize relationships that may not be truly supported.

  3. Number of decimals for values: User chooses the number of decimal values.

Results and Interpretation

  1. Eigenvalues tab: Provides a table of eigenvalues, percentage of variance explained and the cumulative variation explained. A high proportion of variation explained in the first two dimensions suggests there is consensus in how assessors partition the products.

  2. Products tab: Provides information on the products tested via napping across assessors. 
    1. Coord: Provides the coordinates of the products from the global MFA (the global MFA is where each assessors has equal weight. See Background section for more). These are visualized in the respective Graph tab.
    2. Contrib (contribution): Provides information on how each product is contributing to the construction of the MFA dimension: higher values imply the product is important in contributing to that dimension.
    3. Cos2 (squared cosine): Provides information on the quality of how well each product is represented on each dimension. Each product row sums to 1: higher values of a product on a dimension imply that that product is well represented; conversely lower values (nearer 0) imply that the quality of representation is poor and therefore one should be cautious in interpreting such products on the respective dimension.
    4. Graph: Provides a map of the product space and can be interpreted in similar fashion to a PCA Scores plot. The coordinates can be found in the respective Coord tab.

  3.  Group tab: In Napping each assessor is a group. So this tab provides information on how the groups are related, in other words, relationships in consensus between assessors.
    1. Lg: Lg is a coefficient that measures the relationship between a single variable (one assessor) and a group of variables (a panel of assessors). In other words, it is a metric of how well each assessor is in consensus with the group of assessors. Thus, the Lg coefficient provides information on the amount of common structure between each pair of assessors, and between each assessor and the overall consensus (see the final column, ‘MFA’, for such information). Higher Lg values indicate more common structure.
    2. RV: The RV coefficient is one of the most important statistics from MFA, showing the degree of similarity between assessors. It can be thought of as a meta-correlation coefficient. Values of 1 indicate a strong positive relation between two groups (in the context of napping, between two assessors). Conversely values nearer 0 indicate that there is no overlap between assessors. The final column, ‘MFA’, provides the user with information on the RV coefficient between the assessor and group consensus.
    3. Graph: Provides a representation of assessors on Dimensions 1 and 2. If assessors show a high value on Dimension 1 it means that Dimension 1 of the common structure (all assessors) is an important source of variation for that respective assessor (see Lê and Worch, 2015). In other words, that assessor is in alignment with the consensus view on Dimension 1. If an assessor has a low value on Dimension 1 but a high value on Dimension 2, it means that for that assessor Dimension 2 of the common structure is an important source of variation and is more important than Dimension 1 of the common structure. 

  4. Partial axes tab: Partial axes representation sees the superimposition of each assessor’s own dimensions (from a PCA of each assessor’s data) on to the common structure given by the global MFA with all assessors. As a result, the analyst can see how each assessor’s dimensions relates to that of whole panel of assessors. Information is also provided for the dimensions of supplementary information (as in the case of Sorted Napping):
    1. Coord and Cor: Here, both the coordinates and the correlation tables present the same information, namely the correlation of each assessor’s dimension to that from the global MFA
    2. Contrib: Provides information on what individual assessor dimensions are most contributing to the formation of the global dimension. Each global dimension (each column) sums to 100%: a higher value indicates that that respective assessor is highly contributing to the dimension.
    3. Graph: Visualisation of the correlation table. Each assessor’s dimensions are superimposed onto the global consensus. If Sorted Napping has been used, then dimensions from the word analysis are also included.

  5. Variables tab: Information relating to each variables of the napping data, most likely to be each assessor’s X and Y coordinates:
    1. Coord: These are the factor loadings related to the MFA. It is possible to see what X and Y coordinates from each assessor load strongly on the MFA dimensions.
    2. Contrib: Contribution of each assessor’s dimensions to the construction of each MFA dimension, expressed as a percentage. Each column of this table therefore totals 100.
    3. Cos2: The quality of representation of each assessor’s dimensions on the global MFA. High values approach 1; low values approach 0, indicating poor quality.
    4. Graph: Provides a plot of each assessor’s dimensions on to the first two dimensions given by the MFA.

  6. Data tab: The data used for the napping analysis.

  7. Information tab: Provides information about the analysis requested.

For Sorted Napping , two additional tabs are shown:

  1. Words tab: Behind the scenes, EyeOpenR transforms the uploaded data to a frequency count of products by words. This frequency table is then treated as a supplementary table in the analysis and does not impact the construction of the global MFA dimensions:
    1. Coord: The coordinates of each word across global MFA dimensions.
    2. Cos2: The quality of representation of each word across the dimensions from the global MFA.
    3. Graph: A two-dimensional plot of the coordinates.
  2. Sensory Overlay PlotOverlays the words from the Sorted Napping task on to the global products graph (i.e., the consensus product graph across all assessors). The user should be aware that descriptor (word) information has not been used in the construction of the MFA dimensions. (Available from version 5.5.4)

Technical Information

  1. R packages: FactoMineR
  2. In the sensory overlay plot the scaling between the words and products is not uniquely determined. The heuristic we use to aid interpretation is to rescale the mean variance of the words to match that of the products.




    • Related Articles

    • Napping

      Introduction Napping is a methodology that identifies how consumers classify, describe and interpret a large group of products based on certain sensory attributes from their own perspective. The panellists are given a set of products and instructed ...
    • Sorted Napping

      Introduction Sorted Napping is a methodology that identifies how consumers classify, describe and interpret a large group of products based on certain sensory attributes from their own perspective. The panellists are given a set of products and ...
    • How to Create a Napping Template

      Introduction The default napping template is currently not available in the default sensory templates but it will be included in a future updated version of EyeQuestion. Meanwhile, it is possible to create it through adapting the ranking template by ...
    • Penalty Analysis

      Purpose To provide a penalty analysis of a consumer data set, that is to investigate how liking or acceptability of product decreases when product attributes are not at the optimal intensity. Data Format Consumer.xlsx Note: for EyeOpenR to read your ...
    • Multiple Factor Analysis (MFA)

      Purpose Performs a multiple factor analysis on a data set. This is not ‘Factor Analysis’ in the classic statistical sense, but rather a method for handling multiple groups of variables that are all measured on the same samples, and from this ...