The first stage in understanding the data in a
sensory or consumer study is to summarise using descriptive statistics. For
each product and attribute (or consumer liking) there will be many scores
arising from multiple assessors and potentially sessions and replicates. This
module enables an initial assessment of the data and can be useful for
understanding whether missing values are a problem, or how the range of scores
have been used.
For EyeOpenR to read your data, the first five columns of the ‘Data’ sheet must be in the following order: assessor (consumer), product, session, replicate and order (sequence). For sensory analysis the data for attributes should be in the sixth column (column F) onwards. There should be one column for each attribute. The attributes data should be numeric. For consumer analysis the data for consumer liking and other consumer assessments (ratings) should be in the sixth column (column F) onwards. There should be one column for each rating. These are described as attributes in the options.
If there is no session, replicate or order information then these columns should contain the value ‘1’ in each cell.
Additional information about the data in the
‘Data’ sheet can be included in additional sheets. The ‘Attributes’ sheet can
be used to specify the names of the attributes, data types and minimum and
maximum values that are used to check data quality. The ‘Assessors’ sheet can
be used to specify assessor names if codes are used in the ‘Data’ sheet.
Similarly, the ‘Products’ sheet can be used to specify product names if codes
are used in the ‘Data’ sheet. See the example spreadsheet for an illustration
of the data format.
The descriptive statistics module creates the following summary statistics: number of observations, number of missing values, minimum, maximum, mean, standard deviation, median, variance and standard error of the mean for each product and attribute.
The median is an average that describes the central value when all assessments are ranked from highest to lowest. It is not as sensitive to unusually high values.
The standard deviation is a measure of the variation between assessments. The variance is the standard deviation squared, another way of expressing the variation in the scores.
The standard error of the mean describes the
uncertainty in the mean assessment.
There is one table for each product in the data, and one table for the assessments of all products combined (‘Overall’). These are selected by clicking on the box that describes the product (or ‘Overall’ for the combined results). Each table has a row for each attribute in the data and a column for each descriptive statistic.
Some of the columns in the table are always reported and others are optional and selected through the options. The first column reports the number of non-missing assessments included in that row of the table. This is important for understanding how much data is being summarised. For the ‘Overall’ table in a balanced design with no missing values it will be the number of products multiplied by the number of assessors, multiplied by the number of sessions and the number of replicates. The other columns that are always reported are the number of missing values and the mean.
If the option to ‘Split results on’ has been selected there will be an additional set of boxes at the top of the display representing the different levels of the split variable. Selecting different values of the split variable will change the table to reflect the choice.
R packages used: