Frequency Tables (Categorical Data)

Frequency Tables (Categorical Data)

Purpose 

Produce summary tables and charts of data per attribute and per product if desired.

This option is for categorical data, which can be nominal or interval data.

Data Format

  1. Categorical coffee.xlsx
  2. The analysis will ignore data of type ‘text’. Categorical text categories should be specified as ‘nominal’.

Background

One of the basic tasks of data analysis is to tabulate data in such a way that the distribution of responses can be understood, and differences between products can be demonstrated.

Options

  1. Treat Sessions/Replicates separately: If the data has been gathered over different sessions, or there are different replicates, these can be analysed separately.
  2.  Choose Scale Type Used. ‘Automatic’ creates a scale based on the definition in the attributes’ meta-data. If there are values that fall outside the defined classes in the meta-data, then the defined classes are ignored and one class per distinct value is created. If there is no attribute meta-data, then one class per distinct value in the data set is created. 1-5’ creates 5 classes. ‘1-7’ creates 7 classes. ‘1-9’ creates 9 classes. If the wrong scale is chosen (e.g. 1-7 is chosen for interval 1-5 data) then extra empty classes are created. Data that falls outside the specified scale is ignored if Automatic is not chosen.
  3. Show Frequencies by Product: If ‘Yes’, the data summaries are created by Product. If ‘No,’ the data summaries are for the full data set.
  4. Show Percentages: If ‘Yes’, a table of the frequencies as percentages is created. If ‘No’, the table is not created.
  5. Show Total: If ‘Yes’, the table of frequencies with total is created. If ‘No’, this table of frequencies with total is not created. The total is the total frequency per class, across all products.
  6. Sort Results from high to low:  If Yes, the tables are sorted by class, with the class with the highest values on the scale as the first row of the table. If No, the tables are sorted by class, with the class with the lowest values on the scale as the first row.
  7. Type of Mean: Adjusted/ Arithmetic. The type of mean that should be computed. ‘Adjusted’ takes into account missing data or imbalance in design. The model attribute ~ product + assessor is used to calculate the adjusted mean.  ‘Arithmetic’ calculates the mean in the data and is recommend for balanced data.
  8. TopBox: The number of values to be included in the TopBox is specified. E.g. If 2 is selected then a Top2Box category containing the top 2 classes is created. Which categories are ‘top’ depends on whether the results are sorted from high to low or low to high. The number of classes in the TopBox must be less than or equal to the total number of classes in the data set. 
  9. BottomBox: The number of values to be included in the BottomBox is specified. E.g. If 2 is selected then a Bottom2Box category containing the bottom 2 classes is created. Which categories are ‘bottom’ depends on whether the results are sorted from high to low or low to high. The number of classes in the BottomBox must be less than or equal to the total number of classes in the data set. TopBox and BottomBox categories can overlap. 
  10. MiddleBox: If TopBox and BottomBox are selected and if there are classes that are not contained in either, then if MiddleBox is ‘Yes’, the classes that are not in the Top or Bottom Box are put in the MiddleBox category. 
  11. Significance Test: If TopBox or BottomBox is specified and if Show Frequencies by Product is ‘Yes’, then a significance test is performed. If the data is from a monadic test, the prop.test function from the R System stats package is used to compare the proportion of data in the top box / not in the top box across pairs of products. If the data is from a sequential test,  then a Cochran’s Q test is carried out using the symmetry_test function from the coin package is used to compare the proportions of data in the top box / not in top box, with product as a factor and assessors as a block factor in the formula passed to the function. The quadratic test statistic is used. If this symmetry test finds that there is significant asymmetry in the proportions in top-box/not in top-box across the products then the McNemar pairwise test on the proportions in top-box / not in top-box for each pair of products is done, using the binom.test function from the stats package. The equivalent significance tests are performed on the bottom box and the middle box if these are specified.
  12. Display of Multiple Comparison Test Results. User can select ‘Pairwise’ or ‘Group’. This will be reflected in the subsequent Topbox with significance table that displays significant differences between products, per attribute. Pairwise’ summarises the significance level associated with each paired comparison, presented in a table. Use this option if you wish to read pairwise comparisons between products. ‘Group’ will assign each product per attribute to a particular group based on significance testing: products not sharing the same group are statistical different at the chosen level of significance.
  13. Levels of significance (group) : Only applicable if display of significance test is at the group level: the user can select 1%, 5%, 10% or 20%. The percentages refer to the alpha level (risk of Type I error)
  14. Levels of significance (pairwise): Only applicable if display of significance test is at the pairwise level: user can choose varying levels of significance which are presented in a summary table in the output.
  15. Number of decimals for Values. Required number of decimals for values given in the results.

Results and Interpretation

  1. Frequency. Tabulates the number of results in each of the classes as defined in the options. If Show Frequency by Product is chosen, then the counts are per product. The total N in each class and the mean of each class is given (if the data are numerical). If Adjusted mean is selected, the model attribute ~ product + assessor is used to calculate the adjusted mean.  If the TopNBox, BottomNBox and / or MiddleNBox options are selected the total count of results in each box are given, again per product if Frequency by Product is chosen. 
  2. Frequencies Plot. Shows a bar chart of the number of results in each class. If Show Frequency by Product is chosen, then the bar chart is per product. To hide / show a class in the chart, click on that class in the plot legend. The plot does not include Top, Middle or BottomBox totals.
  3. Frequency percentages. If Show Percentages is set to Yes. Tabulates the number of results in each of the classes as a percentage of the total. The classes in each column sum to 100%.  If Show Frequency by Product is chosen, then the percentages are per product. 
  4. Top/Middle/Bottom boxes. If Top, Middle or BottomBox is specified. Tabulates the results of the number of results in each box, and in any classes that are not contained in a box.
  5. TopBox with Significance. If significance test is set to Yes. Percentages of results in each class, including in each box are displayed. Significance tests as defined in the options are performed and the results displayed in this table. If Display of Multiple Comparison Tests is Pairwise then it is indicated where product classes are significantly different to other product classes. For example, if TopBox for product C is significantly different to products A and B, then the letters A, B are displayed on TopBox Product C. If a product class is not significantly different to any other product class, then nothing is displayed on that class. If Display of Multiple Comparison Tests is Group, then it is indicated which group each product class is in. E.g. If TopBox for Product A and B is significantly less than product C then C will be in group A and A and B will be in group B. If there are no significant differences, then all products will be in group A. 
  6. Frequency with Total. If Show Total is set to Yes. Tabulates the number of results in each of the classes as defined in the options. If Show Frequency by Product is chosen, then the counts are per product. An extra column, Total, gives the total number of results in each of the classes for all products.

Technical Information

  1. SensoMineR
  2. coin symmetry_test, statistic function
  3. binom.test
  4. R function settings that are not otherwise visible to the user




    • Related Articles

    • Frequency Tables (Continuous Data)

      Purpose Produce summary tables and charts of data per attribute and per product if desired. This option is for continuous data e.g. Interval data on a scale from 1 to 100, or 1 to 10. Data Format Profling.xlsx Background A frequency table lists a set ...
    • Correspondence Analysis (CATA and categorical data)

      Purpose To visualise and summarise analyse tabular data and to highlight the patterns of association in two way tables. It is widely used for mapping pure qualitative variables – e.g cluster by demographic use. This is an example of typical data that ...
    • How Can I Analyse My Data?

      In EyeQuestion there are multiple options to analyze the project data. When you select the Data tab in your project you will find a dropdown menu Analysis: Auto Reports Via the option for Auto reports EyeQuestion will analyze the data and create the ...
    • Data Cleaning

      Introduction Following data collection, it's essential to ensure the validity of the collected data and address any instances where participants may have completed the questionnaire without due attention. To tackle this issue, we've introduced a ...
    • Live Data Monitor

      Category data such as hedonic data & demographic data or interval data such as that defined on line scales are often collected in sensory or consumer research questionnaires. In order to rapidly visualize these results, you can make use of the Live ...