Multiple Factor Analysis (MFA)

Multiple Factor Analysis (MFA)

Purpose

Performs a multiple factor analysis on a data set.  This is not ‘Factor Analysis’ in the classic statistical sense, but rather a method for handling multiple groups of variables that are all measured on the same samples, and from this information derive a common multivariate map of the sample space.  There is flexibility regarding the type of variables within each group, each one may be continuous, categorical or frequency based.  It is common to apply MFA in one of two different scenarios, firstly where the variable groups represent different measurement types, for example groups of analytical measurements, consumer and sensory measurements all made on the same samples.  Secondly, each variable group may represent a different subject or assessor, specially in a scenario such a free choice profiling where assessors are free to choose their own attributes, and so the standard mapping approach of running a PCA on the panel mean is not an option. Napping is also in this second category because the X-Y coordinates are unique to each assessor. 

Data Format

The Excel file used with EyeOpenR should be in the standard format with the addition of two extra columns to the attributes worksheet called ‘group’ and ‘type’.  The group column should contain a label in each row that gives the group name of the corresponding variable, it is not necessary for all the variables belonging to the same group to be next to each other in the data set.  The cells in the type column can either be empty or contain one of the following options ‘standardized’, ‘centered’, ‘nominal’, or ‘frequency’.  These strings must be spelled exactly as shown, in lower case and without the quote marks.  It is only necessary for one variable in each group to have a type string, and then that type will be applied to all variables within that group.  The setting of the type option determines how the software handles the variables in each group, in particular:

  1. ‘centered’ refers to continuous variables that are mean centered prior to analysis. 
  2. ‘standardized’ refers to continuous variables that are centered and scaled prior to analysis.
  3. ‘nominal’ refers to variables that are categorical.
  4. ‘frequency’ refers to variables that are counts of events.

Background

MFA is a hybrid method, that involves, firstly, computing a PCA on each group of numeric variables, or an MCA on each group of categorical variables. Secondly, each group of variables in then weighted according to the square root of its own first eigenvalue, and then all the weighted variables combined in a final PCA analysis.  The advantage of this weighting approach is that no single variable group can dominate in the final PCA.

Options

  1. Data to Analyse – When the option ‘Run on imported Data’ is used, this assumes that each row of the data worksheet is a unique product in the analysis.  If on the other hand the option ‘Compute Table of Means’ is used, then the software averages across all the panellists, sessions and replicates for each product prior to the MFA analysis. 
  2. Type of Mean – only applies when ‘Compute Table of Means’ is selected above and allows the user to choose between simple arithmetic means, or adjusted means that account for the unequal replication of each product in the design.  If the design is balanced, then there will be no difference between the two options.  
  3. Number of decimals for values – controls the number of decimals printed in all numeric output.

Results and Interpretation

  1. The ‘Eigenvalue’ tab summarises the amount of information that is explained by each dimension in the final PCA of the weighted group variables. 
  2. The ‘Products’ tab gives the following sub-tables:
    1. Coord - product scores on each dimension 
    2. Contribution - the relative contribution of each product to each component (the contributions in each column sum to 100).
    3.  Cos2 - shows the importance of each product to each dimension (the cos2 values in a row sum to one).
    4.  Graph - plot of the scores on the 1st two dimensions.
  3. The ‘Group’ tab gives the following sub-tables:
    1. Lg – the Lg coefficient is a measure of association between groups of variables. The higher the number the higher the association 
    2. RV – the RV coefficient is a measure of correlation between groups of variables and varies between 0 and 1. 
    3. For both Lg and RV tables the final row/column labelled ‘MFA’ corresponds to the combined matrix containing all the groups of variables. 
    4. Graph – a plot showing relationships between groups.  The group coordinates are Lg coefficients between each group of variables and the scores on the 1st two dimensions.
  4. The ‘Partial axes’ tab shows the relationships between the dimensions derived from individual groups and the dimensions derived from the final PCA. 
    1. Coord – the coordinates of the group dimensions on the dimensions of the final PCA 
    2. Cor – the correlation coefficients between the group dimensions with the dimensions of the final PCA.
    3. Contribution – the relative contribution of each group dimension to each dimension of the final PCA (the contributions in each column sum to 100).
    4. Graph – a plot of the coordinates on the first two dimensions.
  5. The ‘Variables’ tab shows the relationships between the variables from the individual groups and the dimensions derived from the final PCA.
    1. Coord – the coordinates of the individual group variables on the dimensions of the final PCA
    2. Contribution – the relative contribution of each group variable to each dimension of the final PCA (the contributions in each column sum to 100). 
    3. Cos2 – shows the importance of each group variable across the dimensions of the final PCA (the cos2 values in each row sum to one). 
    4. Graph – a plot of the coordinates on the first two dimensions.

Technical Information

  1. The function MFA from the R package ‘FactoMineR’ is used to carry out the analysis.

References

  1. Escofier, B. and Pagès, J. (1994) “Multiple factor analysis” Computational Statistics & Data Analysis, vol 18, pp 121–140.
  2. Pagès, J. (2005) “Collection and analysis of perceived product inter-distances using multiple factor analysis: Application to the study of 10 white wines from the Loire valley” Food Quality and Preference, vol 16, pp 642–649.
  3. Pagès, J. (2014) “Multiple Factor Analysis by Example Using R” CRC Press (ISBN 9780429171086)



    • Related Articles

    • Penalty Analysis

      Purpose To provide a penalty analysis of a consumer data set, that is to investigate how liking or acceptability of product decreases when product attributes are not at the optimal intensity. Data Format Consumer.xlsx Note: for EyeOpenR to read your ...
    • Napping Analysis

      Purpose To provide an analysis of data collected using the napping methodology. Data Format Napping.xlsx For EyeOpenR to read your data the first five columns must include the following in the specified order: Assessor, Product, Session, Replica and ...
    • Principal Component Analysis (PCA)

      Purpose To provide a Principal Components Analysis (PCA) of the imported data. PCA is a popular dimensionality reduction method in sensory and consumer science. In non-technical terms, the analysis reduces an initial set of variables to a smaller set ...
    • Choice Base Conjoint Analysis

      Purpose This analysis method can be used to analyse data collected using the Choice Based Conjoint question type in EyeQuestion. Background Choice-Based Conjoint Analysis is a sophisticated market research technique used to decipher consumer ...
    • Same/Different Test Analysis

      Available from version: 5.0.8.6 Purpose The Same Different Test is a discrimination test that is a variation of the paired comparison test. The assessor is presented with two samples and is asked to decide whether these samples are the same or ...