Purpose
Performs a multiple factor analysis on a
data set. This is not ‘Factor Analysis’
in the classic statistical sense, but rather a method for handling multiple
groups of variables that are all measured on the same samples, and from this
information derive a common multivariate map of the sample space. There is flexibility regarding the type of
variables within each group, each one may be continuous, categorical or
frequency based. It is common to apply
MFA in one of two different scenarios, firstly where the variable groups
represent different measurement types, for example groups of analytical
measurements, consumer and sensory measurements all made on the same samples. Secondly, each variable group may represent a
different subject or assessor, specially in a scenario such a free choice
profiling where assessors are free to choose their own attributes, and so the
standard mapping approach of running a PCA on the panel mean is not an option.
Napping is also in this second category because the X-Y coordinates are unique
to each assessor.
The Excel file used with EyeOpenR should be
in the standard format with the addition of two extra columns to the attributes
worksheet called ‘group’ and ‘type’. The
group column should contain a label in each row that gives the group name of
the corresponding variable, it is not necessary for all the variables belonging
to the same group to be next to each other in the data set. The cells in the type column can either be
empty or contain one of the following options ‘standardized’, ‘centered’,
‘nominal’, or ‘frequency’. These strings
must be spelled exactly as shown, in lower case and without the quote marks. It is only necessary for one variable in each
group to have a type string, and then that type will be applied to all
variables within that group. The setting
of the type option determines how the software handles the variables in each
group, in particular:
- ‘centered’ refers
to continuous variables that are mean centered prior to analysis.
- ‘standardized’
refers to continuous variables that are centered and scaled prior to analysis.
- ‘nominal’ refers
to variables that are categorical.
- ‘frequency’
refers to variables that are counts of events.
Background
MFA is a hybrid method, that involves,
firstly, computing a PCA on each group of numeric variables, or an MCA on each
group of categorical variables. Secondly, each group of variables in then
weighted according to the square root of its own first eigenvalue, and then all
the weighted variables combined in a final PCA analysis. The advantage of this weighting approach is
that no single variable group can dominate in the final PCA.
Options
- Data to Analyse – When the option ‘Run
on imported Data’ is used, this assumes that each row of the data worksheet is
a unique product in the analysis. If on
the other hand the option ‘Compute Table of Means’ is used, then the software averages
across all the panellists, sessions and replicates for each product prior to
the MFA analysis.
- Type of Mean – only applies when
‘Compute Table of Means’ is selected above and allows the user to choose
between simple arithmetic means, or adjusted means that account for the unequal
replication of each product in the design.
If the design is balanced, then there will be no difference between the
two options.
- Number of decimals for values – controls
the number of decimals printed in all numeric output.
Results and Interpretation
- The ‘Eigenvalue’ tab summarises the amount of information that is explained
by each dimension in the final PCA of the weighted group variables.
- The ‘Products’ tab gives the following sub-tables:
- Coord - product scores on each
dimension
- Contribution - the relative
contribution of each product to each component (the contributions in each
column sum to 100).
- Cos2 - shows the importance of
each product to each dimension (the cos2 values in a row sum to one).
- Graph - plot of the scores on
the 1st two dimensions.
- The ‘Group’ tab gives the following sub-tables:
- Lg – the Lg coefficient is a
measure of association between groups of variables. The higher the number the
higher the association
- RV – the RV coefficient is a
measure of correlation between groups of variables and varies between 0 and 1.
- For both Lg and RV tables the
final row/column labelled ‘MFA’ corresponds to the combined matrix containing
all the groups of variables.
- Graph – a plot showing
relationships between groups. The group
coordinates are Lg coefficients between each group of variables and the scores
on the 1st two dimensions.
- The ‘Partial axes’ tab shows the relationships between the
dimensions derived from individual groups and the dimensions derived from the
final PCA.
- Coord – the coordinates of the
group dimensions on the dimensions of the final PCA
- Cor – the correlation
coefficients between the group dimensions with the dimensions of the final PCA.
- Contribution – the relative
contribution of each group dimension to each dimension of the final PCA (the
contributions in each column sum to 100).
- Graph – a plot of the
coordinates on the first two dimensions.
- The ‘Variables’ tab shows the relationships between the variables
from the individual groups and the dimensions derived from the final PCA.
- Coord – the coordinates of the individual
group variables on the dimensions of the final PCA
- Contribution – the relative
contribution of each group variable to each dimension of the final PCA (the
contributions in each column sum to 100).
- Cos2 – shows the importance of each group
variable across the dimensions of the final PCA (the cos2 values in each row
sum to one).
- Graph – a plot of the
coordinates on the first two dimensions.
- The function MFA from the R package
‘FactoMineR’ is used to carry out the analysis.
References
- Escofier, B. and Pagès, J. (1994) “Multiple
factor analysis” Computational Statistics & Data Analysis, vol 18, pp
121–140.
- Pagès, J. (2005) “Collection and analysis
of perceived product inter-distances using multiple factor analysis:
Application to the study of 10 white wines from the Loire valley” Food Quality
and Preference, vol 16, pp 642–649.
- Pagès, J. (2014) “Multiple Factor Analysis
by Example Using R” CRC Press (ISBN 9780429171086)