On the scatterplot of the columns, we can see that the first axis opposes "Stay at home" and "Full-time work", which means it opposes two women's profiles. #invisible: elements we do not want to be plotted Res.ca.col = CA(women_work, invisible="row") Res.ca.rows = CA(women_work, invisible="col") To see the scatterplots of rows and columns separately, type: We are going to use the first three columns (corresponding to the answers to the second question) as active variables and the four last ones (corresponding to the third question) as supplementary variables. rows) to which it is too much or to little associated. columns) is characterized by the columns (resp. The ones which look the most or the less alike.Įach group of rows (resp. columns) whose distribution is the most different from the population's. columns) will be close to each other if they associate with the columns (resp. Here, similarity between two rows or two columns is completely symmetric. However, the concept of similarity between rows or columns is different. The objectives of CA are quite the same as PCA's: to get a typology of rows and columns and to study the link between these two typologies. Women_work=read.table(" ", header=TRUE, row.names=1, sep="\t") To load the package and the data set, write the following line code: To each crossing, the value given is the number of women who gave both answers. The data set is two contingency tables which cross the answers of the first question with the two others. * What do you think of the following sentence: women who do not work feel cut off from the world? * Which activity is the best for a mother when children go to school? * What do you think the perfect family is ? Presentation of the data ġ724 women have answered several questions about women's work among which: Individuals graph (Decathlon data - available with the package documentation): individuals are colored from the athletics meetingĪs an example, we use here a data set issued from a questionnaire about French women's work in 1974. With the function plot, you can draw graphs and results. Or load FactoMineR and its GUI for each new R session by typing the following line code:įunctions Reference Ī complete implementation reference of all fifty FactoMineR functions, with description, usage, arguments and values, can be foud here Visualization Load FactoMineR for each new R session by typing the following line code: To Download the graphical interface of FactoMineR in your R session write the following line code (you have to be connected to internet): Load FactoMineR in your R session by writing the following line code: For each modality, the values associated with each “mean individual” are the means of each variable over the individuals endowed with this modality in this case the supplementary variable lies in the scatter plot of the individuals. When the variable is categorical, its modalities are represented by the way of a “mean individual” per modality. In the same manner, it is also easy to calculate the coordinate of a supplementary variable when the former is quantitative in this case the supplementary variable lies in the scatter plot of the variables. In the case of PCA, they can be written:į s ( i ) = 1 λ s ∑ k x i k m k G s ( k ) Those two vectors are related by the so called “transition formulae”. G s ) denotes the vector of the coordinates of the rows (resp. In order to reduce the dimensionality, X is transformed to a new coordinate system by an orthogonal linear transformation. GPA (Generalized Procustean Analysis), for which variables must be continuous.HMFA (Hierarchical Multiple Factorial Analysis), an extension of MFA for which variables are structured according to a hierarchy.MFA (Multiple Factorial Analysis), for which the variables of a same group may be numerical or categorical. Multiple correspondence analysis (MCA) when individuals are described by categorical variables. Correspondence analysis (CA) when individuals are described by two categorical variables that leads to a contingency table.Principal component analysis (PCA) when individuals are described by quantitative variables.Several methods are implemented, the most classical (PCA, Correspondence Analysis, Multiple Correspondence Analysis, Multiple Factor Analysis) as well as some advanced methods (Hierarchical Multiple Factor Analysis, Mixed Data Analysis, Dual Multiple Factor Analysis).įor the classical ones we have the following situation-use solutions: These methods are used depending on what data are available and if the variables are quantitative (Numerous) or qualitative (categorical or nominal). The methods implemented in this package are conceptually similar with respect to its main goal, for example, merge and simplify the data by reducing the dimensionality of the data set.
0 Comments
Leave a Reply. |