# afex - Analysis of Factorial Experiments

## Why afex

Factorial experiments are at the heart of many fields with a strong experimental methodology such as experimental Psychology. The traditional method to analyze these experiments are analysis of variance (ANOVA) and analysis of covariance (ANCOVA). Although these methods may seem "somewhat oldfashioned" (Dalgaard, 2007, p. 2) compared to mixed models, they are still very popular and seen as the default analysis strategy in a wide variety of fields. Somewhat surprisingly, R lacks convenient functions for performing these analysis on unbalanced data or designs with repeated-measures factors. This seems especially unsatisfactory as the commercial statistical packages such as SPSS and SAS offer simple and powerful functionalities for performing those analyses.

**afex **tries to fill this gap (similar to package **ez** by Mike Lawrence) by providing functions that allow for conveniently running different types of ANOVAs and ANCOVAs using a common interface and replicating the default behavior of SPSS and SAS, namely using Type 3 sums of squares. Therefore, **afex **should be specifically interesting for new users of R who need to perform ANOVAs and want to replicate their results from SPSS or SAS. When learning a new statistics software replicating established results is probably a common strategy that allows one to get accustomed to the new software while being safe side that errors will be spotted (i.e., if the results cannot be replicated).

Additionally, **afex **offers one feature that goes beyond replicating the default behavior of commercial statistical software. When there is more than one observation per cell of the design and individual (common when using repeated-measures/within-subjects factors), **afex **automatically aggregates the data even allowing to specify the aggregation function (the default is the mean, but any other function, e.g., median is possible).

**afex** offers the convenience function `mixed()`

to obtain p-values for all effects in a linear-mixed model (see documentation of mixed available via `?mixed`

for more information). For an introduction to mixed modeling for experimental data see Judd, Westfal & Kenny (2012, who also recommend using the Kenward-Rogers approximation for degrees of freedom), Barr, Levy, Scheepers, and Tily (2013; for recommendations on how to specify the random effects structure), or Baayen, Davidson and Bates (2006; unfortunately they do not use *p*-values).

Finally, **afex **offers function `compare.2.vectors(x, y)`

for conveniently comparing two vectors `x`

and `y`

using a *t*, Wilcoxon and permutation test.

## Installation

**afex** is available from CRAN. Just install from within R with:

`install.packages("afex")`

The pdf help file is also available from CRAN.

The development version of **afex** is hosted on R-forge (what is new in the development version) and can be installed via:

`install.packages("afex", repos="http://R-Forge.R-project.org")`

or

`install.packages("afex", repos="http://R-Forge.R-project.org", type = "source")`

After installation you need to load **afex** every time you start R by:

`library("afex")`

or `require("afex")`

**Important note: One needs to have at least pbkrtest version 0.3-6 ****if one wants to obtain p-values for GLMMs using parametric bootstrap (method = "PB") (time of writing, 11.10.2013).**

## Using afex

Please see the help files in the package for detailed instructions:

`?aov.car`

for help with ANOVA

`?mixed `

for help with mixed models.

Note that **afex** was heavily influenced by the **ez** package from Mike Lawrence.

## Reporting unexpected behavior or questions

When reporting unexpected behaviors, bugs etc. in the packages, PLEASE supply:

- A reproducible example in terms of a short code fragment.
- The data. The preferred way of sending the data "mydata" is to copy and paste the result from running
`dput(mydata)`. - The result of running the
`sessionInfo()`function.

(This text is stolen from here)

## FAQ

Most of these questions are about the more popular function `mixed()`

for (G)LMMs. If something is missing, please send me a mail.

**What does the warning "**`failure to converge in 10000 evaluations`

" mean?

This message indicates that the optimizer did not find a suitable solution or optimum (i.e., the parameters, goodness of fit, deviance, AIC, BIC, ... cannot be trusted). This is simply a sign that the default number of maximum iterations which the optimizer uses is not sufficient which can easily happen even with medium sized datasets (especially with complex random structures). To avoid this message one only needs to increase the number of iterations that are allowed for the optimizer via the control argument, e.g. for a LMM: `control = lmerControl(optCtrl = list(maxfun = 100000))`

or for a GLMM `control = glmerControl(optCtrl = list(maxfun = 100000))`

. One could also even use larger numbers here if necessary. See also the next question.

**How does**`mixed()`

deal with the situation when the model does not converge?

`mixed()`

doesn't do anything in these case, but pass the warnings through (unless a cluster is used via the `cl`

argument). You need to assure that you pass the appropriate control arguments (i.e., `lmerControl `

or ` glmerControl`

) to `mixed `

which are passed down to `(g)lmer`

. Hence, if there are no warnings when `mixed`

finishes, there have been no warnings from the fitting routine (i.e., all models converged).

Further, `mixed`

returns an object (a list of class `"mixed"`

) whose third element is a list with all the submodels fitted. (The second element of said list is the full model). Each of those submodels, at least in the new version of lme4, contains a slot which also contains fitting information such as nonconvergence or other warnings. The following returns the warnings for a model `m1`

: `lapply(m1[[3]], function(x) x@optinfo$warnings)`

**Does**`mixed()`

also work if you pass it an already fitted model as first argument and not a formula?

This behavior is not supported and is known to fail when fitting is distributed using the `cl`

argument. Furthermore, it does not save computational time as the full model is fitted again in all cases. Hence, I strongly recommend to always use a call similar to the original (g)lmer call that includes the data and formula with arguments for `type`

, `method `

(and potentially `cl `

or `args.test`

) added.

**Why does afex set the default contrasts to effect coding (**`"contr.sum"`

)?

afex is intended for factorial experiments, i.e., orthogonal factors. For those, there is, as far as I have read, little use for other codings. If so, I am really interested to hear the arguments. If you do not want this setting simply reset the default factors via the following command, all afex functions should still try to use effects coding unless told not to do so:

`options(contrasts=c('contr.treatment', 'contr.poly'))`

Finally, afex is open source, if you do not like it, branch afex and delete this. I think it is an extremely useful default, way more useful the the default treatment or dummy coding!

## References

Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. *Journal of Memory and Language*, 59(4), 390–412. doi:10.1016/j.jml.2007.12.005

Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. *Journal of Memory and Language*, 68(3), 255–278. doi:10.1016/j.jml.2012.11.001

Dalgaard, P. (2007). New Functions for Multivariate Analysis. *R News*, 7, 2–7.

Fox, J., & Weisberg, S. (2011). *An R Companion to Applied Regression*. Thousand Oaks, Calif.: SAGE Publications.

Judd, C. M., Westfall, J., & Kenny, D. A. (2012). Treating stimuli as a random factor in social psychology: A new and comprehensive solution to a pervasive but largely ignored problem. *Journal of Personality and Social Psychology*, 103(1), 54–69. doi:10.1037/a0028347

O’Brien, R. G., & Kaiser, M. K. (1985). MANOVA method for analyzing repeated measures designs: An extensive primer. *Psychological Bulletin*, 97(2), 316–333. doi:10.1037/0033-2909.97.2.316

Wickham, H. (2007). Reshaping Data with the reshape Package. *Journal of Statistical Software*, 21(12), 1–20.