raw.means.plot is a function for visualizing results of experimental designs with up to two factors (i.e., conditions) for the R programming lanuage.
The main feature is that it plots both the raw data in the background and superimposed the means to provide a better and more accurate visualization of the underlying distribution. To distinguish between factor levels which would be at the same x-axis position in classical plots, each factor level is offseted from the other factor levels. Furthermore, if two data points (from the raw data) would occupy the exact same position some uniform noise is added to display both at almost the same position. Furthermore, a legend can be added automatically outside the plot region.
There is almost unanimously agreement that providing means (or other point estimates) without some information of the spread of the data is always a bad idea (e.g., APA, 2010, p.34). However, which measure of spread to use is less clear. One can provide standard deviations (SD), standard erros (SE), or confidence intervals (CI) with the latter two (SE & CI) even providing the possibility for visual inference tests. However, each of these measures has drawebacks. At first, all of these measures perform best and accurate if and only if the underlying distribution is normal. Second, performing visual inference tests is not overly simple (e.g., Cumming, & Finch, 2005). Third, providing appropriate SEs or CIs in a mixed within-between-subjects design is impossible (Loftus & Masson, 1994; Masson & Loftus, 2003). To overcome this issues (for the cost of not being able to perform visual inference tests) one can use raw.means.plots. They not only provide a visual representation of the spread unbiased of the underlying distribution, they even display the underlying distribution and provide the possibility to check for outliers.
I have also giving a presentation in German on raw.means.plot at the University of Hamburg (thanks to Ingmar Böschen for the invitation), available here.
The functionality of rm.plot was stimulated by a discussion on stats.stackexchange and a similar discussion on stackoverflow. Furthermore, the final function contains code stimulated by two discussions on stackoverflow: one, two.
How to install and use
raw.means.plot is now part of the plotrix package (Thanks to Jim lemon). Simply install plotrix from CRAN and then you have it:
To load the package type
plotrix currently contains the following three functions from me (data needs to be in long format for all functions):
raw.means.plotwas the initial version of this functions that takes a
data.frame(subject identifier not necessary) and plots every row that is present in the data frame as a point in the rm.plot. This function can only handle datasets with up to two factors.
raw.means.plot2is a convenience wrapper for
raw.means.plotthat allows for data.frames with an arbitrary number of factors if the data.frame contains a subject identifier. You only need to specify the subject identifier column in addition to the usual offset and x-axis column and
raw.means.plot2will aggregate the data if there is more than one observation per individual and cell of the crossed factors.
add.psneeds to be called with the same data.frame and arguments as
raw.means.plot2and will add p-values of t.tests comparing the different factor levels at each x-axis tick against a reference factor level. Good for exploratory data analysis and data screening.
If there are any issues (bugs, missing features, odd behavior, ...) please report them to me at firstname.lastname@example.org
This graph shows a result from Experiment 1 from Singmann & Klauer (2011) produced with raw.means.plot. One clearly sees the differences in the means that are accompanied by severe differences in spred and distribution.
APA. (2010). Publication Manual of the American Psychological Association. New York: American Psychological Association.
Cumming, G., Finch, S. (2005). Inference by eye: confidence intervals and how to read pictures of data. American Psychologist, 60, 170-180.
Loftus, G. R. & Masson, M. E. J. (1994). Using confidence intervals in within-subject designs. Psychonomic Bulletin & Review, 1, 476-490.
Masson, M. E. J., & Loftus, G. R. (2003). Using confidence intervals for graphically based data interpretation. Canadian Journal of Experimental Psychology, 57, 203-220.
Singmann, H., & Klauer, K. C. (2011). Deductive and Inductive Conditional Inferences: Two Modes of Reasoning. Thinking & Reasoning, 17, 247-281.