# MPTinR

## Description

MPTinR is a package for the R - programming language that providing a user-friendly way for the analysis of multinomial processing tree (MPT) models. MPT models are measurement models for cognitive processes with which you can estimate latent (i.e., unobservable) cognitive processes (e.g., Riefer & Batchelder, 1988, or here). MPT models can be used if observations fall into one and only one of a finite set of categories (i.e., if data is categorical).

The advantages of MPTinR are:

- MPTinR is an R package and therefore integrates smoothly with an R-workflow.
- MPTinR allows an easy and intuitive way to specify the model in within the R script or an external model file that even allows comments (via #). Furthermore, MPTinR supports the 'classical' EQN syntax (see e.g., Stahl & Klauer, 2007).
- Equality, inequality and order restrictions can be specified easily.
- MPTinR provides different outputs and analyses for single datasets (i.e., a vector) and multiple datasets (i.e., a matrix or data.frame). In the latter case, results for each individual, sums of the individual results, and results from aggregating the data across participants are automatically provided.
- For model selection, the Fisher information approximation (FIA), a minimum description based measure of model complexity, can be obtained using the algorithm provided by Wu, Myung and Batchelder (2010).
- For multiple individuals or multiple fitting runs the package allows one to easily use multiple cores (or CPUs) via the snowfall package by simply specifying the number of available cores (requires snowfall >= 1.84).
- The functions
`fit.model`

and`fit.mptinr`

can be used to fit many other types of models for categorical data such as signal detection theory (SDT) based models. See the examples in the documentation of these functions,

MPTinR was programmed by Henrik Singmann and David Kellen with help from Karl Christoph Klauer and many others.

To get an overview, read our paper on MPTinR.

## Downloads & Installation

MPTinR is hosted on CRAN the central R repositiory, so it can be installed from within R by calling:

`install.packages("MPTinR")`

To view the pdf help page go to its CRAN page.

The sourcecode can be viewed here or here (also R-forge). Note that the source code at R-forge contains additional documentation and tests.

After installation you may load the package every time you start R by:

`library(MPTinR)`

or `require(MPTinR)`

To get started, type `?fit.mpt`

, scroll to the examples and execute them. Especially the third example, using data from Bröder & Schütz (2009), is extremely helpful as it illustrates fitting and model selection using the FIA.

## Usage & Functions

After installation you need to load MPTinR if you want to use it via either `library(MPTinR)`

or `require(MPTinR)`

.

MPTinR has two main functions:

`fit.mpt`

is the main function that fits MPT models for single as well as for multiple individuals and can obtain the FIA. For more information type`?fit.mpt`

.`select.mpt`

is a function that takes a list of results returned by`fit.mpt`

and displays a model selection table on these results (using, if computed, FIA, and AIC and BIC).

See the example from Bröder & Schütz (2009) in `?fit.mpt`

for an example of how these two functions can be used together.

`fit.mpt`

needs at least a data object (vector for individual fit and matrix or data.frame for multi-individual fit) and the name of the model file as arguments. The name of a restriction file is one of the optional arguments.

Furthermore, MPTinR contains two further fitting functions that are useful when fitting other types of models for categorical data:

`fit.model`

is a copy of`fit.mpt`

with the additional arguments`lower.bound`

and`upper.bound`

to specify the lower and upper bound of the paramater space (for each parameter individually). With this function any type of model for categorical data that can be specified in a model file can be fitted (such as signal detection theory based models).`fit.mptinr`

is the workhorse of MPTinR that needs a model specified in an objective function.`fit.mpt`

and`fit.model`

simply create these objective functions (and functions for obtaining the gradient and Hessian of the objective function) and pass them to`fit.mptinr`

. If a model cannot be specified in a model file (e.g., if the model contains integrals) use this function.

### Model Representation

The model can either be in the **classical EQN syntax** as described for example in Stahl & Klauer (2007) or in the **easy format as described here**. To fit models from EQN files, the files either need to have the ending .eqn or .EQN or you need to set the `model.type`

argument to `"eqn"`

. The default model format is called `"easy"`

as you do not need to adhere to the rules imposed by the EQN syntax.

See section Model Files for more information and some helper functions.

### Data Object

The data needs to be an R object, either a vector for fitting an individual model or a matrix or data.frame for fitting multiple individuals or experiments. The position (coulumn for matrix/data.frame) of each response category in the data file must correspond to the data for this response category in the model file.

### Restrictions

The restrictions may contain simple equality and inequality restrictions, sequential equality restrictions, or order restrictions (i.e., sequential inequality restrictions) that need to follow simple rules:

- Inequality restrictions before equality restrictions (i.e., inequalities first).
- If a variable appears in an inequality restriction, it can not be on the LHS (left-hand-side) of any further restriction.
- If a variable appears on RHS (right-hand-side) of an equality restriction, it can not appear on LHS of an equality restriction.
- No variable can appear more than once on the LHS of any restriction.

Furthermore, comments are allowed using # (everythin gto the right of # will be ignored).

A valid restrictions file could look like the following:

#D1 to D3 are ordered D1 < D2 < D3 D4 = D3 # Both, B1 and B3, are set to 0.3333 B1 = B3 = 0.33333 X4 = X5 = G6 X3 = 0.9 # This is a comment, restriction will be used

### FIA

The FIA is computed using the MCMC algorithm provided by Wu, Myung and Batchelder (2010). We ported their Matlab code BMPTFIA to R (essentially it is an almost exact copy), see `?bmpt.fia`

. As their function needs the model represented in the context free language for MPTs (Purdy & Batchelder, 2009) we wrote several functions to integrate their function into MPTinR:

`make.mpt.cf`

is a function that takes a model file and returns the representation of this model in the context free language of MPT models (L-BMPT; Purdy & Batchelder, 2009). For this function to work, it is absolutely necessary that the representation of the model via equations in the model file exactly maps on the structure of the binary MPT. In other words, euqations in the model file can NOT be simplified in any way. See`?make.mpt.cf`

for more details.`bmpt.fia`

is an almost exact copy of the BMPTFIA function for obtaining the FIA for MPT models for Matlab by Wu et al. (2010).`get.mpt.fia`

is a wrapper for both of the aforementioned functions`make.mpt.cf`

and`bmpt.fia`

. It can be called with a data object, model file and (optionally) a restrictions file. The function will transform the model into the representation in L-BMPT and call`bmpt.fia`

with this representation as an argument. This function tries to reduce computational time for multi-individual data sets (matrix/data.frame).`prepare.mpt.fia`

is similar to`get.mpt.fia`

(i.e., takes the same arguments) but, instead of calling`bmpt.fia`

, prepares the code necessary to call the original BMPTFIA function for Matlab. This code is then immediately executable in Matlab.

### Data Generation and Bootstrapping

Since version 0.9.x, MPTinR contains three bootstrap functions:

`gen.data`

takes a set of parameter values and a model to generate*n*datsets. Can be used for parametric bootstrap.`sample.data`

takes a datasets and generates*n*bootstrap samples from the data. Can be used for nonparamteric bootstrap.`gen.predictions`

takes a set of parameter values and a model to generate predicted proportions or values.

See `?gen.data`

for examples.

## References

Purdy, B. P., & Batchelder, W. H. (2009). A context-free language for binary multinomial processing tree models. *Journal of Mathematical Psychology*, 53, 547-561.

Riefer, D. M., & Batchelder, W. H. (1988). Multinomial modeling and the measurement of cognitive processes. *Psychological Review*, 95, 318-339.

Stahl, C. & Klauer, K. C. (2007). HMMTree: A computer program for latent-class hierarchical multinomial processing tree models. *Behavior Research Methods*, *39*, 267- 273.

Wu, H., Myung, J.I., & Batchelder, W.H. (2010). Minimum description length model selection of multinomial processing tree models. *Psychonomic Bulletin & Review*, 17, 275-286.