Specify a diagnostic model

Get started by learning how to specify and estimate a model using measr.

Introduction

How do you specify and estimate a diagnostic classification model (DCM) using measr? In this article, we will walk you through the steps. We start with data for building the model, learn how to specify DCMs that make different assumptions about the data, and explore how to estimate the model with Stan.

To use code in this article, you will need to install the following packages: dcmdata, measr, and rstan.

library(measr)     # for model specification and estimation

# Helper packages
library(dcmdata)   # for example data set
library(rstan)     # for estimation backend

Rapid Online Assessment of Reading and Phonological Awareness (ROAR-PA) data

Let’s use data from the ROAR-PA (Gijbels et al., 2024) to learn how to specify and estimate a DCM with measr. The ROAR-PA data is available in the dcmdata package, and contains responses to 57 items from 272 respondents.

roarpa_data
#> # A tibble: 272 × 58
#>       id fsm_01 fsm_04 fsm_05 fsm_06 fsm_07 fsm_08 fsm_10 fsm_11 fsm_12 fsm_14
#>    <int>  <int>  <int>  <int>  <int>  <int>  <int>  <int>  <int>  <int>  <int>
#>  1   161      0      1      1      1      1      0      0      1      1      1
#>  2   226      0      1      0      1      0      0      1      0      1      0
#>  3   103      0      1      0      1      0      0      0      0      0      0
#>  4     7      1      1      0      0      1      0      0      0      1      1
#>  5   185      1      1      1      1      1      1      1      1      1      1
#>  6   129      1      1      1      0      1      1      0      0      1      1
#>  7   181      1      1      1      1      1      1      1      1      1      1
#>  8    36      1      1      1      1      1      1      1      1      1      1
#>  9   206      1      1      1      1      1      1      1      1      1      1
#> 10   257      1      1      1      1      1      1      1      1      1      1
#> # ℹ 262 more rows
#> # ℹ 47 more variables: fsm_15 <int>, fsm_16 <int>, fsm_17 <int>, fsm_18 <int>,
#> #   fsm_21 <int>, fsm_22 <int>, fsm_23 <int>, fsm_24 <int>, fsm_25 <int>,
#> #   lsm_01 <int>, lsm_02 <int>, lsm_04 <int>, lsm_05 <int>, lsm_06 <int>,
#> #   lsm_07 <int>, lsm_08 <int>, lsm_10 <int>, lsm_11 <int>, lsm_13 <int>,
#> #   lsm_15 <int>, lsm_16 <int>, lsm_17 <int>, lsm_18 <int>, lsm_19 <int>,
#> #   lsm_20 <int>, lsm_21 <int>, lsm_22 <int>, lsm_24 <int>, del_01 <int>, …

In addition to our response data, a DCM also requires a Q-matrix. A Q-matrix contains one row per item, and one column per attribute (plus an optional column of item identifiers). A value of 1 indicates that the item measures the attribute, and a value of 0 indicates the item does not measure the attribute. In our Q-matrix, we can see that the item identifiers in the in the rows (item) correspond to the column names of the data. Additionally, we see that there are three attributes measured by this assessment: lsm, del, and fsm. These refer to the first sound made (fsm), last sound made (lsm), and deletion (del) elements of phonological awareness.

roarpa_qmatrix
#> # A tibble: 57 × 4
#>    item     lsm   del   fsm
#>    <chr>  <int> <int> <int>
#>  1 fsm_01     0     0     1
#>  2 fsm_04     0     0     1
#>  3 fsm_05     0     0     1
#>  4 fsm_06     0     0     1
#>  5 fsm_07     0     0     1
#>  6 fsm_08     0     0     1
#>  7 fsm_10     0     0     1
#>  8 fsm_11     0     0     1
#>  9 fsm_12     0     0     1
#> 10 fsm_14     0     0     1
#> # ℹ 47 more rows

Our task is to determine which attributes each respondent is proficient on, given their item responses. For more information on the data set, see ?roarpa and Gijbels et al. (2024).

Specify a DCM

A DCM model specification has three primary components: the Q-matrix, a measurement model, and a structural model. Given these three components, we can create a model specification with dcm_specify().

roarpa_spec <- dcm_specify(
  qmatrix = roarpa_qmatrix,
  identifier = "item",
  measurement_model = lcdm(),
  structural_model = unconstrained()
)

roarpa_spec
#> A loglinear cognitive diagnostic model (LCDM) measuring 3 attributes
#> with 57 items.
#> 
#> ℹ Attributes:
#> • "lsm" (19 items)
#> • "del" (19 items)
#> • "fsm" (19 items)
#> 
#> ℹ Attribute structure:
#>   Unconstrained
#> 
#> ℹ Prior distributions:
#>   intercept ~ normal(0, 2)
#>   maineffect ~ lognormal(0, 1)
#>   `Vc` ~ dirichlet(1, 1, 1)

The Q-matrix, as we described, defines which items measure each attribute. In addition the the Q-matrix itself, we must also tell dcm_specify() which column within the Q-matrix contains the item identifiers. If there is no item identifier, then identifier can be left as NULL (the default). In our ROAR-PA specification, we can see that each of our three attributes is measured by 19 items. The ROAR-PA Q-matrix is a simple structure, meaning that each item measures only one attribute.

At a high level, the measurement model describes how attributes interact with each other on specific items. If an item measures two attributes, how do we expect a respondent to perform if they possess only one of the attributes? Are the attributes compensatory, meaning that proficiency on either is sufficient to answer the item correctly, or noncompensatory, and proficiency on both attributes is required in order to provide a correct response? The choice of measurement model dictates these relationships.

The structural model describes relationships between proficiency on the attributes. Is proficiency on one attribute independent of proficiency on another, or is proficiency correlated? It’s also possible that some attributes may represent prerequisite knowledge such that respondents must demonstrate proficiency before they can demonstrate proficiency of other attributes. The structural model is used to define these relationships.

We’ll explore both measurement and structural models in more detail in the next sections.

Measurement models

measr provides functionality for seven DCM measurement models: the six core models identified by Rupp et al. (2010) and a general model that subsumes the other models. A full description of these models is beyond the scope of what we are covering here. However, we will provide a high-level overview the types of models and offer referenes for further details on each.

The general DCM supported by the measr is the loglinear cognitive diagnostic model (LCDM; Henson et al., 2009; Henson & Templin, 2019). This is the most flexible model that allows each item to have unique interactions between attributes, estimating separate main effects and interaction effects for all possible attribute combinations. You can think of the LCDM as the “saturated model” that all other DCMs are constrained versions of. That is, by placing constraints on the LCDM parameters, you can achieve models equivalent to the other core models.

Under the umbrella of the LCDM are the six core DCMs, which generally fall into two categories: non-compensatory (also called conjunctive) and compensatory (disjunctive). When using a non-compensatory model, attributes function like prerequisites or requirements, and missing an attribute creates a specific deficit that other attributes cannot overcome. In other words, with non-compensatory models, performance is constrained by the weakest link. In this category, measr supports the deterministic input, noisy “and” gate model (DINA; de la Torre & Douglas, 2004); the noisy-input, deterministic “and” gate model (NIDA; Junker & Sijtsma, 2001); and the non-compensatory reparameterized unified model (NC-RUM; DiBello et al., 1995). On the other hand, when compensatory models, attributes function like independent skills that accumulate, and having more attributes can partially or fully compensate for missing others. Thus, performance improves as you gain more attributes. In the compensatory category, measr supports the deterministic input, noisy “or” gate model (DINO; Templin & Henson, 2006); the noisy-input, deterministic “or” gate model (NIDO; Templin, 2006); and the compensatory reparameterized unified model (C-RUM; Hartz, 2002).

Each of these measurement models can be estimated with measr by supplying the respective measurement model function, as shown in Table 1, to the measurement_model argument of dcm_specify().

Table 1: Measurement models supported by measr
model description measr
LCDM General and flexible, subsumes other models lcdm()
Non-compensatory
DINA All attributes must be present dina()
NIDA Attributes have multiplicative penalties equal across items nida()
NC-RUM Attributes have multiplicative penalites that vary across items ncrum()
Compensatory
DINO Any one attribute must be present dino()
NIDO Attributes are additive and equal across items nido()
C-RUM Attributes are additive and vary across items crum()

Structural models

measr provides functionality for five structural models. The structural model describes the joint distribution of attribute profiles in the population. Different structural models make different assumptions about how attributes relate to each other.

The most general option is the unconstrained model (Rupp et al., 2010). This model places no constraints on the relationships between attributes. Each of the 2A possible attribute profiles (where A is the number of attributes) has its own freely estimated base rate parameter. Because all profiles are freely estimated, this is a saturated structural model.

The independent model (Lee, 2017) assumes that attributes are completely unrelated. Proficiency on one attribute tells you nothing about proficiency on another. Under this model, each attribute has its own proficiency base rate, and the probability of any profile is simply the product of the individual attribute base rates (or their complements for non-proficiency).

The loglinear model (Xu & von Davier, 2008) uses a log-linear parameterization with main effects and interactions. When specifying a loglinear model, we can use the max_interaction argument to control the high-level interactions to include. When max_interaction is set to the number of attributes (the default), the loglinear model is equivalent to the unconstrained model. When max_interaction = 1, only main effects are included, which is equivalent to the independent model. Intermediate values allow you to model some degree of attribute dependence without fully saturating the structural model.

The remaining two structural models incorporate attribute hierarchies. In these models, proficiency on some attributes may be a prerequisite for proficiency on others. The hierarchical DCM (HDCM; Templin & Bradshaw, 2014) enforces strict attribute prerequisites. Attribute profiles that violate the specified hierarchy are excluded entirely from the model, meaning their base rates are fixed to zero. In contrast, the Bayesian network model (Hu & Templin, 2020) implements a softer version of the hierarchy. All attribute profiles remain possible, but profiles that are inconsistent with the hierarchy are estimated to be less likely. Both models require a hierarchy argument that defines the attribute relationships using dagitty-style syntax, such as "att1 -> att2 -> att3". For more details on specifying attribute hierarchies, see the Define Attribute Relationships article.

Each of these structural models can be estimated with measr by supplying the respective structural model function, as shown in Table 2, to the structural_model argument of dcm_specify().

Table 2: Structural models supported by measr
model description measr
Unconstrained General and flexible, subsumes other models unconstrained()
Independent Attributes are independent of each other independent()
Loglinear Can be constrained to only include certain interaction levels loglinear()
HDCM Hard constraints on profiles based on attribute dependencies hdcm()
BayesNet Soft constraints on profiles based on attribute dependencies bayesnet()

Prior distributions

A final aspect of the DCM specification that we have not yet talked about is the definition of the model priors. Take another look at our specifciation object. We can see that there are prior distribution defined for each type of parameter in our model.

roarpa_spec
#> A loglinear cognitive diagnostic model (LCDM) measuring 3 attributes
#> with 57 items.
#> 
#> ℹ Attributes:
#> • "lsm" (19 items)
#> • "del" (19 items)
#> • "fsm" (19 items)
#> 
#> ℹ Attribute structure:
#>   Unconstrained
#> 
#> ℹ Prior distributions:
#>   intercept ~ normal(0, 2)
#>   maineffect ~ lognormal(0, 1)
#>   `Vc` ~ dirichlet(1, 1, 1)

Every parameter in a DCM specification is assigned a prior distribution that encodes our beliefs about plausible parameter values before observing any data. measr provides sensible defaults, but you can also customize priors to reflect domain knowledge or to implement more informative constraints.

To view the default priors for a given measurement and structural model combination, use default_dcm_priors(). In our example, we have specified an LCDM with an unconstrained structural model. Plugging those two components in, we see the same priors that we saw in our specification object.

default_dcm_priors(
  measurement_model = lcdm(),
  structural_model = unconstrained()
)
#> # A tibble: 4 × 3
#>   type        coefficient prior                      
#>   <chr>       <chr>       <chr>                      
#> 1 intercept   <NA>        normal(0, 2)               
#> 2 maineffect  <NA>        lognormal(0, 1)            
#> 3 interaction <NA>        normal(0, 2)               
#> 4 structural  Vc          dirichlet(rep_vector(1, C))

However, different choices of measurement and structural models will result in different parameters being included, and therefore different prior distributions. For example, specifying a DINA measurement model with and independent structural model has a completely different set of parameters.

default_dcm_priors(
  measurement_model = dina(),
  structural_model = independent()
)
#> # A tibble: 3 × 3
#>   type       coefficient prior      
#>   <chr>      <chr>       <chr>      
#> 1 slip       <NA>        beta(5, 25)
#> 2 guess      <NA>        beta(5, 25)
#> 3 structural <NA>        beta(1, 1)

You can see which parameter types and specific coefficients are available for your model using get_parameters().

get_parameters(lcdm(), qmatrix = roarpa_qmatrix, identifier = "item")
#> # A tibble: 114 × 4
#>    item   type       attributes coefficient
#>    <chr>  <chr>      <chr>      <chr>      
#>  1 fsm_01 intercept  <NA>       l1_0       
#>  2 fsm_01 maineffect fsm        l1_13      
#>  3 fsm_04 intercept  <NA>       l2_0       
#>  4 fsm_04 maineffect fsm        l2_13      
#>  5 fsm_05 intercept  <NA>       l3_0       
#>  6 fsm_05 maineffect fsm        l3_13      
#>  7 fsm_06 intercept  <NA>       l4_0       
#>  8 fsm_06 maineffect fsm        l4_13      
#>  9 fsm_07 intercept  <NA>       l5_0       
#> 10 fsm_07 maineffect fsm        l5_13      
#> # ℹ 104 more rows

To customize priors, use the prior() function. The type argument specifies which parameter type the prior applies to, and the optional coefficient argument can target a specific parameter within that type. Custom priors can be passed to dcm_specify() via the priors argument. Any parameter types not covered by a custom prior will retain their default values.

my_priors <- c(
  prior(normal(0, 1), type = "intercept"),
  prior(lognormal(0, 0.5), type = "maineffect", coefficient = "l1_13")
)

dcm_specify(
  qmatrix = roarpa_qmatrix,
  identifier = "item",
  measurement_model = lcdm(),
  structural_model = unconstrained(),
  priors = my_priors
)
#> A loglinear cognitive diagnostic model (LCDM) measuring 3 attributes
#> with 57 items.
#> 
#> ℹ Attributes:
#> • "lsm" (19 items)
#> • "del" (19 items)
#> • "fsm" (19 items)
#> 
#> ℹ Attribute structure:
#>   Unconstrained
#> 
#> ℹ Prior distributions:
#>   `l1_13` ~ lognormal(0, 0.5)
#>   intercept ~ normal(0, 1)
#>   maineffect ~ lognormal(0, 1)
#>   `Vc` ~ dirichlet(1, 1, 1)

Estimate a model specification

Once we have a model specification, we can estimate it using dcm_estimate(). This function takes a specification object (created by dcm_specify()), along with the response data and the name of the column in the data that contains respondent identifiers.

The method argument controls how the model is estimated. Options include "optim" for point estimation using Stan’s optimizer, "mcmc" for full Markov chain Monte Carlo sampling, "variational" for variational inference, and "pathfinder" (available only when using the cmdstanr backend). Full MCMC provides the most complete picture of the posterior distribution, but takes the longest to run. The optimizer is the fastest option and is useful for quick analyses, but does not provide a full posterior distribution.

The backend argument specifies which Stan interface to use for estimation: "rstan" or "cmdstanr" to use the rstan or cmdstanr package, respetively. The file argument allows you to save the estimated model to disk so that it does not need to be re-estimated if you re-run the script. Any additional arguments are passed directly to the backend’s estimation function (e.g., chains, iter, and warmup for MCMC estimation when using the "rstan" backend).

For this example, we use the optimizer with rstan, which provides fast point estimates of the model parameters.

roarpa_lcdm <- dcm_estimate(
  dcm_spec = roarpa_spec,
  data = roarpa_data,
  identifier = "id",
  method = "optim",
  backend = "rstan",
  file = here::here("start", "fits", "roarpa-lcdm-uncst-rstn")
)

Respondent proficiency estimates

After estimating a model, we typically want to know which attributes each respondent has mastered. The score() function calculates respondent proficiency estimates from a fitted model. It returns a list with two elements: class_probabilities, which contains the probability that each respondent belongs to each possible attribute profile, and attribute_probabilities, which contains the marginal probability that each respondent is proficient on each individual attribute.

roarpa_scores <- score(roarpa_lcdm)
roarpa_scores
#> $class_probabilities
#> # A tibble: 2,176 × 3
#>    id    class   probability
#>    <chr> <chr>         <dbl>
#>  1 161   [0,0,0]   2.59 e- 1
#>  2 161   [1,0,0]   2.24 e-10
#>  3 161   [0,1,0]   7.41 e- 1
#>  4 161   [0,0,1]   1.10 e- 7
#>  5 161   [1,1,0]   7.97 e- 9
#>  6 161   [1,0,1]   1.29 e-15
#>  7 161   [0,1,1]   2.35 e- 6
#>  8 161   [1,1,1]   2.14 e-13
#>  9 226   [0,0,0]   1.000e+ 0
#> 10 226   [1,0,0]   6.65 e-11
#> # ℹ 2,166 more rows
#> 
#> $attribute_probabilities
#> # A tibble: 816 × 3
#>    id    attribute probability
#>    <chr> <chr>           <dbl>
#>  1 161   lsm          8.19e- 9
#>  2 161   del          7.41e- 1
#>  3 161   fsm          2.46e- 6
#>  4 226   lsm          6.65e-11
#>  5 226   del          5.07e-10
#>  6 226   fsm          1.19e- 7
#>  7 103   lsm          2.41e-13
#>  8 103   del          3.79e- 6
#>  9 103   fsm          2.62e-14
#> 10 7     lsm          2.14e-15
#> # ℹ 806 more rows

In practice, we often want to convert these probabilities into binary proficiency classifications. A common approach is to use a threshold of .5, classifying a respondent as proficient on an attribute if their estimated probability of proficiency exceeds .5. The choice of threshold matters and can be adjusted based on the intended use of the results.

library(tidyverse)

roarpa_scores$attribute_probabilities |> 
  mutate(probability = as.integer(probability > .5)) |> 
  pivot_wider(names_from = attribute, values_from = probability)
#> # A tibble: 272 × 4
#>    id      lsm   del   fsm
#>    <chr> <int> <int> <int>
#>  1 161       0     1     0
#>  2 226       0     0     0
#>  3 103       0     0     0
#>  4 7         0     0     0
#>  5 185       1     0     1
#>  6 129       0     0     0
#>  7 181       1     1     1
#>  8 36        1     1     1
#>  9 206       1     1     1
#> 10 257       1     1     1
#> # ℹ 262 more rows

Wrapping up

We now have an estimate of proficiency for each respondent on each of the attribute measured by the ROAR-PA. However, before we report these result it’s important to evaluate the quality of the model. We need to ensure that the model fits well and provides accurate classifications. That is the focus of the Evaluate Model Performance article.

Session information

#> ─ Session info ─────────────────────────────────────────────────────
#>  version      R version 4.5.2 (2025-10-31)
#>  language     (EN)
#>  date         2026-03-11
#>  pandoc       3.9
#>  quarto       1.9.24
#>  Stan (rstan) 2.37.0
#> 
#> ─ Packages ─────────────────────────────────────────────────────────
#>  package          version     date (UTC) source
#>  bridgesampling   1.2-1       2025-11-19 CRAN (R 4.5.2)
#>  dcmdata          0.2.0       2026-03-10 CRAN (R 4.5.2)
#>  dcmstan          0.1.0       2025-11-24 CRAN (R 4.5.2)
#>  dplyr            1.2.0       2026-02-03 CRAN (R 4.5.2)
#>  forcats          1.0.1       2025-09-25 CRAN (R 4.5.0)
#>  ggplot2          4.0.2       2026-02-03 CRAN (R 4.5.2)
#>  loo              2.9.0.9000  2025-12-30 https://stan-dev.r-universe.dev (R 4.5.2)
#>  lubridate        1.9.5       2026-02-04 CRAN (R 4.5.2)
#>  measr            2.0.0.9000  2026-03-04 Github (r-dcm/measr@f033603)
#>  posterior        1.6.1.9000  2025-12-30 https://stan-dev.r-universe.dev (R 4.5.2)
#>  purrr            1.2.1       2026-01-09 CRAN (R 4.5.2)
#>  readr            2.2.0       2026-02-19 CRAN (R 4.5.2)
#>  rlang            1.1.7       2026-01-09 CRAN (R 4.5.2)
#>  rstan            2.36.0.9000 2025-09-26 https://stan-dev.r-universe.dev (R 4.5.1)
#>  stringr          1.6.0       2025-11-04 CRAN (R 4.5.0)
#>  tibble           3.3.1       2026-01-11 CRAN (R 4.5.2)
#>  tidyr            1.3.2       2025-12-19 CRAN (R 4.5.2)
#> 
#> ────────────────────────────────────────────────────────────────────

References

de la Torre, J., & Douglas, J. A. (2004). Higher-order latent trait models for cognitive diagnosis. Psychometrika, 69(3), 333–353. https://doi.org/10.1007/BF02295640
DiBello, L. V., Stout, W. F., & Roussos, L. (1995). Unified cognitive psychometric assessment likelihood-based classification techniques. In P. D. Nichols, S. F. Chipman, & R. L. Brennan (Eds.), Cognitively diagnostic assessment (pp. 361–390). Erlbaum.
Gijbels, L., Burkhardt, A., Ma, W. A., & Yeatman, J. D. (2024). Rapid online assessment of reading and phonological awareness (ROAR-PA). Scientific Reports, 14, Article 10249. https://doi.org/10.1038/s41598-024-60834-9
Hartz, S. M. (2002). A Bayesian framework for the unified model for assessing cognitive abilities: Blending theory with practicality (Publication No. 3044108). [Doctoral thesis, University of Illinois at Urbana-Champaign]. ProQuest Dissertations and Theses Global.
Henson, R. A., & Templin, J. L. (2019). Loglinear cognitive diagnostic model (LCDM). In M. von Davier & Y.-S. Lee (Eds.), Handbook of diagnostic classification models (pp. 171–185). Springer International Publishing. https://doi.org/10.1007/978-3-030-05584-4_8
Henson, R. A., Templin, J. L., & Willse, J. T. (2009). Defining a family of cognitive diagnosis models using log-linear models with latent variables. Psychometrika, 74(2), 191–210. https://doi.org/10.1007/s11336-008-9089-5
Hu, B., & Templin, J. (2020). Using diagnostic classification models to validate attribute hierarchies and evaluate model fit in Bayesian networks. Multivariate Behavioral Research, 55(2), 300–311. https://doi.org/10.1080/00273171.2019.1632165
Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25(3), 258–272. https://doi.org/10.1177/01466210122032064
Lee, S. Y. (2017, June 27). Cognitive diagnosis model: DINA model with independent attributes. Stan. https://mc-stan.org/learn-stan/case-studies/dina_independent.html
Rupp, A. A., Templin, J., & Henson, R. A. (2010). Diagnostic measurement: Theory, methods, and applications. Guilford Press.
Templin, J. (2006). CDM user’s guide [Unpublished manuscript]. Department of Psychology, University of Kansas.
Templin, J., & Bradshaw, L. (2014). Hierarchical diagnostic classification models: A family of models for estimating and testing attribute hierarchies. Psychometrika, 79(2), 317–339. https://doi.org/10.1007/s11336-013-9362-0
Templin, J., & Henson, R. A. (2006). Measurement of psychological disorders using cognitive diagnosis models. Psychological Methods, 11(3), 287–305. https://doi.org/10.1037/1082-989X.11.3.287
Xu, X., & von Davier, M. (2008). Fitting the structured general diagnostic model to NAEP data (Nos. RR-08-27). Educational Testing Service. https://files.eric.ed.gov/fulltext/EJ1111272.pdf