Stan: Structural Ordinal Item Response Theory with a Latent Interaction

The Non-Gaussian factor model introduces the idea of using other distributions for a factor analysis. But that code is not very generalized, and in reality we’ll tend to need something more like structural equation modeling with non-Gaussian variables.

The name for a factor model with logit-linked indicators, whether dichotomous or ordinal, is Item Response Theory. It has been used for decades to develop instruments and in particular, tests. Because of its history, factor loadings are called the “discrimination” parameters, intercepts are the item “difficulty”, and the factor scores represent each person’s “ability”.

I do not know if there’s a cool brand name for IRT where multiple latent variables go on to have structure amongst themselves. But it seems useful to establish the possibility. We are essentially talking about structural equation modeling with discrete distributions.

I wrote a model to explore a number of difficult computing challenges related to that pursuit and common to the kinds of analysis requests that I get:

  • Non-Gaussian factor indicators
  • of which there are a large number (»20) arising from questionnaire data,
  • each having 5 or more response categories, as a common property of Likert scales,
  • And latent variable interactions, because everyone wants to know if their variables involve moderation.

And a bonus idea: when there are a large number of indicators and we are assuming that they function similarly in representing the factor, we can model the distribution of the factor loadings and perhaps gain a little extra power to estimate them. The discrimination parameters in IRT tend to be the least precise.

So here is a model that addresses all of that, first with simulation of the data:

D<-4 #Number of factors
Q<-50 #Indicators per factor
K<-7 #Levels per indicator
N<-600 #Sample size


z<-matrix(NA, N, D)
z[,1]<-rnorm(N) 
z[,4]<-rnorm(N) 
z[,2]<-rnorm(N) + .3*z[,1]
z[,3]<-rnorm(N) + 2.5*z[,1] + .3*z[,2] + 2*z[,4]*z[,1]

pairs(z)

#I know, this is actually a probit model. 
dat<-array(NA, dim=c(N,D,Q))
for(d in 1:D){
  for(q in 1:Q){
    dat[,d,q]<- findInterval( rnorm(1,1,.05)*z[,d] + rnorm(N), seq(-1, 2, l=K-1) )+1
  }
}

And then the Stan code:

functions{
}
data {
  int<lower=0> N; 	 //sample size
  int<lower=1> D; 	 //num factors
  int<lower=1> Q;	 //num indicators per factor
  int<lower=2> K; 	 //num levels
  int<lower=1,upper=K> y[N, D, Q];
}
parameters {
  matrix[Q,D] lambda_z;
  ordered[K-1] c[Q, D];
  matrix[N,D] z;
  real beta[3];
  real alpha;
  
//optional multi-level distribution of discrimination parameters, helpful for smaller samples but many indicators
  real<lower=0> lambda_mu;
  real<lower=0> lambda_sd;
}
transformed parameters{
//centered parameterization for discrimination parameters
	matrix[Q,D] lambda = lambda_z*lambda_sd + lambda_mu;
}
model {
	z[,1] ~ std_normal();
	z[,4] ~ std_normal();
	z[,2] ~ normal(beta[1]*z[,1], 1.0);
	z[,3] ~ normal(beta[2]*z[,1] + beta[3]*z[,2] + alpha*(z[,1].*z[,4]), 1.0);
	
	for(d in 1:D){
		for(q in 1:Q){
			for(n in 1:N){
			y[n, d, q] ~ ordered_logistic( z[n,d] * lambda[q, d], c[q, d]);
			}
		 }
		 lambda_z[,d] ~ std_normal();
	}

//optional priors	
  lambda_sd ~ normal(0, 5.0);
  lambda_mu ~ normal(0, 5.0);
  beta ~ normal(0.0,5.0);
  alpha ~ normal(0.0,5.0);
}

Note certain features include grouping variables together into matrices instead of inputting them one by one, centered factor loadings (“discrimination” parameters) with a shared mean and standard deviation. Most important for me was achievement of latent interaction by element-wise multiplication of the z-score vectors alpha*(z[,1].*z[,4]). No higher order moments necessary!