Stan: Structural Ordinal Item Response Theory with a Latent Interaction

The Non-Gaussian factor model introduces the idea of using other distributions for a factor analysis. But that code is not very generalized, and in reality we’ll tend to need something more like structural equation modeling with non-Gaussian variables.

The name for a factor model with logit-linked indicators, whether dichotomous or ordinal, is Item Response Theory. It has been used for decades to develop instruments and in particular, tests. Because of its history, factor loadings are called the “discrimination” parameters, intercepts are the item “difficulty”, and the factor scores represent each person’s “ability”.

I do not know if there’s a cool brand name for IRT where multiple latent variables go on to have structure amongst themselves. But it seems useful to establish the possibility. We are essentially talking about structural equation modeling with discrete distributions.

I wrote a model to explore a number of difficult computing challenges related to that pursuit and common to the kinds of analysis requests that I get:

  • Non-Gaussian factor indicators
  • of which there are a large number (»20) arising from questionnaire data,
  • each having 5 or more response categories, as a common property of Likert scales,
  • And latent variable interactions, because everyone wants to know if their variables involve moderation.

And a bonus idea: when there are a large number of indicators and we are assuming that they function similarly in representing the factor, we can model the distribution of the factor loadings and perhaps gain a little extra power to estimate them. The discrimination parameters in IRT tend to be the least precise.

So here is a model that addresses all of that, first with simulation of the data:

D<-4 #Number of factors
Q<-50 #Indicators per factor
K<-7 #Levels per indicator
N<-600 #Sample size


z<-matrix(NA, N, D)
z[,1]<-rnorm(N) 
z[,4]<-rnorm(N) 
z[,2]<-rnorm(N) + .3*z[,1]
z[,3]<-rnorm(N) + 2.5*z[,1] + .3*z[,2] + 2*z[,4]*z[,1]

pairs(z)

#I know, this is actually a probit model. 
dat<-array(NA, dim=c(N,D,Q))
for(d in 1:D){
  for(q in 1:Q){
    dat[,d,q]<- findInterval( rnorm(1,1,.05)*z[,d] + rnorm(N), seq(-1, 2, l=K-1) )+1
  }
}

And then the Stan code:

functions{
}
data {
  int<lower=0> N; 	 //sample size
  int<lower=1> D; 	 //num factors
  int<lower=1> Q;	 //num indicators per factor
  int<lower=2> K; 	 //num levels
  int<lower=1,upper=K> y[N, D, Q];
}
parameters {
  matrix[Q,D] lambda_z;
  ordered[K-1] c[Q, D];
  matrix[N,D] z;
  real beta[3];
  real alpha;
  
//optional multi-level distribution of discrimination parameters, helpful for smaller samples but many indicators
  real<lower=0> lambda_mu;
  real<lower=0> lambda_sd;
}
transformed parameters{
//centered parameterization for discrimination parameters
	matrix[Q,D] lambda = lambda_z*lambda_sd + lambda_mu;
}
model {
	z[,1] ~ std_normal();
	z[,4] ~ std_normal();
	z[,2] ~ normal(beta[1]*z[,1], 1.0);
	z[,3] ~ normal(beta[2]*z[,1] + beta[3]*z[,2] + alpha*(z[,1].*z[,4]), 1.0);
	
	for(d in 1:D){
		for(q in 1:Q){
			for(n in 1:N){
			y[n, d, q] ~ ordered_logistic( z[n,d] * lambda[q, d], c[q, d]);
			}
		 }
		 lambda_z[,d] ~ std_normal();
	}

//optional priors	
  lambda_sd ~ normal(0, 5.0);
  lambda_mu ~ normal(0, 5.0);
  beta ~ normal(0.0,5.0);
  alpha ~ normal(0.0,5.0);
}

Note certain features include grouping variables together into matrices instead of inputting them one by one, centered factor loadings (“discrimination” parameters) with a shared mean and standard deviation. Most important for me was achievement of latent interaction by element-wise multiplication of the z-score vectors alpha*(z[,1].*z[,4]). No higher order moments necessary!

Stan: Non-Normal Factor Model

There are so many factor analyses in the world and so few truly normally distributed variables. I have not seen much careful tailoring of residual distributions in medical or psychological research. It is probably because most software don’t support it or make it convenient. It was a revelation to me that you can use Markov Chain Monte Carlo (MCMC) to sample latent variables and then do all kinds of things, like non-Gaussian distributions and latent variable interactions. For Stan users, this will be trivial, but I know that I would have benefited from a simple demo earlier on. So that’s what I’m posting here.

In R, a non-normal factor can be simulated this way:

f<-rnorm(n) #latent factor
x1<-rnorm(n, mean=.8*f, sd=.5) 
x2<-rbinom(n, size=1, p=1/(1+exp(-2*f-.5))) 
x3<-rpois(n, lambda=exp(-1 + 1.5*f)) 

Here is the Stan code for a simplistic model with 3 indicators distributed as normal, Bernoulli, and Poisson:

data { 
  int<lower=0> N;
  vector[N] x1;
  int x2[N];
  int x3[N];
}
parameters {
real x1_mu;
real<lower=0> x1_sd;
real x2_mu;
real x3_mu;

real<lower=0> f_x1; //lower bound prevents sign indeterminacy
real<lower=0> f_x2;
real<lower=0> f_x3;
vector[N] z; //Latent variable scores- i.e., person-intercepts
}
model {
  z  ~ std_normal(); // N(0,1) prior for latent variables.
  x1 ~ normal( x1_mu + f_x1 * z, x1_sd );
  x2 ~ bernoulli( 1.0 ./( 1+exp( - f_x2 * z + x2_mu )) );
  x3 ~ poisson( exp( x3_mu + f_x3 * z)  );
}

Note that z doesn’t really need to be normally distributed either. Putting the above code in a string object model3.src, the model is then run as follows with package rstan:

model3<-stan_model(model_code = model3.src)
stan.dat<-list(N=n,"x1"=x1,"x2"=x2,"x3"=x3)
model3.fit<-sampling(model3, stan.dat, 
                     iter = 1500, warmup = 500, #Reducing the amount of warmup to cut down on time.
                     chains=3, cores=3, 
                     control=list(max_treedepth=8)) #Reduced treedepth from 10 to 8 to cut down time.
model3.draws<-extract(model3.fit)

Sampling z takes a much longer time than marginalizing over it (i.e., by using matrix decomposition algebra), of course, but I do not know how to do that without assuming the indicators are normal. And really, most of us are not using research models to shoot down missiles or dodge pedestrians, so a little extra time to do things right is worth the payoff.