Stan: Non-Normal Factor Model

There are so many factor analyses in the world and so few truly normally distributed variables. I have not seen much careful tailoring of residual distributions in medical or psychological research. It is probably because most software don’t support it or make it convenient. It was a revelation to me that you can use Markov Chain Monte Carlo (MCMC) to sample latent variables and then do all kinds of things, like non-Gaussian distributions and latent variable interactions. For Stan users, this will be trivial, but I know that I would have benefited from a simple demo earlier on. So that’s what I’m posting here.

In R, a non-normal factor can be simulated this way:

f<-rnorm(n) #latent factor
x1<-rnorm(n, mean=.8*f, sd=.5) 
x2<-rbinom(n, size=1, p=1/(1+exp(-2*f-.5))) 
x3<-rpois(n, lambda=exp(-1 + 1.5*f)) 

Here is the Stan code for a simplistic model with 3 indicators distributed as normal, Bernoulli, and Poisson:

data { 
  int<lower=0> N;
  vector[N] x1;
  int x2[N];
  int x3[N];
}
parameters {
real x1_mu;
real<lower=0> x1_sd;
real x2_mu;
real x3_mu;

real<lower=0> f_x1; //lower bound prevents sign indeterminacy
real<lower=0> f_x2;
real<lower=0> f_x3;
vector[N] z; //Latent variable scores- i.e., person-intercepts
}
model {
  z  ~ std_normal(); // N(0,1) prior for latent variables.
  x1 ~ normal( x1_mu + f_x1 * z, x1_sd );
  x2 ~ bernoulli( 1.0 ./( 1+exp( - f_x2 * z + x2_mu )) );
  x3 ~ poisson( exp( x3_mu + f_x3 * z)  );
}

Note that z doesn’t really need to be normally distributed either. Putting the above code in a string object model3.src, the model is then run as follows with package rstan:

model3<-stan_model(model_code = model3.src)
stan.dat<-list(N=n,"x1"=x1,"x2"=x2,"x3"=x3)
model3.fit<-sampling(model3, stan.dat, 
                     iter = 1500, warmup = 500, #Reducing the amount of warmup to cut down on time.
                     chains=3, cores=3, 
                     control=list(max_treedepth=8)) #Reduced treedepth from 10 to 8 to cut down time.
model3.draws<-extract(model3.fit)

Sampling z takes a much longer time than marginalizing over it (i.e., by using matrix decomposition algebra), of course, but I do not know how to do that without assuming the indicators are normal. And really, most of us are not using research models to shoot down missiles or dodge pedestrians, so a little extra time to do things right is worth the payoff.