org.knowceans.dirichlet.sandbox
Class LdaGibbsSamplerHyper

java.lang.Object
  extended by org.knowceans.dirichlet.lda.LdaGibbsSampler
      extended by org.knowceans.dirichlet.sandbox.LdaGibbsSamplerHyper
All Implemented Interfaces:
java.io.Serializable

public class LdaGibbsSamplerHyper
extends LdaGibbsSampler

Gibbs sampler for estimating the best assignments of topics for words and documents in a corpus. The algorithm is introduced in Tom Griffiths' paper "Gibbs sampling in the generative model of Latent Dirichlet Allocation" (2002).

Author:
heinrich
See Also:
Serialized Form

Field Summary
(package private)  int[] alphaParams
           
(package private)  InvGammaArms alphaSampler
           
(package private)  int[] betaParams
           
(package private)  InvGammaArms betaSampler
           
private static long serialVersionUID
           
(package private)  LdaMarkovStateHyper state
           
 
Fields inherited from class org.knowceans.dirichlet.lda.LdaGibbsSampler
backupIteration, conf, dispcol, numstats, phisum, rand, thetasum
 
Constructor Summary
LdaGibbsSamplerHyper(ITermCorpus corpus, ExtLdaConfiguration conf, java.util.Random rand)
           
LdaGibbsSamplerHyper(LdaMarkovStateHyper state, ExtLdaConfiguration conf, java.util.Random rand)
           
 
Method Summary
private  void estimateHyperParameters(LdaMarkovStateHyper hyper)
          Estimate the hyperparameters from the observations (uses vectorial values)
private  void init()
           
private  void leaveOneOutAlpha(LdaMarkovStateHyper s)
          Estimate the parameters alpha based on Minka's leave-one-out likeliho
private  void leaveOneOutBeta(LdaMarkovStateHyper s)
          Estimate the parameters beta based on Minka's leave-one-out likelihood.
protected  void sampleCorpus(LdaMarkovState s)
          Sample once through the corpus and update the corresponding state.
private  void sampleHyperParameters()
          Sample scalar hyperparameters from the observations using adaptive rejective metropolis sampling with rasmussen's vague inverse-gamma prior
protected  int sampleLdaFullConditional(LdaMarkovState ms, int m, int n)
          Sample a topic z_i from the full conditional distribution: p(z_i = j | z_-i, w) = (n_-i,j(w_i) + beta(w_i))/(n_-i,j(.) + sum beta) * (n_-i,j(d_i) + alpha(j))/(n_-i,.
 
Methods inherited from class org.knowceans.dirichlet.lda.LdaGibbsSampler
getPhi, getState, getTheta, gibbs, gibbs, gibbsHeap, gibbsHeap, initialState, load, main, output, run, save, saveState, updateParams, updatePhi, updateTheta, writeParameters
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

alphaSampler

InvGammaArms alphaSampler

betaSampler

InvGammaArms betaSampler

alphaParams

int[] alphaParams

betaParams

int[] betaParams

state

LdaMarkovStateHyper state

serialVersionUID

private static final long serialVersionUID
See Also:
Constant Field Values
Constructor Detail

LdaGibbsSamplerHyper

public LdaGibbsSamplerHyper(ITermCorpus corpus,
                            ExtLdaConfiguration conf,
                            java.util.Random rand)
Parameters:
corpus -
conf -
rand -

LdaGibbsSamplerHyper

public LdaGibbsSamplerHyper(LdaMarkovStateHyper state,
                            ExtLdaConfiguration conf,
                            java.util.Random rand)
Parameters:
state -
conf -
rand -
Method Detail

init

private void init()

sampleCorpus

protected void sampleCorpus(LdaMarkovState s)
Description copied from class: LdaGibbsSampler
Sample once through the corpus and update the corresponding state. The parameter is used to choose the state to be sampled from: query vs. corpus; choice of chain for multichain sampling.

Overrides:
sampleCorpus in class LdaGibbsSampler

sampleLdaFullConditional

protected int sampleLdaFullConditional(LdaMarkovState ms,
                                       int m,
                                       int n)
Sample a topic z_i from the full conditional distribution: p(z_i = j | z_-i, w) = (n_-i,j(w_i) + beta(w_i))/(n_-i,j(.) + sum beta) * (n_-i,j(d_i) + alpha(j))/(n_-i,.(d_i) + sum alpha)

Overrides:
sampleLdaFullConditional in class LdaGibbsSampler
Parameters:
m - document
n - word

estimateHyperParameters

private void estimateHyperParameters(LdaMarkovStateHyper hyper)
Estimate the hyperparameters from the observations (uses vectorial values)

Parameters:
hyper -

leaveOneOutAlpha

private void leaveOneOutAlpha(LdaMarkovStateHyper s)
Estimate the parameters alpha based on Minka's leave-one-out likeliho

Parameters:
s - od.

leaveOneOutBeta

private void leaveOneOutBeta(LdaMarkovStateHyper s)
Estimate the parameters beta based on Minka's leave-one-out likelihood.

Parameters:
s -

sampleHyperParameters

private void sampleHyperParameters()
Sample scalar hyperparameters from the observations using adaptive rejective metropolis sampling with rasmussen's vague inverse-gamma prior

Throws:
java.lang.Exception