org.knowceans.dirichlet.lda
Class LdaMarkovState

java.lang.Object
  extended by org.knowceans.dirichlet.lda.LdaMarkovState
All Implemented Interfaces:
java.io.Serializable
Direct Known Subclasses:
AtmMarkovState, LdaMarkovStateHyper

public class LdaMarkovState
extends java.lang.Object
implements java.io.Serializable

LdaMarkovState represents the state of the gibbs sampler of the lda process used for estimating unknown documents (word vectors).

Author:
gregor
See Also:
Serialized Form

Field Summary
 int[][] nd
          n_m,k: nd[m][k] number of words (not terms!)
 int[] ndsum
          n_m: ndsum[m] total number of words in document m.
 int[][] nw
          n_k,t: nw[k][t] number of instances of term t assigned to topic k.
 int[] nwsum
          n_k: nwsum[k] total number of words (not terms!)
private static long serialVersionUID
           
 int V
          V: size of vocabulary.
 int[][] w
          w_m,n: word vectors of the corpus.
 int[][] z
          z_m,n: topic assignments z_m,n for each word (= term occurrence).
 
Constructor Summary
LdaMarkovState()
          Initialise markov state for loading or setting by parameters (subclasses only).
LdaMarkovState(int[][] w, int V, int[][] z, int K)
          Initialise the object from the data given.
LdaMarkovState(LdaMarkovState state)
          Copy constructor that copies all internal fields from state to this.
LdaMarkovState(java.lang.String ldaBase)
          Set up the markov chain from a file.
 
Method Summary
 void copyTo(LdaMarkovState cps)
          Copy the information contained in this to the fields of the argument object.
 void init(int K, java.util.Random rand)
          Random allocation and initialisaton of the state count vectors.
protected  void initNd(int M, int K, java.util.Random rand)
           
protected  void initNw(int M, int K)
           
 void load(java.lang.String filename)
          Load state arrays from file.
 void recalculate(int[][] z)
          Recalculates the topic statistics from the word-topic associations.
 void save(java.lang.String filename)
          Save (optionally compressed) file.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

serialVersionUID

private static final long serialVersionUID
See Also:
Constant Field Values

w

public int[][] w
w_m,n: word vectors of the corpus.


V

public int V
V: size of vocabulary.


z

public int[][] z
z_m,n: topic assignments z_m,n for each word (= term occurrence).


nw

public int[][] nw
n_k,t: nw[k][t] number of instances of term t assigned to topic k. In subclasses, this can be generalised as the number of associations between topics and latent-semantic "minor" items (words, links, recommendations).


nd

public int[][] nd
n_m,k: nd[m][k] number of words (not terms!) in document m assigned to topic k. This should be equal to the document length. In subclasses, this can be generalised as the number associations between latent-semantic "major" items (documents, authors, searchers, recommenders) and topics.


nwsum

public int[] nwsum
n_k: nwsum[k] total number of words (not terms!) assigned to topic k.


ndsum

public int[] ndsum
n_m: ndsum[m] total number of words in document m.

Constructor Detail

LdaMarkovState

public LdaMarkovState()
Initialise markov state for loading or setting by parameters (subclasses only).


LdaMarkovState

public LdaMarkovState(java.lang.String ldaBase)
Set up the markov chain from a file.

Parameters:
ldaBase -

LdaMarkovState

public LdaMarkovState(LdaMarkovState state)
Copy constructor that copies all internal fields from state to this.

Parameters:
state -

LdaMarkovState

public LdaMarkovState(int[][] w,
                      int V,
                      int[][] z,
                      int K)
Initialise the object from the data given.

Parameters:
w -
V -
z -
K -
Method Detail

recalculate

public void recalculate(int[][] z)
Recalculates the topic statistics from the word-topic associations.

Parameters:
z -

copyTo

public void copyTo(LdaMarkovState cps)
Copy the information contained in this to the fields of the argument object.

Parameters:
cps -

load

public void load(java.lang.String filename)
Load state arrays from file.

Parameters:
file -

save

public void save(java.lang.String filename)
Save (optionally compressed) file.

Parameters:
file -

init

public void init(int K,
                 java.util.Random rand)
Random allocation and initialisaton of the state count vectors.

Parameters:
K -
rand -
hasNw -

initNd

protected void initNd(int M,
                      int K,
                      java.util.Random rand)

initNw

protected void initNw(int M,
                      int K)