org.knowceans.dirichlet.atm
Class AtmGibbsQuerySampler

java.lang.Object
  extended by org.knowceans.dirichlet.lda.LdaGibbsSampler
      extended by org.knowceans.dirichlet.atm.AtmGibbsSampler
          extended by org.knowceans.dirichlet.atm.AtmGibbsQuerySampler
All Implemented Interfaces:
java.io.Serializable

public class AtmGibbsQuerySampler
extends AtmGibbsSampler

AtmGibbsQuerySampler allows sampling from known markov states, i.e., the model of an author-corpus, which can be used to predict the topics of query documents.

Author:
gregor
See Also:
Serialized Form

Field Summary
(package private)  AtmMarkovState atmstateq
          stateq contains the query documents.
private static long serialVersionUID
           
(package private)  AtmMarkovState stateSave
          stateSave contains the saved markov state (after initially loading the state.
private  double[][] thetasumq
           
 
Fields inherited from class org.knowceans.dirichlet.atm.AtmGibbsSampler
atmstate
 
Fields inherited from class org.knowceans.dirichlet.lda.LdaGibbsSampler
backupIteration, conf, dispcol, numstats, phisum, rand, state, thetasum
 
Constructor Summary
AtmGibbsQuerySampler(AtmMarkovState state, ExtLdaConfiguration conf, java.util.Random rand, boolean restorable)
          Initialise the gibbs sampler with a known markov state (for querying).
 
Method Summary
 double[][] getPredictiveTheta()
          Get the document--topic associations of the query documents.
 double[][] getSavedPhi()
          Get the backed up phi (without influence of the queries).
protected  void gibbs()
          Main method: Select initial state ?
private  void initialState(boolean restorable)
          Initialisation: initialise the sampler from a known state of the markov chain for querying the model.
 double[] query(int[] document)
          Initialise the sampler with a one-document query.
 double[][] query(int[][] query)
          Initialise the gibbs sampler with the query documents
 void restore()
          For restorable state operation, restore (reinitialise) the state to that of the markov chain at object creation time.
protected  void updateTheta()
          Add to the statistics the values of theta for the current state.
 
Methods inherited from class org.knowceans.dirichlet.atm.AtmGibbsSampler
getState, gibbsAtm, gibbsAtmHeap, gibbsAtmHeap, initialState, main, run, sampleAtmFullConditional, sampleCorpus, saveState
 
Methods inherited from class org.knowceans.dirichlet.lda.LdaGibbsSampler
getPhi, getTheta, gibbs, gibbsHeap, gibbsHeap, load, output, sampleCorpus, sampleLdaFullConditional, save, updateParams, updatePhi, writeParameters
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

serialVersionUID

private static final long serialVersionUID
See Also:
Constant Field Values

atmstateq

AtmMarkovState atmstateq
stateq contains the query documents. Its nw and nwsum fields are shared with the corpus state.


stateSave

AtmMarkovState stateSave
stateSave contains the saved markov state (after initially loading the state. It is a complete copy of the


thetasumq

private double[][] thetasumq
Constructor Detail

AtmGibbsQuerySampler

public AtmGibbsQuerySampler(AtmMarkovState state,
                            ExtLdaConfiguration conf,
                            java.util.Random rand,
                            boolean restorable)
Initialise the gibbs sampler with a known markov state (for querying).

Parameters:
state -
conf -
rand -
restorable - whether the initial markov state can be restored using restore (see there).
Method Detail

initialState

private void initialState(boolean restorable)
Initialisation: initialise the sampler from a known state of the markov chain for querying the model.

Parameters:
restorable - whether the initial markov state can be restored using restore (see there).

query

public double[] query(int[] document)
Initialise the sampler with a one-document query.

Parameters:
document -
Returns:
document--topic associations for the query

query

public double[][] query(int[][] query)
Initialise the gibbs sampler with the query documents

Parameters:
query - word vectors
Returns:
document--topic associations (same as getPredictiveTheta())

restore

public void restore()
For restorable state operation, restore (reinitialise) the state to that of the markov chain at object creation time.

Because the association counts are influenced by the queries, the original state of the markov chain becomes "dirty". Therefore, this state can be backed up by enabling the argument restorable in the constructors.


updateTheta

protected void updateTheta()
Add to the statistics the values of theta for the current state.

Overrides:
updateTheta in class LdaGibbsSampler

getPredictiveTheta

public double[][] getPredictiveTheta()
Get the document--topic associations of the query documents.

Returns:

getSavedPhi

public double[][] getSavedPhi()
Get the backed up phi (without influence of the queries).

Returns:
phi multinomial mixture of topic words (K x V)

gibbs

protected void gibbs()
Description copied from class: AtmGibbsSampler
Main method: Select initial state ? Repeat a large number of times: 1. Select an element 2. Update conditional on other elements. If appropriate, output summary for each run.

Overrides:
gibbs in class AtmGibbsSampler