org.knowceans.sandbox.hlda
Class ChineseRestaurantProcess

java.lang.Object
  extended by org.knowceans.sandbox.hlda.ChineseRestaurantProcess

public class ChineseRestaurantProcess
extends java.lang.Object

CrpNode models a nested Chinese restaurant process (CRP)

Author:
heinrich

Nested Class Summary
(package private)  class ChineseRestaurantProcess.CrpNode
           
 
Field Summary
private  int datasize
          total occupation number
private  double gamma
          concentration parameter of the CRP
private  java.util.Vector<ChineseRestaurantProcess.CrpNode> nodes
          list of occupied tables in the CRP
 
Constructor Summary
ChineseRestaurantProcess(double gamma)
          initialise CRP.
 
Method Summary
static void main(java.lang.String[] args)
           
static
<T> void
print(java.util.Collection<T> a)
           
static
<T> void
print(T[] a)
           
private  int sampleCrp()
          sample a cluster according to the CRP scheme: CRP example: index : [0] [1] [2] draw 1: 1 0 0 e.g., -> [0] draw 2: 1/(1+gamma) gamma/(1+gamma) 0 -> [1] draw 3: 1/(2+gamma) 1/(2+gamma) gamma/(2+gamma) -> [0] draw 4: 2/(3+gamma) 1/(3+gamma) gamma/(3+gamma) -> [1] i.e., the probability of drawing [2] becomes lower with every draw, leading to an aggregated probability of cluster number that is p(clusters) ~ log(datasize).
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

nodes

private java.util.Vector<ChineseRestaurantProcess.CrpNode> nodes
list of occupied tables in the CRP


gamma

private double gamma
concentration parameter of the CRP


datasize

private int datasize
total occupation number

Constructor Detail

ChineseRestaurantProcess

public ChineseRestaurantProcess(double gamma)
initialise CRP.

Parameters:
gamma - concentration parameter of the underlying CRP (p=1/nodes if gamma is equal to the mean occupation number per child node). A new cluster is introduced in distances of log(data) items.
Method Detail

main

public static void main(java.lang.String[] args)

sampleCrp

private int sampleCrp()
sample a cluster according to the CRP scheme:

CRP example: index : [0] [1] [2] draw 1: 1 0 0 e.g., -> [0] draw 2: 1/(1+gamma) gamma/(1+gamma) 0 -> [1] draw 3: 1/(2+gamma) 1/(2+gamma) gamma/(2+gamma) -> [0] draw 4: 2/(3+gamma) 1/(3+gamma) gamma/(3+gamma) -> [1] i.e., the probability of drawing [2] becomes lower with every draw, leading to an aggregated probability of cluster number that is p(clusters) ~ log(datasize).

Returns:

print

public static <T> void print(T[] a)
Parameters:
a -

print

public static <T> void print(java.util.Collection<T> a)
Parameters:
a -