|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.knowceans.sandbox.gauss.IgmmGibbsSampler
public class IgmmGibbsSampler
IgmmGibbsSampler implements the infinite Gaussian mixture model as a Gibbs sampler. The algorithm runs fully non-parametric, i.e., the entire set of parameters are estimated using Empirical Bayes and the only prior knowledge involved is the choice of distributions.
Rasmussen (NIPS-12) presented the underlying approach whose complete probability model is as follows:
Mixture model: x = sum_j^k pi_j * N(mu_j, 1/s_j) Data points: x | c_j ~ N(mu_j, 1/s_j) Mean hyperparameters: mu_j ~ N(lambda, 1/r) lambda ~ N(mu_y, sigma_y^2) r ~ Gamma(1, 1/sigma_y^2) = 1/Z * r^.5 * exp(-r * sigma_y^2/2) Precision hyperparameters: s_j ~ Gamma(beta, 1/w) w ~ Gamma(1, sigma_y^2) pi_j ~ Dirichlet(alpha/k) alpha ~ Gamma(a,b)Note that Rasmussen defines Gamma(a,b) as having mean b, not a*b, which is reflected in the method sampleGammaDist().
Nested Class Summary | |
---|---|
(package private) class |
IgmmGibbsSampler.AlphaArms
AlphaArms implements the ARMS update for the CRP hyperparameter |
(package private) class |
IgmmGibbsSampler.BetaArms
BetaArms implements the ARMS update for the precision's precision hyperparameter |
Field Summary | |
---|---|
private double |
alpha
crp prior |
private IgmmGibbsSampler.AlphaArms |
alphaSampler
sampler for alpha |
private double |
beta
mean of s_j |
private IgmmGibbsSampler.BetaArms |
betaSampler
sampler for beta |
private int |
burnIn
burn-in period |
private int[] |
cc
state (component) for each data point |
int |
debugLevel
debug level for output (5=info, 1=error) |
private int |
growstep
array grow step |
private int |
iterations
max iterations |
private int |
k
number of components |
private double |
lambda
mean of mu_j |
private double[] |
mu
component means |
private double |
muunrep
mean of unrepresented components |
private double |
muy
mean of the data |
private int |
n
data size |
private double[] |
nn
occupation numbers for each component (double for usage of double-valued methods) |
private double |
r
precision of mu_j |
private boolean |
randomScan
random scan or systematic scan |
private double[] |
s
inverse component variances (precisions) |
private double |
sigmasqy
variance of the data |
private double |
sunrep
precision of unrepresented components |
private int |
thinInterval
sampling lag |
private double |
w
precision of s_j |
private double[] |
ysum
component data sum |
private double[] |
yy
vector of univariate data points |
Constructor Summary | |
---|---|
IgmmGibbsSampler(double[] data)
Initialise the Gibbs sampler with data. |
Method Summary | |
---|---|
private void |
addComponent()
handle size of componentwise structures. |
void |
configure(int iterations,
int burnIn,
int thinInterval)
set sampling conditions |
private void |
debug(int level,
java.lang.String string)
print debug information |
(package private) double[] |
getMean()
get the mean of the components |
(package private) double[] |
getStdDev(double[] mean)
get the standard deviation of the components |
(package private) double[] |
getWeights()
get the mixture weights of the components |
private void |
gibbs()
Main method: Select initial state ? |
private static double[] |
increaseSize(double[] array,
int step)
Increase size of array. |
(package private) void |
initialState()
Initialisation: starts with one class and assigns data-dependent piors (which Rasmussen justifies in his paper). |
static void |
main(java.lang.String[] args)
Driver with example data. |
private void |
removeComponent(int j)
removes one component from the model |
private static double[] |
removeElement(double[] array,
int element)
removes one element from the array |
(package private) double |
sampleAlpha()
sample alpha using ARS. |
(package private) double |
sampleBeta()
sample beta using ARS. |
(package private) int |
sampleC(int i)
sample component association to data point i with likelihood. |
(package private) int |
sampleCrpC(int i)
sample component association to data point i using Chinese restaurant process including likelihood term. |
(package private) int |
sampleCrpPriorC(int i)
sample component association to data point i using Chinese restaurant process. |
double |
sampleGammaDist(double a,
double b)
Gamma distribution with mean as a parameter b (normally mean = a*b) |
(package private) double |
sampleLambda()
sample the component means' mean. |
(package private) double |
sampleMu(int j)
sample the means. |
(package private) double |
sampleMuUnrep()
sample from prior on s for unrepresented classes |
(package private) double |
sampleNormalDist(double mu,
double sigmaSquared)
Normal distribution with variance as parameter instead of standard deviation. |
(package private) double[] |
samplePi()
sample the component weights ~ Dirichlet. |
(package private) int |
samplePriorC(int i)
sample component association to data point i using Dirichlet distribution. |
(package private) double |
sampleR()
sample means' precision. |
(package private) double |
sampleS(int j)
sample precision for component j. |
(package private) double |
sampleSUnrep()
sample from prior on mu for unrepresented classes |
(package private) double |
sampleW()
sample precisions' precision. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
private double[] yy
private int thinInterval
private int burnIn
private int iterations
private int growstep
private double muy
private double sigmasqy
private double alpha
private IgmmGibbsSampler.AlphaArms alphaSampler
private double lambda
private double beta
private IgmmGibbsSampler.BetaArms betaSampler
private double r
private double w
private double[] mu
private double muunrep
private double[] s
private double sunrep
private int k
private double[] nn
private int[] cc
private int n
private double[] ysum
private boolean randomScan
public int debugLevel
Constructor Detail |
---|
public IgmmGibbsSampler(double[] data)
data
- Method Detail |
---|
public void configure(int iterations, int burnIn, int thinInterval)
iterations
- burnIn
- thinInterval
- void initialState()
private void addComponent()
Note: We use arrays for components more readable syntax and possibly speed loss during all cast operations when accessing a Vector. Therefore all loops over components should explicitly use k, not, e.g., mu.length. The problem with this approach is that it is hard to remove unoccupied classes
private static double[] increaseSize(double[] array, int step)
array
- overhead
- step
-
private void removeComponent(int j)
j
- private static double[] removeElement(double[] array, int element)
array
- overhead
- step
-
double[] getMean()
double[] getStdDev(double[] mean)
mean
-
double[] getWeights()
double sampleAlpha()
double sampleBeta()
int samplePriorC(int i)
i
-
int sampleC(int i)
i
-
int sampleCrpPriorC(int i)
i
-
int sampleCrpC(int i)
i
-
public double sampleGammaDist(double a, double b)
a
- b
-
double sampleNormalDist(double mu, double sigmaSquared)
a
- b
-
double sampleLambda()
double sampleMu(int j)
TODO: possibility to update means whenever component associations are changed --> no need to calc mu[j] here
j
-
double sampleMuUnrep()
double[] samplePi()
double sampleR()
double sampleS(int j)
j
-
double sampleSUnrep()
double sampleW()
private void gibbs()
k
- probs
- mean
- sigma
- private void debug(int level, java.lang.String string)
level
- debug level (5=info to 1=error)string
- public static void main(java.lang.String[] args)
args
-
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |