|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.knowceans.util.Samplers
public class Samplers
Diverse sampling methods, including beta, gamma, multinomial, and Dirichlet distributions as well as Dirichlet processes, using Sethurahman's stick-breaking construction and Chinese restaurant process. The random generator used is a Mersenne Twister (Cokus), which is the only dependency.
FIXME: markov condition in random generator, see random string?
Nested Class Summary | |
---|---|
static class |
Samplers.CrpData
data structure for a Chinese restaurant process CrpData |
Field Summary | |
---|---|
static double |
lastRand
|
Constructor Summary | |
---|---|
Samplers()
|
Method Summary | |
---|---|
static int |
binarySearch(double[] a,
double p)
perform a binary search and return the first index i at which a[i] >= p. |
static double |
enumClass(double alpha,
int numdata)
enumclass(alpha,numdata) The expected number of tables in a CRP with concentration parameter alpha and numdata items. |
static void |
main(java.lang.String[] args)
|
static double |
meanLik(double lik)
meanlik(lik) Computes estimated likelihood from individual samples. |
static int |
randAntoniak(double alpha,
int n)
sample number of components m that a DP(alpha, G0) has after n samples. |
static int |
randBernoulli(double p)
draw a Bernoulli sample. |
static double[] |
randBeta(double[] aa,
double[] bb)
randbeta(aa, bb) Generates beta samples, one for each element in aa/bb, and scale 1. |
static double |
randBeta(double aa,
double bb)
beta as two-dimensional Dirichlet |
static int |
randBinom(double N,
double p)
draw a binomial sample (by counting Bernoulli samples). |
static double |
randConParam(double alpha,
int numgroup,
int[] numdata,
int[] numtable,
double alphaa,
double alphab,
int numiter)
randconparam(alpha,numdata,numclass,aa,bb,numiter) Generates a sample from a concentration parameter of a HDP with gamma(aa,bb) prior, and number of classes and data items given in numdata, numclass (has to be row vectors). |
static double |
randConParam(double alpha,
int numdata,
int numtopic,
double alphaa,
double alphab,
int numiter)
Sample the Dirichlet process concetration parameter given the topic and data counts and gamma hyperparameters alphaa and alphab. |
static Samplers.CrpData |
randCrp(double[] alpha,
int numdata)
[cc numclass] = randcrp(alpha,numdata) Generates a partition of numdata items with concentration parameter alpha, which can be an array, in which case the Chinese restaurant process has "two new tables to chose for each new customer". |
static Samplers.CrpData |
randCrp(double alpha,
int numdata)
|
static double[] |
randDir(double[] aa)
randdir(aa) generates one Dirichlet sample vector according to the parameters alpha. |
static double[][] |
randDir(double[][] aa,
int direction)
Generate as many Dirichlet column samples as there are columns (direction = 1; randdir(A, 1)) or row samples as there are rows (direction = 2, randdir(A, 2)) in aa (aa[][]), taking the respective parameters. |
static double[] |
randDir(double[] mean,
double precision)
randdir(aa) generates one Dirichlet sample vector according to the parameters alpha. |
static double[][] |
randDir(double[] aa,
int repetitions)
Generate n Dirichlet samples taking parameters aa. |
static double[] |
randDir(double a,
int dimension)
symmetric Dirichlet sample. |
static double[] |
randDmm(double[] probs,
double[][] mean,
double[] precision)
DMM sampling |
static double[] |
randDmm(double[] probs,
double[][] mean,
double[] precision,
int[] component)
DMM sampling |
static double[][] |
randDmm(int n,
double[] probs,
double[][] means,
double[] precisions)
DMM sampling |
static double[][] |
randDmm(int n,
double[] probs,
double[][] means,
double[] precisions,
int[] components)
DMM sampling |
static double |
randGamma(double rr)
self-contained gamma generator. |
static double[] |
randGamma(double[] aa)
randgamma(aa) Generates gamma samples, one for each element in aa. |
static double |
randGamma(double shape,
double scale)
sample from gamma distribution with defined shape a and scale b: |
static double |
randGmm(double[] probs,
double[] mean,
double[] sigma)
GMM sampling |
static double |
randGmm(double[] probs,
double[] mean,
double[] sigma,
int[] component)
GMM sampling |
static double[] |
randGmm(int n,
double[] probs,
double[] mean,
double[] sigma)
GMM sampling |
static double[] |
randGmm(int n,
double[] probs,
double[] mean,
double[] sigma,
int[] components)
GMM sampling |
static int |
randMult(double[] pp)
Creates one multinomial sample given the parameter vector pp. |
static int[] |
randMult(double[] pp,
int repetitions)
Multiply sample a multinomial distribution and return a vector with all samples. |
static int |
randMultDirect(double[] pp)
Creates one multinomial sample given the parameter vector pp. |
static int |
randMultDirect(double[] pp,
double rand)
Like randMultDirect, but the random number is given as argument. |
static int[] |
randMultFreqs(double[] pp,
int repetitions)
Multiply sample a multinomial distribution and return a vector with category frequencies. |
static int |
randMultSimple(double[] pp)
old version of the randMult method |
static double |
randNorm(double mu,
double sigma)
uses same approach as java.util.Random() |
static int |
randNumTable(double alpha,
int numdata)
randnumtable(weights,maxtable) For each entry in weights and maxtables, generates the number of tables given concentration parameter (weights) and number of data items (maxtable). |
static int[] |
randPerm(int size)
Random permutation of size elements (symbols '0'.. |
int[] |
randPerm(int[] set)
Random permutation of existing set of integers. |
static double[] |
randStick(double[] alpha,
int numclass)
randstick(alpha,numclass) Generates stick-breaking weights with concentration parameter for numclass "sticks". |
static java.lang.String |
randString(int length,
byte[] alphabet)
create a random string of length alphanumeric characters. |
static double |
randUniform(double numvalue)
|
static int |
randUniform(int numvalue)
|
static double[] |
stirling(int nn)
[ss lmss] = stirling(nn) Gives unsigned Stirling numbers of the first kind s(nn,*) in ss. |
static void |
testMult()
|
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static double lastRand
Constructor Detail |
---|
public Samplers()
Method Detail |
---|
public static void main(java.lang.String[] args)
public static double randNorm(double mu, double sigma)
mu
- sigma
-
public static double randGmm(double[] probs, double[] mean, double[] sigma)
probs
- mixture responsibilitiesmean
- mean vectorsigma
- stddev vector
public static double randGmm(double[] probs, double[] mean, double[] sigma, int[] component)
probs
- mixture responsibilitiesmean
- mean vectorsigma
- stddev vectorcomponent
- [out] componentn[0] is filled with the sampled component index
public static double[] randGmm(int n, double[] probs, double[] mean, double[] sigma)
n
- number of samples to take (this saves the calculation of the
cumulative probabilities for successive trials)probs
- mixture responsibilitiesmean
- mean vectorsigma
- stddev vector
public static double[] randGmm(int n, double[] probs, double[] mean, double[] sigma, int[] components)
n
- number of samples to take (this saves the calculation of the
cumulative probabilities for successive trials)probs
- mixture responsibilitiesmean
- mean vectorsigma
- stddev vectorcomponents
- [out] n-vector is filled with the sampled component indices
(ignored if null)
public static double[] randDmm(double[] probs, double[][] mean, double[] precision)
probs
- mixture responsibilitiesmean
- mean vector of vectorsprecision
- precision vector
public static double[] randDmm(double[] probs, double[][] mean, double[] precision, int[] component)
probs
- mixture responsibilitiesmean
- mean vector of vectorsprecision
- precision vectorcomponent
- [out] sampled component of the mixture (or ignored if null)
public static double[][] randDmm(int n, double[] probs, double[][] means, double[] precisions)
probs
- mixture responsibilitiesmeans
- mean vector of vectorsprecisions
- precision vector
public static double[][] randDmm(int n, double[] probs, double[][] means, double[] precisions, int[] components)
n
- number of trialsprobs
- mixture responsibilitiesmeans
- mean vector of vectorsprecisions
- precision vectorcomponents
- n-vector is filled with the sampled component indices (ignored
if null)
public static double randBeta(double aa, double bb)
aa
- bb
-
public static double[] randBeta(double[] aa, double[] bb)
aa
- public static double randGamma(double rr)
rr
- shape parameter
public static double[] randGamma(double[] aa)
aa
- public static double randGamma(double shape, double scale)
x ~ x^(a-1) * exp(-x/b) / ( gamma(a) * b^a )
E(x) = ab, V(x) = (ab)^2. Note that instead of the scale parameter b, often a rate parameter r = 1/b is used: E(x) = a/r, V(x) = (a/r)^2. For sampling, the following are equivalent: Gamma(a,1)*b <=> Gamma(a,b), with shape parametrisation; Gamma(a,1)/r <=> Gamma(a,r) with rate parametrisation.
shape
- scale
-
public static int[] randPerm(int size)
size
-
public final int[] randPerm(int[] set)
set
-
public static double[] randDir(double a, int dimension)
aa
-
public static double[] randDir(double[] aa)
aa
- normdim
-
public static double[] randDir(double[] mean, double precision)
mean
- (mean_i = alpha_i / sum_j alpha_j)precision
- (precision = alpha_i / mean_i)
public static double[][] randDir(double[][] aa, int direction)
aa
- direction
- -- 2 is more efficient (row-major Java matrix structure)
public static double[][] randDir(double[] aa, int repetitions)
aa
-
public static int[] randMultFreqs(double[] pp, int repetitions)
pp
- repetitions
-
public static int[] randMult(double[] pp, int repetitions)
pp
- repetitions
-
public static int randMultSimple(double[] pp)
pp
-
public static void testMult()
public static int randMult(double[] pp)
public static int randMultDirect(double[] pp)
public static int randMultDirect(double[] pp, double rand)
public static int binarySearch(double[] a, double p)
a
- p
-
public static int randBinom(double N, double p)
N
- p
- public static int randBernoulli(double p)
p
- success probability
public static double randConParam(double alpha, int numgroup, int[] numdata, int[] numtable, double alphaa, double alphab, int numiter)
Modification of Escobar and West. Works for multiple groups of data. numdata, numclass are row vectors, one element per group. After Teh (npbayes).
alpha
- alphanumgroup
- number of components ??numdata
- number of data items per classnumtable
- number of per DPalphaa
- hyperparameter (gamma shape)alphab
- hyperparameter (gamma scale)numiter
- number of iterations
public static double randConParam(double alpha, int numdata, int numtopic, double alphaa, double alphab, int numiter)
alpha
- numdata
- numtopic
- alphaa
- alphab
- numiter
-
public static Samplers.CrpData randCrp(double alpha, int numdata)
public static Samplers.CrpData randCrp(double[] alpha, int numdata)
alpha
- numdata
-
public static int randNumTable(double alpha, int numdata)
weights
- maxtable
-
public static double[] randStick(double[] alpha, int numclass)
alpha
- numclass
-
public static double enumClass(double alpha, int numdata)
alpha
- numdata
-
public static java.lang.String randString(int length, byte[] alphabet)
length
- of outputalphabet
- alphabet to be used or null
public static double meanLik(double lik)
lik
-
public static double[] stirling(int nn)
nn
-
public static int randAntoniak(double alpha, int n)
alpha
- n
-
public static double randUniform(double numvalue)
numclass
-
public static int randUniform(int numvalue)
numclass
-
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |