|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.knowceans.util.Samplers
public class Samplers
Diverse sampling methods, including beta, gamma, multinomial, and Dirichlet distributions as well as Dirichlet processes, using Sethurahman's stick-breaking construction and Chinese restaurant process. The random generator used is a Mersenne Twister (Cokus), which is the only dependency.
FIXME: markov condition in random generator, see random string?
| Nested Class Summary | |
|---|---|
static class |
Samplers.CrpData
data structure for a Chinese restaurant process CrpData |
| Field Summary | |
|---|---|
static double |
lastRand
|
| Constructor Summary | |
|---|---|
Samplers()
|
|
| Method Summary | |
|---|---|
static int |
binarySearch(double[] a,
double p)
perform a binary search and return the first index i at which a[i] >= p. |
static double |
enumClass(double alpha,
int numdata)
enumclass(alpha,numdata) The expected number of tables in a CRP with concentration parameter alpha and numdata items. |
static void |
main(java.lang.String[] args)
|
static double |
meanLik(double lik)
meanlik(lik) Computes estimated likelihood from individual samples. |
static int |
randAntoniak(double alpha,
int n)
sample number of components m that a DP(alpha, G0) has after n samples. |
static int |
randBernoulli(double p)
draw a Bernoulli sample. |
static double[] |
randBeta(double[] aa,
double[] bb)
randbeta(aa, bb) Generates beta samples, one for each element in aa/bb, and scale 1. |
static double |
randBeta(double aa,
double bb)
beta as two-dimensional Dirichlet |
static int |
randBinom(double N,
double p)
draw a binomial sample (by counting Bernoulli samples). |
static double |
randConParam(double alpha,
int numgroup,
int[] numdata,
int[] numtable,
double alphaa,
double alphab,
int numiter)
randconparam(alpha,numdata,numclass,aa,bb,numiter) Generates a sample from a concentration parameter of a HDP with gamma(aa,bb) prior, and number of classes and data items given in numdata, numclass (has to be row vectors). |
static double |
randConParam(double alpha,
int numdata,
int numtopic,
double alphaa,
double alphab,
int numiter)
Sample the Dirichlet process concetration parameter given the topic and data counts and gamma hyperparameters alphaa and alphab. |
static Samplers.CrpData |
randCrp(double[] alpha,
int numdata)
[cc numclass] = randcrp(alpha,numdata) Generates a partition of numdata items with concentration parameter alpha, which can be an array, in which case the Chinese restaurant process has "two new tables to chose for each new customer". |
static Samplers.CrpData |
randCrp(double alpha,
int numdata)
|
static double[] |
randDir(double[] aa)
randdir(aa) generates one Dirichlet sample vector according to the parameters alpha. |
static double[][] |
randDir(double[][] aa,
int direction)
Generate as many Dirichlet column samples as there are columns (direction = 1; randdir(A, 1)) or row samples as there are rows (direction = 2, randdir(A, 2)) in aa (aa[][]), taking the respective parameters. |
static double[] |
randDir(double[] mean,
double precision)
randdir(aa) generates one Dirichlet sample vector according to the parameters alpha. |
static double[][] |
randDir(double[] aa,
int repetitions)
Generate n Dirichlet samples taking parameters aa. |
static double[] |
randDir(double a,
int dimension)
symmetric Dirichlet sample. |
static double[] |
randDmm(double[] probs,
double[][] mean,
double[] precision)
DMM sampling |
static double[] |
randDmm(double[] probs,
double[][] mean,
double[] precision,
int[] component)
DMM sampling |
static double[][] |
randDmm(int n,
double[] probs,
double[][] means,
double[] precisions)
DMM sampling |
static double[][] |
randDmm(int n,
double[] probs,
double[][] means,
double[] precisions,
int[] components)
DMM sampling |
static double |
randGamma(double rr)
self-contained gamma generator. |
static double[] |
randGamma(double[] aa)
randgamma(aa) Generates gamma samples, one for each element in aa. |
static double |
randGamma(double shape,
double scale)
sample from gamma distribution with defined shape a and scale b: |
static double |
randGmm(double[] probs,
double[] mean,
double[] sigma)
GMM sampling |
static double |
randGmm(double[] probs,
double[] mean,
double[] sigma,
int[] component)
GMM sampling |
static double[] |
randGmm(int n,
double[] probs,
double[] mean,
double[] sigma)
GMM sampling |
static double[] |
randGmm(int n,
double[] probs,
double[] mean,
double[] sigma,
int[] components)
GMM sampling |
static int |
randMult(double[] pp)
Creates one multinomial sample given the parameter vector pp. |
static int[] |
randMult(double[] pp,
int repetitions)
Multiply sample a multinomial distribution and return a vector with all samples. |
static int |
randMultDirect(double[] pp)
Creates one multinomial sample given the parameter vector pp. |
static int |
randMultDirect(double[] pp,
double rand)
Like randMultDirect, but the random number is given as argument. |
static int[] |
randMultFreqs(double[] pp,
int repetitions)
Multiply sample a multinomial distribution and return a vector with category frequencies. |
static int |
randMultSimple(double[] pp)
old version of the randMult method |
static double |
randNorm(double mu,
double sigma)
uses same approach as java.util.Random() |
static int |
randNumTable(double alpha,
int numdata)
randnumtable(weights,maxtable) For each entry in weights and maxtables, generates the number of tables given concentration parameter (weights) and number of data items (maxtable). |
static int[] |
randPerm(int size)
Random permutation of size elements (symbols '0'.. |
int[] |
randPerm(int[] set)
Random permutation of existing set of integers. |
static double[] |
randStick(double[] alpha,
int numclass)
randstick(alpha,numclass) Generates stick-breaking weights with concentration parameter for numclass "sticks". |
static java.lang.String |
randString(int length,
byte[] alphabet)
create a random string of length alphanumeric characters. |
static double |
randUniform(double numvalue)
|
static int |
randUniform(int numvalue)
|
static double[] |
stirling(int nn)
[ss lmss] = stirling(nn) Gives unsigned Stirling numbers of the first kind s(nn,*) in ss. |
static void |
testMult()
|
| Methods inherited from class java.lang.Object |
|---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static double lastRand
| Constructor Detail |
|---|
public Samplers()
| Method Detail |
|---|
public static void main(java.lang.String[] args)
public static double randNorm(double mu,
double sigma)
mu - sigma -
public static double randGmm(double[] probs,
double[] mean,
double[] sigma)
probs - mixture responsibilitiesmean - mean vectorsigma - stddev vector
public static double randGmm(double[] probs,
double[] mean,
double[] sigma,
int[] component)
probs - mixture responsibilitiesmean - mean vectorsigma - stddev vectorcomponent - [out] componentn[0] is filled with the sampled component index
public static double[] randGmm(int n,
double[] probs,
double[] mean,
double[] sigma)
n - number of samples to take (this saves the calculation of the
cumulative probabilities for successive trials)probs - mixture responsibilitiesmean - mean vectorsigma - stddev vector
public static double[] randGmm(int n,
double[] probs,
double[] mean,
double[] sigma,
int[] components)
n - number of samples to take (this saves the calculation of the
cumulative probabilities for successive trials)probs - mixture responsibilitiesmean - mean vectorsigma - stddev vectorcomponents - [out] n-vector is filled with the sampled component indices
(ignored if null)
public static double[] randDmm(double[] probs,
double[][] mean,
double[] precision)
probs - mixture responsibilitiesmean - mean vector of vectorsprecision - precision vector
public static double[] randDmm(double[] probs,
double[][] mean,
double[] precision,
int[] component)
probs - mixture responsibilitiesmean - mean vector of vectorsprecision - precision vectorcomponent - [out] sampled component of the mixture (or ignored if null)
public static double[][] randDmm(int n,
double[] probs,
double[][] means,
double[] precisions)
probs - mixture responsibilitiesmeans - mean vector of vectorsprecisions - precision vector
public static double[][] randDmm(int n,
double[] probs,
double[][] means,
double[] precisions,
int[] components)
n - number of trialsprobs - mixture responsibilitiesmeans - mean vector of vectorsprecisions - precision vectorcomponents - n-vector is filled with the sampled component indices (ignored
if null)
public static double randBeta(double aa,
double bb)
aa - bb -
public static double[] randBeta(double[] aa,
double[] bb)
aa - public static double randGamma(double rr)
rr - shape parameter
public static double[] randGamma(double[] aa)
aa -
public static double randGamma(double shape,
double scale)
x ~ x^(a-1) * exp(-x/b) / ( gamma(a) * b^a )
E(x) = ab, V(x) = (ab)^2. Note that instead of the scale parameter b, often a rate parameter r = 1/b is used: E(x) = a/r, V(x) = (a/r)^2. For sampling, the following are equivalent: Gamma(a,1)*b <=> Gamma(a,b), with shape parametrisation; Gamma(a,1)/r <=> Gamma(a,r) with rate parametrisation.
shape - scale -
public static int[] randPerm(int size)
size -
public final int[] randPerm(int[] set)
set -
public static double[] randDir(double a,
int dimension)
aa -
public static double[] randDir(double[] aa)
aa - normdim -
public static double[] randDir(double[] mean,
double precision)
mean - (mean_i = alpha_i / sum_j alpha_j)precision - (precision = alpha_i / mean_i)
public static double[][] randDir(double[][] aa,
int direction)
aa - direction - -- 2 is more efficient (row-major Java matrix structure)
public static double[][] randDir(double[] aa,
int repetitions)
aa -
public static int[] randMultFreqs(double[] pp,
int repetitions)
pp - repetitions -
public static int[] randMult(double[] pp,
int repetitions)
pp - repetitions -
public static int randMultSimple(double[] pp)
pp -
public static void testMult()
public static int randMult(double[] pp)
public static int randMultDirect(double[] pp)
public static int randMultDirect(double[] pp,
double rand)
public static int binarySearch(double[] a,
double p)
a - p -
public static int randBinom(double N,
double p)
N - p - public static int randBernoulli(double p)
p - success probability
public static double randConParam(double alpha,
int numgroup,
int[] numdata,
int[] numtable,
double alphaa,
double alphab,
int numiter)
Modification of Escobar and West. Works for multiple groups of data. numdata, numclass are row vectors, one element per group. After Teh (npbayes).
alpha - alphanumgroup - number of components ??numdata - number of data items per classnumtable - number of per DPalphaa - hyperparameter (gamma shape)alphab - hyperparameter (gamma scale)numiter - number of iterations
public static double randConParam(double alpha,
int numdata,
int numtopic,
double alphaa,
double alphab,
int numiter)
alpha - numdata - numtopic - alphaa - alphab - numiter -
public static Samplers.CrpData randCrp(double alpha,
int numdata)
public static Samplers.CrpData randCrp(double[] alpha,
int numdata)
alpha - numdata -
public static int randNumTable(double alpha,
int numdata)
weights - maxtable -
public static double[] randStick(double[] alpha,
int numclass)
alpha - numclass -
public static double enumClass(double alpha,
int numdata)
alpha - numdata -
public static java.lang.String randString(int length,
byte[] alphabet)
length - of outputalphabet - alphabet to be used or null
public static double meanLik(double lik)
lik -
public static double[] stirling(int nn)
nn -
public static int randAntoniak(double alpha,
int n)
alpha - n -
public static double randUniform(double numvalue)
numclass -
public static int randUniform(int numvalue)
numclass -
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||