Package org.knowceans.corpus.analysis

Class Summary
AmqCorpusStatistics ActorMediaCorpusStatistics
AtmTopicsConverter ActorMediaTopicAnalyser analyses the results of an LS-AMQ run.
CorpusStatistics CorpusStatistics prints some statistics about a TermCorpus to stdout.
LdaAmqDistance LdaAmqCorrelationAnalyser analyses the distance between the extracted topics of an LDA model and an LS-AMQ, effectively measuring the influence of authorship and querying information on the topic distributions.
LdaPerplexity LdaPerplexity implements the perplextity metric for unlabeled test data sets TODO: thorough check after change of field tc from class TermCorpusOld.
LdaSimilarityAnalyser LdaSimilarities analyses the distance between terms documents.
LdaTopicVariationAnalyser LdaTopicVariationAnalyser analyses the variation of the topic parameters through the document by comparing the theta vectors of the document to these of the sentences TODO: not completed.
TopicsConverter TopicAnalyser extracts topics from Phi and Theta variables and shows the Bayesian equivalent of the phi[z][w] = P(z|w) = P(w|z) P(z) / sum_z'(P(w|z') P(z')) or, equivalently, theta[d][z] = P(d|z) = P(z|d) P(d) / sum_d'(P(z|d') P(d'))
VariationOfInformationAnalyser VariationOfInformationAnalyser (old: TopicCorrelationAnalyser) analyses the distance between the extracted topics and a priori categories.