k
n n s
knowceans.org
w t u
l w r
e o c
d r e
g k w
e a
r
e
gregor :: arbylon . net
|
knowceans.org hosts resources for "knowledge networks".
A knowledge network is defined as a structure consisting of the content
and the people within a "virtual" community, along with their different interrelations.
The current focus is on probabilistic topic extraction from text to determine relationships among and between
items and actors, including semantic similarity, expertise, interest,
recommendation as well as interest/expertise matching and extraction of community-specific ontologies.
On this site, you currently find some general-purpose tools, cf. the
sourceforge project knowceans with CVS resources.
Contact me for feedback, more special solutions and cooperation.
[ consider this under construction :) ]
|
experimental software
|
Actor-media embedded search is a method to explore knowledge structures.
|
- Freshmind, a knowledge network visualisation and editing tool, navigation is done by a combination of
searching and browsing. Force-directed layout and view-layer data structure based on Touchgraph. Retrieval is based on Lucene,
and graph structuring / similarity-based search uses a model similar to Latent Dirichlet Allocation (see below).
,
|
Latent Dirichlet Allocation is a powerful probabilistic method
for topic extraction from text data.
|
|
Markov Clustering implements the MCL algorithm in Java and Matlab.
|
|
Statistics base classes are helpers to implement sampling-based algorithms in Java.
|
- arms-java(version 20060516), provides a
Java port of the adaptive rejection Metropolis sampler (ARMS), which can sample from virtually any
univariate distribution.
- Samplers and densities / likelihood functions of various probability distributions as well as a Java port
of the Mersenne Twister random generator can be found in the package knowceans-tools.jar (see below).
|
Java helpers came into existence when I tried to shortcut re-occurring tasks.
|
- NEW: knowceans-tools (version 20090727), many Java helper
classes I frequently use: command line parser, runtime stop watch, perl-like regular expression usage (reduces Java coding),
special invertible, regex and many-to-many implementations of the Map interface, data output formatters specialised to commandline output
(like histograms and dot-encoded numbers) and many more.
- knowceans-corpus, a text corpus extraction toolset.
- BibFileMod is a Java class that takes a LaTeX
document, collects from a list of BibTeX database files all
references cited in the document and writes the result into a new BibTeX database. (This seemed
more useful for organising my documents than bibtool from CTAN etc.)
|
optimised for firefox.
|
(c) 2004-6 arbylon.net
|