libexttextcat text categorization library

Libexttextcat is an N-Gram-Based Text Categorization library primarily intended for language guessing.

You can find it being used in libreoffice.


Getting the sources

libexttextcat sources are stored in git. To get them, you can use:

git clone git://

or you can browse the code online.

If you want to use release version you can fetch it from libreoffice mirror.

Building it


Once the source has been checked out, libexttextcat can be built in usual manner:

cd libexttextcat
make check # optional
make install


Once you have done a change that you are happy with, and that builds with libexttextcat, contribute it back, we'll be happy to integrate it! Do:

# commit your changes to your local repository
git commit -a
# create the patch
git format-patch origin/master


You can get in touch with us using multiple ways:

  1. using IRC server and joining channel #libreoffice-dev
  2. using mailinglist
  3. filling bugreport in Freedesktop bugzilla