This is a list of software that doesn't support Unicode properly, or at all. Please note that we do not consider UTF-16 or UTF-32 support adequate (or CESU-8, for that matter). There is also another page for software that doesn't support characters outside the Basic Multilingual Plane. Please file bug reports whenever possible, and be sure to let us know about them.
<!-- Note to editors: Please keep this list in alphabetical order. Thank you. -->
Aterm: Appears to fail unfortunately when attempting to read Markus Kuhn's UTF-8 demonstration document.
Emacs: The CVS version is unable to uppercase lowercase characters that map to multiple uppercase characters.
Evolution: Pango is used improperly in GtkHtml, so right-to-left text is displayed incorrectly.
Flex: Unicode support is very basic. There is no support for dealing with wchar_t strings, and the regular expression matching is limited to US-ASCII.
GNU Arch: tla does not accept anything other than simple letters, numbers, and basic punctuation in my-id. It encounters problems with umlauts, underscores, and other âfunky charactersâ.
GPG: Configuration files contain a setting that communicates the character set that is wanted. There is a notice near this setting claiming that UTF-8 will be the default in the next version. However, in !CVS revision HEAD, the default remains ISO-8859-1.
grep: Markus Kuhn noticed that grep 2.5 is very slow in UTF-8 locales; Mika Fischer posted a patch that you may want to try.
Grip has problems with UTF-8 in ID3 tags. See #854558 and #852783
ID3v2 (an MP3 tagging tool) doesn't set the encoding flag of the text fields in the ID3v2 tags to UTF-8, which thus are not shown correctly in most players/music organizers. See this bug report for details.
joe: Lon Hohberger said, âI looked at it briefly, but I didn't get too far before more important things came up. On a side note, it's probably much easier to write a joe.elisp or joe.vim :)â
Linux kernel: Console: Can display UTF-8 characters after configured using kbd package. Console can show 256 or 512 different characters at same time. Supported already in distributions such as Fedora, Mandrake, etc. Unicode input is problematic for composing (using a dead key to add accents to characters) as diacritics must be 8-bits, allowing ISO 8859, but not UTF-8. You can input UTF-8 characters by assigning one key to one Unicode character. The issue of diacritics looks difficult to solve in an easy way.
man-pages: Manual pages for character sets like ISO-8859-1 are not encoded in UTF-8.
- newt: Has problems with multibyte characters, entering UTF-8 characters.
mc: The Red Hat/Fedora Linux packages contain patches that fix the main interface, but not the viewer or editor. Grab the source RPM for the patches. There are also patches from Suse available: Suse Patches
strings, part of binutils, does not support UTF-8.
tcsh: This shell has issues: âUnicode (UTF-8) doesn't seem to workâ. This is on their wish list, though. Setting the variable dspmbyte to utf8 seems to solve the problem, however you have to set it explicitely. There are bugs when editing a command line with UTF-8 characters.
TWiki: See http://twiki.org/cgi-bin/view/Codev/ProposedUTF8SupportForI18N for the latest information.
WordPress: There's no current plan to support UTF-8, and there were no responses to a request to use anything other than ISO-8859-1. *Update* (2004-03): This appears to be mostly resolved in their CVS.
Zsh: The Z Shell has partial support for UTF-8. For example, pasted UTF-8-encoded Latin characters display OK, but other characters such as the 3-byte UTF-8 single quote (â) or many Asian characters do not. Zsh also has trouble moving the cursor around multi-byte characters (such as using backspace). Tab completion will match files but probably not display them correctly. A \u escape for generating Unicode characters is supported. In the current unstable (4.3) branch a lot of progress has been made in adding support for unicode from the line editor and help in testing this would be much appreciated by the developers. It is a big job, in part because zsh's useful feature of being able to handle null characters is being preserved.
<!-- Note to editors: Please leave only three change log entries here. Thank you. -->
<br>-- Main.NickLamb - 26 Jul 2006 <br>-- Main.AlexanderWinston - 09 Jul 2004 <br>-- Main.AlexanderWinston - 14 Jun 2004


