For years I regularly stumbled over LaTeX-Errors in the form of
Unicode char \u8:χ not set up for use with LaTeX
. I always took the chickens path and replaced the unicode characters with the tex-escapes in the file. That was easy, but it made my files needlessly unreadable. Today I decided to FIX the problem once and for all. And it worked. Easily.
Firstoff: The problem I’m facing is that my keyboard layout [1] makes it effortless for me to input characters like ℂ Σ and χ. But LaTeX cannot cope with them out-of-the-box. Org-mode [2] already catches most of these problems, so I can write things like x² instead of x^2
, but occasionally it stumbles.
The solution to that is actually pretty simple: I only need to declare the escapes-sequences LaTeX should use when it sees one of the characters (to be used before \begin{document}!):
\DeclareUnicodeCharacter{03C7}{\chi}
Or in org-mode:
#+LaTeX_HEADER: \DeclareUnicodeCharacter{03C7}{\chi}
To do this more easily, you can use the uniinput.ins [3] and uniinput.dtx [4] from the neo-layout project [1]. Run latex uniinput.ins
to generate uniinput.sty
which you can put next to your latex files and use with \usepackage{uniinput}
(instructions in German [5]).
Thanks go to Wikibooks:LaTeX [6] for this. Their solution suggests then to read several Unicode definition documents for tracking down the codepoint of the character. But we can make that easier with Emacs [7] (almost everything is easier with Emacs ☺).
Instead of browsing huge documents manually, we simply rely on the unicode-definitions in Emacs: Move the cursor over the char and execute M-x describe-char
.
When used with χ, this shows the following output:
position: 672 of 35513 (2%), column: 0
character: χ (displayed as χ) (codepoint 967, #o1707, #x3c7)
preferred charset: unicode-bmp (Unicode Basic Multilingual Plane (U+0000..U+FFFF))
code point in charset: 0x03C7
… (and a bit more) …
What we need is code point in charset: Just leave out the 0x
and you have the codepoint.
For the document I currently write, I now use the following definitions:
#+LaTeX_HEADER: \DeclareUnicodeCharacter{03C7}{\chi}
#+LaTeX_HEADER: \DeclareUnicodeCharacter{B2}{^{2}}
And that makes χ² work.
Happy Hacking - and have fun with Emacs Org-Mode!
Links:
[1] http://neo-layout.org
[2] http://orgmode.org
[3] http://wiki.neo-layout.org/export/2476/latex/Standard-LaTeX/uniinput.ins
[4] http://wiki.neo-layout.org/export/2476/latex/Standard-LaTeX/uniinput.dtx
[5] http://wiki.neo-layout.org/browser/latex/Standard-LaTeX/README.txt
[6] http://en.wikibooks.org/wiki/LaTeX/Special_Characters#Extending_the_support
[7] http://gnu.org/s/emacs