Writing Chinese homework with Neovim and Pandoc

Kevin Guillaumond

June 20, 2022

One of my problems with learning the Chinese language is that there isn’t a strong correlation between what it sounds like and what it looks like written.

Another of my problems is that writing Chinese characters by hand is difficult, and not that useful to me. While I do enjoy taking notes on paper, I don’t really see many scenarios where I have to write something on paper and where it will be read by somebody who only understands Chinese characters.

For these reasons, some people choose to skip learning the characters entirely, and instead rely on pinyin, a romanization of the writing system. As an example of what “romanization” means, the name of the writing system itself is “汉语拼音”, which is romanized to “hànyǔ pīnyīn”.

While pinyin is very useful as a learning tool, no Chinese person actually uses it for written communication. I don’t want to rely entirely on it, because I’d like to be able to read and to text. Texting is possible using a keyboard where I input pinyin and just choose the correct character, which is a lot easier than being able to write it by hand.

I’m now in an actual Chinese class that has actual homework. Homework can be written in pinyin, but I want to get used to writing Chinese the way I would in real life, using my phone or laptop.

It seems a good idea to use tools I already use daily (or want to learn how to use), so I decided I would write the homework in Markdown on Neovim, then make a pretty PDF using Pandoc and LaTeX.

This is all done on PureOS but I assume it would work on most Debian derivatives.

First, install LaTeX and its Chinese language extension:

$ sudo apt install texlive
$ sudo apt install texlive-lang-chinese

Then optimistically try the obvious:

$ pandoc Homework1.md -f markdown -o Homework1.pdf
Error producing PDF.
! Package inputenc Error: Unicode character 第 (U+7B2C)
(inputenc)                not set up for use with LaTeX.

See the inputenc package documentation for explanation.
Type  H <return>  for immediate help.
 ...                                              
                                                  
l.52 第

Try running pandoc with --pdf-engine=xelatex.

Okay. The PDF engine has to be installed, so let’s do that and try again:

$ sudo apt install texlive-xetex
[...]
$ pandoc Homework1.md -f markdown -o Homework1.pdf --pdf-engine=xelatex
[WARNING] Missing character: There is no 第 (U+7B2C) in font [lmroman10-regular]:mapping=tex-text;!
[WARNING] Missing character: There is no 一 (U+4E00) in font [lmroman10-regular]:mapping=tex-text;!
[WARNING] Missing character: There is no 课 (U+8BFE) in font [lmroman10-regular]:mapping=tex-text;!
[WARNING] Missing character: There is no : (U+FF1A) in font [lmroman10-regular]:mapping=tex-text;!
[...]

So it skipped all the Chinese characters… I have to specify a font that includes them:

$ sudo apt install latex-cjk-all
[...]
$ pandoc Homework1.md -f markdown -o Homework1.pdf --pdf-engine=xelatex -V CJKmainfont='Noto Serif CJK SC'

And that’s how I can now write my Chinese homework on my phone or computer, run this command, print the resulting PDF and present it proudly at my next class!