Due to my studies I (will) often use Japanese in my blog. As I gave some thought to typography and readability, I found the default appearance of Japanese text to be in stark contrast with the rest of my design.¹ To target specifically Japanese text, I wrote a small Markdown-Python extension for use in static blog generators as Jekyll and Pelican (or pretty much anything that utilizes Markdown-Python to parse Markdown in HTML) and embed such text in a span with the language attribute set to Japanese. The added, and probably more important bonus, aside from styling and semantic reasons, is that this method counters the negative effects of Han unification in so-called CJK-languages.

Download

I’ve added the extension on its own repository on my GitHub for anyone interested, but as it serves its purpose for me as-is I have no further interest in maintaining it at the moment.²

Installation

Copy the japanese.py script into your python-markdown extension directory.

If you’re using Pelican as static site generator, open your project’s pelicanconf.py and add 'japanese' to the MD_EXTENSIONS list:

1	`MD_EXTENSIONS = ['japanese']`

Usage

Using a simple regular expression (\{\{)(.+?)(\}\}), the extension treats double {} brackets as span tags with a lang="ja" attribute.

1	`{{読書クラブ}}`

will output

1	`<span lang="ja">読書クラブ</span>`

Example 1 (fonts): just compare 読書(どくしょ)クラブ (custom) to 読書(どくしょ)クラブ (Meiryo) to 読書(どくしょ)クラブ (MS Gothic default).³

Example 2 (unihan): compare the Chinese to Japanese characters: 隆 (隆), 誤 (誤), 直 (直).⁴

Styling

Although it’s a bit of a risk performance-wise, I’m quite a fan of Google’s free web-fonts.⁵ Due the complexity of the Japanese character-set, development on these have been slow⁶, but Google’s Noto Font is getting quite efficient and with the Japanese font set supporting near 7000 characters, it should pose no problem for most web-projects. Since it works better, typography-wise, with the rest of my fonts, I use this one over fonts as Meiryo that are more widespread across all platforms.

Using the CSS below, I ensure max compatibility by using Meiryo and others as fall-back if the page can’t connect to Google’s font API.

@import url(https://fonts.googleapis.com/earlyaccess/notosansjapanese.css);

    [lang="ja"] {
      font-family: "Noto Sans Japanese", "メイリオ","Meiryo","ヒラギノ角ゴ Pro W3",
      "Hiragino Kaku Gothic Pro","ＭＳ Ｐゴシック","MS PGothic",Sans-Serif;
      font-weight: 100;
      font-size: 95%;
    }

Stevie Poppe | Onoreto

Parsing Japanese Text in Markdown-Python for Stylizing and Semantic Purposes

Download

Installation

Usage

Styling

Further reading

Related posts