Why I made the Canto Font
The Big Idea
How this makes Jyutping approachable
Problem 1: Unfriendly Romanization
Cantonese is a tonal language where the pitch and the inflection both carry meaning. For example, the phrases ä˝ ĺ
ć佢 and ä˝ ĺ
ć佢, sounds the same except for the tones but have completely different meaning. (The former means, âthat twisted way you love him,â the latter, âthe depth of your love for himâ.) Conveying the tone is essential.
Jyutping was devised by the Linguistic Society of Hong Kong (LSHK) to represent the sounds in modern Cantonese. Jyutping started as an academic linguistsâ tool; the goal was to fully and unambiguously denote sounds. Many researcher tools are user hostile, and by compressing pitch and inflection into a (seemingly arbitrary) number, Jyutpingâs treatment of tones is certainly not user-friendly.
Learning a language is hard, and especially so for learners who are exposed to a tonal language for the first time. Needing to mentally map some number to a tone before applying it to a preceding vowel creates an additional strain on the very limited short-term memory.
On full sentences, Jyutpingâs mix of alphabet and numbers create an intimidating, alien-script aesthetic. This is exacerbated when mixed with Chinese script, where the block/fixed-width expectation of ideographs clashes with the variable length alphanumeric.
Raw Jyutping is ugly and unreadable on many levels.
Problem 2: Shaky Jyutping Generation
Written Cantonese uses Chinese ideographs, but there is no 1:1 correspondence between what is written and how it sounds. For example, the character čĄ can be read in five common ways, and it must be read in one specific way within a given context.
With a long history, with influences from many languages, and complexity around Simplified and Traditional script, there are no rules that describe why/how a character should be read. The right reading â what is ârightâ? â is simply the collective agreement on the tens-of-thousands of characters and hundreds-of-thousands of contexts.
Even getting the context is not easy: without spaces denoting word boundary, Chinese sentences needed to be segmented into words. Backed by the collective might of Taiwan and the PRC, there are tools for doing this with standard written Chinese with good success, but doing so for Cantonese is non-trivial.
Automated Jyutping generation is thus often faulty. They are poor enough that none of the tools were even able to provide accuracy benchmarks, because that would require long âmodel answersâ from different genre of text for comparison. The tooling was wrong often enough and itâs too tedious to prepare long model answers.
Solution 1: Better Jyutping Design
The Cantonese Font updates how Jyutping looks, on the character and prose levels. First conceived in 2017, every tiny aspect and details were tuned over hundreds of iterations. The Jyutping from this is easy and appealing to read in both standalone and long prose forms.
Character level
There are several small modifications to Jyutping on a character level:
- adding of tone marks to visualize pitch and inflection
- placing tone numbers to match pitch
- coloring the components, especially to de-accentuate the non-plosive coda
Prose level
Characters adopt a pragmatic fixed-width setting. This has two elements:
- characters are primarily uniform-width,
- but grows to fit wider Jyutping as needed.
The overall effect is that there is a fixed-width, fulfiling CJK conventions and easy reading rhythm, while accommodating over-run Jyutping only to the necessary extent.