Pretty <ruby> for CJK languages
Mátyás Mustoha
Posted on June 4, 2023
Recently, I've been experimenting with East Asian typography and with creating print-quality output using HTML and CSS. However, it didn't take long and I noticed something: rubies are ugly! I haven't really found articles about the topic in English, so here's my attempt at one.
Wait, what?
If you're not familiar with the name “ruby”, they are small characters above the text, usually for providing pronunciation hints. For example, they can show furigana for Japanese, or bopomofo for Chinese, but also Latin letters as well.
The ruby element consists of a ruby base, and the ruby text, that most often sits on its top:
In HTML, we can use the <ruby>
tag to define a whole group, in which <rb>
1 defines the ruby base, and <rt>
the ruby text.2 (Spaces added below for readability.)
<ruby lang="ja"> <rb>東京</rb> <rt>とうきょう</rt> </ruby>
<ruby lang="zh"> <rb>北京</rb> <rt>Běijīng</rt> </ruby>
<ruby lang="vi"> <rb>河內</rb> <rt>Hà Nội</rt> </ruby>
The naive approach
Now what happens when the ruby text is wider than the ruby base? By default, <ruby>
acts sort of like a single block of text:
In Japanese typography, however, it often looks more pleasant to spread the text over the neighboring characters, without any spacing3:
This could be solved with a little CSS:
- take the ruby text out of the regular text flow with
position: absolute
, then - align it horizontally to the center of its parent, with something like
left: 50%; transform: translateX(-50%)
, and - move it to the top with
bottom: 100%
.
ruby {
position: relative;
}
ruby rt {
position: absolute;
left: 50%;
transform: translateX(-50%);
bottom: 100%;
}
And this works perfectly fine in Firefox, producing the earlier image.
Unfortunately, the implementation in Chrome and Safari lags behind at the moment, and the position
attribute does not seem to work at all there.
An alternative
If we cannot use the built-in <rt>
element, we could try to replace it with the CSS pseudo-element ::before
. If, instead of
<ruby>東京<rt>とうきょう</rt></ruby>
we write
<ruby data-rt="とうきょう">東京</ruby>
this stores the ruby text as a custom attribute, which we can access from CSS:
ruby[data-rt]::before {
content: attr(data-rt);
}
and, in addition to the very first styling attempt, to make it look like the original <rt>
tag:
ruby[data-rt]::before {
font-size: .5em;
line-height: 1;
}
The result looks visually the same as our first attempt!
Sidenotes
The above approach should work for most cases, including vertical writing. Corner cases might appear however if you try to build on top of if. As usual, most of these can be solved with a hint JavaScript code.
-
Shorter text: You might also want to spread out the characters if the ruby base is wider than the ruby text. An approach for that is to split the text to individual characters with JavaScript, then spread them with flexbox styling (
justify-content: space-around
for example happens to match the Japanese styling specification). However, you cannot target CSS pseudo-elements with JS, so you might need to manually construct a child element for your<ruby>
es. - Body overflow: If you want to be very precise, you might want to handle ruby texts flowing out of the body text area, i.e. make the text align to one of the sides.
-
Overlaps: The ruby texts might overlap or touch, though in practice the chance for that shouldn't be too high. If this becomes an issue, you can detect such cases using
getBoundingClientRect()
, and add some padding if necessary. - Compound words: If you want to use multiple ruby texts in one single ruby element (eg. per-character pronunciation), you might need to split the ruby elements. If the ruby base can break eg. at line ends, the ruby texts should probably follow that too.
You might also need to do some preprocessing, based on your source text:
-
From HTML: If your text is in HTML and already uses
<ruby>
and<rt>
, you can use JavaScript to query all ruby elements, and move the text content from the<rt>
into the data property of the<ruby>
. -
From Markdown: If your text is in Markdown or similar, a common ruby pattern is like this:
{東京|とう|きょう}
, that is,{base|text1|text2|...|textN}
, where eachtext
segment is the reading of a base character. -
From plain text: If you have plain text, where the reading is next to the word (eg.
東京《とうきょう》
), you can always just replace them with a regular expression, as long as the writing is consistent.
A nicely typeset page pleases the eye, and often requires just a tiny bit of additional care. If you happen to work with East Asian text a lot, I hope this will help to make your content look even better.
-
The
<rb>
tag is actually unnecessary now (you can directly write the text there), but in this example shows the element structure more clearly. ↩ -
For a long time,
<ruby>
wasn't well supported, so people also used “creative” solutions, like tables for alignment. You might still run into those on some sites. ↩ -
See https://www.w3.org/TR/jlreq/?lang=en for the whole specification. ↩
Posted on June 4, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.