emoji and grapheme clusters
Series: [blog]
Niki wrote Emoji under the hood, where he explains how emoji work.
- the intro to Unicode and the encoding UTF-8 / UTF-16.
- how representation works for emoji fonts when glyphs are bitmaps and not vector shapes
- the unexpected results of font fallbacks
- the VARIATION SELECTOR-16
U+FE0Fto point for the right font when a codepoint is implemented in multiple fonts - that you should never ever try to string split grapheme clusters
- same holds for
U+200D, the ZERO-WIDTH JOINER (ZWJ) which is used to compose emoji
- same holds for
- flags are two-letter ligatures
Or: you should always use the ICU library for string operations. The key is to handle grapheme clusters correct.