t-fischer.net

Blog

Follow-up on ‘ASCII Transliteration without ICU or iconv’

2020-10-13, 13:37, by Thomas Fischer

In a recent blog posting I presented some code on how to transliterate common Unicode characters into ASCII-only representations using a offline-generated lookup table to avoid dependencies on ICU which would normally do this job.

By an anonymous commenter, I got pointed to that Unicode (in Qt) is slightly more complicated than I had considered when writing the code: I missed to handle planes beyond the Basic Multilingual Plane (BMP) and the ‘surrogates’ between code points 0xD800 and 0xDFFF. In a series of recently pushed Git commits I addressed problem of surrogates and fixed some more issues. Some preparatory work has been done to support more planes in the future, but as of now, only the BMP is supported. For details, please have a look at the five commits posted on 2019-10-12.

Tagged with: kde kbibtex

This posting is available via Gemini at gemini://gemini.t-fischer.net/post/follow-up-on-ascii-transliteration-without-icu-or-iconv.gmi.

Commenting blog postings is currently not possible. Instead, share it on Mastodon icon Mastodon.