[GTALUG] interesting article and comments about UCS-16, UTF-16, UTF-8
D. Hugh Redelmeier
hugh at mimosa.com
Sat Aug 3 18:18:51 EDT 2019
https://news.ycombinator.com/item?id=20600195
There are so many hairy details!
UTF-8 gets a bit less coverage since it has fewer hairy details.
>From this I learned that Java and JavaScript now have optimizations to
use LATIN-1 when they can. Normally they use UTF-16 (originally
UCS-16). I take it that Using Latin-1 is an opportunistic
optimization hidden from the program. I don't think Python 3 uses
this.
I think that Linux does this right and needs no such hack: just use
UTF-8. Of course Java, JavaScript, Python 2, and Python 3 on Linux
don't get it right.
More information about the talk
mailing list