This string "𠀎a" is wrongly shown as a difference when comparing a UTF-8 to UTF-16 version of the string:
"�a" <-> "𠀎a" (UTF-8 on left and UTF-16 on right)
On my screen it shows one ? for the UTF-8 and two ?s for the UTF-16 [in beyond compare] (You might be able to see the character on one side in this post).
This code point uses 4 bytes in UTF-16. Note that other code points are correctly matched UTF-8 <-> UTF-16 (when it's a 2 byte UTF-16 representation).
Jon
"�a" <-> "𠀎a" (UTF-8 on left and UTF-16 on right)
On my screen it shows one ? for the UTF-8 and two ?s for the UTF-16 [in beyond compare] (You might be able to see the character on one side in this post).
This code point uses 4 bytes in UTF-16. Note that other code points are correctly matched UTF-8 <-> UTF-16 (when it's a 2 byte UTF-16 representation).
Jon
Comment