GUI inputs are broken when using Unicode chars

Here is a Playground
Steps to reproduce :

  • Click within the input
  • Move the cursor from right to left with the keyboard
  • Try typing some text while at several cursor position. Strangely, new chars from keyboard will almost never appear where expected

This is not a β€œbug” per se :slight_smile:
We are simply not supporting unicode. It is a known limitation of that control.

That being said, I will gladly welcome any PR on that topic :slight_smile:

Ok ! :slight_smile:

That said, already we are quite close to having it working. The rendering is quite fine, I think the problem mainly comes from indexing within the decomposition of the chars, which is using raw text β€œas is”, while in my example, (emojis) the chars are using double chared unicodes :
"πŸ˜€πŸ˜πŸ˜‚πŸ€£" β†’ "\uD83D\uDE00\uD83D\uDE01\uD83D\uDE02\uD83E\uDD23"


Before my chat project, I had never noticed that Javascript sees these chars as a total of 8 while for example python sees 4 (because it handles unicode natively and javascript doesn’t, as far as I understand) :

JS :

> const text = "πŸ˜€πŸ˜πŸ˜‚πŸ€£";
undefined
> text.length;
8

Python :

>>> text = "πŸ˜€πŸ˜πŸ˜‚πŸ€£"
>>> len(text)
4

Coming back to the fix, I think when it comes to computing text length, or caret position, we could replace the use of the inputField.text by the use of its grapheme decomposition :

function decomp(text){
    const segmenter = new Intl.Segmenter('en', { granularity: 'grapheme' });
    const segments = Array.from(segmenter.segment(text));
    return segments.map(segment => segment.segment);
}

Test :

> decomp("πŸ˜€πŸ˜πŸ˜‚πŸ€£").length;
4
> decomp("πŸ˜€πŸ˜πŸ˜‚πŸ€£");
['πŸ˜€', '😁', 'πŸ˜‚', '🀣']
> decomp("Hello World !");
['H', 'e', 'l', 'l', 'o', ' ', 'W', 'o', 'r', 'l', 'd', ' ','!']

Agreed! This is super close to be done :slight_smile:

This looks widely supported, but then I was curious - what is the browser support required for a feature?

Intl.Segmenter is showing at 93.96% global for me.

that should be far enough!

1 Like