textWrapping not working on Asian languages (Japanese & Chinese)

GUI TextBlock textWrapping is not working on Asian languages. I’ve tried Japanese and Chinese.

The PG I saved is all ? marks.
https://www.babylonjs-playground.com/#XCPP9Y#2555

But you can simply copy+paste the text below and replace line 33.
var text = “<copy+paste here>”

Japanese text:
バビロンは、紀元前18~6世紀の古代メソポタミア地域における主要な王国だった。その名をとどろかせる首都が、ユーフラテス川沿いに建設された。都市は、川をまたいでその両岸に、ほぼ同じ面積で建設され、時に襲う季節的な洪水を防ぐため、川沿いには険しい堤防が築かれた。元々は、バビロンはアッカドの小さな町で、その起源は紀元前2300年頃のアッカド帝国にまでさかのぼる。

Chinese text:
巴比伦城市遗址在今天伊拉克巴比伦省的希拉被发现,位于巴格达以南约八十五公里处。这个举世闻名城市的遗址地处底格里斯河和幼发拉底河之间肥沃的美索不达米亚平原上,现在仅留存着由破损的土砖建筑物构成的大型土墩和碎片。城市沿着幼发拉底河建造,被左、右河岸平分成两部分,配有陡峭的河堤来抵御季节性的洪水。

Thanks.

Pinging @thomlucc to track this issue

1 Like

@Aaran - I create an issue on Babylon.js repo. You can track: #8211

The problem is that the separator character is assumed to be the space character, and there is no space in those strings…

There are specific rules to split japanese words: Line breaking rules in East Asian languages - Wikipedia

Do we want to implement this in the Babylon GUI?

If yes I guess we would need to support other asian languages as well…

Also, how to detect the language from a string? Must be possible but not easy, I guess we would need the user to pass the information.

Maybe this can be used to split words for japanese: GitHub - trkbt10/mikan.js: 機械学習を用いていない日本語改行問題へのソリューション

What about giving the user an option to provide his own separation mechanism. And we can create a PG for the doc that does it for chinese / japanese ?

1 Like

Ok, will do that.

Thanks @thomlucc and all for your help!

Done, but I’m a bit stuck for an example PG because of:

yeah the server may not store unicode characters…

This is the response I’m getting
{“id”:“MEH9C5”,“version”:0,“snippetIdentifier”:“MEH9C5-0”,“jsonPayload”:"{“code”:“var joshi = \”???|\";\r\n"}",“name”:“Test japanese”,“description”:"",“tags”:""}

Investigating the root cause

Ok I cannot fix it. The playground backend is configured to use text and not ntext and I cannot change it easily as the field is used by the fully et search system

A TextBlock.wordSplittingFunction property has been added, that lets you provide a function that should return an array of words given a string line.

Here’s a fiddle that demonstrates the usage for japanese (could not find an easy to use js package for chinese as Milkan is for japanese):

https://jsfiddle.net/3ph9m0cx/

3 Likes

do you mind updating the doc as well as this is super useful?
Thanks a ton

1 Like

Done:

1 Like

Thanks @Evgeni_Popov!! @Aaran

1 Like