textWrapping not working on Asian languages (Japanese & Chinese)

Aaran · May 17, 2020, 2:35am

GUI TextBlock textWrapping is not working on Asian languages. I’ve tried Japanese and Chinese.

The PG I saved is all ? marks.
https://www.babylonjs-playground.com/#XCPP9Y#2555

But you can simply copy+paste the text below and replace line 33.
var text = “<copy+paste here>”

Japanese text:
バビロンは、紀元前18～6世紀の古代メソポタミア地域における主要な王国だった。その名をとどろかせる首都が、ユーフラテス川沿いに建設された。都市は、川をまたいでその両岸に、ほぼ同じ面積で建設され、時に襲う季節的な洪水を防ぐため、川沿いには険しい堤防が築かれた。元々は、バビロンはアッカドの小さな町で、その起源は紀元前2300年頃のアッカド帝国にまでさかのぼる。

Chinese text:
巴比伦城市遗址在今天伊拉克巴比伦省的希拉被发现，位于巴格达以南约八十五公里处。这个举世闻名城市的遗址地处底格里斯河和幼发拉底河之间肥沃的美索不达米亚平原上，现在仅留存着由破损的土砖建筑物构成的大型土墩和碎片。城市沿着幼发拉底河建造，被左、右河岸平分成两部分，配有陡峭的河堤来抵御季节性的洪水。

Thanks.

Deltakosh · May 18, 2020, 1:58pm

Pinging @thomlucc to track this issue

thomlucc · May 18, 2020, 5:07pm

@Aaran - I create an issue on Babylon.js repo. You can track: #8211

Evgeni_Popov · May 18, 2020, 9:35pm

The problem is that the separator character is assumed to be the space character, and there is no space in those strings…

There are specific rules to split japanese words: Line breaking rules in East Asian languages - Wikipedia

Do we want to implement this in the Babylon GUI?

If yes I guess we would need to support other asian languages as well…

Also, how to detect the language from a string? Must be possible but not easy, I guess we would need the user to pass the information.

Evgeni_Popov · May 18, 2020, 9:37pm

Maybe this can be used to split words for japanese: GitHub - trkbt10/mikan.js: 機械学習を用いていない日本語改行問題へのソリューション

Deltakosh · May 18, 2020, 11:13pm

What about giving the user an option to provide his own separation mechanism. And we can create a PG for the doc that does it for chinese / japanese ?

Evgeni_Popov · May 18, 2020, 11:24pm

Ok, will do that.

Aaran · May 19, 2020, 1:29pm

Thanks @thomlucc and all for your help!

Evgeni_Popov · May 19, 2020, 6:19pm

Done, but I’m a bit stuck for an example PG because of:

Deltakosh · May 19, 2020, 6:28pm

yeah the server may not store unicode characters…

This is the response I’m getting
{“id”:“MEH9C5”,“version”:0,“snippetIdentifier”:“MEH9C5-0”,“jsonPayload”:"{“code”:“var joshi = \”???|\";\r\n"}",“name”:“Test japanese”,“description”:"",“tags”:""}

Investigating the root cause

Deltakosh · May 19, 2020, 7:16pm

Ok I cannot fix it. The playground backend is configured to use text and not ntext and I cannot change it easily as the field is used by the fully et search system

Evgeni_Popov · May 20, 2020, 7:32pm

A TextBlock.wordSplittingFunction property has been added, that lets you provide a function that should return an array of words given a string line.

Here’s a fiddle that demonstrates the usage for japanese (could not find an easy to use js package for chinese as Milkan is for japanese):

https://jsfiddle.net/3ph9m0cx/

Deltakosh · May 20, 2020, 9:00pm

do you mind updating the doc as well as this is super useful?
Thanks a ton

Evgeni_Popov · May 21, 2020, 11:43am

Done:

thomlucc · May 22, 2020, 4:45pm

Thanks @Evgeni_Popov!! @Aaran

Topic		Replies	Views
GUI TextBlock auto warpping on CJK characters Feature requests text	2	395	February 28, 2023
Chinese cannot wrap automatically in the 3DGUI Questions gui	5	269	February 6, 2023
GUI text wrapping Bugs	4	692	March 24, 2023
InputText and TextBlock work incorrectly with Unicode characters (Emoji, for example) Bugs	5	570	March 12, 2024
Incorrect word wrap behavior in InputTextArea component Bugs gui , text	4	363	August 16, 2023

textWrapping not working on Asian languages (Japanese & Chinese)

Related topics