Incorrect fallback font (for Chinese characters)
Here is how JavaFX on Windows 11 renders a Label
with mixed characters. I keep the default font “System” and set the font size to 48:
If you are not familiar with Chinese characters, pay attention to the thickness of the lines and the tiny hooks. The problem is that the first character (內, pronounced nèi) is set in a different font compared to the following characters! However, there is actually another issue that has something to do with which fonts JavaFX uses to render certain characters.
I was noticing these mixed fonts (for Chinese) from time to time while testing Merit Cards during my own studies. Usually all words and sentences looked alright (like the last 3 characters in the example), but there were some characters like 內 or 說 that suddenly “broke” into a different font.
The following sections describe how I got closer to the problem and its solution. I’m writing this introduction some days after collecting and creating enough example images, so bear in mind that at the beginning of my research, I was not aware of some of the technical details yet.
If you also develop JavaFX applications and need a quick fix for this issue, one suggestion is here at the bottom.
First investigation (2022-08-09)
First something quick about my setup: I’m running Windows 11 and write traditional Chinese characters using Microsoft’s input method (Taiwan Bopomofo/Zhuyin Fuhao). For Latin characters, the default font is called Segoe UI. A Chinese phrase on a Facet might look like this:
This is without setting anything in JavaFX (but the font size for this screenshot), so the system should use whatever the default is. It doesn’t look pretty, but in the beginning I simply regarded it as one of those quirks that you encounter when running anything Java-based.
Of course, I could use a prettier font and just set everything to use that one, but consider what happens when I manually set a font for a Label consisting of mixed characters:
East-Asian fonts often have weird shapes for Latin characters. Also, consider what happens if you mix Latin and Chinese characters in e.g. a code editor: The Latin characters and other symbols stay in your (monospace) font of choice. It must be possible to mix different fonts for different character sets.
There are many places in Windows (and on other systems) where you mix character sets seemingly without problem. For example, here I put the same phrase into Windows’ search:
The “內” looks correct! That is, it is rendered in the same font as the following characters. So why could it be that JavaFX somehow chokes on 內?
On font substitution
When a font is asked to render a character that is not part of its glyphs, Windows uses Font substitution to insert another Font’s glyphs into the same text area. This is outlined here by Microsoft: Font technology#Font substitution.
It seems this is correctly used in the Windows Search bar, but not in JavaFX. Let’s look at the fallback list for Segoe UI to see if there is a problem. It is stored under Computer\HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\FontLink\SystemLink
and contains:
TAHOMA.TTF,Tahoma
MEIRYO.TTC,Meiryo UI,128,96
MEIRYO.TTC,Meiryo UI
MSGOTHIC.TTC,MS UI Gothic
MSJH.TTC,Microsoft JhengHei UI,128,96
MSJH.TTC,Microsoft JhengHei UI
MSYH.TTC,Microsoft YaHei UI,128,96
MSYH.TTC,Microsoft YaHei UI
MALGUN.TTF,Malgun Gothic,128,96
MALGUN.TTF,Malgun Gothic
MINGLIU.TTC,PMingLiU
SIMSUN.TTC,SimSun
GULIM.TTC,Gulim
YUGOTHM.TTC,Yu Gothic UI,128,96
YUGOTHM.TTC,Yu Gothic UI
SEGUISYM.TTF,Segoe UI Symbol
I’m recognizing Microsoft JhengHei UI as a font for traditional characters here, and it also appears fairly early in the list. If you have used Chinese systems in the past, you might know PMingLiU as a typical example for traditional Chinese characters. It actually looks a lot closer to the mis-rendered 內 than Microsoft JhengHei UI (as we will see in a minute). Let’s take a look at the first image again:
If JavaFX works according to Microsoft’s documentation, I might be able to fix the mixed fonts by adjusting the values of Segoe UI. However, editing this list didn’t change anything at all! So I simply tried to remove Registry entries one after another until I see something breaking. Interestingly, I was just able to close and restart Merit Cards to check for changes (in contrast to Microsoft’s documentation which requires a restart; more on that later).
After a few steps, I noticed that the Registry entry for Tahoma seems to be responsible. It is the first entry for Segoe UI’s fallback*, and its fallback list is:
MSGOTHIC.TTC,MS UI Gothic
MINGLIU.TTC,PMingLiU
SIMSUN.TTC,SimSun
GULIM.TTC,Gulim
YUGOTHM.TTC,Yu Gothic UI
MSJH.TTC,Microsoft JhengHei UI
MSYH.TTC,Microsoft YaHei UI
MALGUN.TTF,Malgun Gothic
SEGUISYM.TTF,Segoe UI Symbol
*As it will later turn out, this is probably just a coincidence, but it just made me more suspicious of Tahoma.
In this list, the PMingliU appears pretty early. Could this be the reason for the mis-rendered 內? Let’s bring Microsoft JhengHei (MSJH.TTC,Microsoft JhengHei UI
) to the top of the list. Saving the Registry and restarting Merit Cards yields:
All characters are rendered in the same font! But this font (Microsoft JhengHei UI) looks completely different to the Chinese font in the first image. As it turns out, what happened before the change was: When the default font isn’t able to render Chinese characters, it falls back to the first font in the list which is MS UI Gothic—a font for Japanese! So it seems that all the time I was typing and entering traditional characters, a Japanese font was used to display them—until 內. Then Tahoma moves next to the list, PMingLiU.
In hindsight, the shape and name of these fonts give away a big clue, but I just wasn’t familiar with which fonts Windows uses for Japanese.
The problem now is, why does this only work when changing Tahoma’s fallback list and not Segoe UI’s? Since Segoe UI also contains MS UI Gothic in an earlier position, the next fallback font should be Microsoft JhengHei UI next and not PMingLiU (for the 內). In Tahoma’s list, this is reversed: PMingLiU comes before Microsoft JhengHei UI.
Taking a look into relevant code sections of JavaFX, this implementation of FallbackResource
seems to be responsible for a hard-coded Tahoma value: com.sun.javafx.font.FallbackResource
There is a comment for this method which even states “To start with we will use the exact same fall back list for everything,” and what the highlighted line does is look into the Registry for the fallback list of Tahoma.
One can confirm this behavior by removing every font from Segoe UI’s fallback list. It doesn’t seem to get used at all, and only Tahoma’s is considered.
So for now, if I want to get rid of the mixed Asian fonts (by the way, this is called a Zwiebelfisch in German), I could fix it on my local system by editing Tahoma’s fallback list. In my case, I would put Microsoft JhengHei UI to the top, replacing MS UI Gothic.
There is still a question though: Taking a look at the Windows search bar image, the Chinese characters are rendered using Microsoft JhengHei UI even though MS UI Gothic comes first in Segoe UI’s and Tahoma’s fallback list. However, I’m pretty sure that I am seeing traditional Chinese characters elsewhere and not their Japanese form.
How can Windows—or any application—even know that if you copy and paste 內憂外患 somewhere that it should use e.g. a Japanese font if you are currently writing a Japanese text and a Chinese font in other cases? What my “fix” in JavaFX now does is using a font for traditional Chinese even though a Merit Cards user might enter Japanese phrases.
The next day (2022-08-10)
Going back to the question from yesterday, which font should Windows use if I randomly paste 內憂外患 somewhere? Here is how it looks in Notepad:
Microsoft JhengHei UI. However, to get more data points, I was trying the same on another machine that was still running Windows 10. Here is what happens in Windows 10’s Notepad (I put the Font selection dialog next to it):
Look at the 內 and it’s the exact behavior that one could expect from Microsoft’s font substitution: Segoe UI isn’t able to render 內憂外患, so it falls back to MS Gothic UI—however, MS Gothic UI isn’t able to render 內 so it falls back to (looking at Segoe UI’s fallback list): Microsoft JhengHei UI!
By the way, what is wrong with this 內 after all? Looking at Windows 10’s font settings, one can see:
It doesn’t exist in MS Gothic UI, that’s why JavaFX/Win32 falls back to the next font. It does contain another shape of this character (内) which also has a different code point. It seems Japanese input methods output 内 while Chinese ones output 內.
With our current knowledge, we should be able to “fix” Notepad by editing Segoe UI’s fallbacks. Luckily, everything behaves like it should—putting Microsoft JhengHei UI first and restarting the machine, Notepad looks like this:
Windows 11’s Notepad new look gives away that Microsoft probably changed something in its backend so that it is not affected by the Registry’s font substitution anymore. The difference seems to lie in running classic Win32 vs. running WPF applications. Researching the topic around WPF a little bit I’ve found the following, which states:
The WPF font fallback mechanism replaces previous Win32 font substitution technologies. https://docs.microsoft.com/en-us/dotnet/api/system.windows.media.fontfamily?view=netframework-4.7.1#font-fallback
The “composite font” which will cleverly fall back to appropriate fonts (and not to MS Gothic UI for Chinese) seems to be implemented here: https://github.com/dotnet/wpf/blob/main/src/Microsoft.DotNet.Wpf/src/PresentationCore/Fonts/GlobalUserInterface.CompositeFont
In its entries for zh-Hant
, one can see that Microsoft JhengHei UI comes before Yu Gothic UI and Meiryo UI (other fonts for Japanese) which confirms Windows 11’s way of choosing fonts. However, I don’t understand yet how the WPF runtime is able to render 外 in Microsoft JhengHei UI for me when it could also choose a Japanese font. Perhaps the locale of the user (or what the program is run with) is taken into account.
The takeaway (2022-08-12)
Let’s summarize all findings briefly:
- The font substitutions set in Windows’ Registry are used for Win32 applications. In Windows 11, Notepad’s runtime changed (it did become a “modern” app after all) so it works different to Windows 10’s Notepad.
- WPF applications (and probably all of Windows’ newer runtimes) use a different mechanism for substituting fonts based on the so-called Global User Interface (composite font). You can even look it up in
C:/Windows/Fonts/GlobalUserInterface.CompositeFont
, although you don’t see it when you use Explorer’s built-in font manager. It is not a font but an XML file after all. - JavaFX, on the other hand, does font substitution in a home-grown way and at runtime by looking at the fallback list for Tahoma, and Tahoma only. That’s why nothing changes when editing anything but Tahoma’s list in the Registry, and is also the reason why I could simply close and restart Merit Cards to observe new changes.
I don’t see anyone complaining about this (at least for now), but if this is true, then it would mean that most Chinese characters are rendered using a Japanese font in all JavaFX applications when only relying on the default font. Manually setting the font to a Chinese one is possible, but could mess up Latin characters. JavaFX should probably do something similar to WPF instead of relying only on Tahoma’s fallback list. Perhaps there should be some kind of Language
or Locale
flag for Label
/Text
which can be used to look up a suitable font.
At the moment, the only solution that I've found is manually editing Tahoma's fallback list.
Open the Registry and navigate to Computer\HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\FontLink\SystemLink
, then edit Tahoma
and add which font you would like to see to the top. I’m reading and writing traditional Chinese characters, so I use Microsoft JhengHei UI:
MSJH.TTC,Microsoft JhengHei UI
MSGOTHIC.TTC,MS UI Gothic
MINGLIU.TTC,PMingLiU
SIMSUN.TTC,SimSun
GULIM.TTC,Gulim
YUGOTHM.TTC,Yu Gothic UI
MSYH.TTC,Microsoft YaHei UI
MALGUN.TTF,Malgun Gothic
SEGUISYM.TTF,Segoe UI Symbol
For simplified Chinese, Microsoft’s newer font is Microsoft YaHei UI, and a newer font for Japanese (compared to MS UI Gothic) is Yu Gothic UI.
Some days later (2022-08-14)
It seems the Java runtime itself has some kind of system to specify logical fonts (composite fonts in WPF) which one can access via Serif, SansSerif, Monospaced, Dialog, and DialogInput. There is also this document on how to specify fallbacks and search paths.
While you can use these key strings for -fx-font-family
in JavaFX and get different kind of fonts (e.g. Arial), I was not able to specify overrides or replacements using the config file from the documentation above.