Please do not reply directly to this email. All additional comments should be made in the comments box of this bug report.
Summary: Missing locl romanian magic
https://bugzilla.redhat.com/show_bug.cgi?id=455981
------- Additional Comments From bl.bugs@gmail.com 2008-07-21 08:05 EST ------- Ugh, Unicode seems to have made an even bigger mess out of this than I originally thought...
So, apparently both U+015E-U+015F, U+0162-U+0163, and U+0218-U+021B can still all be used for Romanian. With the extra string attached to U+0218-U+021B that they should be used when a distinct shape with comma below is needed. So you're still allowed the U+015E-U+015F, U+0162-U+0163 glyphs to write Romanian apparently.
And since Unicode only cares about code points, it didn't give any clue on how fonts or renderers are supposed to know when distinct glyphs are needed. Yet Unicode expects them to clean up the mess they've made.
It should be done because locl is an *optional* font feature.
I thought it was obligated if a language was passed to the renderer (but I may be wrong on this).
Adobe introduced ROM/locl because they (and 99% of commercial fonts) remap "t with cedilla" to "t with comma" regardless of locale
That's just bad, t with cedilla _is_ used sometimes. I think it was even proposed a long time ago to be used in French for when a t sounds like /s/, like "relaĊ£ion" (didn't catch on unfortunately :-) ). Unicode itself mentions Semitic transliteration (but I guess that needs a lot of other glyphs those fonts don't have).
So far I've only found three Adobe fonts with Romanian glyphs and two didn't have the locl rule, so it looks like Adobe doesn't do it often either. They all have indeed t with comma below in the place of t with cedilla. If you have documents with mixed diacritics you can blame it on that practice, _not_ the absence of locl rules in the font.
I've also checked the MS Vista fonts once (usually they make the de facto standard rules since their fonts are most widely spread). Segoe UI and the new versions of Arial, Times New Roman etc. don't have locl rules or anything else and have t with cedilla at U+0162-U+0163 (I think the old versions known as the corefonts were pre-Unicode 3.0). The C-fonts which were made by another foundry have t with comma below at U+0162-U+0163 like Adobe fonts, and have a salt (stylistic alternate) _and_ a locl feature for s with cedilla glyphs to s with comma below for Romanian.
Also, one thing I'm asking myself is: why doesn't Gentium have locl rules (or ccmp rules)? It's a more recent font compared to Doulos and Charis, so the SIL people seem to have changed their minds about it, and I'd like to know their reasons before changing anything in DejaVu.
So, short conclusion: how it's dealt with it seems to just depend on the foundry that made the fonts, and it also seems to depend on who you ask. So far, I haven't seen enough yet to be sure that a locl rule is needed.
Also, don't always assume commercial fonts have it right. As said above, the same fonts have t with comma below in place of t with cedilla, together with a s with cedilla, which is the worst thing you can do here.