On Friday 24 March 2006 13:34, James Wilkinson wrote:
A backup from an FC3 machine listed
SUPPORTED="en_GB.UTF-8:en_GB:en:en_US.UTF-8:en_US:en"
although I doubt both en references are strictly necessary.
OK, I've altered the file.
> Here's a sample -
>
> ../Mp3/marisa_monte/rose_and_charcoal/06_dan�_da_solid�.mp3
>
> The title should read
>
> 06_dança_da_solidäo.mp3
That's actually a different symptom of the same problem. UTF8 takes two
bytes to store most common non-ASCII characters, whereas the ISO-8859
family always uses one byte.
What you first described was seeing the two UTF8 bytes in an ISO-8859
program, so each accented character shows as two ISO-8859 characters
(some of which will probably be "illegal", so you'll see spaces or
something similar there).
It's quite possible that the two different displays were because, when first
attempting to troubleshoot this, I experimented by setting different
character sets in kde.
What you've just illustrated is an ISO-8859 name viewed in an
UTF-8
environment, where two ISO-8859 characters are interpreted as one
illegal UTF-8 character.
My first reaction is to blame the generating program (what was it?)
Grip generated the mp3s. I first saw the problem in k3b, but then in
konqueror and kmail, all under FC4.
In
my experience, many MP3 programs, following Winamp's example, have gone
flat-out for skins and custome text-handling. Too many of them don't
support UTF8 in $LANG properly.
Alternatively, what did the server box use to run? How did you transfer
the files? Red Hat went to UTF-8 early, and many other distros took a
lot longer to upgrade. And transferring files might not get the
conversion right.
It was running Mandriva 10.0. In truth, though, I can't remember whether the
box that generated the files was running Mdv 10.1 or 10.2. I don't think
10.0 had utf-8 (could be wrong) but it's very likely that I never elected to
use utf-8 when it first became available.
(You used to use Mandriva, didn't you? I'm not sure when they
adopted
UTF-8...)
I wrote:
> As for the single e-mail -- I'd blame the other end, personally.
Anne said:
> Maybe. Maybe he has the same problem as I do.
Um. Mail clients have no business not knowing which encoding they're
using. And if they know that, they've no business not putting it into
the headers of outgoing e-mail properly.
We've proved that your e-mail client can receive UTF-8. I suppose
there's still the chance that your correspondent used a weird encoding
that your client didn't understand. But you're not going to get the
"right" message anyway in those situations, except by blind luck.
Well thanks for the insights I've got, anyway. And finding convmv was another
good thing to come out of it. It all helps.
Anne
--
E-mail address: james | In the Royal Air Force a landing's OK,
@westexe.demon.co.uk | If the pilot gets out and can still walk away.
| But in the Fleet Air Arm the outlook is grim,
| If your landings are duff and you've not learnt to
| swim.