All those weird Dutch characters were messing me up, and my Perl code was generating all kinds of cryptic errors, like: Malformed UTF-8 character
, Wide character in print
, etc.
You see, buried deep in the Perl innards somewhere, something was choking bad with such double dotted characters like ë, ö, etc.
I struggled with a number of my own home-grown solutions, but they didn't work well with all possible situations.
Finally in desperation, while perusing through one of my favorite recluses for inspiration, the good old CPAN, I happened to spy the Encode
module.
A little bell rang in my head, and I knew I was in the right place. Getting warm, getting warmer, warmer, boiling, boiling hot ... bingo.
$text = Encode::decode('iso-8859-1', $text);
That's the simple and basic little statement that saved me and made my day. What could possibly be easier than that, and why was it so hard for me to find?
Maybe in the future I should visit CPAN more often and earlier, and not think so much of myself. As if I can always solve these grueling problems on my own.
I'm a pretty smart guy, but a little assistance from my Perl friends is always a welcome helping hand.
Nice read :)
I also read http://ahinea.com/en/tech/perl-unicode-struggle.html
on this subject ( the method you found is in the middle of the article ).
Though I haven't actually worked with utf8 except for reading the
utf8 docs and these two articles :)