bug-gnupod
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnupod] Encoding of non-ascii characters in GNUtunesDB.xml


From: H. Langos
Subject: Re: [Bug-gnupod] Encoding of non-ascii characters in GNUtunesDB.xml
Date: Tue, 22 Apr 2008 21:44:24 +0200
User-agent: Mutt/1.5.13 (2006-08-11)

Patch to the patch of the patch ... or rather not...

I rewrote the UTF-8 to XML-entities conversion once again. This time I only use 
perl regular expressions and no Unicode::String methods at all.

Instead of making another patch I'll simply paste the xescaped sub.

################################################################
# Escape chars
sub xescaped {
        my ($ret) = @_;
        $ret =~ s/&/&/g;
        $ret =~ s/"/"/g;
        $ret =~ s/\'/'/g;
        $ret =~ s/</&lt;/g;
        $ret =~ s/>/&gt;/g;
        #$ret =~ s/^\s*-+//g;
        my $xutf = Unicode::String::utf8($ret)->utf8;
        #Remove 0x00 - 0x1f chars (we don't need them)
        $xutf =~ tr/\000-\037//d;
        #convert to XML encoded unicode
        $xutf =~ s/([\xC2-\xDF])([\x80-\xBF])/"&#".( ((ord($1) % 32) <<  6) +  
(ord($2) % 64) ).";"/eg;
        $xutf =~ s/([\xE0-\xEF])([\x80-\xBF])([\x80-\xBF])/"&#".( ((ord($1) % 
16) << 12) + ((ord($2) % 64) <<  6) +  (ord($3) % 64) ).";"/eg;
        $xutf =~ s/([\xF0-\xF4])([\x80-\xBF])([\x80-\xBF])([\x80-\xBF])/"&#".( 
((ord($1) %  8) << 18) + ((ord($2) % 64) << 12) + ((ord($3) % 64) <<  6) + 
(ord($4) % 64) ).";"/eg;

        return $xutf;
}
################################################################


cheers
-henrik





reply via email to

[Prev in Thread] Current Thread [Next in Thread]