sks-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [pgp-keyserver-folk] Re: UTF-8/non-ASCII chars in keys (was Re: [Sks


From: David Shaw
Subject: Re: [pgp-keyserver-folk] Re: UTF-8/non-ASCII chars in keys (was Re: [Sks-devel] 1.0.8 patches)
Date: Fri, 22 Oct 2004 08:30:28 -0400
User-agent: Mutt/1.5.6i

On Wed, Oct 20, 2004 at 03:44:49PM -0400, Jason Harris wrote:
> On Wed, Oct 20, 2004 at 01:21:37PM -0400, David Shaw wrote:
> 
> > Just a general FYI with UTF-8 searches and GnuPG.  Versions before
> > 1.2.6 did not always do this properly for HKP keyservers, and in fact
> > sometimes truncated the search string.  The current release, and all
> > future releases use UTF-8.
> 
> In the last keyanalyze keydump, there were 102579 userids with old-
> style extended chars v. 16682 with UTF-8 chars (based on counting
> instances of \xc3 (which is also the old-style 'Ã' and may skew
> the counts somewhat) from mutt's pgpring).
> 
> If necessary, keyservers can convert the old-style userid strings into
> UTF-8 before parsing and storing them in the userid word database.
> 
> Right now, there is one instance of Noèl and Köthe (same key) to
> test UTF-8 searches with on hkp://keyserver.kjsl.com:11371 .  Only keys
> with new UTF-8 userids will be searchable by their correct (non-ASCII)
> userid words, however, until I fully reload the database to fix the
> existing (UTF-8) userids.  SKS will require a fix as well since its
> is_alnum() currently recognizes extended chars only from decimal
> 192 to 255 when parsing userids.  Note that both pks and SKS lowercase
> all ASCII characters internally for search purposes, but lowercasing
> UTF-8 characters may also be necessary.

This seems to work well for me.  I am able to find Noèl Köthe with a
UTF8 search.  Amusingly enough, if I intentionally use the wrong
pre-UTF8 encoding, I get an older key of his...

David




reply via email to

[Prev in Thread] Current Thread [Next in Thread]