sks-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: UTF-8/non-ASCII chars in keys (was Re: [Sks-devel] 1.0.8 patches)


From: Jason Harris
Subject: Re: UTF-8/non-ASCII chars in keys (was Re: [Sks-devel] 1.0.8 patches)
Date: Wed, 20 Oct 2004 15:44:49 -0400
User-agent: Mutt/1.4.2.1i

On Wed, Oct 20, 2004 at 01:21:37PM -0400, David Shaw wrote:

> Just a general FYI with UTF-8 searches and GnuPG.  Versions before
> 1.2.6 did not always do this properly for HKP keyservers, and in fact
> sometimes truncated the search string.  The current release, and all
> future releases use UTF-8.

In the last keyanalyze keydump, there were 102579 userids with old-
style extended chars v. 16682 with UTF-8 chars (based on counting
instances of \xc3 (which is also the old-style 'Ã' and may skew
the counts somewhat) from mutt's pgpring).

If necessary, keyservers can convert the old-style userid strings into
UTF-8 before parsing and storing them in the userid word database.

Right now, there is one instance of Noèl and Köthe (same key) to
test UTF-8 searches with on hkp://keyserver.kjsl.com:11371 .  Only keys
with new UTF-8 userids will be searchable by their correct (non-ASCII)
userid words, however, until I fully reload the database to fix the
existing (UTF-8) userids.  SKS will require a fix as well since its
is_alnum() currently recognizes extended chars only from decimal
192 to 255 when parsing userids.  Note that both pks and SKS lowercase
all ASCII characters internally for search purposes, but lowercasing
UTF-8 characters may also be necessary.

-- 
Jason Harris           |  NIC:  JH329, PGP:  This _is_ PGP-signed, isn't it?
address@hidden _|_ web:  http://keyserver.kjsl.com/~jharris/
          Got photons?   (TM), (C) 2004

Attachment: pgp_M9nKh8bE9.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]