[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: AW: treatment of U+002E that is produced by NFKC
From: |
Erik van der Poel |
Subject: |
Re: AW: treatment of U+002E that is produced by NFKC |
Date: |
Tue, 15 Jan 2008 06:30:07 -0800 |
Yes, that's right.
By the way, there may be a different way to address this issue. If
libidn has a separate API for NFKC or Nameprep, the caller could pass
the entire domain name (including all of the dots and dot-like
characters) through NFKC (or Nameprep) first, and then call the normal
IDNA routine. This is quite likely to behave the same way as MSIE 7
and Firefox 2. If you chose this approach, you could simply document
this somewhere, and callers could then decide whether or not to go
this way.
Erik
> >> I'm not yet sure whether actually providing a mechanism (like the
> >> one I proposed in the patch) to work around the problem is a good thing.
> >> The mechanism could just as well cause other problems.
> >
> > Yes, it is possible that that approach would cause other
> > incompatibility problems that I cannot think of at the moment, since
> > it is different from MSIE 7 and Firefox 2.
>
> Indeed. I've thought a bit about this, and there are some problems with
> my patch:
>
> 1) It only treats U+2024 as a dot. There are other code points as well,
> but none are as simple as U+2024. The others include:
>
> 2024;ONE DOT LEADER;Po;0;ON;<compat> 002E;;;;N;;;;;
> 2025;TWO DOT LEADER;Po;0;ON;<compat> 002E 002E;;;;N;;;;;
> 2026;HORIZONTAL ELLIPSIS;Po;0;ON;<compat> 002E 002E 002E;;;;N;;;;;
> 2488;DIGIT ONE FULL STOP;No;0;EN;<compat> 0031 002E;;1;1;N;DIGIT ONE
> PERIOD;;;;
> 2489;DIGIT TWO FULL STOP;No;0;EN;<compat> 0032 002E;;2;2;N;DIGIT TWO
> PERIOD;;;;
> ...
> 2498;NUMBER SEVENTEEN FULL STOP;No;0;EN;<compat> 0031 0037 002E;;;17;N;NUMBER
> SEVENTEEN PERIOD;;;;
> ...
> 249B;NUMBER TWENTY FULL STOP;No;0;EN;<compat> 0032 0030 002E;;;20;N;NUMBER
> TWENTY PERIOD;;;;
> 33C2;SQUARE AM;So;0;L;<square> 0061 002E 006D 002E;;;;N;SQUARED AM;;;;
> 33C7;SQUARE CO;So;0;L;<square> 0043 006F 002E;;;;N;SQUARED CO;;;;
> 33D8;SQUARE PM;So;0;L;<square> 0070 002E 006D 002E;;;;N;SQUARED PM;;;;
> FE52;SMALL FULL STOP;Po;0;CS;<small> 002E;;;;N;SMALL PERIOD;;;;
>
> It would be incorrect to treat all of these as dots as well. For
> example:
>
> ToASCII(hi U+248C com) = hi5.com
>
> If we extend my patch for U+248C one, libidn would generate 'hi.com'
> instead of 'hi5.com'.
>
> Right now, both Firefox and libidn translates the input into the ASCII
> string hi5.com. Arguable Firefox is incorrect (wrt the RFC) in that it
> treat the string as two labels rather than one.
>
> 2) As you say, the patch is different from what MSIE/Firefox really
> implements. The only advantage with a new flag in libidn (that I see)
> would be if it does exactly the same as MSIE/Firefox. But it doesn't.
>
> Thus, my patch seems to be the wrong thing, and I'm not going to install
> it now.
>
> If someone wants to work on a patch against libidn that makes it
> implement the MSIE/Firefox algorithm, when a new IDNA flag is given,
> that would be something we could seriously consider applying. I'm
> currently too busy to do this on a pro-bono basis though.
>
> Thanks,
> /Simon
>
- Re: treatment of U+002E that is produced by NFKC, (continued)
- Re: treatment of U+002E that is produced by NFKC, Erik van der Poel, 2008/01/13
- Re: treatment of U+002E that is produced by NFKC, Simon Josefsson, 2008/01/13
- Re: treatment of U+002E that is produced by NFKC, Erik van der Poel, 2008/01/13
- AW: treatment of U+002E that is produced by NFKC, Alexander Gnauck, 2008/01/13
- Re: AW: treatment of U+002E that is produced by NFKC, Simon Josefsson, 2008/01/14
- Re: AW: treatment of U+002E that is produced by NFKC, Erik van der Poel, 2008/01/14
- Re: AW: treatment of U+002E that is produced by NFKC, Simon Josefsson, 2008/01/14
- Re: AW: treatment of U+002E that is produced by NFKC, Erik van der Poel, 2008/01/14
- Re: AW: treatment of U+002E that is produced by NFKC, Simon Josefsson, 2008/01/14
- Re: AW: treatment of U+002E that is produced by NFKC, Simon Josefsson, 2008/01/15
- Re: AW: treatment of U+002E that is produced by NFKC,
Erik van der Poel <=
- Re: AW: treatment of U+002E that is produced by NFKC, Simon Josefsson, 2008/01/15
- Re: AW: treatment of U+002E that is produced by NFKC, Erik van der Poel, 2008/01/15
- Re: AW: treatment of U+002E that is produced by NFKC, Simon Josefsson, 2008/01/15
- Re: AW: treatment of U+002E that is produced by NFKC, Erik van der Poel, 2008/01/15
- Re: AW: treatment of U+002E that is produced by NFKC, Simon Josefsson, 2008/01/14
- Re: treatment of U+002E that is produced by NFKC, Simon Josefsson, 2008/01/14
- Re: treatment of U+002E that is produced by NFKC, Erik van der Poel, 2008/01/14
Re: treatment of U+002E that is produced by NFKC, Simon Josefsson, 2008/01/13