bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-wget] IDN and IRI tests fail on MS-Windows with wget 1.16.1


From: Eli Zaretskii
Subject: [Bug-wget] IDN and IRI tests fail on MS-Windows with wget 1.16.1
Date: Sat, 20 Dec 2014 10:28:53 +0200

I've looked into the failing tests.  Here's the list of failed tests
and my conclusions from looking at the logs and the test scripts:

     FAIL: Test-idn-headers.px
     FAIL: Test-idn-meta.px

   These use EUC_JP encoded file name, but do not state
   --local-encoding on the wget command line, so the non-ASCII
   characters get mangled by Windows (because Windows tries to convert
   non-Unicode non-ASCII strings to the current system codepage).
   Test-idn-* tests that do state --local-encoding do succeed.  Is it
   possible that the tests assume something about the local encoding,
   like that it's UTF-8?
   
     FAIL: Test-iri.px
     FAIL: Test-iri-percent.px
     FAIL: Test-iri-forced-remote.px

   These fail due to non-ASCII file names.  It seems that the iri
   tests that succeed state their charset explicitly, like this:

     '/p1_fran%C3%A7ais.html' => {      # UTF-8 encoded
         code => "404",
         msg => "File not found",
         headers => {
             "Content-type" => "text/html; charset=UTF-8",
         },
         content => $page404,
     },

   while those that fail don't state the charset.  Again, is it
   possible that the tests make some implicit assumptions about the
   local encoding?

For the record: the test suite used the MSYS environment (Bash, Make,
Coreutils, etc.) and the MSYS build of Perl 5.8.8

Please CC me on any replies, as I'm not subscribed to the list.

Thanks.

P.S. The description in the manual of the IRI support in wget could
use some improvement.  It sounds like it assumes that the reader is
already familiar with IRI and knows what it features; these
assumptions are not necessarily correct.  E.g., I have only a very
vague idea of what IRI does and what is to be expected.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]