bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] bad filenames (again)


From: Eli Zaretskii
Subject: Re: [Bug-wget] bad filenames (again)
Date: Sun, 16 Aug 2015 17:43:50 +0300

> Date: Thu, 13 Aug 2015 19:10:41 +0200
> From: "Andries E. Brouwer" <address@hidden>
> Cc: address@hidden, "Andries E. Brouwer" <address@hidden>
> 
> +/* Used to determine whether bytes 128-159 are OK in a filename */
> +static int
> +have_utf8_locale() {
> +#if defined(WINDOWS) || defined(MSDOS) || defined(__CYGWIN__)
> +  /* insert some test for Windows */
> +#else
> +  char *p;
> +
> +  p = getenv("LC_ALL");
> +  if (p == NULL)
> +    p = getenv("LC_CTYPE");
> +  if (p == NULL)
> +    p = getenv("LANG");
> +  if (strstr(p, "UTF-8") != NULL || strstr(p, "UTF8") != NULL ||
> +      strstr(p, "utf-8") != NULL || strstr(p, "utf8") != NULL)
> +    return true;
> +#endif
> +  return false;
> +}
> [...]
> +  opt.restrict_files_highctrl = (have_utf8_locale() ? false : true);

I'm not sure this is the right way to fix this.  First, relying on
UTF-8 locale to be announced in the environment is less portable than
it could be: it's better to call 'setlocale' with the 2nd argument
NULL to glean the same information.  Then the ugly #ifdef above could
be dropped, and at least Cygwin will not be excluded from this
feature.

Moreover, even if the locale is not UTF-8, wget should attempt to
convert the file names to the current locale using iconv (which I
believe was what Tim suggested).  This will DTRT in much more cases
than the above UTF-8 centric approach, IMO.

Thanks.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]