bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] wget / character set


From: Alex Davies
Subject: Re: [Bug-wget] wget / character set
Date: Mon, 7 Sep 2009 12:24:00 +0100

Hi,

You are correct; the problem was the UTF-8 character set used by
Google sites (which runs that site).

The solution to my problem is to run find on all files and execute recode:

find /temp/path -type f -exec /usr/bin/recode -f ascii {} \;

Many thanks,

Alex


On Mon, Sep 7, 2009 at 4:01 AM, Micah Cowan<address@hidden> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Alex Davies wrote:
>> Hi All,
>>
>> I am attempting to wget a page, but find that wget is mangling some of
>> the characters inside the page and i'm not quite sure why.
>
> Wget is completely incapable of mangling characters, as it doesn't do
> any sort of conversion on content whatsoever.
>
> I do see <U+200e> characters in vim... probably vim doesn't know what to
> do with those character codes. Viewing the result in a web browser gives
> identical results for me as visiting the site directly (apart from some
> coloring, probably from a CSS file that couldn't be found, and that
> issue's resolved by adding -k).
>
> - --
> Micah J. Cowan
> Programmer, musician, typesetting enthusiast, gamer.
> Maintainer of GNU Wget and GNU Teseq
> http://micah.cowan.name/
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iEYEARECAAYFAkqkd3sACgkQ7M8hyUobTrEfRgCeJhgXgjWM6HmztaJV113cOJj1
> l+cAoIXMAVyvzo/QISgSDNCY1bTGNgav
> =yIGt
> -----END PGP SIGNATURE-----
>



-- 
Alex Davies

This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the sender immediately by e-mail and delete this e-mail permanently.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]