bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] html pasing via wget


From: Micah Cowan
Subject: Re: [Bug-wget] html pasing via wget
Date: Mon, 02 Mar 2009 12:23:35 -0800
User-agent: Thunderbird 2.0.0.19 (X11/20090105)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Роман Мартынович wrote:
> Hello!
> 
> I use wget on Windows to parse html files form the Web  to my pc. I live
> in Russia and so I parse Russian sites. Sometimes parsed files happen to
> be stored in wrong encoding - they have charset=windows-1251 in their
> <meta> tag, but I have to choose the koi-8 encoding to get them appear
> correctly in Firefox, and in MS Notepad it's impossible to change
> encoding. I can't find the reason why. And I also cannot process these
> files in my applications.
> 
> So I ask you to make it possible to choose encoding of html files as an
> option, or if it is a bug to fix it.

Wget doesn't do transcoding of files; it just stores it directly as the
server gave it. We might add a feature to do so at some point in the
future, perhaps, but not likely any time soon. At some point, we would
like to add arbitrary post-download filters, which could probably also
be used to address this sort of thing.

The real problem, though, is that whoever created the files set the meta
tag incorrectly; you should contact the site to address this problem.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer.
Maintainer of GNU Wget and GNU Teseq
http://micah.cowan.name/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkmsQD4ACgkQ7M8hyUobTrG/awCbB/nh+SugovMYKUcDf5r0gTUa
a6YAn0vkyrXpGBmYRjPZ6DgugCWZQkRF
=3dvI
-----END PGP SIGNATURE-----




reply via email to

[Prev in Thread] Current Thread [Next in Thread]