[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] Problem with ÅÄÖ and wget
From: |
Tim Ruehsen |
Subject: |
Re: [Bug-wget] Problem with ÅÄÖ and wget |
Date: |
Tue, 24 Sep 2013 10:38:30 +0200 |
User-agent: |
KMail/4.10.5 (Linux/3.10-3-amd64; KDE/4.10.5; x86_64; ; ) |
On Monday 23 September 2013 23:32:39 Ángel González wrote:
> On 17/09/13 09:49, Tim Ruehsen wrote:
> > On Tuesday 17 September 2013 00:17:21 Ángel González wrote:
> >>> [1] http://nikitathespider.com/articles/EncodingDivination.html
> >>
> >> Note that these steps are outdated now (that was written at most at
> >> 2008).
> >
> > Outdated by exactly what ? RFC3986 is of 2005 and does not contradict to
> > [1]. See my explanation above.
>
> By the HTML Living Standard (formerly known as HTML5)
> http://www.whatwg.org/specs/web-apps/current-work/multipage/
>
> The Content-type header is sometimes overriden, ISO-8859-1 now means
> windows-1252,
> there are some well-defined guessing steps when there's such need...
Just for completeness: these guessing steps called "encoding sniffing
algorithm" are described in 12.2.2.2.
But only "In some cases, it might be impractical to unambiguously determine
the encoding before parsing the document.".
I found this iso-8859-1 / windows-1252 issue mentioned on the Wikipedia
'windows-1252' page, but couldn't find it on the HTML Living Standard pages.
Could you give me a pointer, please ?
What do you think, how can we address the iso / windows encoding issue (should
we ?) ? As I understood, it is only valid for HTML5...
Is there a practical need for the sniffing algorithm ?
Do you know any real web sites / pages where the encoding is ambiguous ?
Tim
- Re: [Bug-wget] Problem with ÅÄÖ and wget, (continued)
- Re: [Bug-wget] Problem with ÅÄÖ and wget, Tim Rühsen, 2013/09/12
- Re: [Bug-wget] Problem with ÅÄÖ and wget, Tim Ruehsen, 2013/09/13
- Re: [Bug-wget] Problem with ÅÄÖ and wget, Björn Mattsson, 2013/09/13
- Re: [Bug-wget] Problem with ÅÄÖ and wget, Tim Ruehsen, 2013/09/16
- Re: [Bug-wget] Problem with ÅÄÖ and wget, Tony Lewis, 2013/09/16
- Re: [Bug-wget] Problem with ÅÄÖ and wget, Ángel González, 2013/09/16
- Re: [Bug-wget] Problem with ÅÄÖ and wget, Tim Ruehsen, 2013/09/17
- Re: [Bug-wget] Problem with ÅÄÖ and wget, Ángel González, 2013/09/23
- Re: [Bug-wget] Problem with ÅÄÖ and wget,
Tim Ruehsen <=