Re: [Bug-wget] Problem with ÅÄÖ and wget

bug-wget

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Problem with ÅÄÖ and wget

From:	Tim Ruehsen
Subject:	Re: [Bug-wget] Problem with ÅÄÖ and wget
Date:	Tue, 24 Sep 2013 10:38:30 +0200
User-agent:	KMail/4.10.5 (Linux/3.10-3-amd64; KDE/4.10.5; x86_64; ; )

On Monday 23 September 2013 23:32:39 Ángel González wrote:
> On 17/09/13 09:49, Tim Ruehsen wrote:
> > On Tuesday 17 September 2013 00:17:21 Ángel González wrote:
> >>> [1] http://nikitathespider.com/articles/EncodingDivination.html
> >> 
> >> Note that these steps are outdated now (that was written at most at
> >> 2008).
> > 
> > Outdated by exactly what ? RFC3986 is of 2005 and does not contradict to
> > [1]. See my explanation above.
> 
> By the HTML Living Standard (formerly known as HTML5)
> http://www.whatwg.org/specs/web-apps/current-work/multipage/
> 
> The Content-type header is sometimes overriden, ISO-8859-1 now means
> windows-1252,
> there are some well-defined guessing steps when there's such need...

Just for completeness: these guessing steps called "encoding sniffing 
algorithm" are described in 12.2.2.2.
But only "In some cases, it might be impractical to unambiguously determine 
the encoding before parsing the document.".

I found this iso-8859-1 / windows-1252 issue mentioned on the Wikipedia  
'windows-1252' page, but couldn't find it on the HTML Living Standard pages.
Could you give me a pointer, please ?

What do you think, how can we address the iso / windows encoding issue (should 
we ?) ? As I understood, it is only valid for HTML5...

Is there a practical need for the sniffing algorithm ?
Do you know any real web sites / pages where the encoding is ambiguous ?

Tim

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Bug-wget] Problem with ÅÄÖ and wget, (continued)
- Re: [Bug-wget] Problem with ÅÄÖ and wget, Tim Rühsen, 2013/09/12
  - Re: [Bug-wget] Problem with ÅÄÖ and wget, Tim Ruehsen, 2013/09/13
    - Re: [Bug-wget] Problem with ÅÄÖ and wget, Björn Mattsson, 2013/09/13
    - Re: [Bug-wget] Problem with ÅÄÖ and wget, Tim Ruehsen, 2013/09/16
    - Re: [Bug-wget] Problem with ÅÄÖ and wget, Tony Lewis, 2013/09/16
    - Re: [Bug-wget] Problem with ÅÄÖ and wget, Ángel González, 2013/09/16
    - Re: [Bug-wget] Problem with ÅÄÖ and wget, Tim Ruehsen, 2013/09/17
    - Re: [Bug-wget] Problem with ÅÄÖ and wget, Ángel González, 2013/09/23
    - Re: [Bug-wget] Problem with ÅÄÖ and wget, Tim Ruehsen <=

Prev by Date: Re: [Bug-wget] Problem with ÅÄÖ and wget
Next by Date: [Bug-wget] wget error: failed: Connection timed out.
Previous by thread: Re: [Bug-wget] Problem with ÅÄÖ and wget
Next by thread: [Bug-wget] [PATCH] fix --with-ssl compile error
Index(es):
- Date
- Thread