lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev HTML4.0 and default charset


From: David Woolley
Subject: Re: lynx-dev HTML4.0 and default charset
Date: Wed, 3 Mar 1999 08:51:45 +0000 (GMT)

> 
> > 4) Although META headers are supposed to control the server, I have
>               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^???
> no, they are for clients.

This whole area is a mess and it looks as the W3C have been doing a
lot of firefighting to try and work round abuses by browser writers
and content providers.

Section 7.4.4 of the HTML 1.1 spec more or less says that http_equiv
is only for servers - it certainly makes it clear that servers can
act upon it.

However, section 5.2.2 makes an exception for charset, subject to
their being no charset specified in the HTTP headers.  It is fairly 
clear from the context that this is a hack to force the use of correct
charset information (in spite of our Ukrainian friend who effectively
wanted the browser user to identify and specify the character set in all
cases).

I suspect this whole area of http_equiv is a rear guard action to
try to recover from what browser developers have done to weaken the role
of servers.

The consequences of a server failing to act on http_equiv information is
that:

- older browsers won't have access to character set information, even if
  they can handle it in HTTP headers;

- you cannot use them to control caching, as the major external cache
  products do not examine the contents of the resources they cache (they
  don't even care whether they are HTML).

I have seen instances of the second problem on the squid mailing list.

Also, there are cases (e.g. 16 bit encoded Unicode) where only the 
server can specify the charset as the encoding of the ASCII subset is
not identical to ASCII.

As to an earlier article on defaulting characters sets, my guess would
be that the intention was the lowest level of user who is permitted to
override the character set should have the default set to ISO 8859/1.
This can't be enforced by technical means because the browser cannot
know when a user has multiple roles.  So, if anonymous users are allowed
to override for the current session, the session should always start in
ISO 8859/1, but if they are locked into KOI-8, then it is the installer
who should be presented eith the ISO 8859/1 default.  A user with their
own configuration file and the ability to change the default should have
the account created with the charset set to ISO 8859/1.

This is probably politically unacceptable for many service providers 
outside the ISO 8859/1 area, or even within that area where "just send
Windows charset" is the norm, but who don't want to lock users into a
particular default.

Note that RFC 2068 is effectively only an early draft, and at least one
more recent draft expires on March 11th, so there may still be time to
get in comments before HTTP 1.1 is finalised.  (I haven't checked to
see if this has been superceded.)

reply via email to

[Prev in Thread] Current Thread [Next in Thread]