lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev [PATCH][dev21] the binary size battle: disabling charsets


From: Klaus Weide
Subject: Re: lynx-dev [PATCH][dev21] the binary size battle: disabling charsets
Date: Wed, 31 Mar 1999 19:54:44 -0600 (CST)

On Wed, 31 Mar 1999, Bela Lubkin wrote:
> Klaus Weide wrote:
> > On Tue, 30 Mar 1999, Bela Lubkin wrote:
> > > I would guess that the code to handle loading charsets as needed would
> > > be perhaps 1-2K of binary size.  So we'd still get about the same net
> > > gain, except *everyone* would get it, and we wouldn't have to think
> > > about whether to turn it on.
> > 
> > I think you greatly underestimate the effort.  1-2K - no way.  (Pick up
> > the challenge and prove me wrong...)
> > 
> > There's also the runtime overhead of loading those files at program start.
> > Or, if loading is not happening at startup but on demand, additional logic
> > is needed to pre-register available charsets - either at installation time
> > (losing part of the flexibility advantage, and introducing additional error
> > modes) or at startup (requiring at least lots of stat()s).
> 
> I envisioned it as on-demand loading, so that in a session where (say)
> you only looked at the 7-bit ASCII Lynx Help files, other tables
> wouldn't have been loaded at all.
> 
> You imply that we need to know ahead of time which charsets are
> available.  I couldn't find that in the code.  For instance, we
> construct accept-charsets headers only from user input (lynx.cfg,
> .lynxrc, or options menu); those could be checked when they're entered
> (verify that matching charset files exist).  Other than that, we
> construct a menu in LYOptions.c, and several functions in UCdomap.c
> search for a charset by mime name.  In all cases it looked like the
> functions could first search the already-loaded list, then (if
> necessary) the directory where charset data is stored.

How to fill the "Display character set" and "Assumed document character set"
SELECT lists?  Those strings cannot be derived from just checking whether
a file exists or not.  So you'd either have to open each file and start
reading (whatever format they would be in), or have those name strings
still compiled in.

There's also UCCan*Translate* and UCNeedNotTranslate functions that may
want to know whether some table exists (without using it), they might
have to open or at least stat a file quite often.

What format would those dynamically loadable tables be in?  Neither the
.tbl not the .h format currently used seems appropriate.  Probably
there should be a different intermediate format.  (Reading the current
.tbl format directly seems very inefficient.  makeuctb is a bit of a
memory hog now (note the 255 in
/*
 *  Massive overkill, but who cares?
 */
unicode unitable[MAX_FONTLEN][255];)
because as a separate process that runs only at compile time we can
afford wasting memory.  It's probably also a runtime hog - it's not
a one-pass process.

If all replacement strings were to be allocated individually when a
table file is read in, there would be lots of small mallocs - or we
need some kind of memory pool mechanism.  Also, at least in principle,
if all the table data is in a read-only segment in the binary, the memory
can be shared by several processes (don't know whether that actually
happens anywhere) - instead of having all lynx processes load the same
duplicated data into their private memory.

Granted, these may all be rationalizations for the current state of
affairs...  but it seems to me the benefits are at least questionable.
If we had hundreds of chartrans files to deal with, I might see it
differently; but there's still the fact that one could select among
them at compile time.

Having as much as possible compiled _into_ lynx should also be of benefit
especially to you, Bela, with ad hoc lynx binaries on new machines...
unless of course you never need charset translation anyway, but in that
case you shouldn't compile it in at all.

   Klaus


reply via email to

[Prev in Thread] Current Thread [Next in Thread]