gnustep-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

typesetting text


From: Nicola Pero
Subject: typesetting text
Date: Thu, 14 Dec 2000 22:08:01 +0100 (CET)

I have spent some days studying text drawing and editing issues. 

It's a very complex matter.

Ahm - I just reread this mail, I'd like to see what people thinks of this
stuff - but don't take this as my last word on it even if my careless
English makes it seem I'm very convinced of it - it's just I'm trying to
collect ideas about how to make things - it's very preliminary.

The matter is already complex for latin languages if we want to do good
typesetting.  Just to do a readable typesetting (not a good one) for
Japanese, Arabic, Hebrew (quite important and widespread languages) the
thing becomes extremely complicated.  We can't ignore this - because we
will need to implement it sooner or later - so it's better to face it from
the very beginning.  Still - I don't want to write a single line of code
which does anything more complex than laying out trivial latin1 text
without ligatures and kerning :-) 

There are quite a set of issues to be resolved.
Basically, the problem is - how do we render on screen a UNICODE string.

Personally, as I will repeat below, I think it's a gross mistake to expose
the internals of NSLayoutManager in the API.  I mean the stuff about
glyphs.  It's of no real use - it's too obscure, meaningless and complex
to use directly or subclass in real use - and it un-encapsulates rudely
the way NSLayoutManager lays out the text and caches layouts info and
draws.

The point is that there is currently a lot of confusion, experiments, on
rendering UNICODE - we want to be able to switch to a standard or use some
good standard library if ever there will be one - or simply
change/improve/extend/adapt the way we layout text in the future - and we
can't do this if all the private internal layout machinery is grossly
exposed in that way.  I really think NSLayoutManager should hide all the
glyph stuff and everything regarding the way it's internal data
structures, algorithms or cache work.  Of course it will have methods
which can be called privately by other NSText* components in certain
circumstances - but this is quite different from making all the details of
the internal private layout machinery available in a public API.

- anyway - 

My resulting idea, is that I think the basic step to succesfully render
unicode stuff is to render by words.  It's difficult to break text beyond
a word without getting into big troubles with strange languages (we get
into troubles somewhere anyway with strange languages, but perhaps can
postpone them). But it seems most if not all languages which could
potentially be typeset on a computer break things into words separated by
spaces (or corresponding characters).  Rendering of a character (or of a
group of characters) in a word depend in unpredictable ways on the other
characters in the word (even in latin languages when you have a good font
with single and double ligatures and kerning tables) - but it seems words
are independent between them - so we are safe if we render one word at a
time, independently one from the other. 

So - I see a design more or less as follows: we break the text into words. 
We cache someway words boundaries if needed for performance.  Then, we
have two `basic' functions in a sort of text backend: one gives us the
bounding box of a rendered word in a given font, and the other one renders
the word in the given font in a given position.  We call the bounding box
function for each word in the text. We cache all this information.  We
compute line breaks simply by going on to next line when the space on the
line is finished.  We cache line breaks as well.  This way, we are caching
the starting position and width of each word - this is the layout
information we cache.  When we render a portion of text, we simply draw
each word using the elementary backend function, in the cached position. 
When we have to recompute layout because the user deleted a character in a
word, we just recompute the bounding box of the word, and use the already
computed and cached bounding boxes for all the other words to compute the
new line breaks and then redraw following words.

Advantages: we encapsulate all the unicode/charset/ligature/kerning mess
into a couple of elementary functions - so we can build most of the higher
level layout/drawing engine ignoring the problems of
ligatures/kerning/arabic.  At the beginning, we can implement the two
elementary functions in a trivial way - thus giving us a way to implement
a simple system without ligatures, kernings and support for strange
languages, without preventing the system to be improved and made better in
the future.  Actually, I like the idea of breaking the layout system into
two clearly separated parts with a very little interface between them. The
ultimate idea is that it could be possible to dynamically load bundles
supporting ligatures, kernings and stuff for your preferite language - the
two basic functions would use the supported bundle.  More about this
below.

Disadvantages: we are not caching information on how each word is drawn! 
This could be fixed in the elementary backend functions, which could keep
a cache of the latest - say - 1000 words which have been drawn, with
information on how to draw them - if not the image ready to be pasted. 
This could even improve performance if you have a text in which words are
repeated - when a word is drawn the second time, the cached info (if not
the word image) is reused - a thing which isn't done in other schemes. 
The other disadvantage is we are going to make a lot of requests to draw 
little strings to the X server.  But if we XFlush only at the end - it
shouldn't be a problem, should it ?

Another big disadvantage is that this scheme is different from the one
exposed in the NSLayoutManager documentation.  It wouldn't break at all
the rest of the text system I think - but need to inspect about this thing
- just the layout methods in NSLayoutManager would be different.

My opinions on this point are: 

 1. It's very nice to have the separate text classes to do all the nice 
    effects documented - text displayed in multiple shapes or across 
    columns or etc. so we should stricly adhere to these doc in general;

 2. But most of the NSLayoutManager API is actually private stuff which
    should IMHO not be public.  Many of the documented methods are not 
    for standalone use - you can't call them directly.  The description 
    is complicated and obscure and it's about obscure things you really
    don't want to know about - the way in which characters and glyphs 
    are managed and mapped in memory, rendered, in which line fragments 
    are dealt with etc.  It's very difficult to subclass this stuff and 
    not very meaningful.  Also - in general - this should not be public 
    because they - or us - would want to redesign the internals of this 
    layout and rendering engine without changing the API.  I read that 
    MacOSX is now using ATSUI to render the text - I don't know how 
    this is coped with.  So - I think glyphs should not be public.

    So I think we should only implement a subset of the MacOSX doc 
    for NSLayoutManager.  We should include all meaningful methods - 
    but no methods regarding the way glyphs are actually drawn.

    Or simply - we implement placeholder methods which, when you ask for
    glyphs, return the corresponding chars - ignoring the difference
    between glyphs and chars.  We implement these methods for MacOS X
    compatibility (even if, if they are using ATSUI, I don't understand 
    how they manage to expose glyphs internals themselves, since ATSUI
    has an opaque interface to its own layout functions, and does these
    things internally - it's a completely different layout system).

    At this point, we can also implement drawing/sizing in cells by simply
    calling directly the basic word functions.  The advantage is that when
    someone implements arabic ligatures in the two word-based functions,
    the ligatures can be used as well by cells automatically, and cell
    drawing system can immediately render arabic as well.


Now - this outline would be OK for simple drawing, but forgets about
editing.  During editing, we need to keep track of the cursor position,
and to manage editing commands.  Again, I'd like the text system to be
able to manage this very simply and directly - but allowing arbitrarily
complex input and editing stuff to be implemented later on.
We need to: 

 - position the cursor at a certain index of the string - whatever this 
   means.  Possibly, we have a `basic' function - we pass the word, and 
   the index of a character in the word, and the function returns the
   position of the cursor in the word.  There we draw the cursor.  This 
   is needed even just for ligatures and kerning in latin characters.  In 
   the case of fancy languages,  I have not even idea how the cursor would 
   be moved - so having a function - which can be refined by someone who 
   knows how to move the cursor in his own language - seems good.

 - get an index in the string corresponding to a mouse click somewhere 
   in the text (for selections etc).  Again, we get the word under the 
   mouse, the position of the mouse inside the word, and ask a `basic' 
   function to tell us a location inside the word string.

 - editing.  This is a mess.  For example, when the user presses `Right
   Arrow', we need to: 

   - move by one character to the right for simple character such as 'f'
     in English;

   - move by two (or more) characters to the right for some characters
     which are grouped when are displayed (eg, letter + modifier in 
     fancy languages);

   - move by one character to the right for some characters which are 
     grouped when are displayed but are still separate entities (eg 
     the `fi' ligature - I don't know how NSLayoutManager manages moving 
     with this, since `fi' is a glyph, but moving right should move the
     cursor after the `f' but before the `i' while the documentation says 
     the text system moves by glyphs when the user moves the cursor - this 
     instead is a movement of half-a-glyph - unless the ligature is
     destroyed when the user moves the cursor on it, and recreated when 
     the cursor goes away - anyway I don't know).

  again, I would delegate decisions to a `basic' function, which can be 
  made arbitrarily complex or made to load dynamic code to manage foreign 
  languages.  So - you call a basic function, tell it that the cursor is 
  in a certain position in a certain word, and that the user has pressed 
  the right arrow key.  The function returns the new cursor index 
  inside the word, from which we then compute the cursor position by 
  calling one of the preceding functions.

We can clean this and make it OO by doing the following: 

we have a locale layoutmanager object.  This object has methods to: 
compute the bounding box of a word in a certain font, draw a word in a
certain font somewhere in a certain view, compute the index in a word
string from a cursor position inside the word rendering, and viceversa; 
return the resulting new position in the string after the user moves left
or rigth, and other editing stuff depending on the language.  We assume
all this editing is localized to a single word - so when the locale layout
manager is called, it is always given a single word to work on.
We can provide a basic implementation of this class which just works for
latin1 stuff without any ligatures and kernings.  It would cache up to the
last XXX words drawn, and possibly the bounding boxes of up to the last
YYY characters which had been to be computed to move the cursor.  If YYY
== 100, for example, if you use mainly say 15 lowercase letters + 10
uppercase letters and have 4 fonts, all the letter bounding boxes are
cached when the stuff is running, and the thing should even be fast. It
should not take too much memory to make this cache bigger - a bounding box
is a NSSize, so we can probably go up to YYY == 1000, and characters
bounding boxes are then possibly never recomputed - to move the cursor we
look up the table by character number, get the bounding box, and move the
cursor of the bounding box width. 

NB: When the user presses the `down arrow', we would probably move the
cursor in vertical down, find the word it's on, then call the basic
function to get the new string index corresponding to cursor position,
then recompute cursor position based on the string index.  Yep - quite
possible actually to make quickly if we are caching character bounding
boxes in the word layout object. 

In general, the idea is that while NSText, NSTextStorage, NSTextContainer,
and the non-glyph methods of NSLayoutManager would be as per-MacOSX-spec,
we would hide the internals of NSLayoutManager, and actually implement
them in quite a different way - we encapsulate all kerning, ligature,
language/locale-dependent cursor movements and editing inside a class, so
that people can write loadable bundles to support their own language.  We
can still make available the macos-x glyphs methods with trivial
implementations not to break MacOSX portability - but I think these
methods should have been very rarely used - only in the trivial example of
drawing a string by using NSLayoutManager - but that should work with
glyphs==chars.

A thing I've not informed myself about - but which is quite important - is
how NSText* stuff is customized/subclassed in practice eg to color strings
basing on syntax, to indent things in particular ways etc.  In any case,
this should have nothing to do with the glyphs stuff. 

I'd like to know other opinions - particularly opinions against this
solution.  :-) 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]