lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev lynx and ncurses and gpm


From: Klaus Weide
Subject: Re: lynx-dev lynx and ncurses and gpm
Date: Fri, 23 Apr 1999 05:59:18 -0500 (CDT)

On Fri, 23 Apr 1999, Alexander V. Lukyanov wrote:

> On Thu, Apr 22, 1999 at 07:37:37AM -0500, Klaus Weide wrote:
> > There is a provision to deal with an interrupted read() in fifo_push()
> > if HIDE_EINTR is defined.  But HIDE_EINTR is explicitly #undef'd in
> > fifo_defs.h, with no explanation except the 980606 NEWS file entry
> > (which I assume corresponds to the #undef)
> > 
> >         + removed the cycle on EINTR, as it seems to be useless.
> > 
> > I don't know how someone came to that conclusion, and disagree with
> > it (as does the lib_tstp.c comment above).
> 
> It is done for purpose. E.g. in case of xterm resize application should
> act immediately, and EINTR for SIGWINCH helps that. I'm sure applications
> can want to react to other signals outside of signal handlers (to avoid
> reentering non-reenterable functions) by setting a flag inside and then
> checking it in main loop.

I can see how this helps applications to react to resize events more
immediately.  So that is the reason why ncurses tries to install its
own handler for SIGWINCH without SA_RESTART, different from those for
SIGTSTP, SIGINT, SIGTERM.  I had wondered about that.

But while it helps some applications, it makes others fail.

> getch can also return ERR on timeout if one was set.

Yes, and that is documented in the getch(3) man page.  But the behavior
you are defending directly *contradicts* the man page:

 PORTABILITY
 .....
       The  behavior of getch and friends in the presence of han-
       dled signals is unspecified in the  SVr4  and  XSI  Curses
       documentation.   Under  historical curses implementations,
       it varied depending  on  whether  the  operating  system's
       implementation  of  handled  signal  receipt  interrupts a
       read(2) call in progress or not, and also (in some  imple-
       mentations)  depending on whether an input timeout or non-
       blocking mode hsd been set.

       Programmers concerned about portability should be prepared
       for  either  of  two  cases:  (a)  signal receipt does not
       interrupt getch; (b) signal receipt interrupts  getch  and
       causes  it  to  return ERR with errno set to EINTR.  Under
       the ncurses implementation, handled signals  never  inter-
       rupt getch.

Note the last sentence.  I handn't seen it before I sent my prvious
message; now I am more convinced that this is a real bug.  If
applications *need* this behavior, they are relying on a bug (if they
were written to the ncurses interface) or on non-portable behavior.
They could and should use no-delay or half-delay mode if they really
want to make sure they get timely notification, or some other means.

I am also concerned that the rationale you states isn't mentioned
anywhere (that I could find) - not in the source code, not in NEWS,
and every piece of comment I *did* find gives the wrong impression
that ncurses is trying hard to never interrupt getch.

> So I beleave applications should not immediately exit on ERR from getch.

Yes, that is probably good advice.  But according to the existing
documentation, an application (if it's using ncurses) that doesn't use
no-delay or half-delay mode has every right to assume that ERR means a
real failure.

Btw. what *portable* way do you suggest for programs to determine whether
an ERR means a real unrecoverable failure?

> What is really bad in getch implementation is lack of distinction
> between ERR as retriable error (such as EINTR or EAGAIN) and hard error
> like terminal hangup.

I see you agree there is a problem...

One partial solution would be adding a function by which the application can
set the desired behavior.  The applications you have in mind would call this
to set 'interrupting' behavior (and then would have to check after each ERR
whether it's a hard error); other more naive applications get the currently
documented behavior.

[One could argue that this is already achieved by the decision whether to set
SA_RESTART or not on handlers; but 1. that doesn't work on all systems, and
2. not all signal handlers are under the application's or ncurses' control
as demonstrated in the gpm case, and 3. a separate function to control global
behavior would allow easy switching between the two modes.]

For the WINCH case, there already is a mechanism for timely notification,
added by yourself; from NEWS:

  970906
  .....
        > patches by Alexander V. Lukyanov:
        + add pseudo-functionkey KEY_RESIZE which is returned by getch() when
          the SIGWINCH handler has been called since the last call to
          doupdate().

> > Possible solutions:
> > 
> > 1. Change gpm code to set SA_RESTART flag.
> > 
> >    Should be done, but this solves the problem only for ncurses+gpm, it
> >    will pop up again for ncurses+<some other code out of ncurses' control>. 
> 
> IMHO this is simplest solution without side effects.

I was specifically having SIGTSTP in mind; but if the change to gpm
code also involved setting SA_RESTART for SIGWINCH, this would indeed
have the side effect of 'breaking' the behavior you wish to see.

> > 2. Change ncurses code to prevent gpm code from installing its own SIGTSTP
> >    handler; (temporarily) setting SIG_IGN at Gpm_Open() time would achieve
> >    this.
> > 
> >    Again this is a workaround specific to gpm, some other code out of
> >    ncurses' control can create the same kind of problem.
> >    Also, the gpm SIGTSTP handler, besides creating a problem, does something
> >    useful; its function would have to be duplicated in ncurses code.
> > 
> > 3. Change ncurses code (back) to deal with interrupted read() system call as
> >    required.
> > 
> >    Obviously this is the preferred solution.
> 
> No, as it would break window resizing.

It shouldn't break it, just break immediate resizing if it is based on
flawed assumptions.  And portable programs already have to deal with
non-immediate resizing on some platforms where signals always restart.
Programs written with ncurses in mind should start using KEY_RESIZE.

> > 4. Change all applications where this may occur to somehow detect the
> >    situation, and work around it.
> 
> I think this is more likely. In fact other curses implementations do
> return ERR on EINTR.
> 
> >    Applications shouldn't have to bother with this kind of low-level
> >    stuff, that's one reason why they use higher-level functions instead
> >    of doing read() etc. directly.  [Although they may still have to
> >    anticipate this situation in order to be portable to other curses
> >    implemetations.]
> 
> But read does return error on signal for some purpose. The problem with
> getch is that it is not possible to determine the error reason. Maybe
> it is possible to clear errno before getch and check it for EINTR after,
> but this looks like unportable hack.
>
> > Problem 1b
> > ----------
> >      Directly after suspending and resuming once, without a following
> >      keystroke - that is:  ^Z, fg
> >      Mouse actions don't work (have no effect, except the default mouse
> >      cursor movement)
> > 
> > This is also caused by the non-restarting SIGTSTP handler installed by
> > gpm.  What gets interrupted by the first ^Z is actually the _nc_timed_wait()
> > in fifo_push(), it returns prematurely instead of waiting until new
> > input is actually available.  This also happens to 'protect' the read()
> 
> In fact input subsystem of ncurses is rusty, and needs to be redesigned.
> I think that timestamped queues of events would be appropriate.

That's beyond me...

> > further below in fifo_push(), it is the reason why Problem 1a only
> > becomes visible on the *second* interrupt.
> 
> This _is_ a bug because of inconsistency. getch should exit with ERR if
> timed_wait fails. (or should ignore EINTR everywhere, which is arguably)

Or (I repeat) make the behavior selectable at runtime.

> > Possible solutions: 1., 2. as above, or 
> > 
> > 3. Change ncurses code to deal with interrupted system call in
> >    _nc_timed_wait() as required.
> > 
> >    Obviously this is the preferred solution.
> 
> Not exactly, for signal handling purpose.

Well I thing we agree it should be consistent with the global
behavior elsewhere.

      --------------------

Anyway, imo either the behavior of the library or the documentation
definitely has to be changed.

The patch I sent doesn't go quite that far. It effectively only
changes things if compiling with gmp support, in which case it
is needed imo, given the current state of libgpm it is likely
to be linked against, to prevent breaking lynx and (I assume)
other programs.

Adding a global flag to control the behavior and a fucntion to set it
would be better and should be straightforward; do you agree that
that's a good approach?

   Klaus


reply via email to

[Prev in Thread] Current Thread [Next in Thread]