[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Emacs Hangs on Filesystem Operations on Stale NFS
From: |
Eli Zaretskii |
Subject: |
Re: Emacs Hangs on Filesystem Operations on Stale NFS |
Date: |
Mon, 11 Jun 2018 18:51:43 +0300 |
> Date: Mon, 11 Jun 2018 14:46:35 +0200
> From: Alexander Shukaev <address@hidden>
> Cc: Emacs-devel <address@hidden>,
> Noam Postavsky <address@hidden>, Emacs developers <address@hidden>
>
> On 2018-06-11 14:40, Andreas Schwab wrote:
> > On Jun 11 2018, Alexander Shukaev <address@hidden> wrote:
> >
> >> signal.signal(signal.SIGALRM, alarm_handler)
> >> signal.alarm(3)
> >> try:
> >> proc = subprocess.call('stat ' + path,
> >> shell=True,
> >> stderr=subprocess.PIPE,
> >> stdout=subprocess.PIPE)
> >> stdoutdata, stderrdata = proc.communicate()
> >> signal.alarm(0)
> >> except Alarm:
> >> print "Timed out after 3 seconds..."
> >
> > How do you know that 3 seconds is enough?
> >
> > Andreas.
>
> You don't know. You just decide that it's maximum tolerable for
> you/your setup/hardware/connection/preferences/whatever, otherwise you
> are 99.(9)% sure that something is wrong somewhere with your system, but
> you don't give up your Emacs instance for that and rather get indicated
> that there might be a potential problem.
I think there's more here than meets the eye. Sure, it's quite easy
to come up with a toy program that uses SIGALRM to time out a system
call that went awry. But Emacs is not a toy program, so doing that
has complications, even if we will come up with a suitable number of
seconds to wait (which ain't easy, since some I/O calls could really
need a long time, or example reading a large file or directory).
Here are some complications we should keep in mind:
. Emacs already uses SIGALRM for different purposes, see atimer.c.
Reusing it for this issue will need some complex logic, to avoid
breaking the features that use SIGALRM now.
. You tried this with a single 'stat' call, but that's just the tip
of the iceberg. Typically, Emacs will need to read a file after
it found it readable, and we normally do that in a way that keeps
looping as long as the system call was interrupted by signals, see,
e.g., emacs_intr_read. Then setting up an alarm clock will not
help if 'read' hangs, we will just loop forever.
. We usually deliver signals to the main thread, so if the code that
hangs happens to run in a non-main thread (recall that Emacs 26
has threads), it will be somewhat tricky, to say the least, to
deliver signal there.
. Even if we somehow succeed to interrupt the hang by a signal, it's
not clear whether it's safe to continue running the session --
there's a reason why we stopped doing non-trivial stuff in signal
handlers. It may be that the only sensible thing is to shut down,
and in that case, what did we gain, exactly?
. This technique is non-portable to MS-Windows.
There are probably other complications.
All in all, I'd be much happier if we could interrupt such hangs,
e.g. by C-g, as Stefan points out (on a TTY frame, this should already
be possible in many cases, since C-g there generates SIGINT). But I'm
not sure this would be possible in general. Maybe Paul will have some
ideas.
- Re: Emacs Hangs on Filesystem Operations on Stale NFS, (continued)
- Re: Emacs Hangs on Filesystem Operations on Stale NFS, Andreas Schwab, 2018/06/11
- Re: Emacs Hangs on Filesystem Operations on Stale NFS, Alexander Shukaev, 2018/06/11
- Re: Emacs Hangs on Filesystem Operations on Stale NFS,
Eli Zaretskii <=
- Re: Emacs Hangs on Filesystem Operations on Stale NFS, Paul Eggert, 2018/06/13
- Re: Emacs Hangs on Filesystem Operations on Stale NFS, Davis Herring, 2018/06/12
- Re: Emacs Hangs on Filesystem Operations on Stale NFS, Perry E. Metzger, 2018/06/12
Re: Emacs Hangs on Filesystem Operations on Stale NFS, Stefan Monnier, 2018/06/11
Re: Emacs Hangs on Filesystem Operations on Stale NFS, Alexander Shukaev, 2018/06/13