chicken-hackers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-hackers] [PATCH] "deep-stack" option, declaration and build


From: Alex Shinn
Subject: Re: [Chicken-hackers] [PATCH] "deep-stack" option, declaration and build-mode
Date: Thu, 2 Aug 2012 12:39:54 +0900

On Thu, Aug 2, 2012 at 5:05 AM, Felix
<address@hidden> wrote:
>>
>> I understand that there are differing viewpoints on this, but I think
>> cutting it off at some arbitrary point when the stack is overflown is
>> just as awkward as cutting off memory allocation at some arbitrary point.
>> We let a user use up all available memory too by repeatedly consing onto
>> a list.  I don't see why the C stack (an implementation detail which is
>> irrelevant when coding in just Scheme) has to be treated differently.
>
> I see your point. I just want to be pragmatic and I don't care too much
> about the philosophical view on this - there are many machine limitations
> that we have to deal with and this is just one of many. What do others
> think about this?

There are two reasons the stack can overflow - because
of an infinite loop, or because of deep recursion.

The former, as you've already pointed out, happens even
to experienced Schemers.  I think it happens to me fairly
often.  When I'm lucky it's a tail-recursion and I can just
kill it.  Otherwise, by the time I notice the loop has been
taking a while, it's probably already set my machine to
thrashing, at which point I can't reliably ^C the program.
The whole machine becomes unusable and it's possible
other unrelated programs will be terminated by the OOM
killer.

This is the worst possible scenario.  I'd rather have a
segfault any day.  In both cases it's obvious to anyone
but a first week beginner what has happened, but the
segfault is better behaved.

The other case is intentional - you have a deep recursion
which just happens to surpass the stack limit (which
may or may not be the same as the heap limit, but
always exists).  In other words, letting

  H := heap limit
  S := stack limit <= H

a deep recursion of R will work properly so long as
R < S.  If R is defined by traversing a data structure,
then R >= H/2 will _always_ fail, regardless of S,
and depending on the actual stack space used it's
likely that an R of any large fraction of H will also
fail.  As a simple example:

  (define (sum ls)
    (if (null? ls)
        0
        (+ (car ls) (sum (cdr ls)))))

will always crash if `ls' takes up half the heap.

So unbounded recursion is inherently unreliable -
you need to put some restrictions on how deep the
recursion can be.  Similarly, unbounded heap
usage is also unreliable - you need to design
your program with memory usage in mind, and
plan when and how you need to save data to
disk.  And of course, you also need to design
with disk usage in mind, and decide when and how
to archive or throw away data.

A reasonable restriction is that a recursion should
be no larger than the logarithm of any data structure
in the heap, and stacks should always be able to
accommodate this.  For the example in the previous
thread, where the tree was degenerate and was
basically a list, this is an example of a bug in the
code.  The tree should have been balanced.

The sum example above is also a bug.  It doesn't
matter it it looks simpler or actually runs faster
than the iterative version, it's allocating an arbitrary
amount of memory that it doesn't need to.  It's
reasonable to work with tens of thousands of threads,
and pass them large shared data structures for
computations.  If they all allocate huge stacks to
be able to traverse these, you will quickly run out
of heap.

There are cases where balancing or bounding are
inherently difficult, and you have no choice but to
manually build a stack.  In these cases it's fine to
just use recursion, but you should be aware of the
inherent limitation that not all inputs can be handled.

-- 
Alex



reply via email to

[Prev in Thread] Current Thread [Next in Thread]