[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Chicken-hackers] [PATCH] Use pair as loop result handle in compiler-syn
From: |
Evan Hanson |
Subject: |
[Chicken-hackers] [PATCH] Use pair as loop result handle in compiler-syntax for `map` |
Date: |
Mon, 21 Dec 2015 16:25:04 +1300 |
Hi hackers,
Here's a patch for an issue Felix pointed out way back in April, in
<address@hidden> (the
actual message is inline, below). I actually mentioned this a while ago
and meant to bring it up for discussion at ICC, but it slipped my mind
there and then sat neglected for a while.
Anyway, the patch addresses the extra CPS call described in that email,
in the expansion of compiler syntax for `map`. The continuation in
question is caused by a conditional in the body of the expansion's named
let, and its presence prevents that loop from compiling down to a label
and goto.
However, that conditional is really only needed the first time through
the loop, in order to update a `##core#box` to refer to the head of the
mapped list once its first pair has been generated. Rather than wait for
the first pair this way, we can instead pay the constant allocation cost
of one pair up front and use that as a handle to the result, which goes
in its `cdr`. Then, after the loop, the expansion can unconditionally
dereference the second slot of this handle to get the resulting list.
So, the expansion goes from something like this (the current expansion,
tweaked quite a bit for readability)...
(let ((node #f) (result '()) ...)
(let map-loop ((l lst) ...)
(if (and (pair? l) ...)
result
(let ((res (cons (f (car l)) '())))
(if (not node) ; Is this the first iteration?
(set! result res) ; If so, set `result` to the head of
mapped list.
(set-slot node 1 res)) ; If not, just append to the list
in-place.
(set! node res)
(loop (cdr l) ...)))))
... To this:
(let* ((node (cons #f '())) (result node) ...)
(let map-loop ((l lst) ...)
(if (and (pair? l) ...)
(get-slot result 1)
(let ((res (cons (f (car l)) '())))
(set-slot node 1 res)
(set! node res)
(loop (cdr l) ...)))))
Let me know if that explanation made no sense.
Holiday cheers,
Evan
On 2015-04-09 0:24, Felix Winkelmann wrote:
> >> This is a terribly written program. It uses 3-element lists as vectors
> >> (including higher-order "vector" arithmetic using "map") and allocates
> >> like hell. The compiler can not do much with this code, and it
> >> produces CPS calls everywhere.
> >
> > I take it you are referring to the {add, sub, scale, mul, dot,
> > squared-length, normal} functions? It seems clear to me that there's
> > plenty of room for optimization in them, but I wonder if you're being a
> > bit hasty to dismiss the naive implementation so readily. IMHO, use of
> > `map` to transform lists is sort of idiomatically lisp.
>
> Using map to transform lists is indeed, but using lists to store
> fixed-size collections that are expected to be heavily used is
> probably not, I'd say. On the other hand, this benchmark appears to be
> intentionally inefficient to stress the GC by allocating a lot.
>
> >
> > That is to say, in this case there are alternatives for numerical
> > computation, but transforming a list in a similar manner ought to be
> > commonplace in lispy programmes. If it is this sort of behaviour that
> > drives the GC into agony then I expect that there are many Chicken
> > programmes that may benefit from any form of optimization that could be
> > had from `map`.
>
> That's right, of course. Nobody would complain if we tried to catch
> this special case. But every additional optimization complicates the
> optimizer (which is already pretty hairy) and increases the
> possibility of bugs. Still, "(map <op> ...)" must allocate, unless one
> can infer some lifetime information about the argument (or result)
> lists.
>
> CHICKEN may pay an extra penalty by producing relatively bad code
> for certain loops, like those introduced by an already existing optimization
> of "map". So, for example:
>
> (define (foo x y)
> (map + x y))
>
> (foo (read) (read))
>
> gets translated and optimized to:
>
> csc -debug 7 x.scm -O5
> [optimized]
> (lambda (k205)
> (let ((k207 (##core#lambda
> (t33)
> (let ((k209 (##core#lambda
> (t34)
> (let ((k260 (##core#lambda
> (t36)
> (let ((k262 (##core#lambda (t37)
> (k205 (##core#undefined)))))
> (let ((k264 (##core#lambda
> (r265) (r265 k262))))
> (##sys#implicit-exit-handler
> k264))))))
> (let ((k268 (##core#lambda
> (a267)
> (let ((k271 (##core#lambda
> (a270)
> (let ((x1 a267))
> (let ((g919
> '()))
> (let ((g1020
> #f))
> (let
> ((map-loop524 (##core#undefined)))
> (let
> ((t218 (set! map-loop524
>
> (lambda (k220 g1725 g1826)
>
> (let ((r255 (##core#inline "C_i_pairp" g1725)))
>
> (let ((r225 (##core#cond
>
> r255
>
> (##core#inline "C_i_pairp" g1826)
>
> #f)))
>
> (if r225
>
> (let ((a248 (##core#inline "C_slot" g1725 0)))
>
> (let ((a251 (##core#inline "C_slot" g1826 0)))
>
> (let ((a245 (##core#inline_allocate
>
> ("C_a_i_plus" 4)
>
> a248
>
> a251)))
>
> (let ((g628 (##core#inline_allocate
>
> ("C_a_i_cons" 3)
>
> a245
>
> '())))
>
> (let ((k229 (##core#lambda
>
> (t29)
>
> (let ((t231 (set! g1020 g628)))
>
> (let ((a235 (##core#inline "C_slot" g1725
> 1)))
>
> (let ((a238 (##core#inline "C_slot"
> g1826 1)))
>
> (map-loop524 k220 a235 a238)))))))
>
> (if g1020
>
> (k229 (##core#inline "C_i_setslot" g1020 1 g628))
>
> (let ((t244 (set! g919 g628))) (k229 t244))))))))
>
> (let ((f_221274 g919)) (k220 f_221274)))))))))
>
> (map-loop524 k260 x1 a270)))))))))
> (read k271)))))
> (read k268))))))
> (##core#callunit "eval" k209)))))
> (##core#callunit "library" k207)))
>
> Note the unnecessary lambda "k229", which could be beta-reduced in the
> conditional following it (and the conditional simplified), with some
> effort, removing the need for a CPS call and the related allocating of
> a continuation.
>
> Any takers?
>
> Moreover, code produced by CHICKEN already allocates a lot
> (activation-frames are heap-allocated), so allocation-heavy code will
> increase this amount, in other words, CHICKEN allocates more than
> non-CPS implementations. Allocation is already quite fast, which
> compensates the cost in code that allocates in a "normal" manner (yes,
> I know: whatever that means...)
>
>
> cheers
> felix
>
> _______________________________________________
> Chicken-hackers mailing list
> address@hidden
> https://lists.nongnu.org/mailman/listinfo/chicken-hackers
0001-Use-pair-as-map-loop-result-handle-in-compiler-synta.master.patch
Description: Text Data
0001-Use-pair-as-map-loop-result-handle-in-compiler-synta.chicken-5.patch
Description: Text Data
- [Chicken-hackers] [PATCH] Use pair as loop result handle in compiler-syntax for `map`,
Evan Hanson <=