gcl-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gcl-devel] Re: fork


From: Camm Maguire
Subject: [Gcl-devel] Re: fork
Date: 01 Nov 2005 10:12:58 -0500
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2

Greetings!

Just a heads up that I've committed some preliminary work in this
regard. I have a dual machine (each real cpu two-way 'hyperthreaded'),
and here is an example of the behavior I see.  YMMV, of course:

=============================================================================
input
=============================================================================
(defun fib (n) (if (<= n 2) 1 (+ (fib (1- n)) (fib (- n 2)))))
(proclaim '(ftype (function (t) t) fib))
(compile 'fib)
;(trace si::select-read)
(time (dotimes (i 10) (values (fib 29) (fib 29) (fib 29))))
(defun foo nil (si::p-let ((a (fib 29)) (b (fib 29)) (c (fib 29))) (values a b 
c)))
(compile 'foo)
(time (dotimes (i 10) (foo)))
(time (dotimes (i 10) (values (fib 29) (fib 29))))
(defun foo1 nil (si::p-let ((a (fib 29)) (b (fib 29))) (values a b)))
(compile 'foo1)
(time (dotimes (i 10) (foo1)))
(time (dotimes (i 10) (and (> (fib 29) 4) (< (fib 29) 7))))
(defun bar nil (si::p-and (> (fib 29) 4) (< (fib 29) 7)))
(compile 'bar)
(time (dotimes (i 10) (bar)))
(time (dotimes (i 10) (or (< (fib 29) 4) (> (fib 29) 7))))
(defun baz nil (si::p-or (< (fib 29) 4) (> (fib 29) 7)))
(compile 'baz)
(time (dotimes (i 10) (baz)))
=============================================================================
output
=============================================================================

FIB

>
NIL

>
;; Compiling ./gazonk0.lsp.
;; End of Pass 1.  
;; End of Pass 2.  
;; OPTIMIZE levels: Safety=0 (No runtime error checking), Space=0, Speed=3, 
(Debug quality ignored)
;; Finished compiling ./gazonk0.o.
Loading /fix/t1/camm/debian/gcl/tmp/tmp/foo/unixport/gazonk0.o
start address -T 0x1230d48 Finished loading 
/fix/t1/camm/debian/gcl/tmp/tmp/foo/unixport/gazonk0.o
#<compiled-function FIB>
NIL
NIL

>
real time       :     24.550 secs
run-gbc time    :     24.540 secs
child run time  :      0.000 secs
gbc time        :      0.000 secs
NIL

>
FOO

>
;; Compiling ./gazonk0.lsp.
;; End of Pass 1.  
;; End of Pass 2.  
;; OPTIMIZE levels: Safety=0 (No runtime error checking), Space=0, Speed=3, 
(Debug quality ignored)
;; Finished compiling ./gazonk0.o.
Loading /fix/t1/camm/debian/gcl/tmp/tmp/foo/unixport/gazonk0.o
start address -T 0x118e110 Finished loading 
/fix/t1/camm/debian/gcl/tmp/tmp/foo/unixport/gazonk0.o
#<compiled-function FOO>
NIL
NIL

>
real time       :     13.840 secs
run-gbc time    :      0.000 secs
child run time  :     36.460 secs
gbc time        :      0.000 secs
NIL

>
real time       :     16.320 secs
run-gbc time    :     16.320 secs
child run time  :      0.000 secs
gbc time        :      0.000 secs
NIL

>
FOO1

>
;; Compiling ./gazonk0.lsp.
;; End of Pass 1.  
;; End of Pass 2.  
;; OPTIMIZE levels: Safety=0 (No runtime error checking), Space=0, Speed=3, 
(Debug quality ignored)
;; Finished compiling ./gazonk0.o.
Loading /fix/t1/camm/debian/gcl/tmp/tmp/foo/unixport/gazonk0.o
start address -T 0x122f008 Finished loading 
/fix/t1/camm/debian/gcl/tmp/tmp/foo/unixport/gazonk0.o
#<compiled-function FOO1>
NIL
NIL

>
real time       :     10.400 secs
run-gbc time    :      0.000 secs
child run time  :     20.640 secs
gbc time        :      0.000 secs
NIL

>
real time       :     15.810 secs
run-gbc time    :     15.810 secs
child run time  :      0.000 secs
gbc time        :      0.000 secs
NIL

>
BAR

>
;; Compiling ./gazonk0.lsp.
;; End of Pass 1.  
;; End of Pass 2.  
;; OPTIMIZE levels: Safety=0 (No runtime error checking), Space=0, Speed=3, 
(Debug quality ignored)
;; Finished compiling ./gazonk0.o.
Loading /fix/t1/camm/debian/gcl/tmp/tmp/foo/unixport/gazonk0.o
start address -T 0x11667f8 Finished loading 
/fix/t1/camm/debian/gcl/tmp/tmp/foo/unixport/gazonk0.o
#<compiled-function BAR>
NIL
NIL

>
real time       :     10.820 secs
run-gbc time    :      0.000 secs
child run time  :     20.080 secs
gbc time        :      0.000 secs
NIL

>
real time       :     15.840 secs
run-gbc time    :     15.840 secs
child run time  :      0.000 secs
gbc time        :      0.000 secs
NIL

>
BAZ

>
;; Compiling ./gazonk0.lsp.
;; End of Pass 1.  
;; End of Pass 2.  
;; OPTIMIZE levels: Safety=0 (No runtime error checking), Space=0, Speed=3, 
(Debug quality ignored)
;; Finished compiling ./gazonk0.o.
Loading /fix/t1/camm/debian/gcl/tmp/tmp/foo/unixport/gazonk0.o
start address -T 0x1144fd8 Finished loading 
/fix/t1/camm/debian/gcl/tmp/tmp/foo/unixport/gazonk0.o
#<compiled-function BAZ>
NIL
NIL

>
real time       :     10.480 secs
run-gbc time    :      0.010 secs
child run time  :     19.430 secs
gbc time        :      0.000 secs
NIL

>(dotimes (i 10) (time (baz)))

real time       :      0.830 secs
run-gbc time    :      0.000 secs
child run time  :      1.640 secs
gbc time        :      0.000 secs
real time       :      0.820 secs
run-gbc time    :      0.000 secs
child run time  :      1.650 secs
gbc time        :      0.000 secs
real time       :      1.520 secs
run-gbc time    :      0.010 secs
child run time  :      3.000 secs
gbc time        :      0.000 secs
real time       :      0.820 secs
run-gbc time    :      0.000 secs
child run time  :      1.640 secs
gbc time        :      0.000 secs
real time       :      0.830 secs
run-gbc time    :      0.000 secs
child run time  :      1.650 secs
gbc time        :      0.000 secs
real time       :      0.820 secs
run-gbc time    :      0.000 secs
child run time  :      1.650 secs
gbc time        :      0.000 secs
real time       :      0.840 secs
run-gbc time    :      0.000 secs
child run time  :      1.660 secs
gbc time        :      0.000 secs
real time       :      0.820 secs
run-gbc time    :      0.000 secs
child run time  :      1.640 secs
gbc time        :      0.000 secs
real time       :      0.820 secs
run-gbc time    :      0.000 secs
child run time  :      0.810 secs
gbc time        :      0.000 secs
real time       :      0.820 secs
run-gbc time    :      0.010 secs
child run time  :      1.630 secs
gbc time        :      0.000 secs
NIL

>=============================================================================

As we can see from the last, there is a little performance variability
which needs tracking down yet.


Robert Boyer <address@hidden> writes:

> Here is a minor suggestion, mainly to help me think.
> 
> Why not have the function si::simple-fork return only a stream, but
> let it be an output stream to the parent and an input stream to the
> child?  So your fib example might be:

Yes, this is the way fork is implemented.  No need for the two way,
and in fact, if one does not close the other channel, eof will not be
delivered to the one you are using.


> 
> (time (let ((s (si::simple-fork)))
>       (cond ((output-stream-p s)
>              ;; In this universe we are the subordinate, so we do
>              ;; our half of the work, print it out, and die.
>              (print (fib 31) s)
>              (bye))
>             ((input-stream-p s)
>              ;; In this universe we are the superior and the
>              ;; subordinate is already working on its half of the
>              ;; job, we trust.  We do our half of the job and then
>              ;; read the subordinate's answer.
>              (values (fib 31)
>                      (or (ignore-errors (read s))
>                          ;; Although using the inferior seems to
>                          ;; have caused an error we can continue.
>                          (fib 31))))
>             (t
>              ;; In this universe, we are (still) the superior but apparently
>              ;; there was no room to fork off a subordinate, so we do both 
> jobs.
>              (values (fib 31) (fib 31))))))
> 
> You could define, say, si::stream-pid, to take an input stream and say
> what process the input was coming from, if any, i.e., the value
> returned by a successful C call to fork.

This is a definite possibility.  As you note, we do need not only the
stream, but the pid as well.  Right now fork returns a cons of the
two, but we might find a way to pack the latter in the former.  As you
know, streams as a concept are already pretty well overlayed, or
'polymorphed', in common lisp.  Making any modification forces one to
consider the implications fot all the other types.  BUt if you feel
this is better here, I'm sure I can be persuaded.

> 
> Why bother with a two-way stream for the very simplest case of
> forking?
> 
> It seems to me that a major drawback to the above example is that I do
> not want to wait forever on the read.  How long am I willing to wait?
> Maybe twice as long as it took me to do my half of the job.  So it
> might be nice to have a macro
> 
>    (with-timeout time-limit form)
> 

OK, the way this is done in the world of C, and the way I've partially
pushed it forward thus far into lisp, is via the new function
si::select-read.  It takes a list of pid/input_stream conses (returned
(one at a time) by fork) and a timeout value in microseconds, and
returns an integer mask representing the pairs with data ready to be
read.  If the timeout is negative, the wait for reading is
indefinite.  The si::p-let, si::p-and and si::p-or macros I have thus
far wait indefinitely, but they can easily be extended to run the form
in the parent should a positive timeout expire and return 0 from
select-read.  The issue is IMHO setting the timeout 'scientifically'
-- surely the user doesn't want trial and error here.  There is a
macro si::with-read-values for lower level access of this sort, but it
is not yet complete:

;; Iterates the body over returned values from backgrounded processes,
;; one process running each form in forms.  Index is bound to the
;; position of the returned form in forms, handles is provided to kill
;; other running processes when needed.  Right now the functionality
;; is the same regardless of timeout, but soon an appropriate index
;; and value calculated in the foreground may be returned should a
;; positive timeout number of microseconds expire with no input
;; available.  (In such a case, handles will be nil and the jobs will
;; be terminated automatically.) 
(with-read-values (index value handles) (forms timeout)
        ....)

si::p-let, si::p-and, and si::p-or are implemented with this macro.
All of this is in gcl_evalmacros.lsp, and is in a quite *preliminary
stage* :-).

This is basically what you are suggesting below, if I understand
correctly.  I'm somewhat of two minds, though, whether these macros
should silently continue in the presence of child error or throw and
exception.  Likewise on fork error, though this can more arguably be
seen as a transient system condition which should not interrupt the
user's job.

> This would be a very strange sort of operator that would try to
> evaluate form and return it as the answer, if it were about to fail to
> finish evaluating form within time-limit units of internal time, it
> would immediately return some timeout error.  Thus we might have:
> 
> (time (let* ((start-time (get-internal-real-time))
>            (s (si::simple-fork)))
>       (cond ((output-stream-p s)
>              ;; In this universe we are (now) the subordinate, so we do
>              ;; our half of the work, print it out, and die.
>              (print (fib 31) s)
>              (bye))
>             ((input-stream-p s)
>              ;; In this universe we are (still) the superior and the
>              ;; subordinate is already working on its half of the
>              ;; job, we trust.  We do our half of the job and then
>              ;; try to read the subordinate's answer.
>              (values (fib 31)  ; that's our half of the job.
>                      (or (ignore-errors
>                            (with-timeout 
>                              (* 2 (- (get-internal-real-time)
>                                      start-time))
>                              (read s)))
>                          ;; Sigh.  Since using the subordinate seems to
>                          ;; have caused an error or taken more time
>                          ;; than we are willing to wait, we continue
>                          ;; by doing the subordinate's job ourself.
>                          (progn ;; Only we can prevent litter litter.
>                                 (si::kill-process (si::stream-pid s)) 
>                                 (fib 31) ;; Doing the work the subordinate 
> didn't get done.   
>                                 ))))
>             (t
>              ;; In this universe, we are (still) the superior but apparently
>              ;; there was no room to fork off a subordinate, so we do both 
> jobs.
>              (values (fib 31) (fib 31))))))
> 
> Bob
> 
> 

All these functions need protoyping to streamline the compiled code.
I've added optimizers for logandc1 and logandc2 toward this end.

There are two other developments which need pursuing here IMHO --
stack allocation in the child, and print/read optimization.  As the
child is destined to return one value and exit without popping its
stack (i.e. without returning from its current function), all new
pages can be allocated on the C stack if possible presenting no load
to the garbage collector.  The only way to do this is to reserve a
certain number of such pages (determined by some global variable
setting) at the beginning of the child, and have alloc_page return
from this block until exhausted, at which point subsequent allocation
proceeds as usual from the heap.  In fact, the beginning free lists in
the child could be set to zero pushing all new allocations onto this
stack.  At return time, we can then see if there is any stack
remaining, if so we are guaranteed that all objects not found in the
parent are on the stack, and then if all objects in the value to be
returned are in the heap, one can just write and read the pointer to
the appropriate object whiich must already by in the parent. This
should greatly reduce the print/read overhead.  

First I'd like to see how bad this overhead really is.  Providing we
can track down the jitter in the performance, it does appear that
scaling at least to levels of 0.5 sec or so can be achieved with this
method, maybe much better.

Take care,


> 
> 

-- 
Camm Maguire                                            address@hidden
==========================================================================
"The earth is but one country, and mankind its citizens."  --  Baha'u'llah




reply via email to

[Prev in Thread] Current Thread [Next in Thread]