mit-scheme-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [MIT-Scheme-devel] Faster ports.


From: Taylor R Campbell
Subject: Re: [MIT-Scheme-devel] Faster ports.
Date: Fri, 24 Jun 2011 03:10:22 +0000
User-agent: IMAIL/1.21; Edwin/3.116; MIT-Scheme/9.1

   Date: Thu, 23 Jun 2011 16:06:59 -0700
   From: Matt Birkholz <address@hidden>

   C-peek-uchar can do it (with a fixnum index) withOUT consing a single
   drop.  Open-coded:

       ; r40 = alien, r41 = index (both already type-masked)
       (load (r 42) (word (r 40) 2))      ; r42 <- (%alien/high-bits r40)
       (load (r 43) (word (r 40) 3))      ; r43 <- (%alien/low-bits r40)
       (lsh (r 42) half-word-size)
       (and (r 43) type-mask)
       (and (r 42) (r 43))                ; r42 <- address of bytes
        ^^^
ADD, surely?

       (load (r 45) (byte (r 42) (r 41))) ; r45 <- your u8 (without type code)

   Acceptable?

Sure, if you want to make the abstraction right, and make it safe, and
implement support for it in the compiler and maybe garbage collector.
I would suggest using not aliens but slices for the purpose.  A slice
is a contiguous subchunk of a contiguous chunk of bytes/octets, on
which the SUBSLICE operation is cheap.  (That way, we can stop the
proliferation of SUBVECTOR-FROBNITZ procedures with start/end
parameters and use VECTOR-FROBNITZ exclusively instead.)  It should
have the following fields, for a native implementation:

- frob
- length
- high address
- low address

The frob is the object that the slice is a slice of -- which might be
in the Scheme heap (vector-8b), in a hypothetical separate Scheme heap
for non-marked data (if we ever get a fancier garbage collector that
needs it), or outside the Scheme heap altogether (external string).

Each frob has a starting address and total length; the address fields
of a slice point to somewhere no more than the total length after the
frob's starting address, into the data represented by the frob.

The frob is stored for two reasons:

1. The garbage collector needs to be able to relocate the address
fields, in case the frob is a vector-8b in the Scheme heap that moves
around.

2. The garbage collector needs to know when it can release the frob if
it is an external string or similar, so there has to be a reference
for the sake of a GC finalizer in that case.

On 64-bit systems, it would be nice if the high address and low
address fields were coalesced into a single field, since all addresses
will fit into fixnums.  But that would require adding a lot of highly
error-prone code which isn't worth experimenting with yet.

   So... if you wanted characters, you'd tickle the port.  If you wanted
   its octets, you'd first get ahold of its octet-source.  Octet-level
   work doesn't involve the port.  OK.

I didn't intend to involve ports at all.  If you want to read Unicode
code points:

(define (source-utf32be source)
  (define (o) (source-octet source))
  (define (<< x c) (shift-left x c))
  (let* ((a (o)) (b (o)) (c (o)) (d (o)))
    (cond ((and a b c d) (bitwise-ior (<< a 24) (<< b 16) (<< c 8) (<< d 0)))
          ((not (or a b c d)) #f)
          (else (error:premature-eof source)))))

If you want a stream of characters decoded by UTF-32BE:

(define (source-character-stream/utf32be source)
  (let recur ()
    (lazy (cond ((source-utf32be source)
                 => (lambda (code-point)
                      (stream-cons (integer->char code-point) (recur))))
                (else stream-nil))))

And so on.

   Maybe parsing a large file of scheme or xml should be included.  I
   don't want a fast cat(1); I want a fast HTML::Tree.  (I also want a
   fast uuencode(1), so I probably want both.)  If cpress.scm used
   octet-sources we would have a challenging test.  We could compare the
   speed of compression using both internal and external buffers.

Effectively, cpress.scm already uses octet sources.  They're just
hand-written for the particular application.  I want to capture that
pattern into an abstraction.

However, there are a lot of problems to be worked out.  For example,
the flow of information betwen a buffered source/sink and a file
descriptor is not unidirectional, in a sense.  Before you reposition a
file descriptor, you need to force any pending buffered output, and
after you reposition it, you need to clear any pending buffered input.
If you want to know the position that you're about to read from, you
need to consider both the file descriptor's position and the
source/sink's buffer index.

There's other similar state in ports too -- transcripts, unreading
characters, line & column tracking, &c. -- and I don't know how to
factor it into composable abstractions.

Another problem is that you need to be able to debug problems and
handle conditions that may arise in the course of I/O, so the stack of
sources/sinks has to be transparent enough not to obscure all that.

Anyway, all this is half-baked -- it needs to be implemented and used
experimentally to make sure it actually works well in practice.  It
may turn out in practice that everything I've been rambling about is a
net lose over using ports as currently designed.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]