[epsilon-devel] Extensible s-expression reader and REPL in epsilon1

epsilon-devel
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[epsilon-devel] Extensible s-expression reader and REPL in epsilon1

From:	Luca Saiu
Subject:	[epsilon-devel] Extensible s-expression reader and REPL in epsilon1
Date:	Wed, 26 Mar 2014 00:36:02 +0100
User-agent:	Gnus (Ma Gnus v0.8), GNU Emacs 24.3.50.2, x86_64-unknown-linux-gnu
Hello.

The reader and REPL support is now on master, and I'm killing the reader
branch.  The code could be nicer here and there, but I think the general
architecture is right.

I've fixed the anomaly of eof-chacking on ports: the semantics now
follows the fgetc convention: when reading a character from a port with
input-port:read-character, you get either a valid character or a special
value (io:eof, currently equal to -1).  Only after receiving that value
the internal EOF status of the port is updated, and calling
input-port:eof? returns the intended result.  This is the case for file
input ports, string input ports, and the readline input port.

The s-expression reader uses a "backtrackable input port" layer on top
of ordinary input ports (see epsilon1.scm), because it needs to
tentatively match some patterns when recognizing s-expressions, going
back to a previous state if some sub-pattern recognition fails and
another has to be attempted.  Backtrackable input ports don't follow the
general interface for ports.  This can be generalized later; I haven't
thought of a good system to provide several implementations of the same
interface (think of something like ML modules) which is general enough.

The reader is not fast: the system directly uses regular expressions and
recognizes by backtracking, rather than converting regular expressions
into deterministic automata first.  This is quite ironic since I could
do that: I work with automata at my day job, and I actually did the
conversion and determinization thing in an old implementation of
epsilon, even if it was simplistic and naïve.  However speed is not a
priority now; I can add those optimizations later.  The system is good
enough to have a read function entirely implemented in epsilon1, called
e1:read.  The REPL is also written in epsilon1: look for repl:repl.

For all of this to work of course I had to specify how to read each
s-expression case: unique objects (#t, #t and ()), s-fixnums, s-strings,
s-characters, s-symbols, and even s-fixedpoints; it's easy to add more
cases now.  Escaping is user-extensible for strings and characters;
escaping for symbols is not very satisfactory but I don't need to
improve it right now; I can live without symbols having spaces in their
name for a while.  Comment syntax is of course also extensible, and I've
even defined the #; syntax commenting-out the following s-expression,
like in the relevant Scheme SRFI.  This is non-trivial, and it requires
a recursive call to the reader; this is an example of a piece of
"lexicon" which can't be covered by regular languages.  Prefixes (such
as ', `, , and ,@) are extensible, and I plan to add some of them for
what in Guile are REPL commands -- which in epsilon1 will expand to
ordinary epsilon0 expressions, usable from anywhere in the code.

It's easy to build an epsilon1 REPL by unexecing.  Get the latest
master source, configure and make, then enter bootstrap/scheme/ .  Then
bootstrap from guile+whatever, and unexec a (repl:repl) call:

--8<---------------cut here---------------start------------->8---
address@hidden ~/repos/epsilon]$ ./configure && make
[..Output...]
address@hidden ~/repos/epsilon]$ cd bootstrap/scheme/
address@hidden ~/repos/epsilon/bootstrap/scheme]$ ulimit -s unlimited
address@hidden ~/repos/epsilon/bootstrap/scheme]$ ../../bin/guile+whatever 
guile> (load "bootstrap.scm") (e1:toplevel (e1:unexec "/tmp/repl" (repl:repl)))
[...Lots of output...]
--8<---------------cut here---------------end--------------->8---

Now the /tmp/repl image contains our native REPL, not using Guile's
reader.  Let's quit guile+whatever with C-d, and run the native REPL
with a faster image interpreter.  The tagged version is safe, and
displays epsilon objects in our usual color notation.

--8<---------------cut here---------------start------------->8---
address@hidden ~/repos/epsilon/bootstrap/scheme]$ 
../../bin/epsilon-image-interpreter-tagged /tmp/repl
GNU epsilon git-snapshot
Copyright (C) 2012  Université Paris 13
Copyright (C) 2012-2014  Luca Saiu

GNU epsilon comes with ABSOLUTELY NO WARRANTY.  This program is free software
and you are welcome to redistribute it under the terms of the GNU General
Public License version 3 or later.  See the file named COPYING for details.

> 
--8<---------------cut here---------------end--------------->8---

The new REPL really works, of course, and comes with readline support.

--8<---------------cut here---------------start------------->8---
> 42
[You wrote: 42]
42
> '42
[You wrote: (quote 42)]
0x262d4b8[2 42 0]
> (fixnum:+ 1 2 3 4)
[You wrote: (fixnum:+ 1 2 3 4)]
10
> (e1:define (fact n) (e1:if (fixnum:zero? n) 1 (fixnum:* n (fact (fixnum:1- 
> n)))))
[You wrote: (e1:define (fact n) (e1:if (fixnum:zero? n) 1 (fixnum:* n (fact 
(fixnum:1- n)))))]
Defining the procedure fact...
> (fact 10)
[You wrote: (fact 10)]
3628800
> #\newline
[You wrote: #\newline]
10
> "\n"
[You wrote: "\n"]
0x1aa7728[1 10]
> "\\n"
[You wrote: "\\n"]
0x1aaad58[2 92 110]
> "中国"
[You wrote: "中国"]
0x1c5c0f8[2 20013 22269]
--8<---------------cut here---------------end--------------->8---

There's no more confusion between Guile and epsilon1 procedures and
macros, and no need for e1:toplevel.  I even reimplemented in epsilon1
the few debugging procedures which were written in Guile.  They are in
the "debug:" namespace, rather than the old "meta:"; and notice that
procedures working on names now work with symbols, and not with
s-expressions:

--8<---------------cut here---------------start------------->8---
> (debug:print-procedure-definition (e1:value fixnum:+))
[You wrote: (debug:print-procedure-definition (e1:value fixnum:+))]
Formals: a b
Body: [primitive fixnum:+ a₃₁ b₃₂]₃₃
> (debug:print-macro-definition (e1:value fixnum:+))
[You wrote: (debug:print-macro-definition (e1:value fixnum:+))]
(e1:destructuring-bind many-parameters arguments (quasiquote 
(variadic:call-associative (unquote (quote 0)) (unquote (quote fixnum:+)) 
(unquote-splicing many-parameters))))
--8<---------------cut here---------------end--------------->8---

However debug:macroexpand has to work with an s-expression, since its
parameter is an s-expression.  In particular that's an *epsilon1*
s-expressions, not a Guile s-expression --- but there's no possibility
for confusion any longer, since we don't have Guile s-expressions at all
here:

--8<---------------cut here---------------start------------->8---
> (debug:macroexpand '(fixnum:+ 1 2 3 4))
[You wrote: (debug:macroexpand (quote (fixnum:+ 1 2 3 4)))]
[call fixnum:+ [call fixnum:+ [call fixnum:+ 1₁₇₅₀₉₃ 2₁₇₅₀₉₄]₁₇₅₀₉₅ 
3₁₇₅₀₉₆]₁₇₅₀₉₇ 4₁₇₅₀₉₈]₁₇₅₀₉₉
> (debug:macroexpand '(fixnum:- 1 2 3 4))
[You wrote: (debug:macroexpand (quote (fixnum:- 1 2 3 4)))]
[call fixnum:- [call fixnum:- [call fixnum:- 1₁₇₅₁₄₈ 2₁₇₅₁₄₉]₁₇₅₁₅₀ 
3₁₇₅₁₅₁]₁₇₅₁₅₂ 4₁₇₅₁₅₃]₁₇₅₁₅₄
> (debug:macroexpand '(fixnum:- 1))
[You wrote: (debug:macroexpand (quote (fixnum:- 1)))]
[call fixnum:negate 1₁₇₅₁₇₉]₁₇₅₁₈₀
--8<---------------cut here---------------end--------------->8---

Isn't it cute?

Among the nice new features, fixed-point numbers have finally become
convenient to use:

--8<---------------cut here---------------start------------->8---
> 1.2
[You wrote: 1.199981689453125]
78642
--8<---------------cut here---------------end--------------->8---

The surprising result is the internal untyped representation of a
fixed-point number: it's actually 1.2 divided by 2^-16.  Of course we
can also print the intended value of fixed-point numbers with the
appropriate procedure:

--8<---------------cut here---------------start------------->8---
> (fio:write (f (fixedpoint:+ 1.2 1.8)) "\n")
[You wrote: (fio:write (f (fixedpoint:+ 1.199981689453125 1.7999267578125)) 
"\n")]
2.999908447265625
--8<---------------cut here---------------end--------------->8---

For fixnums we support the full Common Lisp/Scheme syntax, with
any radix between 2 and 36:

--8<---------------cut here---------------start------------->8---
> 14
[You wrote: 14]
14
> #b1110
[You wrote: 14]
14
> #2r1110
[You wrote: 14]
14
> #xe
[You wrote: 14]
14
> #16re
[You wrote: 14]
14
--8<---------------cut here---------------end--------------->8---

If you have epsilon code in /tmp/q.e you can load it:

--8<---------------cut here---------------end--------------->8---
> (e1:load "/tmp/q.e")
[...output...]
--8<---------------cut here---------------end--------------->8---

Of course e1:load is just an ordinary procedure, internally calling the
reader on a file port, expanding and executing each s-expression result
until the reader returns s-EOF.

The native reader could be optimized, and I'd like symbol completion in
the REPL; but I'm already quite happy with the current state.

Now it's late, and I guess the message about bootstrapping will have to
wait.  Good night,

-- 
Luca Saiu
Home page:   http://ageinghacker.net
GNU epsilon: http://www.gnu.org/software/epsilon
Marionnet:   http://marionnet.org
[Prev in Thread]
Current Thread
[Next in Thread]
[epsilon-devel] Extensible s-expression reader and REPL in epsilon1, Luca Saiu <=
Prev by Date: [epsilon-devel] GNU epsilon status update
Next by Date: [epsilon-devel] Bootstrapping away from Guile: harder than it looks
Previous by thread: [epsilon-devel] GNU epsilon status update
Next by thread: [epsilon-devel] Bootstrapping away from Guile: harder than it looks
Index(es):
- Date
- Thread