epsilon-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[epsilon-devel] How to write a simple code generation which is not terri


From: Luca Saiu
Subject: [epsilon-devel] How to write a simple code generation which is not terrible
Date: Wed, 30 Jan 2019 03:27:04 +0100
User-agent: Gnus (Gnus v5.13), GNU Emacs 25.1.50.2, x86_64-unknown-linux-gnu

In order to convince myself that the new stack operand idea, not yet
implemented, was actually useful I wrote a new code generator for the
Structured example, emitting new instructions which are not implemented
yet.  The generated code of course is not executable, but is printed out
as text on stderr.

See the commit 9dab58ff8ce7f8e52d7e0feb532f62947c324565 , as usual from
the jitterlisp branch.

The file is example-vms/structured/structured-code-generator.c .

I wonder if this style of compiling, with a "location" for the result of
each phrase of the language to be translated first demanded by the
translator of its container-phrase, and then possibly refined by the
translator of the contained phrase.  It is too simple an idea for it to
be original, but certainly nobody has ever taught me that.

The Structured language currently uses the stack for nested primitive
uses, instead of temporaries.  For example

  a := a + b + 2

get compiled into

        plus %r0 %r1 %s 
        plus %s 2 %r0 

(output operands are on the right.)

It is funny how primitives nested more deeply still get compiled to
temporaries in the correct order, left-to-right, with no effort:

  print (1 + 2) * (3 - 4)

        plus 1 2 %s 
        minus 3 4 %s 
        times %s %s %s 
        print %s 
        exitvm 

(No, there is no optimization.  For some fun play with AST optimization
in a much more complicated setting, see JitterLisp.)

A language as simple as Structured could use registers for temporaries
instead of a stack; the stack is actually not needed at all right now,
in a language without procedures.  But this version is easy, and works
well as a first didactic example.  Introducing registers as temporaries
on the fly while translating makes it slightly more difficult to reason
on their lifetime --- not that it would be terribly difficult: each
temporary gets allocated as the operand of a primitive, and can always
be reused after the primitive has finished running; this means that the
temporary registers currently in use could be kept, for example, in a
stack associated to a primitive AST, and then be disposed of after
compiling each primitive.

I think that doing this efficiently would somehow muddle the clean, very
compositional beauty of this new code generator.  I would certainly use
it as a beginner example, which is its reason for existing in the first
place.

Let us take this Structured program as an example:

  begin
    a := 100;
    b := 0;
    while a <> 0 do
      a := a - 1;
      b := b + 1
    end;
    print b
  end

The old code generator translated it into this revolting mess:

        push             100
        pop              %r0
        push             0
        pop              %r1
        b                $L13
$L5:
        push             %r0
        push             1
        minus            
        pop              %r0
        push             %r1
        push             1
        plus             
        pop              %r1
$L13:
        push             %r0
        push             0
        different        
        bt               $L5
        push             %r1
        print            
        exitvm           

With the current rewrite rules enabled, this turns into:

        set-r            100, %r0
        set-r            0, %r1
        b                $L9
$L3:
        push             %r0
        minusi           1
        pop              %r0
        push             %r1
        plusi            1
        pop              %r1
$L9:
        push             %r0
        push             0
        different        
        bt               $L3
        push             %r1
        print            
        exitvm           

which is faster, but only marginally less ugly.

Instead, the new generator yields, very easily and naturally, this:

        set-r 100 %r0 
        set-r 0 %r1 
        b $L1 
$L0:
        minus %r0 1 %r0 
        plus %r1 1 %r1 
$L1:
        different %r0 0 %s 
        bt %s $L0 
        print %r1 
        exitvm 

, which I almost like.  It still has the defect of materializing a truth
value and then jumping if it's true, instead of directly branching on
the "different" condition.  This change can be made in a clean way, I
think, so I will do it.


In the end stack operands have been an initial motivation for me to play
with this new code generator, even if in the end they were possibly not
the most important thing I learned from it.

Stack operands remain useful, particularly for handling parameters and
result in procedure calls.  I am quite strongly convinced that they are
the right thing.

Do you have any comments?

Regards,

-- 
Luca Saiu
* GNU epsilon:           http://www.gnu.org/software/epsilon
* My personal web site:  http://ageinghacker.net

I support everyone's freedom of mocking any opinion or belief, no
matter how deeply held, with open disrespect and the same unrelented
enthusiasm of a toddler who has just learned the word "poo".

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]