guile-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Guile-commits] 05/86: Inline body.texi and version.texi into lightning.


From: Andy Wingo
Subject: [Guile-commits] 05/86: Inline body.texi and version.texi into lightning.texi
Date: Wed, 3 Apr 2019 11:38:48 -0400 (EDT)

wingo pushed a commit to branch lightening
in repository guile.

commit 414f530c1dac1bda58556e5477b578aa2841ddeb
Author: Andy Wingo <address@hidden>
Date:   Tue Oct 30 11:35:01 2018 +0100

    Inline body.texi and version.texi into lightning.texi
---
 doc/body.texi      | 1679 ---------------------------------------------------
 doc/lightning.texi | 1686 +++++++++++++++++++++++++++++++++++++++++++++++++++-
 doc/version.texi   |    4 -
 3 files changed, 1684 insertions(+), 1685 deletions(-)

diff --git a/doc/body.texi b/doc/body.texi
deleted file mode 100644
index 60f5692..0000000
--- a/doc/body.texi
+++ /dev/null
@@ -1,1679 +0,0 @@
address@hidden
address@hidden Software development
address@hidden
-* lightning: (lightning).       Library for dynamic code generation.
address@hidden direntry
address@hidden ifnottex
-
address@hidden
address@hidden Top
address@hidden @lightning{}
-
address@hidden
address@hidden comma
address@hidden|,|}
address@hidden macro
address@hidden iftex
-
address@hidden
address@hidden comma
address@hidden|,|}
address@hidden macro
address@hidden ifnottex
-
-This document describes @value{TOPIC} the @lightning{} library for
-dynamic code generation.
-
address@hidden
-* Overview::                What GNU lightning is
-* Installation::            Configuring and installing GNU lightning
-* The instruction set::     The RISC instruction set used in GNU lightning
-* GNU lightning examples::  GNU lightning's examples
-* Reentrancy::              Re-entrant usage of GNU lightning
-* Customizations::          Advanced code generation customizations
-* Acknowledgements::        Acknowledgements for GNU lightning
address@hidden menu
address@hidden ifnottex
-
address@hidden Overview
address@hidden Introduction to @lightning{}
-
address@hidden
-This document describes @value{TOPIC} the @lightning{} library for
-dynamic code generation.
address@hidden iftex
-
-Dynamic code generation is the generation of machine code 
-at runtime. It is typically used to strip a layer of interpretation 
-by allowing compilation to occur at runtime.  One of the most
-well-known applications of dynamic code generation is perhaps that
-of interpreters that compile source code to an intermediate bytecode
-form, which is then recompiled to machine code at run-time: this
-approach effectively combines the portability of bytecode
-representations with the speed of machine code.  Another common
-application of dynamic code generation is in the field of hardware
-simulators and binary emulators, which can use the same techniques
-to translate simulated instructions to the instructions of the 
-underlying machine.
-
-Yet other applications come to mind: for example, windowing
address@hidden operations, matrix manipulations, and network packet
-filters.  Albeit very powerful and relatively well known within the
-compiler community, dynamic code generation techniques are rarely
-exploited to their full potential and, with the exception of the
-two applications described above, have remained curiosities because
-of their portability and functionality barriers: binary instructions
-are generated, so programs using dynamic code generation must be
-retargeted for each machine; in addition, coding a run-time code
-generator is a tedious and error-prone task more than a difficult one.
-
address@hidden provides a portable, fast and easily retargetable dynamic
-code generation system. 
-
-To be portable, @lightning{} abstracts over current architectures'
-quirks and unorthogonalities.  The interface that it exposes to is that
-of a standardized RISC architecture loosely based on the SPARC and MIPS
-chips.  There are a few general-purpose registers (six, not including
-those used to receive and pass parameters between subroutines), and
-arithmetic operations involve three operands---either three registers
-or two registers and an arbitrarily sized immediate value.
-
-On one hand, this architecture is general enough that it is possible to
-generate pretty efficient code even on CISC architectures such as the
-Intel x86 or the Motorola 68k families.  On the other hand, it matches
-real architectures closely enough that, most of the time, the
-compiler's constant folding pass ends up generating code which
-assembles machine instructions without further tests.
-
address@hidden Installation
address@hidden Configuring and installing @lightning{}
-
-The first thing to do to use @lightning{} is to configure the
-program, picking the set of macros to be used on the host
-architecture; this configuration is automatically performed by
-the @file{configure} shell script; to run it, merely type:
address@hidden
-     ./configure
address@hidden example
-
address@hidden supports the @code{--enable-disassembler} option, that
-enables linking to GNU binutils and optionally print human readable
-disassembly of the jit code. This option can be disabled by the
address@hidden option.
-
-Another option that @file{configure} accepts is
address@hidden, which enables several consistency checks in
-the run-time assemblers.  These are not usually needed, so you can
-decide to simply forget about it; also remember that these consistency
-checks tend to slow down your code generator.
-
-After you've configured @lightning{}, run @file{make} as usual.
-
address@hidden has an extensive set of tests to validate it is working
-correctly in the build host. To test it run:
address@hidden
-    make check
address@hidden example
-
-The next important step is:
address@hidden
-    make install
address@hidden example
-
-This ends the process of installing @lightning{}.
-
address@hidden The instruction set
address@hidden @lightning{}'s instruction set
-
address@hidden's instruction set was designed by deriving instructions
-that closely match those of most existing RISC architectures, or
-that can be easily syntesized if absent.  Each instruction is composed
-of:
address@hidden @bullet
address@hidden
-an operation, like @code{sub} or @code{mul}
-
address@hidden
-most times, a register/immediate flag (@code{r} or @code{i})
-
address@hidden
-an unsigned modifier (@code{u}), a type identifier or two, when applicable.
address@hidden itemize
-
-Examples of legal mnemonics are @code{addr} (integer add, with three
-register operands) and @code{muli} (integer multiply, with two
-register operands and an immediate operand).  Each instruction takes
-two or three operands; in most cases, one of them can be an immediate
-value instead of a register.
-
-Most @lightning{} integer operations are signed wordsize operations,
-with the exception of operations that convert types, or load or store
-values to/from memory. When applicable, the types and C types are as
-follow:
-
address@hidden
-     _c         @r{signed char}
-     _uc        @r{unsigned char}
-     _s         @r{short}
-     _us        @r{unsigned short}
-     _i         @r{int}
-     _ui        @r{unsigned int}
-     _l         @r{long}
-     _f         @r{float}
-     _d         @r{double}
address@hidden example
-
-Most integer operations do not need a type modifier, and when loading or
-storing values to memory there is an alias to the proper operation
-using wordsize operands, that is, if ommited, the type is @r{int} on
-32-bit architectures and @r{long} on 64-bit architectures.  Note
-that lightning also expects @code{sizeof(void*)} to match the wordsize.
-
-When an unsigned operation result differs from the equivalent signed
-operation, there is a the @code{_u} modifier.
-
-There are at least seven integer registers, of which six are
-general-purpose, while the last is used to contain the frame pointer
-(@code{FP}).  The frame pointer can be used to allocate and access local
-variables on the stack, using the @code{allocai} or @code{allocar}
-instruction.
-
-Of the general-purpose registers, at least three are guaranteed to be
-preserved across function calls (@code{V0}, @code{V1} and
address@hidden) and at least three are not (@code{R0}, @code{R1} and
address@hidden).  Six registers are not very much, but this
-restriction was forced by the need to target CISC architectures
-which, like the x86, are poor of registers; anyway, backends can
-specify the actual number of available registers with the calls
address@hidden (for caller-save registers) and @code{JIT_V_NUM}
-(for callee-save registers).
-
-There are at least six floating-point registers, named @code{F0} to
address@hidden  These are usually caller-save and are separate from the integer
-registers on the supported architectures; on Intel architectures,
-in 32 bit mode if SSE2 is not available or use of X87 is forced,
-the register stack is mapped to a flat register file.  As for the
-integer registers, the macro @code{JIT_F_NUM} yields the number of
-floating-point registers.
-
-The complete instruction set follows; as you can see, most non-memory
-operations only take integers (either signed or unsigned) as operands;
-this was done in order to reduce the instruction set, and because most
-architectures only provide word and long word operations on registers.
-There are instructions that allow operands to be extended to fit a larger
-data type, both in a signed and in an unsigned way.
-
address@hidden @b
address@hidden Binary ALU operations
-These accept three operands; the last one can be an immediate.
address@hidden operations must directly follow @code{addc}, and
address@hidden must follow @code{subc}; otherwise, results are undefined.
-Most, if not all, architectures do not support @r{float} or @r{double}
-immediate operands; lightning emulates those operations by moving the
-immediate to a temporary register and emiting the call with only
-register operands.
address@hidden
-addr         _f  _d  O1 = O2 + O3
-addi         _f  _d  O1 = O2 + O3
-addxr                O1 = O2 + (O3 + carry)
-addxi                O1 = O2 + (O3 + carry)
-addcr                O1 = O2 + O3, set carry
-addci                O1 = O2 + O3, set carry
-subr         _f  _d  O1 = O2 - O3
-subi         _f  _d  O1 = O2 - O3
-subxr                O1 = O2 - (O3 + carry)
-subxi                O1 = O2 - (O3 + carry)
-subcr                O1 = O2 - O3, set carry
-subci                O1 = O2 - O3, set carry
-rsbr         _f  _d  O1 = O3 - O1
-rsbi         _f  _d  O1 = O3 - O1
-mulr         _f  _d  O1 = O2 * O3
-muli         _f  _d  O1 = O2 * O3
-divr     _u  _f  _d  O1 = O2 / O3
-divi     _u  _f  _d  O1 = O2 / O3
-remr     _u          O1 = O2 % O3
-remi     _u          O1 = O2 % O3
-andr                 O1 = O2 & O3
-andi                 O1 = O2 & O3
-orr                  O1 = O2 | O3
-ori                  O1 = O2 | O3
-xorr                 O1 = O2 ^ O3
-xori                 O1 = O2 ^ O3
-lshr                 O1 = O2 << O3
-lshi                 O1 = O2 << O3
-rshr     _u          O1 = O2 >> address@hidden sign bit is propagated unless 
using the @code{_u} modifier.}
-rshi     _u          O1 = O2 >> address@hidden sign bit is propagated unless 
using the @code{_u} modifier.}
address@hidden example
-
address@hidden Four operand binary ALU operations
-These accept two result registers, and two operands; the last one can
-be an immediate. The first two arguments cannot be the same register.
-
address@hidden stores the low word of the result in @code{O1} and the
-high word in @code{O2}. For unsigned multiplication, @code{O2} zero
-means there was no overflow. For signed multiplication, no overflow
-check is based on sign, and can be detected if @code{O2} is zero or
-minus one.
-
address@hidden stores the quotient in @code{O1} and the remainder in
address@hidden It can be used as quick way to check if a division is
-exact, in which case the remainder is zero.
-
address@hidden
-qmulr    _u       O1 O2 = O3 * O4
-qmuli    _u       O1 O2 = O3 * O4
-qdivr    _u       O1 O2 = O3 / O4
-qdivi    _u       O1 O2 = O3 / O4
address@hidden example
-
address@hidden Unary ALU operations
-These accept two operands, both of which must be registers.
address@hidden
-negr         _f  _d  O1 = -O2
-comr                 O1 = ~O2
address@hidden example
-
-These unary ALU operations are only defined for float operands.
address@hidden
-absr         _f  _d  O1 = fabs(O2)
-sqrtr                O1 = sqrt(O2)
address@hidden example
-
-Besides requiring the @code{r} modifier, there are no unary operations
-with an immediate operand.
-
address@hidden Compare instructions
-These accept three operands; again, the last can be an immediate.
-The last two operands are compared, and the first operand, that must be
-an integer register, is set to either 0 or 1, according to whether the
-given condition was met or not.
-
-The conditions given below are for the standard behavior of C,
-where the ``unordered'' comparison result is mapped to false.
-
address@hidden
-ltr       _u  _f  _d  O1 =  (O2 <  O3)
-lti       _u  _f  _d  O1 =  (O2 <  O3)
-ler       _u  _f  _d  O1 =  (O2 <= O3)
-lei       _u  _f  _d  O1 =  (O2 <= O3)
-gtr       _u  _f  _d  O1 =  (O2 >  O3)
-gti       _u  _f  _d  O1 =  (O2 >  O3)
-ger       _u  _f  _d  O1 =  (O2 >= O3)
-gei       _u  _f  _d  O1 =  (O2 >= O3)
-eqr           _f  _d  O1 =  (O2 == O3)
-eqi           _f  _d  O1 =  (O2 == O3)
-ner           _f  _d  O1 =  (O2 != O3)
-nei           _f  _d  O1 =  (O2 != O3)
-unltr         _f  _d  O1 = !(O2 >= O3)
-unler         _f  _d  O1 = !(O2 >  O3)
-ungtr         _f  _d  O1 = !(O2 <= O3)
-unger         _f  _d  O1 = !(O2 <  O3)
-uneqr         _f  _d  O1 = !(O2 <  O3) && !(O2 >  O3)
-ltgtr         _f  _d  O1 = !(O2 >= O3) || !(O2 <= O3)
-ordr          _f  _d  O1 =  (O2 == O2) &&  (O3 == O3)
-unordr        _f  _d  O1 =  (O2 != O2) ||  (O3 != O3)
address@hidden example
-
address@hidden Transfer operations
-These accept two operands; for @code{ext} both of them must be
-registers, while @code{mov} accepts an immediate value as the second
-operand.
-
-Unlike @code{movr} and @code{movi}, the other instructions are used
-to truncate a wordsize operand to a smaller integer data type or to
-convert float data types. You can also use @code{extr} to convert an
-integer to a floating point value: the usual options are @code{extr_f}
-and @code{extr_d}.
-
address@hidden
-movr                                 _f  _d  O1 = O2
-movi                                 _f  _d  O1 = O2
-extr      _c  _uc  _s  _us  _i  _ui  _f  _d  O1 = O2
-truncr                               _f  _d  O1 = trunc(O2)
address@hidden example
-
-In 64-bit architectures it may be required to use @code{truncr_f_i},
address@hidden, @code{truncr_d_i} and @code{truncr_d_l} to match
-the equivalent C code.  Only the @code{_i} modifier is available in
-32-bit architectures.
-
address@hidden
-truncr_f_i    = <int> O1 = <float> O2
-truncr_f_l    = <long>O1 = <float> O2
-truncr_d_i    = <int> O1 = <double>O2
-truncr_d_l    = <long>O1 = <double>O2
address@hidden example
-
-The float conversion operations are @emph{destination first,
-source second}, but the order of the types is reversed.  This happens
-for historical reasons.
-
address@hidden
-extr_f_d    = <double>O1 = <float> O2
-extr_d_f    = <float> O1 = <double>O2
address@hidden example
-
address@hidden Network extensions
-These accept two operands, both of which must be registers; these
-two instructions actually perform the same task, yet they are
-assigned to two mnemonics for the sake of convenience and
-completeness.  As usual, the first operand is the destination and
-the second is the source.
-The @code{_ul} variant is only available in 64-bit architectures.
address@hidden
-htonr    _us _ui _ul @r{Host-to-network (big endian) order}
-ntohr    _us _ui _ul @r{Network-to-host order }
address@hidden example
-
address@hidden Load operations
address@hidden accepts two operands while @code{ldx} accepts three;
-in both cases, the last can be either a register or an immediate
-value. Values are extended (with or without sign, according to
-the data type specification) to fit a whole register.
-The @code{_ui} and @code{_l} types are only available in 64-bit
-architectures.  For convenience, there is a version without a
-type modifier for integer or pointer operands that uses the
-appropriate wordsize call.
address@hidden
-ldr     _c  _uc  _s  _us  _i  _ui  _l  _f  _d  O1 = *O2
-ldi     _c  _uc  _s  _us  _i  _ui  _l  _f  _d  O1 = *O2
-ldxr    _c  _uc  _s  _us  _i  _ui  _l  _f  _d  O1 = *(O2+O3)
-ldxi    _c  _uc  _s  _us  _i  _ui  _l  _f  _d  O1 = *(O2+O3)
address@hidden example
-
address@hidden Store operations
address@hidden accepts two operands while @code{stx} accepts three; in
-both cases, the first can be either a register or an immediate
-value. Values are sign-extended to fit a whole register.
address@hidden
-str     _c  _uc  _s  _us  _i  _ui  _l  _f  _d  *O1 = O2
-sti     _c  _uc  _s  _us  _i  _ui  _l  _f  _d  *O1 = O2
-stxr    _c  _uc  _s  _us  _i  _ui  _l  _f  _d  *(O1+O2) = O3
-stxi    _c  _uc  _s  _us  _i  _ui  _l  _f  _d  *(O1+O2) = O3
address@hidden example
-As for the load operations, the @code{_ui} and @code{_l} types are
-only available in 64-bit architectures, and for convenience, there
-is a version without a type modifier for integer or pointer operands
-that uses the appropriate wordsize call.
-
address@hidden Argument management
-These are:
address@hidden
-prepare     (not specified)
-va_start    (not specified)
-pushargr                                   _f  _d
-pushargi                                   _f  _d
-va_push     (not specified)
-arg         _c  _uc  _s  _us  _i  _ui  _l  _f  _d
-getarg      _c  _uc  _s  _us  _i  _ui  _l  _f  _d
-va_arg                                         _d
-putargr                                    _f  _d
-putargi                                    _f  _d
-ret         (not specified)
-retr                                       _f  _d
-reti                                       _f  _d
-va_end      (not specified)
-retval      _c  _uc  _s  _us  _i  _ui  _l  _f  _d
-epilog      (not specified)
address@hidden example
-As with other operations that use a type modifier, the @code{_ui} and
address@hidden types are only available in 64-bit architectures, but there
-are operations without a type modifier that alias to the appropriate
-integer operation with wordsize operands.
-
address@hidden, @code{pusharg}, and @code{retval} are used by the caller,
-while @code{arg}, @code{getarg} and @code{ret} are used by the callee.
-A code snippet that wants to call another procedure and has to pass
-arguments must, in order: use the @code{prepare} instruction and use
-the @code{pushargr} or @code{pushargi} to push the arguments @strong{in
-left to right order}; and use @code{finish} or @code{call} (explained below)
-to perform the actual call.
-
address@hidden returns a @code{C} compatible @code{va_list}. To fetch
-arguments, use @code{va_arg} for integers and @code{va_arg_d} for doubles.
address@hidden is required when passing a @code{va_list} to another function,
-because not all architectures expect it as a single pointer. Known case
-is DEC Alpha, that requires it as a structure passed by value.
-
address@hidden, @code{getarg} and @code{putarg} are used by the callee.
address@hidden is different from other instruction in that it does not
-actually generate any code: instead, it is a function which returns
-a value to be passed to @code{getarg} or @code{putarg}. @footnote{``Return
-a value'' means that @lightning{} code that compile these
-instructions return a value when expanded.} You should call
address@hidden as soon as possible, before any function call or, more
-easily, right after the @code{prolog} instructions
-(which is treated later).
-
address@hidden accepts a register argument and a value returned by
address@hidden, and will move that argument to the register, extending
-it (with or without sign, according to the data type specification)
-to fit a whole register.  These instructions are more intimately
-related to the usage of the @lightning{} instruction set in code
-that generates other code, so they will be treated more
-specifically in @ref{GNU lightning examples, , Generating code at
-run-time}.
-
address@hidden is a mix of @code{getarg} and @code{pusharg} in that
-it accepts as first argument a register or immediate, and as
-second argument a value returned by @code{arg}. It allows changing,
-or restoring an argument to the current function, and is a
-construct required to implement tail call optimization. Note that
-arguments in registers are very cheap, but will be overwritten
-at any moment, including on some operations, for example division,
-that on several ports is implemented as a function call.
-
-Finally, the @code{retval} instruction fetches the return value of a
-called function in a register.  The @code{retval} instruction takes a
-register argument and copies the return value of the previously called
-function in that register.  A function with a return value should use
address@hidden or @code{reti} to put the return value in the return register
-before returning.  @xref{Fibonacci, the Fibonacci numbers}, for an example.
-
address@hidden is an optional call, that marks the end of a function
-body. It is automatically generated by @lightning{} if starting a new
-function (what should be done after a @code{ret} call) or finishing
-generating jit.
-It is very important to note that the fact that @code{epilog} being
-optional may cause a common mistake. Consider this:
address@hidden
-fun1:
-    prolog
-    ...
-    ret
-fun2:
-    prolog
address@hidden example
-Because @code{epilog} is added when finding a new @code{prolog},
-this will cause the @code{fun2} label to actually be before the
-return from @code{fun1}. Because @lightning{} will actually
-understand it as:
address@hidden
-fun1:
-    prolog
-    ...
-    ret
-fun2:
-    epilog
-    prolog
address@hidden example
-
-You should observe a few rules when using these macros.  First of
-all, if calling a varargs function, you should use the @code{ellipsis}
-call to mark the position of the ellipsis in the C prototype.
-
-You should not nest calls to @code{prepare} inside a
address@hidden/finish} block.  Doing this will result in undefined
-behavior. Note that for functions with zero arguments you can use
-just @code{call}.
-
address@hidden Branch instructions
-Like @code{arg}, these also return a value which, in this case,
-is to be used to compile forward branches as explained in
address@hidden, , Fibonacci numbers}.  They accept two operands to be
-compared; of these, the last can be either a register or an immediate.
-They are:
address@hidden
-bltr      _u  _f  _d  @r{if }(O2 <  O3)@r{ goto }O1
-blti      _u  _f  _d  @r{if }(O2 <  O3)@r{ goto }O1
-bler      _u  _f  _d  @r{if }(O2 <= O3)@r{ goto }O1
-blei      _u  _f  _d  @r{if }(O2 <= O3)@r{ goto }O1
-bgtr      _u  _f  _d  @r{if }(O2 >  O3)@r{ goto }O1
-bgti      _u  _f  _d  @r{if }(O2 >  O3)@r{ goto }O1
-bger      _u  _f  _d  @r{if }(O2 >= O3)@r{ goto }O1
-bgei      _u  _f  _d  @r{if }(O2 >= O3)@r{ goto }O1
-beqr          _f  _d  @r{if }(O2 == O3)@r{ goto }O1
-beqi          _f  _d  @r{if }(O2 == O3)@r{ goto }O1
-bner          _f  _d  @r{if }(O2 != O3)@r{ goto }O1
-bnei          _f  _d  @r{if }(O2 != O3)@r{ goto }O1
-
-bunltr        _f  _d  @r{if }!(O2 >= O3)@r{ goto }O1
-bunler        _f  _d  @r{if }!(O2 >  O3)@r{ goto }O1
-bungtr        _f  _d  @r{if }!(O2 <= O3)@r{ goto }O1
-bunger        _f  _d  @r{if }!(O2 <  O3)@r{ goto }O1
-buneqr        _f  _d  @r{if }!(O2 <  O3) && !(O2 >  O3)@r{ goto }O1
-bltgtr        _f  _d  @r{if }!(O2 >= O3) || !(O2 <= O3)@r{ goto }O1
-bordr         _f  _d  @r{if } (O2 == O2) &&  (O3 == O3)@r{ goto }O1
-bunordr       _f  _d  @r{if }!(O2 != O2) ||  (O3 != O3)@r{ goto }O1
-
-bmsr                  @r{if }O2 &  address@hidden goto }O1
-bmsi                  @r{if }O2 &  address@hidden goto }O1
-bmcr                  @r{if }!(O2 & O3)@r{ goto }O1
-bmci                  @r{if }!(O2 & O3)@r{ goto address@hidden mnemonics mean, 
respectively, @dfn{branch if mask set} and @dfn{branch if mask cleared}.}
-boaddr    _u          O2 += address@hidden, goto address@hidden if overflow}
-boaddi    _u          O2 += address@hidden, goto address@hidden if overflow}
-bxaddr    _u          O2 += address@hidden, goto address@hidden if no overflow}
-bxaddi    _u          O2 += address@hidden, goto address@hidden if no overflow}
-bosubr    _u          O2 -= address@hidden, goto address@hidden if overflow}
-bosubi    _u          O2 -= address@hidden, goto address@hidden if overflow}
-bxsubr    _u          O2 -= address@hidden, goto address@hidden if no overflow}
-bxsubi    _u          O2 -= address@hidden, goto address@hidden if no overflow}
address@hidden example
-
address@hidden Jump and return operations
-These accept one argument except @code{ret} and @code{jmpi} which
-have none; the difference between @code{finishi} and @code{calli}
-is that the latter does not clean the stack from pushed parameters
-(if any) and the former must @strong{always} follow a @code{prepare}
-instruction.
address@hidden
-callr     (not specified)                @r{function call to register O1}
-calli     (not specified)                @r{function call to immediate O1}
-finishr   (not specified)                @r{function call to register O1}
-finishi   (not specified)                @r{function call to immediate O1}
-jmpr      (not specified)                @r{unconditional jump to register}
-jmpi      (not specified)                @r{unconditional jump}
-ret       (not specified)                @r{return from subroutine}
-retr      _c _uc _s _us _i _ui _l _f _d
-reti      _c _uc _s _us _i _ui _l _f _d
-retval    _c _uc _s _us _i _ui _l _f _d  @r{move return value}
-                                         @r{to register}
address@hidden example
-
-Like branch instruction, @code{jmpi} also returns a value which is to
-be used to compile forward branches. @xref{Fibonacci, , Fibonacci
-numbers}.
-
address@hidden Labels
-There are 3 @lightning{} instructions to create labels:
address@hidden
-label     (not specified)                @r{simple label}
-forward   (not specified)                @r{forward label}
-indirect  (not specified)                @r{special simple label}
address@hidden example
-
address@hidden is normally used as @code{patch_at} argument for backward
-jumps.
-
address@hidden
-        jit_node_t *jump, *label;
-label = jit_label();
-        ...
-        jump = jit_beqr(JIT_R0, JIT_R1);
-        jit_patch_at(jump, label);
address@hidden example
-
address@hidden is used to patch code generation before the actual
-position of the label is known.
-
address@hidden
-        jit_node_t *jump, *label;
-label = jit_forward();
-        jump = jit_beqr(JIT_R0, JIT_R1);
-        jit_patch_at(jump, label);
-        ...
-        jit_link(label);
address@hidden example
-
address@hidden is useful when creating jump tables, and tells
address@hidden to not optimize out a label that is not the target of
-any jump, because an indirect jump may land where it is defined.
-
address@hidden
-        jit_node_t *jump, *label;
-        ...
-        jmpr(JIT_R0);                    @rem{/* may jump to label */}
-        ...
-label = jit_indirect();
address@hidden example
-
address@hidden is an special case of @code{note} and @code{name}
-because it is a valid argument to @code{address}.
-
-Note that the usual idiom to write the previous example is
address@hidden
-        jit_node_t *addr, *jump;
-addr  = jit_movi(JIT_R0, 0);             @rem{/* immediate is ignored */}
-        ...
-        jmpr(JIT_R0);
-        ...
-        jit_patch(addr);                 @rem{/* implicit label added */}
address@hidden example
-
-that automatically binds the implicit label added by @code{patch} with
-the @code{movi}, but on some special conditions it is required to create
-an "unbound" label.
-
address@hidden Function prolog
-
-These macros are used to set up a function prolog.  The @code{allocai}
-call accept a single integer argument and returns an offset value
-for stack storage access.  The @code{allocar} accepts two registers
-arguments, the first is set to the offset for stack access, and the
-second is the size in bytes argument.
-
address@hidden
-prolog    (not specified)                @r{function prolog}
-allocai   (not specified)                @r{reserve space on the stack}
-allocar   (not specified)                @r{allocate space on the stack}
address@hidden example
-
address@hidden receives the number of bytes to allocate and returns
-the offset from the frame pointer register @code{FP} to the base of
-the area.
-
address@hidden receives two register arguments.  The first is where
-to store the offset from the frame pointer register @code{FP} to the
-base of the area.  The second argument is the size in bytes.  Note
-that @code{allocar} is dynamic allocation, and special attention
-should be taken when using it.  If called in a loop, every iteration
-will allocate stack space.  Stack space is aligned from 8 to 64 bytes
-depending on backend requirements, even if allocating only one byte.
-It is advisable to not use it with @code{frame} and @code{tramp}; it
-should work with @code{frame} with special care to call only once,
-but is not supported if used in @code{tramp}, even if called only
-once.
-
-As a small appetizer, here is a small function that adds 1 to the input
-parameter (an @code{int}).  I'm using an assembly-like syntax here which
-is a bit different from the one used when writing real subroutines with
address@hidden; the real syntax will be introduced in @xref{GNU lightning
-examples, , Generating code at run-time}.
-
address@hidden
-incr:
-     prolog
-in = arg                     @rem{! We have an integer argument}
-     getarg    R0, in        @rem{! Move it to R0}
-     addi      R0, R0, 1     @rem{! Add 1}
-     retr      R0            @rem{! And return the result}
address@hidden example
-
-And here is another function which uses the @code{printf} function from
-the standard C library to write a number in hexadecimal notation:
-
address@hidden
-printhex:
-     prolog
-in = arg                     @rem{! Same as above}
-     getarg    R0, in
-     prepare                 @rem{! Begin call sequence for printf}
-     pushargi  "%x"          @rem{! Push format string}
-     ellipsis                @rem{! Varargs start here}
-     pushargr  R0            @rem{! Push second argument}
-     finishi   printf        @rem{! Call printf}
-     ret                     @rem{! Return to caller}
address@hidden example
-
address@hidden Trampolines, continuations and tail call optimization
-
-Frequently it is required to generate jit code that must jump to
-code generated later, possibly from another @code{jit_context_t}.
-These require compatible stack frames.
-
address@hidden provides two primitives from where trampolines,
-continuations and tail call optimization can be implemented.
-
address@hidden
-frame   (not specified)                  @r{create stack frame}
-tramp   (not specified)                  @r{assume stack frame}
address@hidden example
-
address@hidden receives an integer address@hidden is not
-automatically computed because it does not know about the
-requirement of later generated code.} that defines the size in
-bytes for the stack frame of the current, @code{C} callable,
-jit function. To calculate this value, a good formula is maximum
-number of arguments to any called native function times
address@hidden eight so that it works for double arguments.
-And would not need conditionals for ports that pass arguments in
-the stack.}, plus the sum of the arguments to any call to
address@hidden @lightning{} automatically adjusts this value
-for any backend specific stack memory it may need, or any
-alignment constraint.
-
address@hidden also instructs @lightning{} to save all callee
-save registers in the prolog and reload in the epilog.
-
address@hidden
-main:                        @rem{! jit entry point}
-     prolog                  @rem{! function prolog}
-     frame  256              @rem{! save all callee save registers and}
-                             @rem{! reserve at least 256 bytes in stack}
-main_loop:
-     ...
-     jmpi   handler          @rem{! jumps to external code}
-     ...
-     ret                     @rem{! return to the caller}
address@hidden example
-
address@hidden differs from @code{frame} only that a prolog and epilog
-will not be generated. Note that @code{prolog} must still be used.
-The code under @code{tramp} must be ready to be entered with a jump
-at the prolog position, and instead of a return, it must end with
-a non conditional jump. @code{tramp} exists solely for the fact
-that it allows optimizing out prolog and epilog code that would
-never be executed.
-
address@hidden
-handler:                     @rem{! handler entry point}
-     prolog                  @rem{! function prolog}
-     tramp  256              @rem{! assumes all callee save registers}
-                             @rem{! are saved and there is at least}
-                             @rem{! 256 bytes in stack}
-     ...
-     jmpi   main_loop        @rem{! return to the main loop}
address@hidden example
-
address@hidden only supports Tail Call Optimization using the
address@hidden construct. Any other way is not guaranteed to
-work on all ports.
-
-An example of a simple (recursive) tail call optimization:
-
address@hidden
-factorial:                   @rem{! Entry point of the factorial function}
-     prolog
-in = arg                     @rem{! Receive an integer argument}
-     getarg R0, in           @rem{! Move argument to RO}
-     prepare
-         pushargi 1          @rem{! This is the accumulator}
-         pushargr R0         @rem{! This is the argument}
-     finishi fact            @rem{! Call the tail call optimized function}
-     retval R0               @rem{! Fetch the result}
-     retr R0                 @rem{! Return it}
-     epilog                  @rem{! Epilog *before* label before prolog}
-
-fact:                        @rem{! Entry point of the helper function}
-     prolog
-     frame 16                @rem{! Reserve 16 bytes in the stack}
-fact_entry:                  @rem{! This is the tail call entry point}
-ac = arg                     @rem{! The accumulator is the first argument}
-in = arg                     @rem{! The factorial argument}
-     getarg R0, ac           @rem{! Move the accumulator to R0}
-     getarg R1, in           @rem{! Move the argument to R1}
-     blei fact_out, R1, 1    @rem{! Done if argument is one or less}
-     mulr R0, R0, R1         @rem{! accumulator *= argument}
-     putargr R0, ac          @rem{! Update the accumulator}
-     subi R1, R1, 1          @rem{! argument -= 1}
-     putargr R1, in          @rem{! Update the argument}
-     jmpi fact_entry         @rem{! Tail Call Optimize it!}
-fact_out:
-     retr R0                 @rem{! Return the accumulator}
address@hidden example
-
address@hidden Predicates
address@hidden
-forward_p      (not specified)           @r{forward label predicate}
-indirect_p     (not specified)           @r{indirect label predicate}
-target_p       (not specified)           @r{used label predicate}
-arg_register_p (not specified)           @r{argument kind predicate}
-callee_save_p  (not specified)           @r{callee save predicate}
-pointer_p      (not specified)           @r{pointer predicate}
address@hidden example
-
address@hidden expects a @code{jit_node_t*} argument, and
-returns non zero if it is a forward label reference, that is,
-a label returned by @code{forward}, that still needs a
address@hidden call.
-
address@hidden expects a @code{jit_node_t*} argument, and returns
-non zero if it is an indirect label reference, that is, a label that
-was returned by @code{indirect}.
-
address@hidden expects a @code{jit_node_t*} argument, that is any
-kind of label, and will return non zero if there is at least one
-jump or move referencing it.
-
address@hidden expects a @code{jit_node_t*} argument, that must
-have been returned by @code{arg}, @code{arg_f} or @code{arg_d}, and
-will return non zero if the argument lives in a register. This call
-is useful to know the live range of register arguments, as those
-are very fast to read and write, but have volatile values.
-
address@hidden exects a valid @code{JIT_Rn}, @code{JIT_Vn}, or
address@hidden, and will return non zero if the register is callee
-save. This call is useful because on several ports, the @code{JIT_Rn}
-and @code{JIT_Fn} registers are actually callee save; no need
-to save and load the values when making function calls.
-
address@hidden expects a pointer argument, and will return non
-zero if the pointer is inside the generated jit code. Must be
-called after @code{jit_emit} and before @code{jit_destroy_state}.
address@hidden table
-
address@hidden GNU lightning examples
address@hidden Generating code at run-time
-
-To use @lightning{}, you should include the @file{lightning.h} file that
-is put in your include directory by the @samp{make install} command.
-
-Each of the instructions above translates to a macro or function call.
-All you have to do is prepend @code{jit_} (lowercase) to opcode names
-and @code{JIT_} (uppercase) to register names.  Of course, parameters
-are to be put between parentheses.
-
-This small tutorial presents three examples:
-
address@hidden
address@hidden @bullet
address@hidden
-The @code{incr} function found in @ref{The instruction set, ,
address@hidden's instruction set}:
-
address@hidden
-A simple function call to @code{printf}
-
address@hidden
-An RPN calculator.
-
address@hidden
-Fibonacci numbers
address@hidden itemize
address@hidden iftex
address@hidden
address@hidden
-* incr::             A function which increments a number by one
-* printf::           A simple function call to printf
-* RPN calculator::   A more complex example, an RPN calculator
-* Fibonacci::        Calculating Fibonacci numbers
address@hidden menu
address@hidden ifnottex
-
address@hidden incr
address@hidden A function which increments a number by one
-
-Let's see how to create and use the sample @code{incr} function created
-in @ref{The instruction set, , @lightning{}'s instruction set}:
-
address@hidden
-#include <stdio.h>
-#include <lightning.h>
-
-static jit_state_t *_jit;
-
-typedef int (*pifi)(int);    @rem{/* Pointer to Int Function of Int */}
-
-int main(int argc, char *argv[])
address@hidden
-  jit_node_t  *in;
-  pifi         incr;
-
-  init_jit(argv[0]);
-  _jit = jit_new_state();
-
-  jit_prolog();                    @rem{/* @t{     prolog             } */}
-  in = jit_arg();                  @rem{/* @t{     in = arg           } */}
-  jit_getarg(JIT_R0, in);          @rem{/* @t{     getarg R0          } */}
-  jit_addi(JIT_R0, JIT_R0, 1);     @rem{/* @t{     addi   address@hidden 
address@hidden 1   } */}
-  jit_retr(JIT_R0);                @rem{/* @t{     retr   R0          } */}
-
-  incr = jit_emit();
-  jit_clear_state();
-
-  @rem{/* call the generated address@hidden passing 5 as an argument */}
-  printf("%d + 1 = %d\n", 5, incr(5));
-
-  jit_destroy_state();
-  finish_jit();
-  return 0;
address@hidden
address@hidden example
-
-Let's examine the code line by line (well, address@hidden):
-
address@hidden @t
address@hidden #include <lightning.h>
-You already know about this.  It defines all of @lightning{}'s macros.
-
address@hidden static jit_state_t *_jit;
-You might wonder about what is @code{jit_state_t}.  It is a structure
-that stores jit code generation information.  The name @code{_jit} is
-special, because since multiple jit generators can run at the same
-time, you must either @r{#define _jit my_jit_state} or name it
address@hidden
-
address@hidden typedef int (*pifi)(int);
-Just a handy typedef for a pointer to a function that takes an
address@hidden and returns another.
-
address@hidden jit_node_t  *in;
-Declares a variable to hold an identifier for a function argument. It
-is an opaque pointer, that will hold the return of a call to @code{arg}
-and be used as argument to @code{getarg}.
-
address@hidden pifi         incr;
-Declares a function pointer variable to a function that receives an
address@hidden and returns an @code{int}.
-
address@hidden init_jit(argv[0]);
-You must call this function before creating a @code{jit_state_t}
-object. This function does global state initialization, and may need
-to detect CPU or Operating System features.  It receives a string
-argument that is later used to read symbols from a shared object using
-GNU binutils if disassembly was enabled at configure time. If no
-disassembly will be performed a NULL pointer can be used as argument.
-
address@hidden _jit = jit_new_state();
-This call initializes a @lightning{} jit state.
-
address@hidden jit_prolog();
-Ok, so we start generating code for our beloved address@hidden
-
address@hidden in = jit_arg();
address@hidden jit_getarg(JIT_R0, in);
-We retrieve the first (and only) argument, an integer, and store it
-into the general-purpose register @code{R0}.
-
address@hidden jit_addi(JIT_R0, JIT_R0, 1);
-We add one to the content of the register.
-
address@hidden jit_retr(JIT_R0);
-This instruction generates a standard function epilog that returns
-the contents of the @code{R0} register.
-
address@hidden incr = jit_emit();
-This instruction is very important.  It actually translates the
address@hidden macros used before to machine code, flushes the generated
-code area out of the processor's instruction cache and return a
-pointer to the start of the code.
-
address@hidden jit_clear_state();
-This call cleanups any data not required for jit execution. Note
-that it must be called after any call to @code{jit_print} or
address@hidden, as this call destroy the @lightning{}
-intermediate representation.
-
address@hidden printf("%d + 1 = %d", 5, incr(5));
-Calling our function is this simple---it is not distinguishable from
-a normal C function call, the only difference being that @code{incr}
-is a variable.
-
address@hidden jit_destroy_state();
-Releases all memory associated with the jit context. It should be
-called after known the jit will no longer be called.
-
address@hidden finish_jit();
-This call cleanups any global state hold by @lightning{}, and is
-advisable to call it once jit code will no longer be generated.
address@hidden table
-
address@hidden abstracts two phases of dynamic code generation: selecting
-instructions that map the standard representation, and emitting binary
-code for these instructions.  The client program has the responsibility
-of describing the code to be generated using the standard @lightning{}
-instruction set.
-
-Let's examine the code generated for @code{incr} on the SPARC and x86_64
-architecture (on the right is the code that an assembly-language
-programmer would write):
-
address@hidden @b
address@hidden SPARC
address@hidden
-      save  %sp, -112, %sp
-      mov  %i0, %g2                 retl
-      inc  %g2                      inc %o0
-      mov  %g2, %i0
-      restore 
-      retl 
-      nop 
address@hidden example
-In this case, @lightning{} introduces overhead to create a register
-window (not knowing that the procedure is a leaf procedure) and to
-move the argument to the general purpose register @code{R0} (which
-maps to @code{%g2} on the SPARC).
address@hidden table
-
address@hidden @b
address@hidden x86_64
address@hidden
-    sub   $0x30,%rsp
-    mov   %rbp,(%rsp)
-    mov   %rsp,%rbp
-    sub   $0x18,%rsp
-    mov   %rdi,%rax            mov %rdi, %rax
-    add   $0x1,%rax            inc %rax
-    mov   %rbp,%rsp
-    mov   (%rsp),%rbp
-    add   $0x30,%rsp
-    retq                       retq
address@hidden example
-In this case, the main overhead is due to the function's prolog and
-epilog, and stack alignment after reserving stack space for word
-to/from float conversions or moving data from/to x87 to/from SSE.
-Note that besides allocating space to save callee saved registers,
-no registers are saved/restored because @lightning{} notices those
-registers are not modified. There is currently no logic to detect
-if it needs to allocate stack space for type conversions neither
-proper leaf function detection, but these are subject to change
-(FIXME).
address@hidden table
-
address@hidden printf
address@hidden A simple function call to @code{printf}
-
-Again, here is the code for the example:
-
address@hidden
-#include <stdio.h>
-#include <lightning.h>
-
-static jit_state_t *_jit;
-
-typedef void (*pvfi)(int);      @rem{/* Pointer to Void Function of Int */}
-
-int main(int argc, char *argv[])
address@hidden
-  pvfi          myFunction;             @rem{/* ptr to generated code */}
-  jit_node_t    *start, *end;           @rem{/* a couple of labels */}
-  jit_node_t    *in;                    @rem{/* to get the argument */}
-
-  init_jit(argv[0]);
-  _jit = jit_new_state();
-
-  start = jit_note(__FILE__, __LINE__);
-  jit_prolog();
-  in = jit_arg();
-  jit_getarg(JIT_R1, in);
-  jit_pushargi((jit_word_t)"generated %d bytes\n");
-  jit_ellipsis();
-  jit_pushargr(JIT_R1);
-  jit_finishi(printf);
-  jit_ret();
-  jit_epilog();
-  end = jit_note(__FILE__, __LINE__);
-
-  myFunction = jit_emit();
-
-  @rem{/* call the generated address@hidden passing its size as argument */}
-  myFunction((char*)jit_address(end) - (char*)jit_address(start));
-  jit_clear_state();
-
-  jit_disassemble();
-
-  jit_destroy_state();
-  finish_jit();
-  return 0;
address@hidden
address@hidden example
-
-The function shows how many bytes were generated.  Most of the code
-is not very interesting, as it resembles very closely the program
-presented in @ref{incr, , A function which increments a number by one}.
-
-For this reason, we're going to concentrate on just a few statements.
-
address@hidden @t
address@hidden start = jit_note(__FILE__, __LINE__);
address@hidden @address@hidden
address@hidden end = jit_note(__FILE__, __LINE__);
-These two instruction call the @code{jit_note} macro, which creates
-a note in the jit code; arguments to @code{jit_note} usually are a
-filename string and line number integer, but using NULL for the
-string argument is perfectly valid if only need to create a simple
-marker in the code.
-
address@hidden jit_ellipsis();
address@hidden usually is only required if calling varargs functions
-with double arguments, but it is a good practice to properly describe
-the @address@hidden in the call sequence.
-
address@hidden jit_pushargi((jit_word_t)"generated %d bytes\n");
-Note the use of the @code{(jit_word_t)} cast, that is used only
-to avoid a compiler warning, due to using a pointer where a
-wordsize integer type was expected.
-
address@hidden jit_prepare();
address@hidden @address@hidden
address@hidden jit_finishi(printf);
-Once the arguments to @code{printf} have been pushed, what means
-moving them to stack or register arguments, the @code{printf}
-function is called and the stack cleaned.  Note how @lightning{}
-abstracts the differences between different architectures and
-ABI's -- the client program does not know how parameter passing
-works on the host architecture.
-
address@hidden jit_epilog();
-Usually it is not required to call @code{epilog}, but because it
-is implicitly called when noticing the end of a function, if the
address@hidden variable was set with a @code{note} call after the
address@hidden, it would not consider the function epilog.
-
address@hidden myFunction((char*)jit_address(end) - (char*)jit_address(start));
-This calls the generate jit function passing as argument the offset
-difference from the @code{start} and @code{end} notes. The @code{address}
-call must be done after the @code{emit} call or either a fatal error
-will happen (if @lightning{} is built with assertions enable) or an
-undefined value will be returned.
-
address@hidden jit_clear_state();
-Note that @code{jit_clear_state} was called after executing jit in
-this example. It was done because it must be called after any call
-to @code{jit_address} or @code{jit_print}.
-
address@hidden jit_disassemble();
address@hidden will dump the generated code to standard output,
-unless @lightning{} was built with the disassembler disabled, in which
-case no output will be shown.
address@hidden table
-
address@hidden RPN calculator
address@hidden A more complex example, an RPN calculator
-
-We create a small stack-based RPN calculator which applies a series
-of operators to a given parameter and to other numeric operands.
-Unlike previous examples, the code generator is fully parameterized
-and is able to compile different formulas to different functions.
-Here is the code for the expression compiler; a sample usage will
-follow.
-
-Since @lightning{} does not provide push/pop instruction, this
-example uses a stack-allocated area to store the data.  Such an
-area can be allocated using the macro @code{allocai}, which
-receives the number of bytes to allocate and returns the offset
-from the frame pointer register @code{FP} to the base of the
-area.
-
-Usually, you will use the @code{ldxi} and @code{stxi} instruction
-to access stack-allocated variables.  However, it is possible to
-use operations such as @code{add} to compute the address of the
-variables, and pass the address around.
-
address@hidden
-#include <stdio.h>
-#include <lightning.h>
-
-typedef int (*pifi)(int);       @rem{/* Pointer to Int Function of Int */}
-
-static jit_state_t *_jit;
-
-void stack_push(int reg, int *sp)
address@hidden
-  jit_stxi_i (*sp, JIT_FP, reg);
-  *sp += sizeof (int);
address@hidden
-
-void stack_pop(int reg, int *sp)
address@hidden
-  *sp -= sizeof (int);
-  jit_ldxi_i (reg, JIT_FP, *sp);
address@hidden
-
-jit_node_t *compile_rpn(char *expr)
address@hidden
-  jit_node_t *in, *fn;
-  int stack_base, stack_ptr;
-
-  fn = jit_note(NULL, 0);
-  jit_prolog();
-  in = jit_arg();
-  stack_ptr = stack_base = jit_allocai (32 * sizeof (int));
-
-  jit_getarg_i(JIT_R2, in);
-
-  while (*expr) @{
-    char buf[32];
-    int n;
-    if (sscanf(expr, "%[0-9]%n", buf, &n)) @{
-      expr += n - 1;
-      stack_push(JIT_R0, &stack_ptr);
-      jit_movi(JIT_R0, atoi(buf));
-    @} else if (*expr == 'x') @{
-      stack_push(JIT_R0, &stack_ptr);
-      jit_movr(JIT_R0, JIT_R2);
-    @} else if (*expr == '+') @{
-      stack_pop(JIT_R1, &stack_ptr);
-      jit_addr(JIT_R0, JIT_R1, JIT_R0);
-    @} else if (*expr == '-') @{
-      stack_pop(JIT_R1, &stack_ptr);
-      jit_subr(JIT_R0, JIT_R1, JIT_R0);
-    @} else if (*expr == '*') @{
-      stack_pop(JIT_R1, &stack_ptr);
-      jit_mulr(JIT_R0, JIT_R1, JIT_R0);
-    @} else if (*expr == '/') @{
-      stack_pop(JIT_R1, &stack_ptr);
-      jit_divr(JIT_R0, JIT_R1, JIT_R0);
-    @} else @{
-      fprintf(stderr, "cannot compile: %s\n", expr);
-      abort();
-    @}
-    ++expr;
-  @}
-  jit_retr(JIT_R0);
-  jit_epilog();
-  return fn;
address@hidden
address@hidden example
-
-The principle on which the calculator is based is easy: the stack top
-is held in R0, while the remaining items of the stack are held in the
-memory area that we allocate with @code{allocai}.  Compiling a numeric
-operand or the argument @code{x} pushes the old stack top onto the
-stack and moves the operand into R0; compiling an operator pops the
-second operand off the stack into R1, and compiles the operation so
-that the result goes into R0, thus becoming the new stack top.
-
-This example allocates a fixed area for 32 @code{int}s.  This is not
-a problem when the function is a leaf like in this case; in a full-blown
-compiler you will want to analyze the input and determine the number
-of needed stack slots---a very simple example of register allocation.
-The area is then managed like a stack using @code{stack_push} and
address@hidden
-
-Source code for the client (which lies in the same source file) follows:
-
address@hidden
-int main(int argc, char *argv[])
address@hidden
-  jit_node_t *nc, *nf;
-  pifi c2f, f2c;
-  int i;
-
-  init_jit(argv[0]);
-  _jit = jit_new_state();
-
-  nc = compile_rpn("32x9*5/+");
-  nf = compile_rpn("x32-5*9/");
-  (void)jit_emit();
-  c2f = (pifi)jit_address(nc);
-  f2c = (pifi)jit_address(nf);
-  jit_clear_state();
-
-  printf("\nC:");
-  for (i = 0; i <= 100; i += 10) printf("%3d ", i);
-  printf("\nF:");
-  for (i = 0; i <= 100; i += 10) printf("%3d ", c2f(i));
-  printf("\n");
-
-  printf("\nF:");
-  for (i = 32; i <= 212; i += 18) printf("%3d ", i);
-  printf("\nC:");
-  for (i = 32; i <= 212; i += 18) printf("%3d ", f2c(i));
-  printf("\n");
-
-  jit_destroy_state();
-  finish_jit();
-  return 0;
address@hidden
address@hidden example
-
-The client displays a conversion table between Celsius and Fahrenheit
-degrees (both Celsius-to-Fahrenheit and Fahrenheit-to-Celsius). The
-formulas are, @math{F(c) = c*9/5+32} and @math{C(f) = (f-32)*5/9},
-respectively.
-
-Providing the formula as an argument to @code{compile_rpn} effectively
-parameterizes code generation, making it possible to use the same code
-to compile different functions; this is what makes dynamic code
-generation so powerful.
-
address@hidden Fibonacci
address@hidden Fibonacci numbers
-
-The code in this section calculates the Fibonacci sequence. That is
-modeled by the recurrence relation:
address@hidden
-     f(0) = 0
-     f(1) = f(2) = 1
-     f(n) = f(n-1) + f(n-2)
address@hidden display
-
-The purpose of this example is to introduce branches.  There are two
-kind of branches: backward branches and forward branches.  We'll
-present the calculation in a recursive and iterative form; the
-former only uses forward branches, while the latter uses both.
-
address@hidden
-#include <stdio.h>
-#include <lightning.h>
-
-static jit_state_t *_jit;
-
-typedef int (*pifi)(int);       @rem{/* Pointer to Int Function of Int */}
-
-int main(int argc, char *argv[])
address@hidden
-  pifi       fib;
-  jit_node_t *label;
-  jit_node_t *call;
-  jit_node_t *in;                 @rem{/* offset of the argument */}
-  jit_node_t *ref;                @rem{/* to patch the forward reference */}
-  jit_node_t *zero;               @rem{/* to patch the forward reference */}
-
-  init_jit(argv[0]);
-  _jit = jit_new_state();
-
-  label = jit_label();
-        jit_prolog   ();
-  in =  jit_arg      ();
-        jit_getarg   (JIT_V0, in);              @rem{/* R0 = n */}
- zero = jit_beqi     (JIT_R0, 0);
-        jit_movr     (JIT_V0, JIT_R0);          /* V0 = R0 */
-        jit_movi     (JIT_R0, 1);
-  ref = jit_blei     (JIT_V0, 2);
-        jit_subi     (JIT_V1, JIT_V0, 1);       @rem{/* V1 = n-1 */}
-        jit_subi     (JIT_V2, JIT_V0, 2);       @rem{/* V2 = n-2 */}
-        jit_prepare();
-          jit_pushargr(JIT_V1);
-        call = jit_finishi(NULL);
-        jit_patch_at(call, label);
-        jit_retval(JIT_V1);                     @rem{/* V1 = fib(n-1) */}
-        jit_prepare();
-          jit_pushargr(JIT_V2);
-        call = jit_finishi(NULL);
-        jit_patch_at(call, label);
-        jit_retval(JIT_R0);                     @rem{/* R0 = fib(n-2) */}
-        jit_addr(JIT_R0, JIT_R0, JIT_V1);       @rem{/* R0 = R0 + V1 */}
-
-  jit_patch(ref);                               @rem{/* patch jump */}
-  jit_patch(zero);                              @rem{/* patch jump */}
-        jit_retr(JIT_R0);
-
-  @rem{/* call the generated address@hidden passing 32 as an argument */}
-  fib = jit_emit();
-  jit_clear_state();
-  printf("fib(%d) = %d\n", 32, fib(32));
-  jit_destroy_state();
-  finish_jit();
-  return 0;
address@hidden
address@hidden example
-
-As said above, this is the first example of dynamically compiling
-branches.  Branch instructions have two operands containing the
-values to be compared, and return a @code{jit_note_t *} object
-to be patched.
-
-Because labels final address are only known after calling @code{emit},
-it is required to call @code{patch} or @code{patch_at}, what does
-tell @lightning{} that the target to patch is actually a pointer to
-a @code{jit_node_t *} object, otherwise, it would assume that is
-a pointer to a C function. Note that conditional branches do not
-receive a label argument, so they must be patched.
-
-You need to call @code{patch_at} on the return of value @code{calli},
address@hidden, and @code{calli} if it is actually referencing a label
-in the jit code. All branch instructions do not receive a label
-argument. Note that @code{movi} is an special case, and patching it
-is usually done to get the final address of a label, usually to later
-call @code{jmpr}.
-
-Now, here is the iterative version:
-
address@hidden
-#include <stdio.h>
-#include <lightning.h>
-
-static jit_state_t *_jit;
-
-typedef int (*pifi)(int);       @rem{/* Pointer to Int Function of Int */}
-
-int main(int argc, char *argv[])
address@hidden
-  pifi       fib;
-  jit_node_t *in;               @rem{/* offset of the argument */}
-  jit_node_t *ref;              @rem{/* to patch the forward reference */}
-  jit_node_t *zero;             @rem{/* to patch the forward reference */}
-  jit_node_t *jump;             @rem{/* jump to start of loop */}
-  jit_node_t *loop;             @rem{/* start of the loop */}
-
-  init_jit(argv[0]);
-  _jit = jit_new_state();
-
-        jit_prolog   ();
-  in =  jit_arg      ();
-        jit_getarg   (JIT_R0, in);              @rem{/* R0 = n */}
- zero = jit_beqi     (JIT_R0, 0);
-        jit_movr     (JIT_R1, JIT_R0);
-        jit_movi     (JIT_R0, 1);
-  ref = jit_blti     (JIT_R1, 2);
-        jit_subi     (JIT_R2, JIT_R2, 2);
-        jit_movr     (JIT_R1, JIT_R0);
-
-  loop= jit_label();
-        jit_subi     (JIT_R2, JIT_R2, 1);       @rem{/* decr. counter */}
-        jit_movr     (JIT_V0, JIT_R0);          /* V0 = R0 */
-        jit_addr     (JIT_R0, JIT_R0, JIT_R1);  /* R0 = R0 + R1 */
-        jit_movr     (JIT_R1, JIT_V0);          /* R1 = V0 */
-  jump= jit_bnei     (JIT_R2, 0);               /* if (R2) goto loop; */
-  jit_patch_at(jump, loop);
-
-  jit_patch(ref);                               @rem{/* patch forward jump */}
-  jit_patch(zero);                              @rem{/* patch forward jump */}
-        jit_retr     (JIT_R0);
-
-  @rem{/* call the generated address@hidden passing 36 as an argument */}
-  fib = jit_emit();
-  jit_clear_state();
-  printf("fib(%d) = %d\n", 36, fib(36));
-  jit_destroy_state();
-  finish_jit();
-  return 0;
address@hidden
address@hidden example
-
-This code calculates the recurrence relation using iteration (a
address@hidden loop in high-level languages).  There are no function
-calls anymore: instead, there is a backward jump (the @code{bnei} at
-the end of the loop).
-
-Note that the program must remember the address for backward jumps;
-for forward jumps it is only required to remember the jump code,
-and call @code{patch} for the implicit label.
-
address@hidden Reentrancy
address@hidden Re-entrant usage of @lightning{}
-
address@hidden uses the special @code{_jit} identifier. To be able
-to be able to use multiple jit generation states at the same
-time, it is required to used code similar to:
-
address@hidden
-    struct jit_state lightning;
-    #define lightning _jit
address@hidden example
-
-This will cause the symbol defined to @code{_jit} to be passed as
-the first argument to the underlying @lightning{} implementation,
-that is usually a function with an @code{_} (underscode) prefix
-and with an argument named @code{_jit}, in the pattern:
-
address@hidden
-    static void _jit_mnemonic(jit_state_t *, jit_gpr_t, jit_gpr_t);
-    #define jit_mnemonic(u, v) _jit_mnemonic(_jit, u, v);
address@hidden example
-
-The reason for this is to use the same syntax as the initial lightning
-implementation and to avoid needing the user to keep adding an extra
-argument to every call, as multiple jit states generating code in
-paralell should be very uncommon.
-
address@hidden Registers
address@hidden Accessing the whole register file
-
-As mentioned earlier in this chapter, all @lightning{} back-ends are
-guaranteed to have at least six general-purpose integer registers and
-six floating-point registers, but many back-ends will have more.
-
-To access the entire register files, you can use the
address@hidden, @code{JIT_V} and @code{JIT_F} macros.  They
-accept a parameter that identifies the register number, which
-must be strictly less than @code{JIT_R_NUM}, @code{JIT_V_NUM}
-and @code{JIT_F_NUM} respectively; the number need not be
-constant.  Of course, expressions like @code{JIT_R0} and
address@hidden(0)} denote the same register, and likewise for
-integer callee-saved, or floating-point, registers.
-
address@hidden Customizations
address@hidden Customizations
-
-Frequently it is desirable to have more control over how code is
-generated or how memory is used during jit generation or execution.
-
address@hidden Memory functions
-To aid in complete control of memory allocation and deallocation
address@hidden provides wrappers that default to standard @code{malloc},
address@hidden and @code{free}. These are loosely based on the
-GNU GMP counterparts, with the difference that they use the same
-prototype of the system allocation functions, that is, no @code{size}
-for @code{free} or @code{old_size} for @code{realloc}.
-
address@hidden void jit_set_memory_functions (@* void *(address@hidden) 
(size_t), @* void *(address@hidden) (void *, size_t), @* void (address@hidden) 
(void *))
address@hidden guarantees that memory is only allocated or released
-using these wrapped functions, but you must note that if lightning
-was linked to GNU binutils, malloc is probably will be called multiple
-times from there when initializing the disassembler.
-
-Because @code{init_jit} may call memory functions, if you need to call
address@hidden, it must be called before @code{init_jit},
-otherwise, when calling @code{finish_jit}, a pointer allocated with the
-previous or default wrappers will be passed.
address@hidden deftypefun
-
address@hidden void jit_get_memory_functions (@* void *(address@hidden) 
(size_t), @* void *(address@hidden) (void *, size_t), @* void (address@hidden) 
(void *))
-Get the current memory allocation function. Also, unlike the GNU GMP
-counterpart, it is an error to pass @code{NULL} pointers as arguments.
address@hidden deftypefun
-
address@hidden Alternate code buffer
-To instruct @lightning{} to use an alternate code buffer it is required
-to call @code{jit_realize} before @code{jit_emit}, and then query states
-and customize as appropriate.
-
address@hidden void jit_realize ()
-Must be called once, before @code{jit_emit}, to instruct @lightning{}
-that no other @code{jit_xyz} call will be made.
address@hidden deftypefun
-
address@hidden jit_pointer_t jit_get_code (jit_word_t address@hidden)
-Returns NULL or the previous value set with @code{jit_set_code}, and
-sets the @var{code_size} argument to an appropriate value.
-If @code{jit_get_code} is called before @code{jit_emit}, the
address@hidden argument is set to the expected amount of bytes
-required to generate code.
-If @code{jit_get_code} is called after @code{jit_emit}, the
address@hidden argument is set to the exact amount of bytes used
-by the code.
address@hidden deftypefun
-
address@hidden void jit_set_code (jit_ponter_t @var{code}, jit_word_t 
@var{size})
-Instructs @lightning{} to output to the @var{code} argument and
-use @var{size} as a guard to not write to invalid memory. If during
address@hidden @lightning{} finds out that the code would not fit
-in @var{size} bytes, it halts code emit and returns @code{NULL}.
address@hidden deftypefun
-
-A simple example of a loop using an alternate buffer is:
-
address@hidden
-  jit_uint8_t   *code;
-  int           *(func)(int);      @rem{/* function pointer */}
-  jit_word_t     code_size;
-  jit_word_t     real_code_size;
-  @rem{...}
-  jit_realize();                   @rem{/* ready to generate code */}
-  jit_get_code(&code_size);        @rem{/* get expected code size */}
-  code_size = (code_size + 4095) & -4096;
-  do (;;) @{
-    code = mmap(NULL, code_size, PROT_EXEC | PROT_READ | PROT_WRITE,
-                MAP_PRIVATE | MAP_ANON, -1, 0);
-    jit_set_code(code, code_size);
-    if ((func = jit_emit()) == NULL) @{
-      munmap(code, code_size);
-      code_size += 4096;
-    @}
-  @} while (func == NULL);
-  jit_get_code(&real_code_size);   @rem{/* query exact size of the code */}
address@hidden example
-
-The first call to @code{jit_get_code} should return @code{NULL} and set
-the @code{code_size} argument to the expected amount of bytes required
-to emit code.
-The second call to @code{jit_get_code} is after a successful call to
address@hidden, and will return the value previously set with
address@hidden and set the @code{real_code_size} argument to the
-exact amount of bytes used to emit the code.
-
address@hidden Alternate data buffer
-Sometimes it may be desirable to customize how, or to prevent
address@hidden from using an extra buffer for constants or debug
-annotation. Usually when also using an alternate code buffer.
-
address@hidden jit_pointer_t jit_get_data (jit_word_t address@hidden, 
jit_word_t address@hidden)
-Returns @code{NULL} or the previous value set with @code{jit_set_data},
-and sets the @var{data_size} argument to how many bytes are required
-for the constants data buffer, and @var{note_size} to how many bytes
-are required to store the debug note information.
-Note that it always preallocate one debug note entry even if
address@hidden or @code{jit_note} are never called, but will return
-zero in the @var{data_size} argument if no constant is required;
-constants are only used for the @code{float} and @code{double} operations
-that have an immediate argument, and not in all @lightning{} ports.
address@hidden deftypefun
-
address@hidden void jit_set_data (jit_pointer_t @var{data}, jit_word_t 
@var{size}, jit_word_t @var{flags})
-
address@hidden can be NULL if disabling constants and annotations, otherwise,
-a valid pointer must be passed. An assertion is done that the data will
-fit in @var{size} bytes (but that is a noop if @lightning{} was built
-with @code{-DNDEBUG}).
-
address@hidden tells the space in bytes available in @var{data}.
-
address@hidden can be zero to tell to just use the alternate data buffer,
-or a composition of @code{JIT_DISABLE_DATA} and @code{JIT_DISABLE_NOTE}
-
address@hidden @t
address@hidden JIT_DISABLE_DATA
address@hidden JIT_DISABLE_DATA
-Instructs @lightning{} to not use a constant table, but to use an
-alternate method to synthesize those, usually with a larger code
-sequence using stack space to transfer the value from a GPR to a
-FPR register.
-
address@hidden JIT_DISABLE_NOTE
address@hidden JIT_DISABLE_NOTE
-Instructs @lightning{} to not store file or function name, and
-line numbers in the constant buffer.
address@hidden table
address@hidden deftypefun
-
-A simple example of a preventing usage of a data buffer is:
-
address@hidden
-  @rem{...}
-  jit_realize();                        @rem{/* ready to generate code */}
-  jit_get_data(NULL, NULL);
-  jit_set_data(NULL, 0, JIT_DISABLE_DATA | JIT_DISABLE_NOTE);
-  @rem{...}
address@hidden example
-
-Or to only use a data buffer, if required:
-
address@hidden
-  jit_uint8_t   *data;
-  jit_word_t     data_size;
-  @rem{...}
-  jit_realize();                        @rem{/* ready to generate code */}
-  jit_get_data(&data_size, NULL);
-  if (data_size)
-    data = malloc(data_size);
-  else
-    data = NULL;
-  jit_set_data(data, data_size, JIT_DISABLE_NOTE);
-  @rem{...}
-  if (data)
-    free(data);
-  @rem{...}
address@hidden example
-
address@hidden Acknowledgements
address@hidden Acknowledgements
-
-As far as I know, the first general-purpose portable dynamic code
-generator is @sc{dcg}, by Dawson R.@: Engler and T.@: A.@: Proebsting.
-Further work by Dawson R. Engler resulted in the @sc{vcode} system;
-unlike @sc{dcg}, @sc{vcode} used no intermediate representation and
-directly inspired @lightning{}.
-
-Thanks go to Ian Piumarta, who kindly accepted to release his own
-program @sc{ccg} under the GNU General Public License, thereby allowing
address@hidden to use the run-time assemblers he had wrote for @sc{ccg}.
address@hidden provides a way of dynamically assemble programs written in the
-underlying architecture's assembly language.  So it is not portable,
-yet very interesting.
-
-I also thank Steve Byrne for writing GNU Smalltalk, since @lightning{}
-was first developed as a tool to be used in GNU Smalltalk's dynamic
-translator from bytecodes to native code.
diff --git a/doc/lightning.texi b/doc/lightning.texi
index c7d8f98..88f397a 100644
--- a/doc/lightning.texi
+++ b/doc/lightning.texi
@@ -68,8 +68,1690 @@
 @c End of macro section
 @c ---------------------------------------------------------------------
 
address@hidden version.texi
address@hidden body.texi
address@hidden UPDATED 18 June 2018
address@hidden UPDATED-MONTH June 2018
address@hidden EDITION 2.1.2
address@hidden VERSION 2.1.2
+
address@hidden
address@hidden Software development
address@hidden
+* lightning: (lightning).       Library for dynamic code generation.
address@hidden direntry
address@hidden ifnottex
+
address@hidden
address@hidden Top
address@hidden @lightning{}
+
address@hidden
address@hidden comma
address@hidden|,|}
address@hidden macro
address@hidden iftex
+
address@hidden
address@hidden comma
address@hidden|,|}
address@hidden macro
address@hidden ifnottex
+
+This document describes @value{TOPIC} the @lightning{} library for
+dynamic code generation.
+
address@hidden
+* Overview::                What GNU lightning is
+* Installation::            Configuring and installing GNU lightning
+* The instruction set::     The RISC instruction set used in GNU lightning
+* GNU lightning examples::  GNU lightning's examples
+* Reentrancy::              Re-entrant usage of GNU lightning
+* Customizations::          Advanced code generation customizations
+* Acknowledgements::        Acknowledgements for GNU lightning
address@hidden menu
address@hidden ifnottex
+
address@hidden Overview
address@hidden Introduction to @lightning{}
+
address@hidden
+This document describes @value{TOPIC} the @lightning{} library for
+dynamic code generation.
address@hidden iftex
+
+Dynamic code generation is the generation of machine code 
+at runtime. It is typically used to strip a layer of interpretation 
+by allowing compilation to occur at runtime.  One of the most
+well-known applications of dynamic code generation is perhaps that
+of interpreters that compile source code to an intermediate bytecode
+form, which is then recompiled to machine code at run-time: this
+approach effectively combines the portability of bytecode
+representations with the speed of machine code.  Another common
+application of dynamic code generation is in the field of hardware
+simulators and binary emulators, which can use the same techniques
+to translate simulated instructions to the instructions of the 
+underlying machine.
+
+Yet other applications come to mind: for example, windowing
address@hidden operations, matrix manipulations, and network packet
+filters.  Albeit very powerful and relatively well known within the
+compiler community, dynamic code generation techniques are rarely
+exploited to their full potential and, with the exception of the
+two applications described above, have remained curiosities because
+of their portability and functionality barriers: binary instructions
+are generated, so programs using dynamic code generation must be
+retargeted for each machine; in addition, coding a run-time code
+generator is a tedious and error-prone task more than a difficult one.
+
address@hidden provides a portable, fast and easily retargetable dynamic
+code generation system. 
+
+To be portable, @lightning{} abstracts over current architectures'
+quirks and unorthogonalities.  The interface that it exposes to is that
+of a standardized RISC architecture loosely based on the SPARC and MIPS
+chips.  There are a few general-purpose registers (six, not including
+those used to receive and pass parameters between subroutines), and
+arithmetic operations involve three operands---either three registers
+or two registers and an arbitrarily sized immediate value.
+
+On one hand, this architecture is general enough that it is possible to
+generate pretty efficient code even on CISC architectures such as the
+Intel x86 or the Motorola 68k families.  On the other hand, it matches
+real architectures closely enough that, most of the time, the
+compiler's constant folding pass ends up generating code which
+assembles machine instructions without further tests.
+
address@hidden Installation
address@hidden Configuring and installing @lightning{}
+
+The first thing to do to use @lightning{} is to configure the
+program, picking the set of macros to be used on the host
+architecture; this configuration is automatically performed by
+the @file{configure} shell script; to run it, merely type:
address@hidden
+     ./configure
address@hidden example
+
address@hidden supports the @code{--enable-disassembler} option, that
+enables linking to GNU binutils and optionally print human readable
+disassembly of the jit code. This option can be disabled by the
address@hidden option.
+
+Another option that @file{configure} accepts is
address@hidden, which enables several consistency checks in
+the run-time assemblers.  These are not usually needed, so you can
+decide to simply forget about it; also remember that these consistency
+checks tend to slow down your code generator.
+
+After you've configured @lightning{}, run @file{make} as usual.
+
address@hidden has an extensive set of tests to validate it is working
+correctly in the build host. To test it run:
address@hidden
+    make check
address@hidden example
+
+The next important step is:
address@hidden
+    make install
address@hidden example
+
+This ends the process of installing @lightning{}.
+
address@hidden The instruction set
address@hidden @lightning{}'s instruction set
+
address@hidden's instruction set was designed by deriving instructions
+that closely match those of most existing RISC architectures, or
+that can be easily syntesized if absent.  Each instruction is composed
+of:
address@hidden @bullet
address@hidden
+an operation, like @code{sub} or @code{mul}
+
address@hidden
+most times, a register/immediate flag (@code{r} or @code{i})
+
address@hidden
+an unsigned modifier (@code{u}), a type identifier or two, when applicable.
address@hidden itemize
+
+Examples of legal mnemonics are @code{addr} (integer add, with three
+register operands) and @code{muli} (integer multiply, with two
+register operands and an immediate operand).  Each instruction takes
+two or three operands; in most cases, one of them can be an immediate
+value instead of a register.
+
+Most @lightning{} integer operations are signed wordsize operations,
+with the exception of operations that convert types, or load or store
+values to/from memory. When applicable, the types and C types are as
+follow:
+
address@hidden
+     _c         @r{signed char}
+     _uc        @r{unsigned char}
+     _s         @r{short}
+     _us        @r{unsigned short}
+     _i         @r{int}
+     _ui        @r{unsigned int}
+     _l         @r{long}
+     _f         @r{float}
+     _d         @r{double}
address@hidden example
+
+Most integer operations do not need a type modifier, and when loading or
+storing values to memory there is an alias to the proper operation
+using wordsize operands, that is, if ommited, the type is @r{int} on
+32-bit architectures and @r{long} on 64-bit architectures.  Note
+that lightning also expects @code{sizeof(void*)} to match the wordsize.
+
+When an unsigned operation result differs from the equivalent signed
+operation, there is a the @code{_u} modifier.
+
+There are at least seven integer registers, of which six are
+general-purpose, while the last is used to contain the frame pointer
+(@code{FP}).  The frame pointer can be used to allocate and access local
+variables on the stack, using the @code{allocai} or @code{allocar}
+instruction.
+
+Of the general-purpose registers, at least three are guaranteed to be
+preserved across function calls (@code{V0}, @code{V1} and
address@hidden) and at least three are not (@code{R0}, @code{R1} and
address@hidden).  Six registers are not very much, but this
+restriction was forced by the need to target CISC architectures
+which, like the x86, are poor of registers; anyway, backends can
+specify the actual number of available registers with the calls
address@hidden (for caller-save registers) and @code{JIT_V_NUM}
+(for callee-save registers).
+
+There are at least six floating-point registers, named @code{F0} to
address@hidden  These are usually caller-save and are separate from the integer
+registers on the supported architectures; on Intel architectures,
+in 32 bit mode if SSE2 is not available or use of X87 is forced,
+the register stack is mapped to a flat register file.  As for the
+integer registers, the macro @code{JIT_F_NUM} yields the number of
+floating-point registers.
+
+The complete instruction set follows; as you can see, most non-memory
+operations only take integers (either signed or unsigned) as operands;
+this was done in order to reduce the instruction set, and because most
+architectures only provide word and long word operations on registers.
+There are instructions that allow operands to be extended to fit a larger
+data type, both in a signed and in an unsigned way.
+
address@hidden @b
address@hidden Binary ALU operations
+These accept three operands; the last one can be an immediate.
address@hidden operations must directly follow @code{addc}, and
address@hidden must follow @code{subc}; otherwise, results are undefined.
+Most, if not all, architectures do not support @r{float} or @r{double}
+immediate operands; lightning emulates those operations by moving the
+immediate to a temporary register and emiting the call with only
+register operands.
address@hidden
+addr         _f  _d  O1 = O2 + O3
+addi         _f  _d  O1 = O2 + O3
+addxr                O1 = O2 + (O3 + carry)
+addxi                O1 = O2 + (O3 + carry)
+addcr                O1 = O2 + O3, set carry
+addci                O1 = O2 + O3, set carry
+subr         _f  _d  O1 = O2 - O3
+subi         _f  _d  O1 = O2 - O3
+subxr                O1 = O2 - (O3 + carry)
+subxi                O1 = O2 - (O3 + carry)
+subcr                O1 = O2 - O3, set carry
+subci                O1 = O2 - O3, set carry
+rsbr         _f  _d  O1 = O3 - O1
+rsbi         _f  _d  O1 = O3 - O1
+mulr         _f  _d  O1 = O2 * O3
+muli         _f  _d  O1 = O2 * O3
+divr     _u  _f  _d  O1 = O2 / O3
+divi     _u  _f  _d  O1 = O2 / O3
+remr     _u          O1 = O2 % O3
+remi     _u          O1 = O2 % O3
+andr                 O1 = O2 & O3
+andi                 O1 = O2 & O3
+orr                  O1 = O2 | O3
+ori                  O1 = O2 | O3
+xorr                 O1 = O2 ^ O3
+xori                 O1 = O2 ^ O3
+lshr                 O1 = O2 << O3
+lshi                 O1 = O2 << O3
+rshr     _u          O1 = O2 >> address@hidden sign bit is propagated unless 
using the @code{_u} modifier.}
+rshi     _u          O1 = O2 >> address@hidden sign bit is propagated unless 
using the @code{_u} modifier.}
address@hidden example
+
address@hidden Four operand binary ALU operations
+These accept two result registers, and two operands; the last one can
+be an immediate. The first two arguments cannot be the same register.
+
address@hidden stores the low word of the result in @code{O1} and the
+high word in @code{O2}. For unsigned multiplication, @code{O2} zero
+means there was no overflow. For signed multiplication, no overflow
+check is based on sign, and can be detected if @code{O2} is zero or
+minus one.
+
address@hidden stores the quotient in @code{O1} and the remainder in
address@hidden It can be used as quick way to check if a division is
+exact, in which case the remainder is zero.
+
address@hidden
+qmulr    _u       O1 O2 = O3 * O4
+qmuli    _u       O1 O2 = O3 * O4
+qdivr    _u       O1 O2 = O3 / O4
+qdivi    _u       O1 O2 = O3 / O4
address@hidden example
+
address@hidden Unary ALU operations
+These accept two operands, both of which must be registers.
address@hidden
+negr         _f  _d  O1 = -O2
+comr                 O1 = ~O2
address@hidden example
+
+These unary ALU operations are only defined for float operands.
address@hidden
+absr         _f  _d  O1 = fabs(O2)
+sqrtr                O1 = sqrt(O2)
address@hidden example
+
+Besides requiring the @code{r} modifier, there are no unary operations
+with an immediate operand.
+
address@hidden Compare instructions
+These accept three operands; again, the last can be an immediate.
+The last two operands are compared, and the first operand, that must be
+an integer register, is set to either 0 or 1, according to whether the
+given condition was met or not.
+
+The conditions given below are for the standard behavior of C,
+where the ``unordered'' comparison result is mapped to false.
+
address@hidden
+ltr       _u  _f  _d  O1 =  (O2 <  O3)
+lti       _u  _f  _d  O1 =  (O2 <  O3)
+ler       _u  _f  _d  O1 =  (O2 <= O3)
+lei       _u  _f  _d  O1 =  (O2 <= O3)
+gtr       _u  _f  _d  O1 =  (O2 >  O3)
+gti       _u  _f  _d  O1 =  (O2 >  O3)
+ger       _u  _f  _d  O1 =  (O2 >= O3)
+gei       _u  _f  _d  O1 =  (O2 >= O3)
+eqr           _f  _d  O1 =  (O2 == O3)
+eqi           _f  _d  O1 =  (O2 == O3)
+ner           _f  _d  O1 =  (O2 != O3)
+nei           _f  _d  O1 =  (O2 != O3)
+unltr         _f  _d  O1 = !(O2 >= O3)
+unler         _f  _d  O1 = !(O2 >  O3)
+ungtr         _f  _d  O1 = !(O2 <= O3)
+unger         _f  _d  O1 = !(O2 <  O3)
+uneqr         _f  _d  O1 = !(O2 <  O3) && !(O2 >  O3)
+ltgtr         _f  _d  O1 = !(O2 >= O3) || !(O2 <= O3)
+ordr          _f  _d  O1 =  (O2 == O2) &&  (O3 == O3)
+unordr        _f  _d  O1 =  (O2 != O2) ||  (O3 != O3)
address@hidden example
+
address@hidden Transfer operations
+These accept two operands; for @code{ext} both of them must be
+registers, while @code{mov} accepts an immediate value as the second
+operand.
+
+Unlike @code{movr} and @code{movi}, the other instructions are used
+to truncate a wordsize operand to a smaller integer data type or to
+convert float data types. You can also use @code{extr} to convert an
+integer to a floating point value: the usual options are @code{extr_f}
+and @code{extr_d}.
+
address@hidden
+movr                                 _f  _d  O1 = O2
+movi                                 _f  _d  O1 = O2
+extr      _c  _uc  _s  _us  _i  _ui  _f  _d  O1 = O2
+truncr                               _f  _d  O1 = trunc(O2)
address@hidden example
+
+In 64-bit architectures it may be required to use @code{truncr_f_i},
address@hidden, @code{truncr_d_i} and @code{truncr_d_l} to match
+the equivalent C code.  Only the @code{_i} modifier is available in
+32-bit architectures.
+
address@hidden
+truncr_f_i    = <int> O1 = <float> O2
+truncr_f_l    = <long>O1 = <float> O2
+truncr_d_i    = <int> O1 = <double>O2
+truncr_d_l    = <long>O1 = <double>O2
address@hidden example
+
+The float conversion operations are @emph{destination first,
+source second}, but the order of the types is reversed.  This happens
+for historical reasons.
+
address@hidden
+extr_f_d    = <double>O1 = <float> O2
+extr_d_f    = <float> O1 = <double>O2
address@hidden example
+
address@hidden Network extensions
+These accept two operands, both of which must be registers; these
+two instructions actually perform the same task, yet they are
+assigned to two mnemonics for the sake of convenience and
+completeness.  As usual, the first operand is the destination and
+the second is the source.
+The @code{_ul} variant is only available in 64-bit architectures.
address@hidden
+htonr    _us _ui _ul @r{Host-to-network (big endian) order}
+ntohr    _us _ui _ul @r{Network-to-host order }
address@hidden example
+
address@hidden Load operations
address@hidden accepts two operands while @code{ldx} accepts three;
+in both cases, the last can be either a register or an immediate
+value. Values are extended (with or without sign, according to
+the data type specification) to fit a whole register.
+The @code{_ui} and @code{_l} types are only available in 64-bit
+architectures.  For convenience, there is a version without a
+type modifier for integer or pointer operands that uses the
+appropriate wordsize call.
address@hidden
+ldr     _c  _uc  _s  _us  _i  _ui  _l  _f  _d  O1 = *O2
+ldi     _c  _uc  _s  _us  _i  _ui  _l  _f  _d  O1 = *O2
+ldxr    _c  _uc  _s  _us  _i  _ui  _l  _f  _d  O1 = *(O2+O3)
+ldxi    _c  _uc  _s  _us  _i  _ui  _l  _f  _d  O1 = *(O2+O3)
address@hidden example
+
address@hidden Store operations
address@hidden accepts two operands while @code{stx} accepts three; in
+both cases, the first can be either a register or an immediate
+value. Values are sign-extended to fit a whole register.
address@hidden
+str     _c  _uc  _s  _us  _i  _ui  _l  _f  _d  *O1 = O2
+sti     _c  _uc  _s  _us  _i  _ui  _l  _f  _d  *O1 = O2
+stxr    _c  _uc  _s  _us  _i  _ui  _l  _f  _d  *(O1+O2) = O3
+stxi    _c  _uc  _s  _us  _i  _ui  _l  _f  _d  *(O1+O2) = O3
address@hidden example
+As for the load operations, the @code{_ui} and @code{_l} types are
+only available in 64-bit architectures, and for convenience, there
+is a version without a type modifier for integer or pointer operands
+that uses the appropriate wordsize call.
+
address@hidden Argument management
+These are:
address@hidden
+prepare     (not specified)
+va_start    (not specified)
+pushargr                                   _f  _d
+pushargi                                   _f  _d
+va_push     (not specified)
+arg         _c  _uc  _s  _us  _i  _ui  _l  _f  _d
+getarg      _c  _uc  _s  _us  _i  _ui  _l  _f  _d
+va_arg                                         _d
+putargr                                    _f  _d
+putargi                                    _f  _d
+ret         (not specified)
+retr                                       _f  _d
+reti                                       _f  _d
+va_end      (not specified)
+retval      _c  _uc  _s  _us  _i  _ui  _l  _f  _d
+epilog      (not specified)
address@hidden example
+As with other operations that use a type modifier, the @code{_ui} and
address@hidden types are only available in 64-bit architectures, but there
+are operations without a type modifier that alias to the appropriate
+integer operation with wordsize operands.
+
address@hidden, @code{pusharg}, and @code{retval} are used by the caller,
+while @code{arg}, @code{getarg} and @code{ret} are used by the callee.
+A code snippet that wants to call another procedure and has to pass
+arguments must, in order: use the @code{prepare} instruction and use
+the @code{pushargr} or @code{pushargi} to push the arguments @strong{in
+left to right order}; and use @code{finish} or @code{call} (explained below)
+to perform the actual call.
+
address@hidden returns a @code{C} compatible @code{va_list}. To fetch
+arguments, use @code{va_arg} for integers and @code{va_arg_d} for doubles.
address@hidden is required when passing a @code{va_list} to another function,
+because not all architectures expect it as a single pointer. Known case
+is DEC Alpha, that requires it as a structure passed by value.
+
address@hidden, @code{getarg} and @code{putarg} are used by the callee.
address@hidden is different from other instruction in that it does not
+actually generate any code: instead, it is a function which returns
+a value to be passed to @code{getarg} or @code{putarg}. @footnote{``Return
+a value'' means that @lightning{} code that compile these
+instructions return a value when expanded.} You should call
address@hidden as soon as possible, before any function call or, more
+easily, right after the @code{prolog} instructions
+(which is treated later).
+
address@hidden accepts a register argument and a value returned by
address@hidden, and will move that argument to the register, extending
+it (with or without sign, according to the data type specification)
+to fit a whole register.  These instructions are more intimately
+related to the usage of the @lightning{} instruction set in code
+that generates other code, so they will be treated more
+specifically in @ref{GNU lightning examples, , Generating code at
+run-time}.
+
address@hidden is a mix of @code{getarg} and @code{pusharg} in that
+it accepts as first argument a register or immediate, and as
+second argument a value returned by @code{arg}. It allows changing,
+or restoring an argument to the current function, and is a
+construct required to implement tail call optimization. Note that
+arguments in registers are very cheap, but will be overwritten
+at any moment, including on some operations, for example division,
+that on several ports is implemented as a function call.
+
+Finally, the @code{retval} instruction fetches the return value of a
+called function in a register.  The @code{retval} instruction takes a
+register argument and copies the return value of the previously called
+function in that register.  A function with a return value should use
address@hidden or @code{reti} to put the return value in the return register
+before returning.  @xref{Fibonacci, the Fibonacci numbers}, for an example.
+
address@hidden is an optional call, that marks the end of a function
+body. It is automatically generated by @lightning{} if starting a new
+function (what should be done after a @code{ret} call) or finishing
+generating jit.
+It is very important to note that the fact that @code{epilog} being
+optional may cause a common mistake. Consider this:
address@hidden
+fun1:
+    prolog
+    ...
+    ret
+fun2:
+    prolog
address@hidden example
+Because @code{epilog} is added when finding a new @code{prolog},
+this will cause the @code{fun2} label to actually be before the
+return from @code{fun1}. Because @lightning{} will actually
+understand it as:
address@hidden
+fun1:
+    prolog
+    ...
+    ret
+fun2:
+    epilog
+    prolog
address@hidden example
+
+You should observe a few rules when using these macros.  First of
+all, if calling a varargs function, you should use the @code{ellipsis}
+call to mark the position of the ellipsis in the C prototype.
+
+You should not nest calls to @code{prepare} inside a
address@hidden/finish} block.  Doing this will result in undefined
+behavior. Note that for functions with zero arguments you can use
+just @code{call}.
+
address@hidden Branch instructions
+Like @code{arg}, these also return a value which, in this case,
+is to be used to compile forward branches as explained in
address@hidden, , Fibonacci numbers}.  They accept two operands to be
+compared; of these, the last can be either a register or an immediate.
+They are:
address@hidden
+bltr      _u  _f  _d  @r{if }(O2 <  O3)@r{ goto }O1
+blti      _u  _f  _d  @r{if }(O2 <  O3)@r{ goto }O1
+bler      _u  _f  _d  @r{if }(O2 <= O3)@r{ goto }O1
+blei      _u  _f  _d  @r{if }(O2 <= O3)@r{ goto }O1
+bgtr      _u  _f  _d  @r{if }(O2 >  O3)@r{ goto }O1
+bgti      _u  _f  _d  @r{if }(O2 >  O3)@r{ goto }O1
+bger      _u  _f  _d  @r{if }(O2 >= O3)@r{ goto }O1
+bgei      _u  _f  _d  @r{if }(O2 >= O3)@r{ goto }O1
+beqr          _f  _d  @r{if }(O2 == O3)@r{ goto }O1
+beqi          _f  _d  @r{if }(O2 == O3)@r{ goto }O1
+bner          _f  _d  @r{if }(O2 != O3)@r{ goto }O1
+bnei          _f  _d  @r{if }(O2 != O3)@r{ goto }O1
+
+bunltr        _f  _d  @r{if }!(O2 >= O3)@r{ goto }O1
+bunler        _f  _d  @r{if }!(O2 >  O3)@r{ goto }O1
+bungtr        _f  _d  @r{if }!(O2 <= O3)@r{ goto }O1
+bunger        _f  _d  @r{if }!(O2 <  O3)@r{ goto }O1
+buneqr        _f  _d  @r{if }!(O2 <  O3) && !(O2 >  O3)@r{ goto }O1
+bltgtr        _f  _d  @r{if }!(O2 >= O3) || !(O2 <= O3)@r{ goto }O1
+bordr         _f  _d  @r{if } (O2 == O2) &&  (O3 == O3)@r{ goto }O1
+bunordr       _f  _d  @r{if }!(O2 != O2) ||  (O3 != O3)@r{ goto }O1
+
+bmsr                  @r{if }O2 &  address@hidden goto }O1
+bmsi                  @r{if }O2 &  address@hidden goto }O1
+bmcr                  @r{if }!(O2 & O3)@r{ goto }O1
+bmci                  @r{if }!(O2 & O3)@r{ goto address@hidden mnemonics mean, 
respectively, @dfn{branch if mask set} and @dfn{branch if mask cleared}.}
+boaddr    _u          O2 += address@hidden, goto address@hidden if overflow}
+boaddi    _u          O2 += address@hidden, goto address@hidden if overflow}
+bxaddr    _u          O2 += address@hidden, goto address@hidden if no overflow}
+bxaddi    _u          O2 += address@hidden, goto address@hidden if no overflow}
+bosubr    _u          O2 -= address@hidden, goto address@hidden if overflow}
+bosubi    _u          O2 -= address@hidden, goto address@hidden if overflow}
+bxsubr    _u          O2 -= address@hidden, goto address@hidden if no overflow}
+bxsubi    _u          O2 -= address@hidden, goto address@hidden if no overflow}
address@hidden example
+
address@hidden Jump and return operations
+These accept one argument except @code{ret} and @code{jmpi} which
+have none; the difference between @code{finishi} and @code{calli}
+is that the latter does not clean the stack from pushed parameters
+(if any) and the former must @strong{always} follow a @code{prepare}
+instruction.
address@hidden
+callr     (not specified)                @r{function call to register O1}
+calli     (not specified)                @r{function call to immediate O1}
+finishr   (not specified)                @r{function call to register O1}
+finishi   (not specified)                @r{function call to immediate O1}
+jmpr      (not specified)                @r{unconditional jump to register}
+jmpi      (not specified)                @r{unconditional jump}
+ret       (not specified)                @r{return from subroutine}
+retr      _c _uc _s _us _i _ui _l _f _d
+reti      _c _uc _s _us _i _ui _l _f _d
+retval    _c _uc _s _us _i _ui _l _f _d  @r{move return value}
+                                         @r{to register}
address@hidden example
+
+Like branch instruction, @code{jmpi} also returns a value which is to
+be used to compile forward branches. @xref{Fibonacci, , Fibonacci
+numbers}.
+
address@hidden Labels
+There are 3 @lightning{} instructions to create labels:
address@hidden
+label     (not specified)                @r{simple label}
+forward   (not specified)                @r{forward label}
+indirect  (not specified)                @r{special simple label}
address@hidden example
+
address@hidden is normally used as @code{patch_at} argument for backward
+jumps.
+
address@hidden
+        jit_node_t *jump, *label;
+label = jit_label();
+        ...
+        jump = jit_beqr(JIT_R0, JIT_R1);
+        jit_patch_at(jump, label);
address@hidden example
+
address@hidden is used to patch code generation before the actual
+position of the label is known.
+
address@hidden
+        jit_node_t *jump, *label;
+label = jit_forward();
+        jump = jit_beqr(JIT_R0, JIT_R1);
+        jit_patch_at(jump, label);
+        ...
+        jit_link(label);
address@hidden example
+
address@hidden is useful when creating jump tables, and tells
address@hidden to not optimize out a label that is not the target of
+any jump, because an indirect jump may land where it is defined.
+
address@hidden
+        jit_node_t *jump, *label;
+        ...
+        jmpr(JIT_R0);                    @rem{/* may jump to label */}
+        ...
+label = jit_indirect();
address@hidden example
+
address@hidden is an special case of @code{note} and @code{name}
+because it is a valid argument to @code{address}.
+
+Note that the usual idiom to write the previous example is
address@hidden
+        jit_node_t *addr, *jump;
+addr  = jit_movi(JIT_R0, 0);             @rem{/* immediate is ignored */}
+        ...
+        jmpr(JIT_R0);
+        ...
+        jit_patch(addr);                 @rem{/* implicit label added */}
address@hidden example
+
+that automatically binds the implicit label added by @code{patch} with
+the @code{movi}, but on some special conditions it is required to create
+an "unbound" label.
+
address@hidden Function prolog
+
+These macros are used to set up a function prolog.  The @code{allocai}
+call accept a single integer argument and returns an offset value
+for stack storage access.  The @code{allocar} accepts two registers
+arguments, the first is set to the offset for stack access, and the
+second is the size in bytes argument.
+
address@hidden
+prolog    (not specified)                @r{function prolog}
+allocai   (not specified)                @r{reserve space on the stack}
+allocar   (not specified)                @r{allocate space on the stack}
address@hidden example
+
address@hidden receives the number of bytes to allocate and returns
+the offset from the frame pointer register @code{FP} to the base of
+the area.
+
address@hidden receives two register arguments.  The first is where
+to store the offset from the frame pointer register @code{FP} to the
+base of the area.  The second argument is the size in bytes.  Note
+that @code{allocar} is dynamic allocation, and special attention
+should be taken when using it.  If called in a loop, every iteration
+will allocate stack space.  Stack space is aligned from 8 to 64 bytes
+depending on backend requirements, even if allocating only one byte.
+It is advisable to not use it with @code{frame} and @code{tramp}; it
+should work with @code{frame} with special care to call only once,
+but is not supported if used in @code{tramp}, even if called only
+once.
+
+As a small appetizer, here is a small function that adds 1 to the input
+parameter (an @code{int}).  I'm using an assembly-like syntax here which
+is a bit different from the one used when writing real subroutines with
address@hidden; the real syntax will be introduced in @xref{GNU lightning
+examples, , Generating code at run-time}.
+
address@hidden
+incr:
+     prolog
+in = arg                     @rem{! We have an integer argument}
+     getarg    R0, in        @rem{! Move it to R0}
+     addi      R0, R0, 1     @rem{! Add 1}
+     retr      R0            @rem{! And return the result}
address@hidden example
+
+And here is another function which uses the @code{printf} function from
+the standard C library to write a number in hexadecimal notation:
+
address@hidden
+printhex:
+     prolog
+in = arg                     @rem{! Same as above}
+     getarg    R0, in
+     prepare                 @rem{! Begin call sequence for printf}
+     pushargi  "%x"          @rem{! Push format string}
+     ellipsis                @rem{! Varargs start here}
+     pushargr  R0            @rem{! Push second argument}
+     finishi   printf        @rem{! Call printf}
+     ret                     @rem{! Return to caller}
address@hidden example
+
address@hidden Trampolines, continuations and tail call optimization
+
+Frequently it is required to generate jit code that must jump to
+code generated later, possibly from another @code{jit_context_t}.
+These require compatible stack frames.
+
address@hidden provides two primitives from where trampolines,
+continuations and tail call optimization can be implemented.
+
address@hidden
+frame   (not specified)                  @r{create stack frame}
+tramp   (not specified)                  @r{assume stack frame}
address@hidden example
+
address@hidden receives an integer address@hidden is not
+automatically computed because it does not know about the
+requirement of later generated code.} that defines the size in
+bytes for the stack frame of the current, @code{C} callable,
+jit function. To calculate this value, a good formula is maximum
+number of arguments to any called native function times
address@hidden eight so that it works for double arguments.
+And would not need conditionals for ports that pass arguments in
+the stack.}, plus the sum of the arguments to any call to
address@hidden @lightning{} automatically adjusts this value
+for any backend specific stack memory it may need, or any
+alignment constraint.
+
address@hidden also instructs @lightning{} to save all callee
+save registers in the prolog and reload in the epilog.
+
address@hidden
+main:                        @rem{! jit entry point}
+     prolog                  @rem{! function prolog}
+     frame  256              @rem{! save all callee save registers and}
+                             @rem{! reserve at least 256 bytes in stack}
+main_loop:
+     ...
+     jmpi   handler          @rem{! jumps to external code}
+     ...
+     ret                     @rem{! return to the caller}
address@hidden example
+
address@hidden differs from @code{frame} only that a prolog and epilog
+will not be generated. Note that @code{prolog} must still be used.
+The code under @code{tramp} must be ready to be entered with a jump
+at the prolog position, and instead of a return, it must end with
+a non conditional jump. @code{tramp} exists solely for the fact
+that it allows optimizing out prolog and epilog code that would
+never be executed.
+
address@hidden
+handler:                     @rem{! handler entry point}
+     prolog                  @rem{! function prolog}
+     tramp  256              @rem{! assumes all callee save registers}
+                             @rem{! are saved and there is at least}
+                             @rem{! 256 bytes in stack}
+     ...
+     jmpi   main_loop        @rem{! return to the main loop}
address@hidden example
+
address@hidden only supports Tail Call Optimization using the
address@hidden construct. Any other way is not guaranteed to
+work on all ports.
+
+An example of a simple (recursive) tail call optimization:
+
address@hidden
+factorial:                   @rem{! Entry point of the factorial function}
+     prolog
+in = arg                     @rem{! Receive an integer argument}
+     getarg R0, in           @rem{! Move argument to RO}
+     prepare
+         pushargi 1          @rem{! This is the accumulator}
+         pushargr R0         @rem{! This is the argument}
+     finishi fact            @rem{! Call the tail call optimized function}
+     retval R0               @rem{! Fetch the result}
+     retr R0                 @rem{! Return it}
+     epilog                  @rem{! Epilog *before* label before prolog}
+
+fact:                        @rem{! Entry point of the helper function}
+     prolog
+     frame 16                @rem{! Reserve 16 bytes in the stack}
+fact_entry:                  @rem{! This is the tail call entry point}
+ac = arg                     @rem{! The accumulator is the first argument}
+in = arg                     @rem{! The factorial argument}
+     getarg R0, ac           @rem{! Move the accumulator to R0}
+     getarg R1, in           @rem{! Move the argument to R1}
+     blei fact_out, R1, 1    @rem{! Done if argument is one or less}
+     mulr R0, R0, R1         @rem{! accumulator *= argument}
+     putargr R0, ac          @rem{! Update the accumulator}
+     subi R1, R1, 1          @rem{! argument -= 1}
+     putargr R1, in          @rem{! Update the argument}
+     jmpi fact_entry         @rem{! Tail Call Optimize it!}
+fact_out:
+     retr R0                 @rem{! Return the accumulator}
address@hidden example
+
address@hidden Predicates
address@hidden
+forward_p      (not specified)           @r{forward label predicate}
+indirect_p     (not specified)           @r{indirect label predicate}
+target_p       (not specified)           @r{used label predicate}
+arg_register_p (not specified)           @r{argument kind predicate}
+callee_save_p  (not specified)           @r{callee save predicate}
+pointer_p      (not specified)           @r{pointer predicate}
address@hidden example
+
address@hidden expects a @code{jit_node_t*} argument, and
+returns non zero if it is a forward label reference, that is,
+a label returned by @code{forward}, that still needs a
address@hidden call.
+
address@hidden expects a @code{jit_node_t*} argument, and returns
+non zero if it is an indirect label reference, that is, a label that
+was returned by @code{indirect}.
+
address@hidden expects a @code{jit_node_t*} argument, that is any
+kind of label, and will return non zero if there is at least one
+jump or move referencing it.
+
address@hidden expects a @code{jit_node_t*} argument, that must
+have been returned by @code{arg}, @code{arg_f} or @code{arg_d}, and
+will return non zero if the argument lives in a register. This call
+is useful to know the live range of register arguments, as those
+are very fast to read and write, but have volatile values.
+
address@hidden exects a valid @code{JIT_Rn}, @code{JIT_Vn}, or
address@hidden, and will return non zero if the register is callee
+save. This call is useful because on several ports, the @code{JIT_Rn}
+and @code{JIT_Fn} registers are actually callee save; no need
+to save and load the values when making function calls.
+
address@hidden expects a pointer argument, and will return non
+zero if the pointer is inside the generated jit code. Must be
+called after @code{jit_emit} and before @code{jit_destroy_state}.
address@hidden table
+
address@hidden GNU lightning examples
address@hidden Generating code at run-time
+
+To use @lightning{}, you should include the @file{lightning.h} file that
+is put in your include directory by the @samp{make install} command.
+
+Each of the instructions above translates to a macro or function call.
+All you have to do is prepend @code{jit_} (lowercase) to opcode names
+and @code{JIT_} (uppercase) to register names.  Of course, parameters
+are to be put between parentheses.
+
+This small tutorial presents three examples:
+
address@hidden
address@hidden @bullet
address@hidden
+The @code{incr} function found in @ref{The instruction set, ,
address@hidden's instruction set}:
+
address@hidden
+A simple function call to @code{printf}
+
address@hidden
+An RPN calculator.
+
address@hidden
+Fibonacci numbers
address@hidden itemize
address@hidden iftex
address@hidden
address@hidden
+* incr::             A function which increments a number by one
+* printf::           A simple function call to printf
+* RPN calculator::   A more complex example, an RPN calculator
+* Fibonacci::        Calculating Fibonacci numbers
address@hidden menu
address@hidden ifnottex
+
address@hidden incr
address@hidden A function which increments a number by one
+
+Let's see how to create and use the sample @code{incr} function created
+in @ref{The instruction set, , @lightning{}'s instruction set}:
+
address@hidden
+#include <stdio.h>
+#include <lightning.h>
+
+static jit_state_t *_jit;
+
+typedef int (*pifi)(int);    @rem{/* Pointer to Int Function of Int */}
+
+int main(int argc, char *argv[])
address@hidden
+  jit_node_t  *in;
+  pifi         incr;
+
+  init_jit(argv[0]);
+  _jit = jit_new_state();
+
+  jit_prolog();                    @rem{/* @t{     prolog             } */}
+  in = jit_arg();                  @rem{/* @t{     in = arg           } */}
+  jit_getarg(JIT_R0, in);          @rem{/* @t{     getarg R0          } */}
+  jit_addi(JIT_R0, JIT_R0, 1);     @rem{/* @t{     addi   address@hidden 
address@hidden 1   } */}
+  jit_retr(JIT_R0);                @rem{/* @t{     retr   R0          } */}
+
+  incr = jit_emit();
+  jit_clear_state();
+
+  @rem{/* call the generated address@hidden passing 5 as an argument */}
+  printf("%d + 1 = %d\n", 5, incr(5));
+
+  jit_destroy_state();
+  finish_jit();
+  return 0;
address@hidden
address@hidden example
+
+Let's examine the code line by line (well, address@hidden):
+
address@hidden @t
address@hidden #include <lightning.h>
+You already know about this.  It defines all of @lightning{}'s macros.
+
address@hidden static jit_state_t *_jit;
+You might wonder about what is @code{jit_state_t}.  It is a structure
+that stores jit code generation information.  The name @code{_jit} is
+special, because since multiple jit generators can run at the same
+time, you must either @r{#define _jit my_jit_state} or name it
address@hidden
+
address@hidden typedef int (*pifi)(int);
+Just a handy typedef for a pointer to a function that takes an
address@hidden and returns another.
+
address@hidden jit_node_t  *in;
+Declares a variable to hold an identifier for a function argument. It
+is an opaque pointer, that will hold the return of a call to @code{arg}
+and be used as argument to @code{getarg}.
+
address@hidden pifi         incr;
+Declares a function pointer variable to a function that receives an
address@hidden and returns an @code{int}.
+
address@hidden init_jit(argv[0]);
+You must call this function before creating a @code{jit_state_t}
+object. This function does global state initialization, and may need
+to detect CPU or Operating System features.  It receives a string
+argument that is later used to read symbols from a shared object using
+GNU binutils if disassembly was enabled at configure time. If no
+disassembly will be performed a NULL pointer can be used as argument.
+
address@hidden _jit = jit_new_state();
+This call initializes a @lightning{} jit state.
+
address@hidden jit_prolog();
+Ok, so we start generating code for our beloved address@hidden
+
address@hidden in = jit_arg();
address@hidden jit_getarg(JIT_R0, in);
+We retrieve the first (and only) argument, an integer, and store it
+into the general-purpose register @code{R0}.
+
address@hidden jit_addi(JIT_R0, JIT_R0, 1);
+We add one to the content of the register.
+
address@hidden jit_retr(JIT_R0);
+This instruction generates a standard function epilog that returns
+the contents of the @code{R0} register.
+
address@hidden incr = jit_emit();
+This instruction is very important.  It actually translates the
address@hidden macros used before to machine code, flushes the generated
+code area out of the processor's instruction cache and return a
+pointer to the start of the code.
+
address@hidden jit_clear_state();
+This call cleanups any data not required for jit execution. Note
+that it must be called after any call to @code{jit_print} or
address@hidden, as this call destroy the @lightning{}
+intermediate representation.
+
address@hidden printf("%d + 1 = %d", 5, incr(5));
+Calling our function is this simple---it is not distinguishable from
+a normal C function call, the only difference being that @code{incr}
+is a variable.
+
address@hidden jit_destroy_state();
+Releases all memory associated with the jit context. It should be
+called after known the jit will no longer be called.
+
address@hidden finish_jit();
+This call cleanups any global state hold by @lightning{}, and is
+advisable to call it once jit code will no longer be generated.
address@hidden table
+
address@hidden abstracts two phases of dynamic code generation: selecting
+instructions that map the standard representation, and emitting binary
+code for these instructions.  The client program has the responsibility
+of describing the code to be generated using the standard @lightning{}
+instruction set.
+
+Let's examine the code generated for @code{incr} on the SPARC and x86_64
+architecture (on the right is the code that an assembly-language
+programmer would write):
+
address@hidden @b
address@hidden SPARC
address@hidden
+      save  %sp, -112, %sp
+      mov  %i0, %g2                 retl
+      inc  %g2                      inc %o0
+      mov  %g2, %i0
+      restore 
+      retl 
+      nop 
address@hidden example
+In this case, @lightning{} introduces overhead to create a register
+window (not knowing that the procedure is a leaf procedure) and to
+move the argument to the general purpose register @code{R0} (which
+maps to @code{%g2} on the SPARC).
address@hidden table
+
address@hidden @b
address@hidden x86_64
address@hidden
+    sub   $0x30,%rsp
+    mov   %rbp,(%rsp)
+    mov   %rsp,%rbp
+    sub   $0x18,%rsp
+    mov   %rdi,%rax            mov %rdi, %rax
+    add   $0x1,%rax            inc %rax
+    mov   %rbp,%rsp
+    mov   (%rsp),%rbp
+    add   $0x30,%rsp
+    retq                       retq
address@hidden example
+In this case, the main overhead is due to the function's prolog and
+epilog, and stack alignment after reserving stack space for word
+to/from float conversions or moving data from/to x87 to/from SSE.
+Note that besides allocating space to save callee saved registers,
+no registers are saved/restored because @lightning{} notices those
+registers are not modified. There is currently no logic to detect
+if it needs to allocate stack space for type conversions neither
+proper leaf function detection, but these are subject to change
+(FIXME).
address@hidden table
+
address@hidden printf
address@hidden A simple function call to @code{printf}
+
+Again, here is the code for the example:
+
address@hidden
+#include <stdio.h>
+#include <lightning.h>
+
+static jit_state_t *_jit;
+
+typedef void (*pvfi)(int);      @rem{/* Pointer to Void Function of Int */}
+
+int main(int argc, char *argv[])
address@hidden
+  pvfi          myFunction;             @rem{/* ptr to generated code */}
+  jit_node_t    *start, *end;           @rem{/* a couple of labels */}
+  jit_node_t    *in;                    @rem{/* to get the argument */}
+
+  init_jit(argv[0]);
+  _jit = jit_new_state();
+
+  start = jit_note(__FILE__, __LINE__);
+  jit_prolog();
+  in = jit_arg();
+  jit_getarg(JIT_R1, in);
+  jit_pushargi((jit_word_t)"generated %d bytes\n");
+  jit_ellipsis();
+  jit_pushargr(JIT_R1);
+  jit_finishi(printf);
+  jit_ret();
+  jit_epilog();
+  end = jit_note(__FILE__, __LINE__);
+
+  myFunction = jit_emit();
+
+  @rem{/* call the generated address@hidden passing its size as argument */}
+  myFunction((char*)jit_address(end) - (char*)jit_address(start));
+  jit_clear_state();
+
+  jit_disassemble();
+
+  jit_destroy_state();
+  finish_jit();
+  return 0;
address@hidden
address@hidden example
+
+The function shows how many bytes were generated.  Most of the code
+is not very interesting, as it resembles very closely the program
+presented in @ref{incr, , A function which increments a number by one}.
+
+For this reason, we're going to concentrate on just a few statements.
+
address@hidden @t
address@hidden start = jit_note(__FILE__, __LINE__);
address@hidden @address@hidden
address@hidden end = jit_note(__FILE__, __LINE__);
+These two instruction call the @code{jit_note} macro, which creates
+a note in the jit code; arguments to @code{jit_note} usually are a
+filename string and line number integer, but using NULL for the
+string argument is perfectly valid if only need to create a simple
+marker in the code.
+
address@hidden jit_ellipsis();
address@hidden usually is only required if calling varargs functions
+with double arguments, but it is a good practice to properly describe
+the @address@hidden in the call sequence.
+
address@hidden jit_pushargi((jit_word_t)"generated %d bytes\n");
+Note the use of the @code{(jit_word_t)} cast, that is used only
+to avoid a compiler warning, due to using a pointer where a
+wordsize integer type was expected.
+
address@hidden jit_prepare();
address@hidden @address@hidden
address@hidden jit_finishi(printf);
+Once the arguments to @code{printf} have been pushed, what means
+moving them to stack or register arguments, the @code{printf}
+function is called and the stack cleaned.  Note how @lightning{}
+abstracts the differences between different architectures and
+ABI's -- the client program does not know how parameter passing
+works on the host architecture.
+
address@hidden jit_epilog();
+Usually it is not required to call @code{epilog}, but because it
+is implicitly called when noticing the end of a function, if the
address@hidden variable was set with a @code{note} call after the
address@hidden, it would not consider the function epilog.
+
address@hidden myFunction((char*)jit_address(end) - (char*)jit_address(start));
+This calls the generate jit function passing as argument the offset
+difference from the @code{start} and @code{end} notes. The @code{address}
+call must be done after the @code{emit} call or either a fatal error
+will happen (if @lightning{} is built with assertions enable) or an
+undefined value will be returned.
+
address@hidden jit_clear_state();
+Note that @code{jit_clear_state} was called after executing jit in
+this example. It was done because it must be called after any call
+to @code{jit_address} or @code{jit_print}.
+
address@hidden jit_disassemble();
address@hidden will dump the generated code to standard output,
+unless @lightning{} was built with the disassembler disabled, in which
+case no output will be shown.
address@hidden table
+
address@hidden RPN calculator
address@hidden A more complex example, an RPN calculator
+
+We create a small stack-based RPN calculator which applies a series
+of operators to a given parameter and to other numeric operands.
+Unlike previous examples, the code generator is fully parameterized
+and is able to compile different formulas to different functions.
+Here is the code for the expression compiler; a sample usage will
+follow.
+
+Since @lightning{} does not provide push/pop instruction, this
+example uses a stack-allocated area to store the data.  Such an
+area can be allocated using the macro @code{allocai}, which
+receives the number of bytes to allocate and returns the offset
+from the frame pointer register @code{FP} to the base of the
+area.
+
+Usually, you will use the @code{ldxi} and @code{stxi} instruction
+to access stack-allocated variables.  However, it is possible to
+use operations such as @code{add} to compute the address of the
+variables, and pass the address around.
+
address@hidden
+#include <stdio.h>
+#include <lightning.h>
+
+typedef int (*pifi)(int);       @rem{/* Pointer to Int Function of Int */}
+
+static jit_state_t *_jit;
+
+void stack_push(int reg, int *sp)
address@hidden
+  jit_stxi_i (*sp, JIT_FP, reg);
+  *sp += sizeof (int);
address@hidden
+
+void stack_pop(int reg, int *sp)
address@hidden
+  *sp -= sizeof (int);
+  jit_ldxi_i (reg, JIT_FP, *sp);
address@hidden
+
+jit_node_t *compile_rpn(char *expr)
address@hidden
+  jit_node_t *in, *fn;
+  int stack_base, stack_ptr;
+
+  fn = jit_note(NULL, 0);
+  jit_prolog();
+  in = jit_arg();
+  stack_ptr = stack_base = jit_allocai (32 * sizeof (int));
+
+  jit_getarg_i(JIT_R2, in);
+
+  while (*expr) @{
+    char buf[32];
+    int n;
+    if (sscanf(expr, "%[0-9]%n", buf, &n)) @{
+      expr += n - 1;
+      stack_push(JIT_R0, &stack_ptr);
+      jit_movi(JIT_R0, atoi(buf));
+    @} else if (*expr == 'x') @{
+      stack_push(JIT_R0, &stack_ptr);
+      jit_movr(JIT_R0, JIT_R2);
+    @} else if (*expr == '+') @{
+      stack_pop(JIT_R1, &stack_ptr);
+      jit_addr(JIT_R0, JIT_R1, JIT_R0);
+    @} else if (*expr == '-') @{
+      stack_pop(JIT_R1, &stack_ptr);
+      jit_subr(JIT_R0, JIT_R1, JIT_R0);
+    @} else if (*expr == '*') @{
+      stack_pop(JIT_R1, &stack_ptr);
+      jit_mulr(JIT_R0, JIT_R1, JIT_R0);
+    @} else if (*expr == '/') @{
+      stack_pop(JIT_R1, &stack_ptr);
+      jit_divr(JIT_R0, JIT_R1, JIT_R0);
+    @} else @{
+      fprintf(stderr, "cannot compile: %s\n", expr);
+      abort();
+    @}
+    ++expr;
+  @}
+  jit_retr(JIT_R0);
+  jit_epilog();
+  return fn;
address@hidden
address@hidden example
+
+The principle on which the calculator is based is easy: the stack top
+is held in R0, while the remaining items of the stack are held in the
+memory area that we allocate with @code{allocai}.  Compiling a numeric
+operand or the argument @code{x} pushes the old stack top onto the
+stack and moves the operand into R0; compiling an operator pops the
+second operand off the stack into R1, and compiles the operation so
+that the result goes into R0, thus becoming the new stack top.
+
+This example allocates a fixed area for 32 @code{int}s.  This is not
+a problem when the function is a leaf like in this case; in a full-blown
+compiler you will want to analyze the input and determine the number
+of needed stack slots---a very simple example of register allocation.
+The area is then managed like a stack using @code{stack_push} and
address@hidden
+
+Source code for the client (which lies in the same source file) follows:
+
address@hidden
+int main(int argc, char *argv[])
address@hidden
+  jit_node_t *nc, *nf;
+  pifi c2f, f2c;
+  int i;
+
+  init_jit(argv[0]);
+  _jit = jit_new_state();
+
+  nc = compile_rpn("32x9*5/+");
+  nf = compile_rpn("x32-5*9/");
+  (void)jit_emit();
+  c2f = (pifi)jit_address(nc);
+  f2c = (pifi)jit_address(nf);
+  jit_clear_state();
+
+  printf("\nC:");
+  for (i = 0; i <= 100; i += 10) printf("%3d ", i);
+  printf("\nF:");
+  for (i = 0; i <= 100; i += 10) printf("%3d ", c2f(i));
+  printf("\n");
+
+  printf("\nF:");
+  for (i = 32; i <= 212; i += 18) printf("%3d ", i);
+  printf("\nC:");
+  for (i = 32; i <= 212; i += 18) printf("%3d ", f2c(i));
+  printf("\n");
+
+  jit_destroy_state();
+  finish_jit();
+  return 0;
address@hidden
address@hidden example
+
+The client displays a conversion table between Celsius and Fahrenheit
+degrees (both Celsius-to-Fahrenheit and Fahrenheit-to-Celsius). The
+formulas are, @math{F(c) = c*9/5+32} and @math{C(f) = (f-32)*5/9},
+respectively.
+
+Providing the formula as an argument to @code{compile_rpn} effectively
+parameterizes code generation, making it possible to use the same code
+to compile different functions; this is what makes dynamic code
+generation so powerful.
+
address@hidden Fibonacci
address@hidden Fibonacci numbers
+
+The code in this section calculates the Fibonacci sequence. That is
+modeled by the recurrence relation:
address@hidden
+     f(0) = 0
+     f(1) = f(2) = 1
+     f(n) = f(n-1) + f(n-2)
address@hidden display
+
+The purpose of this example is to introduce branches.  There are two
+kind of branches: backward branches and forward branches.  We'll
+present the calculation in a recursive and iterative form; the
+former only uses forward branches, while the latter uses both.
+
address@hidden
+#include <stdio.h>
+#include <lightning.h>
+
+static jit_state_t *_jit;
+
+typedef int (*pifi)(int);       @rem{/* Pointer to Int Function of Int */}
+
+int main(int argc, char *argv[])
address@hidden
+  pifi       fib;
+  jit_node_t *label;
+  jit_node_t *call;
+  jit_node_t *in;                 @rem{/* offset of the argument */}
+  jit_node_t *ref;                @rem{/* to patch the forward reference */}
+  jit_node_t *zero;               @rem{/* to patch the forward reference */}
+
+  init_jit(argv[0]);
+  _jit = jit_new_state();
+
+  label = jit_label();
+        jit_prolog   ();
+  in =  jit_arg      ();
+        jit_getarg   (JIT_V0, in);              @rem{/* R0 = n */}
+ zero = jit_beqi     (JIT_R0, 0);
+        jit_movr     (JIT_V0, JIT_R0);          /* V0 = R0 */
+        jit_movi     (JIT_R0, 1);
+  ref = jit_blei     (JIT_V0, 2);
+        jit_subi     (JIT_V1, JIT_V0, 1);       @rem{/* V1 = n-1 */}
+        jit_subi     (JIT_V2, JIT_V0, 2);       @rem{/* V2 = n-2 */}
+        jit_prepare();
+          jit_pushargr(JIT_V1);
+        call = jit_finishi(NULL);
+        jit_patch_at(call, label);
+        jit_retval(JIT_V1);                     @rem{/* V1 = fib(n-1) */}
+        jit_prepare();
+          jit_pushargr(JIT_V2);
+        call = jit_finishi(NULL);
+        jit_patch_at(call, label);
+        jit_retval(JIT_R0);                     @rem{/* R0 = fib(n-2) */}
+        jit_addr(JIT_R0, JIT_R0, JIT_V1);       @rem{/* R0 = R0 + V1 */}
+
+  jit_patch(ref);                               @rem{/* patch jump */}
+  jit_patch(zero);                              @rem{/* patch jump */}
+        jit_retr(JIT_R0);
+
+  @rem{/* call the generated address@hidden passing 32 as an argument */}
+  fib = jit_emit();
+  jit_clear_state();
+  printf("fib(%d) = %d\n", 32, fib(32));
+  jit_destroy_state();
+  finish_jit();
+  return 0;
address@hidden
address@hidden example
+
+As said above, this is the first example of dynamically compiling
+branches.  Branch instructions have two operands containing the
+values to be compared, and return a @code{jit_note_t *} object
+to be patched.
+
+Because labels final address are only known after calling @code{emit},
+it is required to call @code{patch} or @code{patch_at}, what does
+tell @lightning{} that the target to patch is actually a pointer to
+a @code{jit_node_t *} object, otherwise, it would assume that is
+a pointer to a C function. Note that conditional branches do not
+receive a label argument, so they must be patched.
+
+You need to call @code{patch_at} on the return of value @code{calli},
address@hidden, and @code{calli} if it is actually referencing a label
+in the jit code. All branch instructions do not receive a label
+argument. Note that @code{movi} is an special case, and patching it
+is usually done to get the final address of a label, usually to later
+call @code{jmpr}.
+
+Now, here is the iterative version:
+
address@hidden
+#include <stdio.h>
+#include <lightning.h>
+
+static jit_state_t *_jit;
+
+typedef int (*pifi)(int);       @rem{/* Pointer to Int Function of Int */}
+
+int main(int argc, char *argv[])
address@hidden
+  pifi       fib;
+  jit_node_t *in;               @rem{/* offset of the argument */}
+  jit_node_t *ref;              @rem{/* to patch the forward reference */}
+  jit_node_t *zero;             @rem{/* to patch the forward reference */}
+  jit_node_t *jump;             @rem{/* jump to start of loop */}
+  jit_node_t *loop;             @rem{/* start of the loop */}
+
+  init_jit(argv[0]);
+  _jit = jit_new_state();
+
+        jit_prolog   ();
+  in =  jit_arg      ();
+        jit_getarg   (JIT_R0, in);              @rem{/* R0 = n */}
+ zero = jit_beqi     (JIT_R0, 0);
+        jit_movr     (JIT_R1, JIT_R0);
+        jit_movi     (JIT_R0, 1);
+  ref = jit_blti     (JIT_R1, 2);
+        jit_subi     (JIT_R2, JIT_R2, 2);
+        jit_movr     (JIT_R1, JIT_R0);
+
+  loop= jit_label();
+        jit_subi     (JIT_R2, JIT_R2, 1);       @rem{/* decr. counter */}
+        jit_movr     (JIT_V0, JIT_R0);          /* V0 = R0 */
+        jit_addr     (JIT_R0, JIT_R0, JIT_R1);  /* R0 = R0 + R1 */
+        jit_movr     (JIT_R1, JIT_V0);          /* R1 = V0 */
+  jump= jit_bnei     (JIT_R2, 0);               /* if (R2) goto loop; */
+  jit_patch_at(jump, loop);
+
+  jit_patch(ref);                               @rem{/* patch forward jump */}
+  jit_patch(zero);                              @rem{/* patch forward jump */}
+        jit_retr     (JIT_R0);
+
+  @rem{/* call the generated address@hidden passing 36 as an argument */}
+  fib = jit_emit();
+  jit_clear_state();
+  printf("fib(%d) = %d\n", 36, fib(36));
+  jit_destroy_state();
+  finish_jit();
+  return 0;
address@hidden
address@hidden example
+
+This code calculates the recurrence relation using iteration (a
address@hidden loop in high-level languages).  There are no function
+calls anymore: instead, there is a backward jump (the @code{bnei} at
+the end of the loop).
+
+Note that the program must remember the address for backward jumps;
+for forward jumps it is only required to remember the jump code,
+and call @code{patch} for the implicit label.
+
address@hidden Reentrancy
address@hidden Re-entrant usage of @lightning{}
+
address@hidden uses the special @code{_jit} identifier. To be able
+to be able to use multiple jit generation states at the same
+time, it is required to used code similar to:
+
address@hidden
+    struct jit_state lightning;
+    #define lightning _jit
address@hidden example
+
+This will cause the symbol defined to @code{_jit} to be passed as
+the first argument to the underlying @lightning{} implementation,
+that is usually a function with an @code{_} (underscode) prefix
+and with an argument named @code{_jit}, in the pattern:
+
address@hidden
+    static void _jit_mnemonic(jit_state_t *, jit_gpr_t, jit_gpr_t);
+    #define jit_mnemonic(u, v) _jit_mnemonic(_jit, u, v);
address@hidden example
+
+The reason for this is to use the same syntax as the initial lightning
+implementation and to avoid needing the user to keep adding an extra
+argument to every call, as multiple jit states generating code in
+paralell should be very uncommon.
+
address@hidden Registers
address@hidden Accessing the whole register file
+
+As mentioned earlier in this chapter, all @lightning{} back-ends are
+guaranteed to have at least six general-purpose integer registers and
+six floating-point registers, but many back-ends will have more.
+
+To access the entire register files, you can use the
address@hidden, @code{JIT_V} and @code{JIT_F} macros.  They
+accept a parameter that identifies the register number, which
+must be strictly less than @code{JIT_R_NUM}, @code{JIT_V_NUM}
+and @code{JIT_F_NUM} respectively; the number need not be
+constant.  Of course, expressions like @code{JIT_R0} and
address@hidden(0)} denote the same register, and likewise for
+integer callee-saved, or floating-point, registers.
+
address@hidden Customizations
address@hidden Customizations
+
+Frequently it is desirable to have more control over how code is
+generated or how memory is used during jit generation or execution.
+
address@hidden Memory functions
+To aid in complete control of memory allocation and deallocation
address@hidden provides wrappers that default to standard @code{malloc},
address@hidden and @code{free}. These are loosely based on the
+GNU GMP counterparts, with the difference that they use the same
+prototype of the system allocation functions, that is, no @code{size}
+for @code{free} or @code{old_size} for @code{realloc}.
+
address@hidden void jit_set_memory_functions (@* void *(address@hidden) 
(size_t), @* void *(address@hidden) (void *, size_t), @* void (address@hidden) 
(void *))
address@hidden guarantees that memory is only allocated or released
+using these wrapped functions, but you must note that if lightning
+was linked to GNU binutils, malloc is probably will be called multiple
+times from there when initializing the disassembler.
+
+Because @code{init_jit} may call memory functions, if you need to call
address@hidden, it must be called before @code{init_jit},
+otherwise, when calling @code{finish_jit}, a pointer allocated with the
+previous or default wrappers will be passed.
address@hidden deftypefun
+
address@hidden void jit_get_memory_functions (@* void *(address@hidden) 
(size_t), @* void *(address@hidden) (void *, size_t), @* void (address@hidden) 
(void *))
+Get the current memory allocation function. Also, unlike the GNU GMP
+counterpart, it is an error to pass @code{NULL} pointers as arguments.
address@hidden deftypefun
+
address@hidden Alternate code buffer
+To instruct @lightning{} to use an alternate code buffer it is required
+to call @code{jit_realize} before @code{jit_emit}, and then query states
+and customize as appropriate.
+
address@hidden void jit_realize ()
+Must be called once, before @code{jit_emit}, to instruct @lightning{}
+that no other @code{jit_xyz} call will be made.
address@hidden deftypefun
+
address@hidden jit_pointer_t jit_get_code (jit_word_t address@hidden)
+Returns NULL or the previous value set with @code{jit_set_code}, and
+sets the @var{code_size} argument to an appropriate value.
+If @code{jit_get_code} is called before @code{jit_emit}, the
address@hidden argument is set to the expected amount of bytes
+required to generate code.
+If @code{jit_get_code} is called after @code{jit_emit}, the
address@hidden argument is set to the exact amount of bytes used
+by the code.
address@hidden deftypefun
+
address@hidden void jit_set_code (jit_ponter_t @var{code}, jit_word_t 
@var{size})
+Instructs @lightning{} to output to the @var{code} argument and
+use @var{size} as a guard to not write to invalid memory. If during
address@hidden @lightning{} finds out that the code would not fit
+in @var{size} bytes, it halts code emit and returns @code{NULL}.
address@hidden deftypefun
+
+A simple example of a loop using an alternate buffer is:
+
address@hidden
+  jit_uint8_t   *code;
+  int           *(func)(int);      @rem{/* function pointer */}
+  jit_word_t     code_size;
+  jit_word_t     real_code_size;
+  @rem{...}
+  jit_realize();                   @rem{/* ready to generate code */}
+  jit_get_code(&code_size);        @rem{/* get expected code size */}
+  code_size = (code_size + 4095) & -4096;
+  do (;;) @{
+    code = mmap(NULL, code_size, PROT_EXEC | PROT_READ | PROT_WRITE,
+                MAP_PRIVATE | MAP_ANON, -1, 0);
+    jit_set_code(code, code_size);
+    if ((func = jit_emit()) == NULL) @{
+      munmap(code, code_size);
+      code_size += 4096;
+    @}
+  @} while (func == NULL);
+  jit_get_code(&real_code_size);   @rem{/* query exact size of the code */}
address@hidden example
+
+The first call to @code{jit_get_code} should return @code{NULL} and set
+the @code{code_size} argument to the expected amount of bytes required
+to emit code.
+The second call to @code{jit_get_code} is after a successful call to
address@hidden, and will return the value previously set with
address@hidden and set the @code{real_code_size} argument to the
+exact amount of bytes used to emit the code.
+
address@hidden Alternate data buffer
+Sometimes it may be desirable to customize how, or to prevent
address@hidden from using an extra buffer for constants or debug
+annotation. Usually when also using an alternate code buffer.
+
address@hidden jit_pointer_t jit_get_data (jit_word_t address@hidden, 
jit_word_t address@hidden)
+Returns @code{NULL} or the previous value set with @code{jit_set_data},
+and sets the @var{data_size} argument to how many bytes are required
+for the constants data buffer, and @var{note_size} to how many bytes
+are required to store the debug note information.
+Note that it always preallocate one debug note entry even if
address@hidden or @code{jit_note} are never called, but will return
+zero in the @var{data_size} argument if no constant is required;
+constants are only used for the @code{float} and @code{double} operations
+that have an immediate argument, and not in all @lightning{} ports.
address@hidden deftypefun
+
address@hidden void jit_set_data (jit_pointer_t @var{data}, jit_word_t 
@var{size}, jit_word_t @var{flags})
+
address@hidden can be NULL if disabling constants and annotations, otherwise,
+a valid pointer must be passed. An assertion is done that the data will
+fit in @var{size} bytes (but that is a noop if @lightning{} was built
+with @code{-DNDEBUG}).
+
address@hidden tells the space in bytes available in @var{data}.
+
address@hidden can be zero to tell to just use the alternate data buffer,
+or a composition of @code{JIT_DISABLE_DATA} and @code{JIT_DISABLE_NOTE}
+
address@hidden @t
address@hidden JIT_DISABLE_DATA
address@hidden JIT_DISABLE_DATA
+Instructs @lightning{} to not use a constant table, but to use an
+alternate method to synthesize those, usually with a larger code
+sequence using stack space to transfer the value from a GPR to a
+FPR register.
+
address@hidden JIT_DISABLE_NOTE
address@hidden JIT_DISABLE_NOTE
+Instructs @lightning{} to not store file or function name, and
+line numbers in the constant buffer.
address@hidden table
address@hidden deftypefun
+
+A simple example of a preventing usage of a data buffer is:
+
address@hidden
+  @rem{...}
+  jit_realize();                        @rem{/* ready to generate code */}
+  jit_get_data(NULL, NULL);
+  jit_set_data(NULL, 0, JIT_DISABLE_DATA | JIT_DISABLE_NOTE);
+  @rem{...}
address@hidden example
+
+Or to only use a data buffer, if required:
+
address@hidden
+  jit_uint8_t   *data;
+  jit_word_t     data_size;
+  @rem{...}
+  jit_realize();                        @rem{/* ready to generate code */}
+  jit_get_data(&data_size, NULL);
+  if (data_size)
+    data = malloc(data_size);
+  else
+    data = NULL;
+  jit_set_data(data, data_size, JIT_DISABLE_NOTE);
+  @rem{...}
+  if (data)
+    free(data);
+  @rem{...}
address@hidden example
+
address@hidden Acknowledgements
address@hidden Acknowledgements
+
+As far as I know, the first general-purpose portable dynamic code
+generator is @sc{dcg}, by Dawson R.@: Engler and T.@: A.@: Proebsting.
+Further work by Dawson R. Engler resulted in the @sc{vcode} system;
+unlike @sc{dcg}, @sc{vcode} used no intermediate representation and
+directly inspired @lightning{}.
+
+Thanks go to Ian Piumarta, who kindly accepted to release his own
+program @sc{ccg} under the GNU General Public License, thereby allowing
address@hidden to use the run-time assemblers he had wrote for @sc{ccg}.
address@hidden provides a way of dynamically assemble programs written in the
+underlying architecture's assembly language.  So it is not portable,
+yet very interesting.
+
+I also thank Steve Byrne for writing GNU Smalltalk, since @lightning{}
+was first developed as a tool to be used in GNU Smalltalk's dynamic
+translator from bytecodes to native code.
 
 @c %**end of header (This is for running Texinfo on a region.)
 
diff --git a/doc/version.texi b/doc/version.texi
deleted file mode 100644
index fb8684c..0000000
--- a/doc/version.texi
+++ /dev/null
@@ -1,4 +0,0 @@
address@hidden UPDATED 18 June 2018
address@hidden UPDATED-MONTH June 2018
address@hidden EDITION 2.1.2
address@hidden VERSION 2.1.2



reply via email to

[Prev in Thread] Current Thread [Next in Thread]