epsilon-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Fun (unfinished) journey of trying to run jitter on a micro-controller


From: Mohammad-Reza Nabipoor
Subject: Fun (unfinished) journey of trying to run jitter on a micro-controller
Date: Tue, 30 Nov 2021 22:47:21 +0330

Hello, Luca!

As I explained to you over IRC (in ##jitter), I want to run a minimal VM
in a micro-controller (uC) for fun (at least, for now!).
The spec is: Cortex-M0 with 32 KiB of flash memory and 4 KiB of RAM.
Very constrained environment.

I compiled the jitter (on Arch Linux):

  ../configure --host=arm-none-eabi       \
    --disable-dispatch-direct-threading   \
    --disable-dispatch-minimal-threading  \
    --disable-dispatch-no-threading       \
    CFLAGS='--specs=nosys.specs -mcpu=cortex-m0 -mthumb -Os -ffunction-sections 
-fdata-sections -Wl,--gc-sections'

The problem was in Gnulib, these files:
  - gnulib-local/getdtablesize.c: getdtablesize(): No `struct rlimit`
  - gnulib-local/getprogname.c: getprogname(): I commented the #error
  - gnulib-local/sleep.c: sleep(): Changed to return 0
  - mkdir, rmdir were needed too

It compiles, but the problem is it fails on the uC (I think because of
dynamic memory).

Note that these compiler flags are important to enable final ELF fit to the
flash memory:
  -Os
  -ffunction-sections
  -fdata-sections
  -Wl,--gc-sections

And I think, as you mentioned in IRC, direct threading is also safe with
these options.


# Suggestions


I then tried to measure the memory usage on my PC which I will explain
in detail. But before that I want to tell my suggestions (everything that comes
to my mind, for sure some of them are obvious for you):

  - It'd be nice if user be able to only compile the `jitter` without `jitterc`.
    Like my case, I don't want (even cannot!) run `jitterc` on the target
    platform.

  - User-provided memory allocator.
    (Maybe you don't like one extra level of dereferencing (because of function
    pointers), but I thinks that's not big issue (esp on modern CPUs),
    unfortunately I don't have any number/measurement in favor or against
    my claim :D).

  - User-provided jitter_fatal (with `__attribute__((noreturn))`)
    (Very useful for reporting error situations)


(NOTE) Using custom allocator and custom failure report functions, and
  if `jitter` claims resources (like memory) using user-provided interfaces,
  the user can recover from fatal errors safely.
  This is useful for constrained environment. After failure, the code
  will `longjump` to a well-know state and re-claim the requested resources
  which were given to `jitter` after the `setjmp`.


  - Replace `printf` and other constructs that assumes that there's a OS with
    traditional input/output (keyboard/terminal), with some abstracted 
constructs
    (e.g. user-provided functions).

  - It would be nice if you make some parts of `jitter` optional,
    like signal hanlding, disassembly, readline and etc.


## My VM

I use this minimal VM:

```jitter
vm
  set prefix "uc"
end

stack s
  long-name "stack"
  c-element-type "unsigned char"
  c-initial-value "0"
  element-no 8
  tos-optimized
  guard-underflow
  guard-overflow
end

state-struct-runtime-c
  code
  end
end

early-header-c
  code
#include <stdint.h>

// get current status of LEDs
uint16_t
led_get(void);

// change the status of LEDs to `v`
void
led_set(uint16_t v);

// Delay for `ms` milliseconds
void
delay_ms(unsigned ms);

extern unsigned char UC_MEM[4];
  end
end

wrapped-functions
  led_set
  led_get
end

wrapped-globals
  UC_MEM
end

instruction mem (?n)
  code
    UC_MEM[JITTER_ARGN0] = UC_TOP_STACK ();
    UC_DROP_STACK ();
  end
end

instruction add (?n, ?n)
  code
    UC_PUSH_STACK (JITTER_ARGN0 + JITTER_ARGN1);
  end
end

instruction push (?n)
  code
    UC_PUSH_STACK (JITTER_ARGN0);
  end
end

instruction adds ()
  code
    jitter_uint tmp = UC_TOP_STACK ();

    UC_DROP_STACK ();
    tmp += UC_TOP_STACK ();
    UC_DROP_STACK ();
    UC_PUSH_STACK ((unsigned char)tmp);
  end
end

instruction ledset ()
  code
    led_set (UC_TOP_STACK ());
    UC_DROP_STACK ();
  end
end

instruction ledget ()
  code
    uint16_t v = led_get ();
    UC_PUSH_STACK (v);
  end
end

instruction delay (?n)
  code
    delay_ms (JITTER_ARGN0);
  end
end

instruction lt (?n)
  code
    UC_PUSH_STACK (UC_TOP_STACK () < JITTER_ARGN0);
  end
end

instruction bnz (?l)
  branching
  code
    unsigned char cond = UC_TOP_STACK ();
    UC_DROP_STACK ();
    JITTER_BRANCH_IF_NONZERO (cond, JITTER_ARGP0);
  end
end
```

Just enough to make some blinking LEDs :)

And this my C code that I use for measuring memory:

```c
#include "uc-vm.h"

#include <string.h>

//---

unsigned char UC_MEM[4];

static uint16_t LED;
void
led_set(uint16_t v)
{
  LED = v;
}
uint16_t
led_get(void)
{
  return LED;
}

void
delay_ms(unsigned ms)
{
  (void)ms;
}

//---

static void
memtrack_enable(int);

int
main()
{
  struct uc_state s;
  struct uc_mutable_routine* r;
  struct uc_executable_routine* rx;
  uc_label l0;

  memset(UC_MEM, 0, sizeof(UC_MEM));

  memtrack_enable(1);

#define i(x) UC_MUTABLE_ROUTINE_APPEND_INSTRUCTION(r, x)
#define lu(x) uc_mutable_routine_append_unsigned_literal_parameter(r, x)

  uc_initialize();
  uc_state_initialize(&s);
  r = uc_make_mutable_routine();
  l0 = uc_fresh_label(r);

  i(push);
  lu(0);
  uc_mutable_routine_append_label(r, l0);
  i(ledget);
  i(push);
  lu(1);
  i(adds);
  i(ledset);
  i(delay);
  lu(1000);
  i(push);
  lu(1);
  i(adds);
  i(lt);
  lu(10);
  i(bnz);
  uc_mutable_routine_append_label_parameter(r, l0);
  i(mem);
  lu(0);

#undef lu
#undef i

  rx = uc_make_executable_routine(r);

  puts("---");

  for (int _ = 0; _ < 3; ++_) {
    uc_execute_executable_routine(rx, &s);
    puts("+++");

    for (int i = 0; i < 4; ++i)
      printf("UC_MEM[%d]:%u\n", i, (unsigned)UC_MEM[i]);
    printf("LED:%04x\n", LED);
  }

  uc_destroy_executable_routine(rx);
  uc_destroy_mutable_routine(r);
  uc_state_finalize(&s);
  uc_finalize();

  memtrack_enable(0);
  return 0;
}
```

# Memory Measurements

At the end of the `main` function, I've implemented my memory measurement
machinery which relys on `--wrap` option of linker (the linker will replace
`malloc`, `realloc` and `free` functions with `__wrap_malloc`, `__wrap_realloc`
and `__wrap_free`. (This method has one advantage/disadvantage: it doesn't
track all `malloc`s, only the ones that linker can see).

The compiler flags:

  -Wl,--wrap=malloc -Wl,--wrap=realloc -Wl,--wrap=free

And this my implmentation of them:

```c
// after the main()

static volatile int MEMTRACK_ENABLE_;

static void
memtrack_enable(int en)
{
  MEMTRACK_ENABLE_ = !!en;
}

typedef void (*dummy_func_t)(void*);

void dummy_func_ma(void* p) { (void)p; }
void dummy_func_re(void* p) { (void)p; }
void dummy_func_fr(void* p) { (void)p; }

volatile void* DUMMY_PTR_OLD;
volatile void* DUMMY_PTR;
volatile size_t DUMMY_SIZE;
volatile dummy_func_t DUMMY_FUNC_MA = dummy_func_ma;
volatile dummy_func_t DUMMY_FUNC_RE = dummy_func_re;
volatile dummy_func_t DUMMY_FUNC_FR = dummy_func_fr;

void*
__real_malloc(size_t size);

void*
__wrap_malloc(size_t size)
{
  void* p = __real_malloc(size);
  // fprintf(stderr, "malloc size:%u; mem:%p\n", size, p);
  DUMMY_PTR = p;
  DUMMY_SIZE = size;
  if (MEMTRACK_ENABLE_)
    DUMMY_FUNC_MA(p);
  return p;
}

void*
__real_realloc(void* ptr, size_t size);

void*
__wrap_realloc(void* ptr, size_t size)
{
  void* p = __real_realloc(ptr, size);
  // fprintf(stderr, "realloc ptr:%p size:%u; mem:%p\n", ptr, size, p);
  DUMMY_PTR_OLD = ptr;
  DUMMY_PTR = p;
  DUMMY_SIZE = size;
  if (MEMTRACK_ENABLE_)
    DUMMY_FUNC_RE(p);
  return p;
}

void
__real_free(void* ptr);

void
__wrap_free(void* ptr)
{
  __real_free(ptr);
  // fprintf(stderr, "free ptr:%p\n", ptr);
  DUMMY_PTR = ptr;
  if (MEMTRACK_ENABLE_)
    DUMMY_FUNC_FR(ptr);
}
```

The purpose of all these `volatile`s is to keep them for `gdb` after 
compilation.
And I use this `gdb` script to produce the logging output.
And then I use a Python script to parse the logging output of gdb to do
the actual analysis in Python (I'm using Python 3).

I run the gdb like this:

  gdb -x dbg-script.gdb a.out


This the `gdb` script:

```gdb
set logging off
set logging file /tmp/log.memtrack
set logging overwrite on

b dummy_func_ma
commands
  p/x (void*)DUMMY_PTR
  p (size_t)DUMMY_SIZE
  bt
  c
end

b dummy_func_re
commands
  p/x (void*)DUMMY_PTR
  p (size_t)DUMMY_SIZE
  p/x (void*)DUMMY_PTR_OLD
  bt
  c
end

b dummy_func_fr
commands
  p/x (void*)DUMMY_PTR
  bt
  c
end

set logging on
r
q
```

And analysis is like this (I suggest you to use `-i` option of Python to be
able to look at the data yourself, in an interactive manner):

  python3 -i  memanal-gdb.py /tmp/log.memtrack

This is my Python script (`memanal-gdb.py`):

```python3
#!/usr/bin/python3

import sys
import pprint
import statistics
from collections import namedtuple, Counter
from enum import Enum

try:
    from matplotlib import pyplot as plt

    def plot(*args, **kwargs):
        plt.plot(*args, **kwargs)
        plt.grid()
        plt.show()


except Exception:

    def plot(*args, **kargs):
        print("plot: Not supported: requires matplotlib")


# my pretty-printer
def pp(x):
    if isinstance(x, MemAction):
        x = dict(x._asdict())
        pprint.pprint(x, indent=2, width=79, sort_dicts=False)
    elif isinstance(x, list):
        print("[")
        for y in x:
            pp(y)
            print(",")
        print("]")
    else:
        pprint.pprint(x, indent=2, width=79, sort_dicts=False)


class MemActionKind(Enum):
    MALLOC = 1
    REALLOC = 2
    FREE = 3


InMalloc = namedtuple("InMalloc", "size")
InRealloc = namedtuple("InRealloc", "ptr size")
InFree = namedtuple("InFree", "ptr")
OutMalloc = namedtuple("OutMalloc", "ptr size")
OutRealloc = namedtuple("OutRealloc", "ptr size")
OutFree = namedtuple("OutFree", "size")
MemAction = namedtuple("MemAction", "kind inp out trace")


MEMORY = {}


def group_breakpoint_entries(lines):
    """Group each GDB logging breakpoint data together"""

    bp_idx = [i for i, l in enumerate(lines) if l.startswith("Breakpoint")]
    bp_idx.append(len(lines))
    return [
        [l.strip() for l in lines[b:e] if l.strip()]
        for b, e in zip(bp_idx[0:-1], bp_idx[1:])
    ]


def bp_parse(bp_entry):
    """Returns 'MemAction' from list of lines of a breakpoint"""

    # Sample: "Breakpoint 1, 0x00005555555566f0 in dummy_func_ma ()"
    kind_str = bp_entry[0].split(",")[0]  # Sample: "Breakpoint 1"

    if kind_str == "Breakpoint 1":  # malloc
        assert "dummy_func_ma" in bp_entry[0]
        kind = MemActionKind.MALLOC

        # Sample:
        #   "$1 = 0x5555555682a0"
        #   "$2 = 520"
        ptr = bp_entry[1].split(" = ")[1]  # Sample: "0x5555555682a0"
        size = int(bp_entry[2].split(" = ")[1])  # Sample: 520

        assert ptr not in MEMORY
        MEMORY[ptr] = size

        inp = InMalloc(size)
        out = OutMalloc(ptr, size)

        # Ignore the first two lines of backtrace, and the last line
        trace = [x.split("(")[0].split(" ")[-2] for x in bp_entry[5:-1]]

    elif kind_str == "Breakpoint 2":  # realloc
        assert "dummy_func_re" in bp_entry[0]
        kind = MemActionKind.REALLOC

        # Sample:
        #   "$145 = 0x555555569ba0"
        #   "$146 = 529"
        #   "$147 = 0x5555555695b0"
        ptr = bp_entry[1].split(" = ")[1]  # Sample: "0x555555569ba0"
        size = int(bp_entry[2].split(" = ")[1])  # Sample: 529
        ptr_old = bp_entry[3].split(" = ")[1]  # Sample: "0x5555555695b0"

        size_old = MEMORY[ptr_old]
        del MEMORY[ptr_old]
        MEMORY[ptr] = size

        inp = InRealloc(ptr_old, size)
        out = OutRealloc(ptr, size - size_old)
        trace = [x.split("(")[0].split(" ")[-2] for x in bp_entry[6:-1]]

    elif kind_str == "Breakpoint 3":  # free
        assert "dummy_func_fr" in bp_entry[0]
        kind = MemActionKind.FREE

        # Sample: "$152 = 0x555555569af0"
        ptr = bp_entry[1].split(" = ")[1]  # Sample: "0x555555569af0"

        size = -MEMORY[ptr]
        del MEMORY[ptr]

        inp = InFree(ptr)
        out = OutFree(size)
        trace = [x.split("(")[0].split(" ")[-2] for x in bp_entry[3:-1]]

    else:
        assert False, "Impossible!"

    return MemAction(kind, inp, out, trace)


# ---


with open(sys.argv[1], "r") as f:
    lines = f.readlines()

bps_grouped = group_breakpoint_entries(lines)
bps = [bp_parse(g) for g in bps_grouped]

# Check memory usage
mem = 0
for bp in bps:
    mem += bp.out.size
assert mem == sum(v for k, v in MEMORY.items())

leaked = [
    bp for bp in bps if bp.kind != MemActionKind.FREE and bp.out.ptr in MEMORY
]

sizes = [bp.out.size for bp in bps if bp.kind != MemActionKind.FREE]
sizes_median = statistics.median(sizes)
sizes_mmode = statistics.multimode(sizes)
sizes_hist = Counter(sizes)

print("Leaked {")
pp(leaked)
print("}")

print(
    f"""\
# Allocation Size Statistics ({len(sizes)} allocations (malloc/realloc))
Median: {sizes_median}
Multimode: {sizes_mmode}
Min: {min(sizes)}
Max: {max(sizes)}
Most common (size, freq): {sizes_hist.most_common(5)}
""",
    end="",
)

# plot(sizes, "o")  # uncomment this, if you want
# pp(bps)           # uncomment if you want to look at pretty-printed version
```

If you have `python3-matplotlib` installed on your system, you can also look at
the plots (I love looking at them! I think you'll love more than me :D).

And for this program, the memroy allocation statistics is like this:

  # Allocation Size Statistics (74 allocations (malloc/realloc))
  Median: 24.0
  Multimode: [24]
  Min: 3
  Max: 1040
  Most common (size, freq): [(24, 18), (16, 12), (64, 11), (8, 8), (256, 5)]


In Python REPL, you can, e.g., look at backtraces for allocations of size `256`:

```python-repl
>>> pp([bp for bp in bps if bp.kind != MemActionKind.FREE and bp.out.size == 
>>> 256])
[
{ 'kind': <MemActionKind.MALLOC: 1>,
  'inp': InMalloc(size=256),
  'out': OutMalloc(ptr='0x555555569070', size=256),
  'trace': [ 'jitter_xmalloc',
             'jitter_dynamic_buffer_initialize_with_allocated_size',
             'jitter_dynamic_buffer_initialize',
             'jitter_initialize_routine',
             'jitter_make_mutable_routine']}
,
{ 'kind': <MemActionKind.MALLOC: 1>,
  'inp': InMalloc(size=256),
  'out': OutMalloc(ptr='0x555555569390', size=256),
  'trace': [ 'jitter_xmalloc',
             'jitter_dynamic_buffer_initialize_with_allocated_size',
             'jitter_dynamic_buffer_initialize',
             'jitter_initialize_routine',
             'jitter_make_mutable_routine']}
,
{ 'kind': <MemActionKind.MALLOC: 1>,
  'inp': InMalloc(size=256),
  'out': OutMalloc(ptr='0x5555555694a0', size=256),
  'trace': [ 'jitter_xmalloc',
             'jitter_dynamic_buffer_initialize_with_allocated_size',
             'jitter_dynamic_buffer_initialize',
             'jitter_initialize_routine',
             'jitter_make_mutable_routine']}
,
{ 'kind': <MemActionKind.MALLOC: 1>,
  'inp': InMalloc(size=256),
  'out': OutMalloc(ptr='0x5555555695b0', size=256),
  'trace': [ 'jitter_xmalloc',
             'jitter_dynamic_buffer_initialize_with_allocated_size',
             'jitter_dynamic_buffer_initialize',
             'jitter_initialize_routine',
             'jitter_make_mutable_routine']}
,
{ 'kind': <MemActionKind.MALLOC: 1>,
  'inp': InMalloc(size=256),
  'out': OutMalloc(ptr='0x5555555696c0', size=256),
  'trace': [ 'jitter_xmalloc',
             'jitter_dynamic_buffer_initialize_with_allocated_size',
             'jitter_dynamic_buffer_initialize',
             'jitter_initialize_routine',
             'jitter_make_mutable_routine']}
,
]
```

I hope this information would be useful for you. It was a wonderful journey for
me until this point and I think it'll remain so :)
Thanks for your work on Jitter.


Regards,
Mohammad-Reza



reply via email to

[Prev in Thread] Current Thread [Next in Thread]