[Gcl-devel] README.macosx

gcl-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gcl-devel] README.macosx

From:	Aurelien Chanudet
Subject:	[Gcl-devel] README.macosx
Date:	Sun, 18 Jul 2004 18:37:19 +0200

Hi all,

Here is a preliminary README.macosx file. I'm planning to complete thisfile during the summer. Comments much appreciated !


Aurelein


Mac OS X implementation notes
aurelien.chanudet <at> m4x.org
July 18, 2004


This file briefly discusses Mac OS X implementation notes.


* Third party malloc(3) calls

In GCL, for the sake of efficient memory management, there should be nocalls to standard memoryallocation routines such as malloc(3) or free(3). Instead, GCL's ownmemory allocation routinesshould be used. In particular, this means that third party calls tomalloc(3) (remember that GCLuses, say, gmp which is likely to call malloc(3)) should be interceptedand re-routed to GCL'sown functions. On Linux, this is done at build time, during the linkingstage. This is possiblebecause symbols in Linux live in a flat namespace. By contrast, symbolsin Mac OS X live in atwo-level namespace. This means that it is not possible, on Mac OS X,to override the defaultimplementation of malloc(3) as provided by libc. The trick on Mac OS Xis to use Darwin's zonemechanism. Darwin has a poorly documented API allowing advanced memorymanagement (see<objc/malloc.h>). Most applications have only one zone, also called thedefault zone, which isautomatically created the first time the program calls malloc(3). InGCL, an extra zone is

created at initialization time and is then made to be the default zone.


* Broken sbrk(2) replacement strategy

sbrk(2) simply does not work on Mac OS X. Unfortunately, GCL heavilyrelies on it. Indeed,GCL has its own page level memory management scheme. Regular Mach-Oapplications have at leastthree segments : the text segment (__TEXT), the data segment (__DATA)and the link-edit segment(__LINKEDIT). The first two segments have equivalent ELF segments (thethird segment containslink-edit information such as references to imported symbols). When theprocess is bootstrappedin memory by the dynamic loader, these segments are mapped in memoryand any subsequent memoryallocation takes place after the end of the link-edit segment. Thislayout turns out to beproblematic because GCL assumes that memory allocation takes placeimmediately after the end of thedata segment. That is, I suspect that, on Linux, calling sbrk(2)results in extending the sizeof this data segment ; however, on Mac OS X, the size of the datasegment cannot vary at runtime.For this reason, an extra data segment is created at initializationtime and is inserted betweenthe first data segment and the link-edit segment resulting in thefollowing memory layout :


+---------------------------------------------------------------------+
| __TEXT segment as created by gcc                                    |

+---------------------------------------------------------------------+<- DBEGIN

| __DATA segment as created by gcc (size is fixed)                    |

+---------------------------------------------------------------------+<- mach_mapstart| |<- heap_end| |<- core_end| |<- mach_brkpt (= my_sbrk(0))+---------------------------------------------------------------------+<- mach_maplimit (= DBEGIN + MAXPAGE)

| __LINKEDIT segment created by gcc but moved toward higher addresses |
+---------------------------------------------------------------------+

The heap ranges from DBEGIN to DBEGIN+MAXPAGE. The area of memorybetween mach_mapstartand mach_maplimit is our extra data segment. To bridge the gap with ourfirst section (thirdparty malloc(3) calls), we can say that all memory allocation happensbetween mach_mapstartand mach_brkpt. In particular, this means that the area ranging frommach_brkpt to mach_maplimitis mere wiggle room (memory is set to zero). You might wonder how anextra data segment canbe programmatically inserted once the process is mapped in memory : thetrick is to writea modified executable file (containing sufficient information for thedynamic loader to know

how to set up this memory layout) and then use execv(2).


* Unexec()'ing

Unexec()'ing is the process of capturing the memory footprint of arunning process and storingit to an executable file for later re-execution. Fortunately, not thewhole address spacehas to be saved to disk. Indeed, because the virtual address space issparse, only non zeroranges have to be saved. In particular, this means that the regionranging from mach_brkptto mach_maplimit isn't saved to the file (thus resulting in a segmentwhose filesize isless than its virtual memory size, see Mach-O Runtime Architecture fordetails). The bulk of

the work comes from Andrew Choi's work for Emacs.


* BFD Mach-O port

GCL has the ability to compile Lisp code to native object code, loadthe compiled code into therunning image, link the code and execute it. Most of the time, thiskind of functionality isachieved by compiling shared object libraries (.so files on Linux,.dylib files or .bundle files onMac OS X) and then loading the shared object library using thedlopen(3) interface (note, however,that Mac OS X has its own replacement solution for dlopen(3), which isthe dyld(3) interface, butthe concept remains the same). Handy as the dlopen strategy is, it is afairly time consuming onebecause the external linker has to be called in order to turn rawobject files (.o) as output bythe compiler into shared object files. To speed things up at theexpense of more development, GCLimplements its own toy linker on top of the BFD interface. BFD is theBinary File Descriptor library.The official BFD distribution supports ELF well, but does not reallysupport Mach-O. For this reason,I had to extend the official Mach-O back-end in order to supportlinking (see files "mach-o.c" and"mach-o-reloc.c" in the local BFD tree). There's nothing special to sayabout this port, exceptedthat the job of adding relocation support was tough. To summarize atthis stage, there are twodynamic loading mechanism available for Mac OS X. The old one, slow andno longer supported relieson the dyld(3). And the new one, built on top of my work to extend theMach-O back-end. At thisstage, there are no known bugs in the relocation code (that is, Maximaand ACL2 compile fine),

but there's a high probability that as yet hidden bugs remain.


* Stratified Garbage Collection

To be completed.


* Further references :

- Mach-O Runtime Architecture
- Linkers & Loaders by John R. Levine

[Prev in Thread]

Current Thread

[Next in Thread]

[Gcl-devel] README.macosx, Aurelien Chanudet <=
- Re: [Gcl-devel] README.macosx, Camm Maguire, 2004/07/19

Prev by Date: [Gcl-devel] Re: malignant sgc allocation in 2.6.3pre
Next by Date: [Gcl-devel] SGC debugging
Previous by thread: [Gcl-devel] Re: malignant sgc allocation in 2.6.3pre
Next by thread: Re: [Gcl-devel] README.macosx
Index(es):
- Date
- Thread