[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Size and length limits for Emacs primitive types and etc data?
From: |
Oleksandr Gavenko |
Subject: |
Size and length limits for Emacs primitive types and etc data? |
Date: |
Wed, 23 Jan 2013 00:06:04 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.2 (gnu/linux) |
during search I found these sources of information about limits of Emacs
runtime:
(info "(elisp)Programming Types")
Programming Types
http://www.emacswiki.org/emacs/EmacsFileSizeLimit
EmacsFileSizeLimit
http://article.gmane.org/gmane.emacs.devel/139119
Re: stack overflow limit
The value of re_max_failures we use now needs 4MB of stack on
a 32-but machine, twice as much on a 64-bit machine. We also
need stack space for GC.
>From official docs:
For integers: 28bit + sign.
For chars: 22-bit.
Next types have unknown or undefined size limits in manual but:
================================================================
For float: Emacs uses the IEEE floating point standard where possible. But
which precision exactly (half/single/double
http://en.wikipedia.org/wiki/IEEE_754#Basic_formats)?
/* Lisp floating point type. */
struct Lisp_Float /* src/lisp.h */
{
union
{
double data;
struct Lisp_Float *chain;
} u;
};
Seems it uses 64-bit (double precision) IEEE 754 on most of 32-bit platforms.
Any function in runtime that return digits and exponent width for float?
================================================================
For list: I think their length unlimited at all.
================================================================
But how many bytes take symbol? For example 'foo'?
>From src/lisp.h:
typedef struct { EMACS_INT i; } Lisp_Object;
struct Lisp_Symbol
{
unsigned gcmarkbit : 1;
ENUM_BF (symbol_redirect) redirect : 3;
unsigned constant : 2;
unsigned interned : 2;
unsigned declared_special : 1;
Lisp_Object name;
union {
Lisp_Object value;
struct Lisp_Symbol *alias;
struct Lisp_Buffer_Local_Value *blv;
union Lisp_Fwd *fwd;
} val;
Lisp_Object function;
Lisp_Object plist;
struct Lisp_Symbol *next;
};
For 32-bit arch I count 4*6=24 bytes.
Seems that Lisp_Object is index in hash table to actual values (like actual
name or function code...).
================================================================
How many memory takes cons cell?
struct Lisp_Cons
{
Lisp_Object car;
union
{
Lisp_Object cdr;
struct Lisp_Cons *chain;
} u;
};
For 32-bit arch I count 4*2=8 bytes.
================================================================
How many takes plist for storing single property?
From:
DEFUN ("plist-put", Fplist_put, Splist_put, 3, 3, 0,
(Lisp_Object plist, register Lisp_Object prop, Lisp_Object val)
{
register Lisp_Object tail, prev;
Lisp_Object newcell;
prev = Qnil;
for (tail = plist; CONSP (tail) && CONSP (XCDR (tail));
tail = XCDR (XCDR (tail)))
seems that 2 cons... or 8*2=16 bytes.
================================================================
How many memory takes string (which is buffer strings and symbols names)?
typedef struct interval *INTERVAL;
struct Lisp_String
{
ptrdiff_t size;
ptrdiff_t size_byte;
INTERVAL intervals; /* Text properties in this string. */
unsigned char *data;
};
Seems that 3*4 + lengthOf(data) bytes.
Manual say that "strings really contain integers" and "strings are arrays, and
therefore sequences as well".
So each char (in data) uses 4 bytes? Seem doesn't. As
To conserve memory, Emacs does not hold fixed-length 22-bit numbers that
are codepoints of text characters within buffers and strings. Rather, Emacs
uses a variable-length internal representation of characters, that stores
each character as a sequence of 1 to 5 8-bit bytes, depending on the
magnitude of its codepoint.
and:
Encoded text is not really text, as far as Emacs is concerned, but rather a
sequence of raw 8-bit bytes. We call buffers and strings that hold encoded
text "unibyte" buffers and strings, because Emacs treats them as a sequence
of individual bytes.
With unibyte I understand that it is easy to get char by index.
But with multibyte I don't understand. And don't understand why in this case
string are array, is it an inefficient array?
Seems that buffer text == string:
struct buffer_text /* from src/buffer.h */
{
unsigned char *beg;
ptrdiff_t gpt; /* Char pos of gap in buffer. */
ptrdiff_t z; /* Char pos of end of buffer. */
ptrdiff_t gpt_byte; /* Byte pos of gap in buffer. */
ptrdiff_t z_byte; /* Byte pos of end of buffer. */
ptrdiff_t gap_size; /* Size of buffer's gap. */
EMACS_INT modiff; /* This counts buffer-modification events
EMACS_INT chars_modiff; /* This is modified with character change
EMACS_INT save_modiff; /* Previous value of modiff, as of last
EMACS_INT overlay_modiff; /* Counts modifications to overlays. */
EMACS_INT compact; /* Set to modiff each time when compact_buffer
ptrdiff_t beg_unchanged;
ptrdiff_t end_unchanged;
EMACS_INT unchanged_modified;
EMACS_INT overlay_unchanged_modified;
INTERVAL intervals;
struct Lisp_Marker *markers;
bool inhibit_shrinking;
};
So opening 10 KiB Russian file in cp1251 actually take 2*10 KiB for buffer as
each Russian chars in multibyte string take 2 bytes... (just type C-u C-x =
and look to "buffer code: #xD0 #x91").
I think that string have no length limit (except limit in 28-bit for index on
32-bit platform).
================================================================
Seems that arrays/vectors also have no limits for length (except limit in
28-bit for index on 32-bit platform):
/* Regular vector is just a header plus array of Lisp_Objects. */
struct Lisp_Vector /* src/lisp.h */
{
struct vectorlike_header header;
Lisp_Object contents[1];
};
/* A boolvector is a kind of vectorlike, with contents are like a string. */
struct Lisp_Bool_Vector
{
struct vectorlike_header header;
/* This is the size in bits. */
EMACS_INT size;
/* This contains the actual bits, packed into bytes. */
unsigned char data[1];
};
================================================================
Hash tables are harder data type and I don't understand limitations on count
of key-values pairs from:
struct Lisp_Hash_Table
{
struct vectorlike_header header;
Lisp_Object weak;
Lisp_Object rehash_size;
Lisp_Object rehash_threshold;
Lisp_Object hash;
Lisp_Object next;
Lisp_Object next_free;
Lisp_Object index;
ptrdiff_t count;
Lisp_Object key_and_value;
struct hash_table_test test;
struct Lisp_Hash_Table *next_weak;
};
================================================================
Please correct me and answer the questions...
--
Best regards!
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Size and length limits for Emacs primitive types and etc data?,
Oleksandr Gavenko <=