gnustep-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: New ABI NSConstantString


From: Ivan Vučica
Subject: Re: New ABI NSConstantString
Date: Thu, 5 Apr 2018 18:52:45 +0100

Thank you, this was very informative!

On Thu, Apr 5, 2018 at 6:41 PM, David Chisnall
<address@hidden> wrote:
> On 5 Apr 2018, at 17:01, Ivan Vučica <address@hidden> wrote:
>>
>> Layman question: does it make sense to optimize for space, too, and have a 
>> smaller structure for tiny constant strings?
>
> With the new ABI, we get much better deduplication across compilation units 
> for selectors and protocols, which should extend to constant strings.
>
> At run time, on 64-bit platforms, we generate GSTinyString instances, which 
> are 64 bits and are hidden inside a pointer.  I’m tempted to make the 
> compiler generate those directly.
>
>> For 32bit ptrs and longs, this would be 20 bytes without the string itself. 
>> I don't think that's a lot, but I thought I'd ask.
>
> 20 bytes isn’t too bad, 36 (for 64-bit platforms) is a bit more.  On a 
> CHERI-like platform, it grows to 52 bytes, which starts to feel a bit 
> excessive.
>
> The absolute minimum structure is an isa pointer immediately followed by the 
> character data, with a null terminator.  That’s not a great idea, because the 
> isa pointer needs to be mutable, which would make the constant string also 
> accidentally mutable.
>
> The next smallest would be an isa pointer and a null-terminated string 
> pointer, so 8 / 16 / 32 bytes on the respective architectures.
>
> The cost of recomputing the hash is sufficiently expensive that it’s probably 
> worth using at least the 28 bits that we provide already for string hashes.
>
> I’ve done some measurements in -base.  In the compiled binary, we have a 
> total of 84976 bytes of strings, in 3307 strings, so an average of just under 
> 26 bytes per string, so 36 bytes of overhead seems quite a lot, and even 20 
> is quite noticeable.  If we exclude strings of 8 or fewer characters, this 
> gives us 81637 bytes in 2586 strings, so an average length of just under 32 
> bytes, so 36 bytes is still more than 100% overhead and adds up to about 90KB 
> in the final binary.
>
> With the current encoding, each constant string is 24 bytes, so that adds up 
> to about 60KB (excluding the string data itself) on 64-bit platforms.  That’s 
> about 0.5% of the total binary size, so I’m not too worried about making it 
> bigger.  Even making it 80KB is a lot of overhead per string (roughly 100%), 
> but isn’t that much of the total binary size.
>
>
> David
>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]