gnustep-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: New ABI NSConstantString


From: Richard Frith-Macdonald
Subject: Re: New ABI NSConstantString
Date: Sun, 8 Apr 2018 16:27:12 +0100


> On 8 Apr 2018, at 12:41, David Chisnall <address@hidden> wrote:
> 
> On 8 Apr 2018, at 10:55, Richard Frith-Macdonald <address@hidden> wrote:
>> 
>> 
>> 
>>> On 6 Apr 2018, at 11:00, David Chisnall <address@hidden> wrote:
>>> 
>>> It would probably help catch more bugs if we made use of NSString’s 
>>> class-cluster nature more in -base.  I have just fixed a bug in GSString 
>>> where we were checking one object matched a particular class before 
>>> dereferencing the _flags ivar of the other.  I caught this because the 
>>> other was a GSTinyString, which is almost never a valid pointer.
>> 
>> Possibly, but performance *is* an issue here.  The NSString code was 
>> rewritten some years ago (moving away from them use of class cluster 
>> features) as a result of extensive profiling of real-world applications 
>> which were running too slow, precisely because NSString methods are very 
>> heavily used in real apps.  At the time somethjing like 20% of the CPU was 
>> wasted in method dispatch overheads (the -characterAtIndex: method is one of 
>> the cluster primitives and a major culprit) but there were also performance 
>> issues due to buffer allocation and copying of internal representations.  
>> The changes made a substantial improvement in general performance as well as 
>> causing multipler orders of magnitude improvement in a few pathological 
>> cases.
> 
> I agree that we should be improving performance for critical code, but 
> unfortunately it appears that we have done so at the expense of correctness 
> in a number of places.

Good guess, but you lack the perspective to appreciate quite how old GNUstep is 
... originally there was no unicode, only latin1, then we had both in a class 
'cluster' without using a set of primitive methods (wholy different 
implentations), then we had reorganisations combining the simple methods, then 
separating out again (but keeping common layout) somewhat.  So correctness has 
nothing to do with performance and everything to do with history.  On the one 
hand the last major reorganisation fixed severe performance problems, on the 
other it hid (or kept hidden) a few remaining issues.

> I also note that a lot of the NSString method implementations are not well 
> optimised.

Yes ... because they are almost never used as we historically had unicode 
string methods and latin1 string methods.  I did optimise the more 'importent' 
(ie ones causing trouble in the test applications I tried) ones though.

> In a number of places, -characterAtIndex: is called repeatedly, when 
> -getCharacters:range: is normally significantly more efficient.

You have to be very careful about using -getCharacters:range: to give more 
efficiency, and also worry about extra complexity to put buffers on stack or 
heap (or work in subsections of strings copied to a stack buffer etc).  I 
remember quite a few cases where more complex code 'optimised' to work that way 
turned out to be slower for common cases.

>  The ICU UText interface provides something very similar to 
> -getCharacters:range: as its primitive method (a callback that fills a buffer 
> with UTF-16 characters) and has some carefully optimised routines.

Yes, I have been thinking about implementing an ICU subclass of NSString (on 
platforms where ICU is available) for some time.  My assumption/hope is that it 
might be both more correct (in odd parts of unicode that people writing our 
stuff have been unaware of) and faster than our UTF16 code.  Even if 
performance tturned out to be poor, it would be good to have a reference 
implementation for testing for correctness.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]