[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: New ABI NSConstantString
From: |
Fred Kiefer |
Subject: |
Re: New ABI NSConstantString |
Date: |
Sun, 8 Apr 2018 17:19:59 +0200 |
> Am 08.04.2018 um 16:11 schrieb David Chisnall <address@hidden>:
>
> On 8 Apr 2018, at 12:41, David Chisnall <address@hidden> wrote:
>>
>> On 8 Apr 2018, at 10:55, Richard Frith-Macdonald <address@hidden> wrote:
>>>
>>>
>>>
>>>> On 6 Apr 2018, at 11:00, David Chisnall <address@hidden> wrote:
>>>>
>>>> It would probably help catch more bugs if we made use of NSString’s
>>>> class-cluster nature more in -base. I have just fixed a bug in GSString
>>>> where we were checking one object matched a particular class before
>>>> dereferencing the _flags ivar of the other. I caught this because the
>>>> other was a GSTinyString, which is almost never a valid pointer.
>>>
>>> Possibly, but performance *is* an issue here. The NSString code was
>>> rewritten some years ago (moving away from them use of class cluster
>>> features) as a result of extensive profiling of real-world applications
>>> which were running too slow, precisely because NSString methods are very
>>> heavily used in real apps. At the time somethjing like 20% of the CPU was
>>> wasted in method dispatch overheads (the -characterAtIndex: method is one
>>> of the cluster primitives and a major culprit) but there were also
>>> performance issues due to buffer allocation and copying of internal
>>> representations. The changes made a substantial improvement in general
>>> performance as well as causing multipler orders of magnitude improvement in
>>> a few pathological cases.
>>
>> I agree that we should be improving performance for critical code, but
>> unfortunately it appears that we have done so at the expense of correctness
>> in a number of places. As per my other email,
>> -rangeOfComposedCharacterSequenceAtIndex: appears to give the wrong results
>> in almost every nontrivial case, and is unfortunately one of the primitive
>> methods for a lot of things.
>>
>> I also note that a lot of the NSString method implementations are not well
>> optimised. In a number of places, -characterAtIndex: is called repeatedly,
>> when -getCharacters:range: is normally significantly more efficient. The
>> ICU UText interface provides something very similar to -getCharacters:range:
>> as its primitive method (a callback that fills a buffer with UTF-16
>> characters) and has some carefully optimised routines.
>
> I’ve pushed my WIP changes to the newabi branch - review is welcome! This
> branch disables the GSString implementation of
> -rangeOfComposedCharacterSequenceAtIndex: and falls back to the NSString one
> (which is also wrong, but now consistently wrong).
I think your implementation of characterAtIndex: is wrong for the UTF8 case.
Sadly things are more complicated. Please have a look at the (slow) way the old
code is handling this.
- (unichar) characterAtIndex: (NSUInteger)index
{
NSUInteger l = 0;
unichar u;
unichar n = 0;
unsigned i = 0;
while (i < nxcslen || n > 0)
{
u = nextUTF8((const uint8_t *)nxcsptr, nxcslen, &i, &n);
if (l++ == index)
{
return u;
}
}
[NSException raise: NSInvalidArgumentException
format: @"-characterAtIndex: index out of range"];
return 0;
}
- Re: New ABI NSConstantString, (continued)
- Re: New ABI NSConstantString, David Chisnall, 2018/04/07
- Re: New ABI NSConstantString, Ivan Vučica, 2018/04/07
- Re: New ABI NSConstantString, David Chisnall, 2018/04/07
- Re: New ABI NSConstantString, Richard Frith-Macdonald, 2018/04/07
- Re: New ABI NSConstantString, Ivan Vučica, 2018/04/07
- Re: New ABI NSConstantString, Stefan Bidigaray, 2018/04/07
Re: New ABI NSConstantString, David Chisnall, 2018/04/06
Re: New ABI NSConstantString, Richard Frith-Macdonald, 2018/04/08
Re: New ABI NSConstantString, David Chisnall, 2018/04/08