bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#68244: hash-table improvements


From: Mattias Engdegård
Subject: bug#68244: hash-table improvements
Date: Sat, 6 Jan 2024 12:34:05 +0100

5 jan. 2024 kl. 16.41 skrev Dmitry Gutov <dmitry@gutov.dev>:

>> That's a good question and it all comes down to how we interpret 
>> `consing_until_gc`. Here we take the view that it should encompass all parts 
>> of an allocation and this seems to be consistent with existing code.
> 
> But the existing code used objects that would need to be collected by GC, 
> right? And the new one, seemingly, does not.

But it does, similar to the same way that we deal with string data.

> So I don't quite see the advantage of increasing consing_until_gc then. It's 
> like the difference between creating new strings and inserting strings into a 
> buffer: new memory is used either way, but the latter doesn't increase 
> consing.

Since we don't know exactly when objects die, we use object allocation as a 
proxy, assuming that on average A bytes die for every B bytes allocated and 
make an informed (and adjusted) guess as to what the A/B ratio might be. That 
is the basis for the GC clock.

Buffer memory is indeed treated differently and does not advance the GC clock 
as far as I can tell. Presumably the reasoning is that buffer size changes make 
a poor proxy for object deaths.

Of course we could reason that growing an existing hash table is also a bad 
proxy for object deaths, but the evidence for that is weak so I used the same 
metric as for other data structures just to be on the safe side.

This reminds me that the `gcstat` bookkeeping should probably include the 
hash-table ancillary arrays as well, since those counters are used to adjust 
the GC clock (see total_bytes_of_live_objects and consing_threshold). Will fix!

> It's great that the new hash tables are garbage-collected more easily and 
> produce less garbage overall, but in a real program any GC cycle will have to 
> traverse the other data structures anyway. So we might be leaving free 
> performance gains on the table when we induce GC cycles while no managed 
> allocations are done. I could be missing something, of course.

So could I, and please know that your questions are much appreciated. Are you 
satisfied by my replies above, or did I misunderstand your concerns?






reply via email to

[Prev in Thread] Current Thread [Next in Thread]