Hello Edmund, and everyone else following this,
Here are my thoughts, not as a gate-keeper, but rather in the spirit of peer review.
In my own tcc hacking, I have looked closely at the cstring handling in token streams and always found this sort of code troubling:
nb_words = (sizeof(CString) + cv->cstr->size + 3) >> 2;
Doesn't this assume that ints are 4 bytes? Of course, if ints are actually 8 bytes, then the ensuing malloc simply allocates more room than we need, so it's not been a problem, but it nonetheless seemed a bit to loose for my taste. Your patch touches the offending lines, and it looks like it handles them correctly.
What I'm not sure about, and would appreciate if somebody could check, is whether changing the contents of the union might lead to substantial increases in memory consumption. How many CValue-s are typically allocated and used during a regular compilation? The proposed change will alter a union the largest member of which used to be a pointer or 64-bit integer, and replaced it with a struct that contains an int and two pointers. On the other hand, allocating no more room than necessary for cstrings should reduce the consumed memory on 64-bit architectures. I believe Edmund that this fixes alignment issues, and this may also lead to better memory consumption, at least on 64-bit.