bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: sed bug when installing texlive-latex-base


From: Paolo Bonzini
Subject: Re: sed bug when installing texlive-latex-base
Date: Fri, 02 Dec 2011 14:43:01 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:7.0.1) Gecko/20110930 Thunderbird/7.0.1

On 12/02/2011 02:26 PM, Terje BrĂ¥ten wrote:

Was it a glibc bug? What was it?

I have done a little programming, so I am just a bit curious.

It's a buffer overflow. I cannot explain all the steps in an example
that causes a SIGSEGV, but I can explain the idea. Let's say you have to
match something like this:

kjihgfedcba

against "[hi][^ ]*0" (yes, it doesn't match; it doesn't matter).

glibc first notices the match must start with "h" or "i", so it skips
the first two characters; it tries matching "ihgfedcba" and, for some
complicated reasons, copies that into a buffer instead of just
remembering "I have to start from the third character". You get

ihgfedcba

and matching fails. So it tries again with "hgfedcba". It places it into
the same buffer. glibc knows that the length is 8, so it doesn't bother
overwriting the rest of the buffer, and the buffer becomes

hgfedcbaa

with a leftover "a" from the previous attempt. Again, normally it
doesn't matter because glibc knows that the length is 8.

Now, "[^ ]" doesn't proceed a character at a time: it proceeds one
"collation element" at a time. In Bokmal, "aa" is a collation element.
The bug is that, when looking for collation elements, glibc forgets that
the length is 8. So "[^ ]" is taken to match "aa".  After every step
glibc has to note down its state.  In this case it says "I have matched
7 bytes and the next element is 2 bytes, so I'll note down my state
after matching 9 bytes". Of course the array of remembered states only
has room for 8 states, so the memory after the array is corrupted.

The fix is to remember the length when looking for collation elements.
glibc still checks if there is an "aa", but the answer now is "no,
because there's not enough data".

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]