bug-glibc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Nasty consequence of fclose followed by ferror


From: Garance A Drosihn
Subject: Nasty consequence of fclose followed by ferror
Date: Wed, 22 Nov 2000 17:32:40 -0500

Hi.

I was porting code which is used on several OS's (freebsd, sunOS,
solaris, aix, irix) to redhat linux 7.0, and ran into a number
of odd bugs which were time-consuming to track down.

I eventually found that the basic cause for all those bugs was
in the code I was porting.  It has:
    fclose(fp);
    if (ferror(fp)) { ... do stuff ... }

Please note I am quite happy to admit this code is wrong.  Also
note I did not write this code, I am just porting it from other
places.  I do not need any lecture on why this code is bad.  I
can read the "Single Unix Specification" as well as anyone else.
I would be perfectly happy if that call to ferror would cause a
SEGV or some other disaster -- as long as the disaster is at
THAT call and not far-far-away from it.

The trick is that the call to ferror with the freed FILE pointer
causes random memory to be overwritten.  When tracking down
this problem, I first had calls to inet_aton() which would
fail.  I also had some calls to syslog which were generating
garbage messages.  When I commented out the call to inet_aton(),
I had failures in some other system routine (which I forget
now).  Debugging was complicated by the fact that sometimes
everything would work fine, and other times the process in
question would just disappear on me (in the middle of one
of these system routines).

Eventually I commented out enough things that the failing
system call was at another fopen, which at least got me on
the right track.  Eventually I used that to trace the
problem back to the bad code above.  As I sit here writing
this message, I have already fixed this code I am porting,
and it is working fine (and on all the above platforms).

However, for the sanity of other people who might run into
this problem, I was wondering if there was some simple and
inexpensive change which could be made to fclose or ferror
which would make debugging this much less painful.

My first thought would be ferror might do some sanity checks
of the area pointed to, but I can understand if this might
be too expensive (performance-wise).

When I reported this to redhat, they said that fclose had to
obtain a lock for the pointer.  To me, this implies it is
picking up some value from the file-pointer, and doing a
lock based on that value.  Perhaps fclose could just trash
that value, such that any attempt to call lock with that
value (assuming there is such a value) will fail.

As it is, that call to lock apparently writes over random
locations in memory.  In my case, those random changes to
other memory can (and do sometimes) cause fatal errors
very far away from the real problem.

So, it would be nice if some simple change could be done,
which would make this fail closer to the real problem code.
I am sorry I do not have the time to suggest a more specific
patch, but I spend most of my programming time on the platforms
listed above.  I am not set up to debug changes to glibc, as
much as I like to contribute code to any open-source project.

If it is infeasible to improve this, that is fine too.  Just
spare me any lectures on the code.  I know the code is wrong,
and that glibc does not "need" to do anything different than
it already does.  I just think it would be helpful to other
developers if the error could be moved closer to the bad code.

--
Garance Alistair Drosehn            =   address@hidden
Senior Systems Programmer           or  address@hidden
Rensselaer Polytechnic Institute    or  address@hidden



reply via email to

[Prev in Thread] Current Thread [Next in Thread]