bug-glibc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

glibc 2.3.2 pthread_create() mixed with fork() - problem with free()


From: Burton M. Strauss III
Subject: glibc 2.3.2 pthread_create() mixed with fork() - problem with free()
Date: Fri, 6 Jun 2003 13:54:00 -0500

First off, I realize that combining pthread_create() and fork() isn't
recommended.  It is, however, the only way to accomplish a specific task,
and - judging by the comments on the net, fairly common.  I therefore expect
this problem to occur frequently and to cause problems for many others.  So
if nothing else, this is documentation...

The product I work on creates a master thread which runs detached.  There
are a number of other service threads created to perform specific tasks,
such as data collection and web serving.  All threads share an internal data
structure, with mutexes used to protect updates.  i.e. this progression:

exec program

which fork()s to create the main thread:

'exec thread'
  -> main thread

Which detaches from the terminal to become:

main thread (detached)

The main thread then creates (pthread_create) the worker threads.

main thread
  -> data1 thread
  -> data2 thread
  -> resolver thread
  -> web server thread
etc.

So to this point, we have not actively mixed fork() and pthread_create().

However, in the web server, once it has been determined that the request is
a simple, static, report, fork() is used.  This allows the report to be
created from a snapshot of the internal tables.  By virtue of copy-on-write,
the cost of the static copy is minimal.  The report runs, returns data to
the user and the forked thread exits, e.g.:

main thread
  -> data1 thread
  -> data2 thread
  -> resolver thread
  -> web server thread
       -> report process (fork())

Unfortunately - and I'll be the first to admit this is a bug - deep in the
code for the report, a common routine is invoked of the 'if we haven't found
x out yet, inquire and update the data structure'.  Because this is in the
fork()ed static copy, the expected behavior is:

   1. the data structure is updated in the static copy.
   2. the report - built from the static copy - should reflect the data.
   3. the main and worker threads (pthread_create()ed), would never see the
update.

However, the specific field being updated is a pointer to a string.  And
(properly if called from one of the worker threads), the original value is
free()ed before strdup() is used to create the new value.

In the previous environments (FreeBSD, Linux, Mac(Darwin), etc.) this has
not caused a problem.  To the best of my knowledge, all of the environments
were based on glibc 2.2.x (e.g. RedHat Linux 8.0 glibc 2.2.93, etc.)

Under RedHat Linux 9 (glibc 2.3.2) (and if one updates the RH8 glibc to
2.3.2), a problem surfaces.  The free() appears to corrupt the memory
allocation routines, causing the fork()ed routine to segment fault at
various and random times, locations, etc.  I'm Ivory soap sure (99.94%) that
the free() is the problem - if I change the code to skip the free in the
fork()ed child, the 'random' problems ALL disappear.


I note this from the glibc changelog:

Version 2.3

<snip />

* The malloc functions were completely rewritten by Wolfram Gloger based
  on Doug Lea's malloc-2.7.0.c.


So I'm not surprised to find different behavior in edge/undefined cases.


My working hypothesis is, therefore, that something in the new malloc() et
al code, is sharing the allocated memory across the threads, regardless of
whether pthread_create() - which should share or fork() - which should not -
was used to create it.

I think this is a bug.  According to the definition of fork() the sharing
should not occur:

>From man fork:

       fork  creates a child process that differs from the parent process
only
       in its PID and PPID, and in the fact that resource utilizations are
set
       to 0.  File locks and pending signals are not inherited.

       Under Linux, fork is implemented using copy-on-write pages, so the
only
       penalty incurred by fork is the time and memory required  to
duplicate
       the parent's page tables, and to create a unique task structure for
the
       child.

I've looked at pthread_atfork, but according to it's man page, it's really
aimed at resolving cross-fork() mutex issues:

       To understand the purpose of pthread_atfork, recall that fork(2)
dupli-
       cates the whole memory space, including mutexes in their current
lock-
       ing  state,  but only the calling thread: other threads are not
running
       in the child process.  The mutexes are not usable after  the  fork
and
       must be initialized with pthread_mutex_init in the child process.
This
       is a limitation of the current implementation and might or might not
be
       present in future versions.

Nothing in there indicates anything about memory allocation routines.


There may be nothing that can be done, it may be that you just document the
abnormality and move on, or maybe there's a fix?


-----Burton

US-based commercial support for ntop:
     http://www.ntopsupport.com
     mailto:address@hidden

Search the ntop mailing lists at gmane:
     http://search.gmane.org

HowTo Ask for Help at
     http://snapshot.ntop.org/faq.php#83





reply via email to

[Prev in Thread] Current Thread [Next in Thread]