bug in thread support

bug-glibc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug in thread support

From:	Balazs Scheidler
Subject:	bug in thread support
Date:	Sun, 21 Oct 2001 14:22:21 +0200
User-agent:	Mutt/1.2.5i

Hi,

I was sending this information and example program to the linux kernel
folks, but they responded that this must be a libc bug instead. So I'm
sending this information to you. (the thread on the linux-kernel mailing
list should give you additional information in addition to this message)

So the problem: we are developing a massively multithreaded application.
This application sends syslog() messages from its threads. The problem I'm
encountering seems to be related to SIGPIPE handling (either the kernel
signal code, the libc signal code or the linuxthreads signal code)

Our application starts a new thread for each new TCP session. Writing to
sockets may result in a SIGPIPE to be delivered and an EPIPE to be returned
from write() when the remote end closes its socket. If this SIGPIPE happens
about the same time as a syslog() libc call, a segmentation fault occurs.
Since core dumping of multithreaded programs do not work reliably, I
implemented a quick&dirty backtrace function, which dumps the stack when a
signal occurs. (see the attached test program)

My backtrace function reports that the SIGSEGV occurs at virtual address
0x1:

address@hidden:~$ cc -g -lpthread stressthreads.c 
address@hidden:~$ ./a.out 
Signal (11) received, stackdump follows; eax='ffffffe0', ebx='0000001d', 
ecx='bc5ff96c', edx='00000400', eip='00000001'
retaddr=0x1, ebp=0xbc5ff944
retaddr=0x8048a2a, ebp=0xbc5ffd74
retaddr=0x4001bc9f, ebp=0xbc5ffe34
address@hidden:~$ gdb a.out 
GNU gdb 19990928
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...
(gdb) info line *0x8048a2a 
Line 80 of "stressthreads.c" starts at address 0x8048a12 <thread_func+118>
   and ends at 0x8048a2d <thread_func+145>.
(gdb) l stressthreads.c:80
75      #endif
76      
77        memset(buf, 'a', sizeof(buf));
78        for (i = 0; i < 1024; i++)
79          {
80            write(fd, buf, sizeof(buf));
81          }
82        close(fd);
83        //syslog(LOG_DEBUG, "thread stopped...%p\n", pthread_self());
84        free(arg);
(gdb) x/2i 0x8048a25
0x8048a25 <thread_func+137>:    call   0x8048680 <write>
0x8048a2a <thread_func+142>:    add    $0x10,%esp

so the virtual address of 0x804892a points where the write() call returns.

The attached test program reproduces the SIGSEGV, although the time needed
to do this depends whether you are using SMP or non-SMP kernel. SMP kernel
with more than a single processor crashes within 1 second.

Some instructions how to use the attached test programs:
1) stressthreads.c is the server, which crashes, compile it with 
     gcc stressthreads.c -lpthreads

   and run it. It will bind itself to port 0.0.0.0:10000, and listens for
   incoming connections. It will syslog() a message, and write 1MB of data 
   to the opened socket. The syslog() call is protected by a mutex (which I
   don't think is necessary, at least glibc seems to do locking on its own)

2) test-zorp.py, a small python script starting several parallel threads,
   connecting to the server in each thread, reading 1024 bytes of data, and 
   closing the connection. (this will cause a nice SIGPIPE in the server
   process)

   Since this script was only put together to reproduce the problem, no
   argument parsing is done. You will need to adjust the IP address of the
   server at the end of the script (test() function call.)

The application sets the SIGPIPE handler to a dummy function doing nothing
but a return. (earlier it was SIG_IGNed, but since I suspected it the source
of the problems I changed the code to use an empty function)

The crash does _NOT_ occur if the threads do not send log messages via
syslog(). I implemented my own syslog() routines for the time being, and the
crash doesn't occur. I tried to narrow down the problem even more, but
simply changing SIGPIPE handlers during the thread execution was not enough.
(this is what syslog() is doing)

There are several defines changing the behaviour of stressthreads.c:

BACKTRACE when #defined it uses my backtrace function reporting the exact
          location of the sigsegv, otherwise SIGSEGV is not masked.
SYSLOG    whe #defined the threads send info to syslog. The crash doesn't
          occur with this undefined.
SIGACTION use the SIGPIPE set/reset code similar to what is found in
          syslog() function. The crash didn't occur for me.

The environment I have here is Debian GNU/Linux potato:

ii  libc6          2.1.3-18       GNU C Library: Shared libraries and Timezone
address@hidden:~$ uname -a
Linux hugefw 2.2.19 #2 SMP Thu Sep 27 17:23:56 CEST 2001 i686 unknown

(hugefw has two PIII 800Mhz processors)

If you need more information, please tell me I'd be glad to help.

Thanks in advance.
-- 
Bazsi
PGP info: KeyID 9AF8D0A9 Fingerprint CD27 CFB0 802C 0944 9CFD 804E C82C 8EB1

stressthreads.c
Description: Text Data

test-zorp.py
Description: Text document

pgpBCDasbYyAf.pgp
Description: PGP signature

[Prev in Thread]

Current Thread

[Next in Thread]

bug in thread support, Balazs Scheidler <=

Prev by Date: obstack.c doesn't honor ENABLE_NLS
Next by Date: neither getopt.c nor regex.c honors ENABLE_NLS
Previous by thread: obstack.c doesn't honor ENABLE_NLS
Next by thread: [��]��ȭ��40%��-��ȸ��Լ�� õ��!!
Index(es):
- Date
- Thread