gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gluster-devel] gsyncd deadlocks in log_raise_exception


From: 蒋凯
Subject: [Gluster-devel] gsyncd deadlocks in log_raise_exception
Date: Sun, 26 Jan 2014 10:46:41 +0000

Hi,

 

 

Generally, when gsyncd encounters exceptions, it can log the exception and restarts. But in some cases, it deadlocks. It happens in my environment about once a week. The replication stops, but geo-replication status command shows OK.

 

I checked the processes in the master. The gsync process hangs in below backtrace, and the ssh sub process can’t terminate. I kill the ssh sub process use the signal -9 manually, then the geo-replication exits and restarts.

 

#3 file '/usr/lib64/python2.6/subprocess.py', in '_eintr_retry_call'

#7 file '/usr/lib64/python2.6/subprocess.py', in 'wait'

#11 file '/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py', in 'log_raise_exception'

#14 file '/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py', in 'twrap'

#19 file '/usr/lib64/python2.6/threading.py', in 'run'

#22 file '/usr/lib64/python2.6/threading.py', in '__bootstrap_inner'

#25 file '/usr/lib64/python2.6/threading.py', in '__bootstrap'

 

 

I think the problem is it uses Popen.wait here, which may deadlock if the output is larger than the pipe size. See the document http://docs.python.org/2/library/subprocess.html, which recommends to use Popen.communicate instead.

 

 

 

Thanks.

 

 


reply via email to

[Prev in Thread] Current Thread [Next in Thread]