gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] rpc problems when using syncops in callbacks


From: Krishnan Parthasarathi
Subject: Re: [Gluster-devel] rpc problems when using syncops in callbacks
Date: Mon, 29 Apr 2013 11:47:33 +0530
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2

Hi Fog,

RPC callbacks are executed in the epoll thread[1]. Calling synctask_yield causes that epoll thread to be blocked until, the corresponding "wake" is called. Most likely, the code calling the "wake" is tied to a network-event, which wouldn't be 'noticed' until the epoll thread is unblocked. This is a classic deadlock.

You could attach gdb to the hung process and check if synctask_yield was called on the epoll thread. For further analysis, you might want to paste the output of "thread apply all bt full" from gdb, attached to the hung process.

[1] - epoll thread - is a short name for the thread executing epoll (), 'listening' for network events.

HTH,
krish

On 04/26/2013 03:10 PM, fog - wrote:
Hello everyone,

I am trying to use syncops in a custom translator to keep my code at least borderline readable, but I am having limited success.

Problem Symptoms:
Using a syncop in a regular fop is fine. However, in a callback it causes a 'freeze' (synctask_yield called by the SYNCOP macro doesn't return).

What seems to be the Problem:
Looking at the traces, there is no corresponding trace from rpc_clnt_reply_init on the client to the trace from rpcsvc_submit_generic on the server. In other words, the rpc reply gets sent but isn't correctly received. Obviously this is not really a networking problem but something else... I'd guess it's a deadlock somewhere on the client?
From the point of the syncop call onwards the client doesn't 'get' any rpc replies any more (the next GlusterFS Handshake sent by the client, which is received by the server and replied to, leads to a disconnection accordingly).

Again: This problem is only occurring when calling a syncop from a callback function inside my translator, if I call the same syncop in a fop call it completes fine.

I hope you can make sense out of the above problem description.
Thanks for your time ~



_______________________________________________
Gluster-devel mailing list
address@hidden
https://lists.nongnu.org/mailman/listinfo/gluster-devel


reply via email to

[Prev in Thread] Current Thread [Next in Thread]