gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] add-brick crashes client


From: Amar Tumballi
Subject: Re: [Gluster-devel] add-brick crashes client
Date: Fri, 03 Aug 2012 14:27:34 +0530
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120615 Thunderbird/13.0.1

On 08/03/2012 02:23 PM, Emmanuel Dreyfus wrote:
On Fri, Aug 03, 2012 at 05:13:02AM +0000, Emmanuel Dreyfus wrote:
It seems there is a race condition here. Someone knowledgable can
confirm?

I tried this. It does not crash anymore, but the volume gets broken
with lookups returning EINVAL  (log below), it's therefore probably the
wrong way, but hints are welcome.

--- syncop.c.orig       2012-08-03 08:02:35.000000000 +0200
+++ syncop.c    2012-08-03 10:43:28.000000000 +0200
@@ -116,8 +116,10 @@
          /* Do not trust the pointer received. It may be
             wrong and can lead to crashes. */

          task = synctask_get ();
+       assert(task != NULL);
+
          task->ret = task->syncfn (task->opaque);
         if (task->synccbk)
                 task->synccbk (task->ret, task->frame, task->opaque);

@@ -211,8 +213,14 @@

          newtask->ctx.uc_stack.ss_sp   = newtask->stack;
          newtask->ctx.uc_stack.ss_size = env->stacksize;

+       /*
+        * synctask_wrap does not trust its argument, and
+        * uses syntask_get()
+        */
+       synctask_set (newtask);
+
          makecontext (&newtask->ctx, (void *) synctask_wrap, 2, newtask);

         newtask->state = SYNCTASK_INIT;


[2012-08-03 10:46:03.709177] E [afr-common.c:3664:afr_notify] 
0-pfs-replicate-0: All subvolumes are down. Going offline until atleast one of 
them comes back up.
[2012-08-03 10:46:03.825505] W [dht-layout.c:186:dht_layout_search] 1-pfs-dht: 
no subvolume for hash (value) = 4177819066
[2012-08-03 10:46:03.825652] E [dht-common.c:1372:dht_lookup] 1-pfs-dht: Failed 
to get hashed subvol for /manu
[2012-08-03 10:46:03.826315] W [fuse-bridge.c:292:fuse_entry_cbk] 
0-glusterfs-fuse: 12944: LOOKUP() /manu => -1 (Invalid argument)
[2012-08-03 10:46:03.827107] W [dht-layout.c:186:dht_layout_search] 1-pfs-dht: 
no subvolume for hash (value) = 4177819066


Looking at the logs, I feel its because of bug 815227, can you run a 'rebalance' operation and see if everything comes to normal?

-Amar




reply via email to

[Prev in Thread] Current Thread [Next in Thread]