gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] Two annoying bugs in 3.2.5


From: Pranith Kumar K
Subject: Re: [Gluster-devel] Two annoying bugs in 3.2.5
Date: Tue, 20 Dec 2011 18:47:46 +0530
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:8.0) Gecko/20111115 Thunderbird/8.0

On 12/17/2011 12:40 AM, Emmanuel Dreyfus wrote:
Pranith Kumar K<address@hidden>  wrote:

2) When using AFR, if a peer goes down, processes that have I/O pending
will se an error. Just retrying the same operation is fine, but that is
a bit furstrating.

Emmanuel,
      Could you give the test case for the afr issue.
It is quite bold, I have not yet narrowed it down to something simple.
My test case is building NetBSD. You can grab the tarballs here:
http://ftp.fr.netbsd.org/pub/NetBSD/NetBSD-5.1/source/sets/src.tgz
http://ftp.fr.netbsd.org/pub/NetBSD/NetBSD-5.1/source/sets/sharesrc.tge
http://ftp.fr.netbsd.org/pub/NetBSD/NetBSD-5.1/source/sets/syssrc.tgz
http://ftp.fr.netbsd.org/pub/NetBSD/NetBSD-5.1/source/sets/gnusrc.tgz

Unpack, then cd usr/src&&  ./build.sh -U release
I wait for the build to actually start, then pkill glusterfsd on a
replica, and the build stops because of an I/O error.

Emmanuel,
I think I will get a chance to take a closer look into this issue next week if you are willing to wait. I am suspecting the following code. Could you make the following change and check if this is the issue you are hitting. could you send me the logs of the client, on the next run.

         struct iatt     *buf = NULL;
         struct iatt     *postparent = NULL;
         dict_t          **xattr = NULL;
+        afr_private_t   *priv = NULL;

         GF_ASSERT (local);
+        priv = this->private;

         buf = &local->cont.lookup.buf;
         postparent = &local->cont.lookup.postparent;
@@ -787,6 +789,9 @@ afr_lookup_build_response_params (afr_local_t *local, xlator_t *this)
         read_child = afr_read_child (this, local->cont.lookup.inode);
gf_log (this->name, GF_LOG_DEBUG, "Building lookup response from %d",
                 read_child);
+        GF_ASSERT (afr_is_child_present (local->cont.lookup.child_success,
+                                         priv->child_count, read_child));
+        GF_ASSERT (local->cont.lookup.sources[read_child]);
         //honor the xattr set by data-self-heal
         if (!*xattr)
                 *xattr = dict_ref (local->cont.lookup.xattrs[read_child]);

Pranith



reply via email to

[Prev in Thread] Current Thread [Next in Thread]