[Gluster-devel] Re: more bugs (was Re: io-threads...)

gluster-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gluster-devel] Re: more bugs (was Re: io-threads...)

From:	Brent A Nelson
Subject:	[Gluster-devel] Re: more bugs (was Re: io-threads...)
Date:	Tue, 1 May 2007 16:27:37 -0400 (EDT)

Wow, it almost looked like the patch fixed the issue with usingstat-prefetch, but see below. I was almost unable to get it to crash withdu's or rm's on complex directories, as it did fairly easily before.

Also, I think it fixed a tiny anomaly that I had noticed but ignored.Previously, even without stat-prefetch, multiple du's on a complexdirectory could give slightly different total sizes (a few KB out ofmany GB). Now, there is no such fluctuation at all.

I WAS able to get a crash with one machine doing du, while a differentmachine did removes of the same area. The du machine is the one where theglusterfs client died (the first du completed, the second died). Theglusterfs left a backtrace in the log but no core, perhaps because Icompiled with CFLAGS=-O3. See attached backtrace.

Stat-prefetch did at least withstand a great deal more torture than beforethe patch, so it seems to be a significant improvement. Note that Ihaven't tried the new patch without stat-prefetch, so it's possible thatheavy testing might be able to kill it even without stat-prefetch; I'm notsure.


Thanks,

Brent

PS Alas, there was no effect on the NFS reexport issue.

PPS The AFR client failover works pretty well, but I notice something.The first attempt to access the glusterfs after losing contact with aglusterfsd is sometimes faulty (e.g., the first df may say it's notconnected or give a smaller size for the volume; trying to cat a file maynot work on the first try). The very next attempt will succeed, however.


On Tue, 1 May 2007, Anand Avati wrote:

I was wondering if you could describe patch-134 a little? I was curious as
to whether or not it could be related to the stat-prefetch or the NFS
reexport issues.


this was a bug in afr which could have triggered for anybody who used
AFR and accessed a directory. the functions forming the reply path of
a transaction use function pointers and the afr's opendir reply
callback prototype had an extra member and derefered that pointer
(which is a junk pointer). so far all of us were lucky that the
derefernced pointer happened to point to some allocated memory (though
nothing was altered or used). it is very much possible that this culd
be related to the stat-prefetch. the latest glusterfs codebase now
prints a backtrace of a segfaul in the log as well as dumps a core,
next time if you get a segfault please pass on the core and/or log.
I do not see how nfs rexport can be affected, but you never know if
this could have triggered a side effect somewhere else.

I have done only a very little check with NFS re-export. once 1.3 next
release is done i will do a more thorough check.

regards,
avati

--
ultimate_answer_t
deep_thought (void)
{
 sleep (years2secs (7500000));
 return 42;
}

glusterfs-crash.txt
Description: Text document

[Prev in Thread]

Current Thread

[Next in Thread]

[Gluster-devel] Re: more bugs (was Re: io-threads...), Anand Avati, 2007/05/01
- [Gluster-devel] Re: more bugs (was Re: io-threads...), Brent A Nelson <=

Prev by Date: Re: [Gluster-devel] Re: Mount options uid=, gid=
Next by Date: [Gluster-devel] Application segfaults when executed on glusterfs mounted volume
Previous by thread: [Gluster-devel] Re: more bugs (was Re: io-threads...)
Next by thread: Re: [Gluster-devel] Re: Mount options uid=, gid=
Index(es):
- Date
- Thread