gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gluster-devel] Error: Transport endpoint is not connected


From: JV
Subject: [Gluster-devel] Error: Transport endpoint is not connected
Date: Thu, 23 Apr 2009 17:12:17 +0300
User-agent: Mozilla-Thunderbird 2.0.0.19 (X11/20090103)



Hello.

I see an error, when I try to

find /mnt/glusterfs/ -type f -exec md5sum {} \; >/dev/null

it fails as:
md5sum: /mnt/glusterfs/2009/1/18/13/E4E4EF76/AF3A768D/F777FC5E: Transport endpoint is not connected

Problem seems not to be related to any single file - it failed twice already on different files, after about 60 minutes. This particular file exists on both storage nodes, md5sums are the same. Extended attributes are the same:

trusted.afr.g1b2=0sAAAAAAAAAAAAAAAA
trusted.afr.g2b2=0sAAAAAAAAAAAAAAAA

Backend FS is ext4, xfs was failing for some strange reason.


in client log is only this:

pending frames:

patchset: 82394d484803e02e28441bc0b988efaaff60dd94
signal received: 6
configuration details:argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 2.0.0rc8
[0xb8067400]
/lib/i686/cmov/libc.so.6(abort+0x188)[0xb7ede018]
/lib/i686/cmov/libc.so.6(__assert_fail+0xee)[0xb7ed55be]
/lib/i686/cmov/libpthread.so.0(pthread_mutex_lock+0x5d4)[0xb8013f54]
/usr/lib/libfuse.so.2[0xb75d3147]
/usr/lib/libfuse.so.2(fuse_session_process+0x26)[0xb75d4bf6]
/usr/local/lib/glusterfs/2.0.0rc8/xlator/mount/fuse.so[0xb75eef31]
/lib/i686/cmov/libpthread.so.0[0xb80124c0]
/lib/i686/cmov/libc.so.6(clone+0x5e)[0xb7f916de]
---------

There are no errors in storage server logs.

Systems:
Debian Stable,
kernel 2.6.29.1
gcc (Debian 4.3.2-1.1) 4.3.2
glusterfs 2.0.0rc8 built on Apr 20 2009 23:04:47
Repository revision: 82394d484803e02e28441bc0b988efaaff60dd94

fuse from debian 2.7.4

Configuration:
2 storage nodes (Dual Core, 2GB RAM, 3x1TB SATA)
1 client (Dual Core, 6GB RAM)

Storage config:
===========================================
volume gdisk1
  type storage/posix
  option directory /export/gdisk1/storage/
end-volume

volume brick1
  type features/posix-locks
  subvolumes gdisk1
end-volume

volume gdisk2
  type storage/posix
  option directory /export/gdisk2/storage/
end-volume

volume brick2
  type features/posix-locks
  subvolumes gdisk2
end-volume


volume gdisk3
  type storage/posix
  option directory /export/gdisk3/storage/
end-volume

volume brick3
  type features/posix-locks
  subvolumes gdisk3
end-volume


volume server
  type protocol/server
  option transport-type tcp
  option auth.addr.brick1.allow *
  option auth.addr.brick2.allow *
  option auth.addr.brick3.allow *
  subvolumes brick1 brick2 brick3
end-volume


Client config:
===========================

volume g1b1
        type protocol/client
        option transport-type tcp/client
        option remote-host 10.10.70.2
        option remote-subvolume brick1
end-volume
volume g1b2
        type protocol/client
        option transport-type tcp
        option remote-host 10.10.70.2
        option remote-subvolume brick2
end-volume
volume g1b3
        type protocol/client
        option transport-type tcp
        option remote-host 10.10.70.2
        option remote-subvolume brick3
end-volume

volume g2b1
        type protocol/client
        option transport-type tcp/client
        option remote-host 10.10.70.3
        option remote-subvolume brick1
end-volume
volume g2b2
        type protocol/client
        option transport-type tcp
        option remote-host 10.10.70.3
        option remote-subvolume brick2
end-volume
volume g2b3
        type protocol/client
        option transport-type tcp
        option remote-host 10.10.70.3
        option remote-subvolume brick3
end-volume

volume replicate1
        type cluster/replicate
        subvolumes g1b1 g2b1
end-volume

volume replicate2
        type cluster/replicate
        subvolumes g1b2 g2b2
end-volume

volume replicate3
        type cluster/replicate
        subvolumes g1b3 g2b3
end-volume



volume distribute
  type cluster/distribute
  subvolumes replicate1 replicate2 replicate3
end-volume

volume readahead
  type performance/read-ahead
  option page-size 128kB        # 256KB is the default option
  option page-count 4           # 2 is default option
  option force-atime-update off # default is off
  subvolumes distribute
end-volume

volume io-cache
  type performance/io-cache
  option cache-size 64MB             # default is 32MB
  option page-size 1MB               #128KB is default option
  option cache-timeout 2             # default is 1 second
  subvolumes readahead
end-volume


volume writebehind
  type performance/write-behind
  option aggregate-size 128KB # default is 0bytes
  option window-size 4MB    # default is equal to aggregate-size
  option flush-behind on    # default is 'off'
  subvolumes io-cache
end-volume


Is there anything I can do help debugging it?

Also it would seem that there are some significant changes to write-behind translator, as I understand it aggregate-size is no longer used?

Thanks.
JV




reply via email to

[Prev in Thread] Current Thread [Next in Thread]