gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gluster-devel] crash test result: Input/output error


From: tapczan
Subject: [Gluster-devel] crash test result: Input/output error
Date: Wed, 08 Feb 2012 13:33:45 +0100
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0) Gecko/20120129 Thunderbird/10.0

This is my crash test scenerio:

1. hosts
server1 - member of gluster volume
server2 - member of gluster volume
client1 - gluster storage activity - reads: ~10/s, writes: ~10/s
client2 - gluster storage activity - reads: ~10/s, writes: ~10/s

2. AFR gluster storage (tested 3.2.5 and 3.3beta2)
# gluster volume info

Volume Name: data
Type: Replicate
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: server1:/fs/data
Brick2: server2:/fs/data

3. storage /fs/data:
~ 300 000 files (size < 10KB)
~ 3 GB

4. crash test scenerio
- server1 goes down
- clients got a few "Input/output error" (for read and write) and continue working - fine
- server1 recovers (after ~3 minutes)
- clients got a few "Input/output error" (for read and write) - fine
- access to gluster storage from clients blocked (self-healing process - a few minutes with my hardware configuration)
- during this self-heling process server2 goes down
- self-healing process interrupted and clients gain access to gluster data
- server2 recovers and real problems started
- clients: data inaccessible: permanent "Input/output error" for files and directories

client1:
# ls -la a
ls: cannot access a: Input/output error

# ls -la
??????????  ? ?    ?         ?            ? a

server1:
# getfattr -d -m . a
# file: a
trusted.afr.data2-client-0=0sAAAAAAAAAAAAAAAA
trusted.afr.data2-client-1=0sAAAAAAAAAAAAAAAq
trusted.gfid=0sfdlzd6TeRxelnMeCG9ut/w==

server2:
# getfattr -d -m . a
# file: a
trusted.afr.data2-client-0=0sAAAAAAAAAAAAAAA1
trusted.afr.data2-client-1=0sAAAAAAAAAAAAAAAA
trusted.gfid=0sfdlzd6TeRxelnMeCG9ut/w==

clients /var/log/glusterfs/data.log:
[2012-02-08 13:24:16.837976] I [afr-self-heal-common.c:705:afr_mark_sources] 0-data2-replicate-0: split-brain possible, no source detected [2012-02-08 13:24:16.838079] W [fuse-bridge.c:184:fuse_entry_cbk] 0-glusterfs-fuse: 565416: LOOKUP() /a => -1 (Input/output error)


This kind of issues making gluster unusable in production system.

--
Robert



reply via email to

[Prev in Thread] Current Thread [Next in Thread]