[Gluster-devel] solutions for split brain situation

2009/9/16 Anand Avati <address@hidden>

> No, not really. In fact every other comment about glusterfs(d) reads like
> "this is a standard application regarding the fs, therefore it cannot be
> responsible for problem A or bug B". Now, if it is to be judged as one of many
> applications on the one hand, then it should be able to cope with situations
> that every standard application can cope with either - other applications
> using the same fs.

glusterfs is a standard application regarding the fs, therefore it
cannot be responsible for problems showing up in the kernel. glusterfs
is not expected to work properly if you modify the backend export
directory directly bypassing the mountpoint. This is the baseline
premise for using glusterfs.

> _The_ advantage of the whole glusterfs concept is exactly that it is _no_ fs
> with a own and special disk layout. It (should) run(s) on top of an existing
> fs that can be used just like a fs may be used - including backup (with rsync
> or whatever), restore and file operations of any kind.

glusterfs uses a disk based filesystem as its backend. This in no way
implies that it can share the backend with other applications and work
without problems. glusterfs needs _exclusive_ access to this export
directory. That is how it is designed to work. If you backup one
backend, you can restor it only as that very backend. What you are
trying is to do is use one backend as another subvolumes backend. If
you expect copying over the backend by skipping the xattrs, while
modifying those very files from the mountpoint to just work, then the
expectation is improperly set. Please copy in all your content only
from the mountpoint.

> If subvolumes are indeed closed storages then they would be in no way
> different than nbd, enbd, whatever-nbd. For various reasons we don't want
> these solutions.

GlusterFS is surely not a solution where you can freely modify the
backend directly. For proper operation of the filesystem, the only
supported mode of usage is through the mountpoint. Whatever
modification you do with the backend is done at your own risk.

Avati

_______________________________________________
Gluster-devel mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/gluster-devel

Stephan,
As Avati has tried to make clear, GlusterFS with the cluster/replicate translator relies very heavily on the backend filesystem's support for extended attributes, and these extended attributes are what GlusterFS uses to know if a file is more up to date on one brick over another when performing a self heal.

I feel that you have not read the link Understanding AFR translator. This link should explain exactly how GlusterFS performs self-heal, and should help you understand its use of extended attributes. There used to be a way to initialise a brick for use with GlusterFS by manually setting the appropriate extended attributes. I do not know if this is still supported.

You are welcome to read the source code for more in depth understanding of the particular extended attributes used by GlusterFS for the replicate translator.

Don't try bypassing the mountpoint to perform file operations _period_ . You can always have a replicate mountpoint configured on the server (i.e. a client for replicate), as well as the server side. NFS should run on top of this replicate mountpoint. This (poor) graphic may help. Note that everything is running on the same machine:

| NFS |
------------------
|GlusterFS Client|
------------------
|GlusterFS Server|
------------------
| POSIX Storage |
------------------

Regards,
Michael Cassaniti

From:	Michael Cassaniti
Subject:	[Gluster-devel] solutions for split brain situation
Date:	Wed, 16 Sep 2009 09:45:43 +1000