Re: [Gluster-devel] Suggestions

gluster-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] Suggestions

From:	Gordan Bobic
Subject:	Re: [Gluster-devel] Suggestions
Date:	Wed, 08 Jun 2011 16:22:45 +0100
User-agent:	Mozilla-Thunderbird 2.0.0.24 (X11/20100328)

Hans K. Rosbach wrote:

On Wed, 2011-06-08 at 12:34 +0100, Gordan Bobic wrote:

Hans K. Rosbach wrote:

-SCTP support, this might not be a silver bullet but it feels

[...]

 Features that might need glusterfs code changes:

[...]

  -Multihoming (failover when one nic dies)

How is this different to what can be achieved (probably much morecleanly) with NIC bonding?


NIC bonding is nice for a small network, but routed networks might
have advantages from this. This is not something I feel that I need,
but I am sure it would be an advantage for some other users. This could
possibly be of help in geo-replication setups for example.

Not sure what routedness has to do with this. If you need route failoverthis is probably something best done by having a HA/cluster servicechange the routing table accordingly.

-Ability to have the storage nodes autosync themselves.
 In our setup the normal nodes have 2x1Gbit connections while the
 storage boxes have 2x10Gbit connections, so having the storage
 boxes use their own bandwidth and resources to sync would be nice.
Sounds like you want server-side rather than client-side replication.You could do this by using afr/replicate on the servers, and export viaNFS to the clients. Have failover handled as for any normal NFS server.
We have considered this, and might decide to go down this route
eventually, however it seems strange that this can not also be done
using the native client.


Is the current NFS wheel not quite round enough for you? ;)

The fact that each client writes to both servers is fine, but the
fact that the clients needs to do the re-sync work whenever the
storage nodes are out of sync (one of them rebooted for example)
seems strange and feels very unreliable especially since this is
a manual operation.

There is a plan C, though. You can make the servers also clients. Youcan then have a process that does "ls -laR" periodically or upon failure.

-An ability for the clients to subscribe to metadata updates for
 a specific directory would also be nice, so that it can cache that
 folders stats while working there and still know that it will not
 miss any changes. This would perhaps increase overhead in large
 clusters but could improve performance by a lot in clusters where
 several nodes work in the same folder (mail spool folder for example).

You have a shared mail spool on your nodes? How do you avoid raceconditions on deferred mail?


Several nodes can deliver mails to the spool folder, and dedicated queue
runners will pick them up and deliver them to local and/or remote hosts.
I am not certain what race conditions you are referring to, but locking
should make sure no more than one queue runner touches the file at one
time. Am I missing something?

Are you sure your MTA applies locks suitably? I wouldn't bet on it. Iwould expect that most of them assume unshared spools. Also rememberthat locking is a _major_ performance bottleneck when it comes tocluster file systems. Multiple nodes doing locking and r/w in the samedirectory will have an inverse scaling impact on performance, especiallyon small I/O such as you are likely to experience on a mail spool.

If there is no file locking you will likely see non-deterministicmultiple sending of mail, especially deferred mail. Depending on howyour MTA produces mail spool file names, you may see non-deterministicsilent clobbering, too, if it doesn't do parent directory locking onfile creation/deletion.If there is locking, you will likely see that the performance starts toreduce as you add more servers due to lock contention.


Gordan

[Prev in Thread]

Current Thread

[Next in Thread]

[Gluster-devel] Suggestions, Hans K. Rosbach, 2011/06/08
- Re: [Gluster-devel] Suggestions, Gordan Bobic, 2011/06/08
  - Re: [Gluster-devel] Suggestions, Hans K. Rosbach, 2011/06/08
    - Re: [Gluster-devel] Suggestions, Gordan Bobic <=

Prev by Date: Re: [Gluster-devel] Compile Failure on Solaris: wrong ELF class: ELFCLASS64
Next by Date: Re: [Gluster-devel] Compile Failure on Solaris: wrong ELF class: ELFCLASS64
Previous by thread: Re: [Gluster-devel] Suggestions
Next by thread: [Gluster-devel] limiting client trust
Index(es):
- Date
- Thread