gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] Server Side AFR gets transport endpoint is not conne


From: James E Warner
Subject: Re: [Gluster-devel] Server Side AFR gets transport endpoint is not connected
Date: Thu, 28 Aug 2008 09:33:18 -0400

Thanks for the prompt reply.  One final question.... is the HA translator
still planned for the upcoming 1.4 release and if not do you have a rough
idea of what release it is going into?

Thanks Again,

James Warner
Computer Sciences Corporation
Registered Office: 3170 Fairview Park Drive, Falls Church, Virginia 22042,
USA
Registered in Nevada, USA No: C-489-59

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------

This is a PRIVATE message. If you are not the intended recipient, please
delete without copying and kindly advise us by e-mail of the mistake in
delivery.
NOTE: Regardless of content, this e-mail shall not operate to bind CSC to
any order or other contract unless pursuant to explicit written agreement
or government initiative expressly permitting the use of e-mail for such
purpose.
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------




                                                                           
             "Krishna                                                      
             Srinivas"                                                     
             <address@hidden                                          To 
             h.com>                    James E Warner/DEF/address@hidden        
  
             Sent by:                                                   cc 
             krishna.srinivas@         address@hidden            
             gmail.com                                             Subject 
                                       Re: [Gluster-devel] Server Side AFR 
                                       gets transport endpoint is not      
             08/28/2008 01:03          connected                           
             AM                                                            
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           




On Thu, Aug 28, 2008 at 12:45 AM, James E Warner <address@hidden> wrote:
>
> Hi,
>
> I'm currently testing gluster to see if I can make it work for our HA
> filesystem needs.  And in initial testing things seem to be very good
> especially with client side AFR performing replication to our server
nodes.
> However, we would like to keep our client network free of replication
> traffic so I set up server side afr with three storage bricks replicating
> data between themselves and round robin DNS for the node failover.  The
> round robin dns is working and the failover between the nodes is kind of
> working, but if I pull the network cable on the currently active server
> (the host that the glusterfs client is connected to) the next filesystem
> operation (such as ls /mnt/glusterfs) fails with a "transport endpoint is
> not connected" error.  Similarly, if I have a large copy operation in
> progress the copy will exit with a failure. All of the operations after
> that work fine and netstat shows that the node has failed over to the
next
> server in the list, but by that point I the current file system operation
> has failed.  Anyway, this leads me to a few questions:
>
> 0.  Do my config files look OK or does it look like I've configured this
> thing incorrectly? :)
> 1.  Is this the expected behavior or is this a bug?  From reading the
> mailing list I had the impression that on failure the operation would be
> tried on the remaining ip's that were cached in the clients list, so I
was
> surprised that the operation failed and I think that it is probably a
bug,
> but I could see an argument for how this might be considered normal
> operation.

That is the expected behavior.

>
> 2.  If this is expected behavior is there any plan to change the behavior
> in the future or is server side AFR always expected to work this way?
I've
> seen references to round robin dns being an interim measure on the
mailing
> list, so I'm not sure if there is another translator in the works or not.
> If there is something in the works is that available in the current
> glusterfs 1.4 snapshot releases or is that planned for a much later
> version?

Yes we plan to bring in a HA translator which will make this work fine.

>
> 3.  Can you think of any option that I might have missed that would
correct
> the problem and allow the currently running file operation to succeed
> during a failover?
>
> 4.  Once again if this is as designed can you explain the reason that it
> works this way?  As I said I really expected it to transparently failover
> in much the same way that client side afr seems to, so I was surprised
that
> it didn't.

If AFR is on client side, it will maintain connections to its
subvolumes separately.
So if one node fails, it will still have connection to other subvols.
However if AFR
is on server side and the server goes down, it can not do anything about
it.
Now if we bring HA xlator into picture, it sits on the client and it
can take care
of seamless failure transition when the connection fails.

>
> Since I hope that this is a bug, the configuration files and the relevant
> sections of the client log are below.  I have used this configuration on
> the gluster 1.3.11 version and the latest snapshot from August 27, 2008.
>
> Client Log Snippet:
> ================
>
> 2008-08-27 12:53:34 D [fuse-bridge.c:839:fuse_err_cbk] glusterfs-fuse:
62:






reply via email to

[Prev in Thread] Current Thread [Next in Thread]