help-cfengine
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Efficiency of remote copy


From: Daniel Riek
Subject: Re: Efficiency of remote copy
Date: Mon, 8 Jul 2002 18:30:18 +0200
User-agent: Mutt/1.3.28i

Hi Mark,

with scp I get the following results:
[root@www2 opt]# time scp -qpr www1:/usr/local/source mirror_test

real    6m4.608s
user    0m1.350s
sys     0m18.440s

For the next weeks I have not time for a more detailed code analysis
but besides the fact, that cfservd uses not very much ressources on the
server, I found another obvious difference:
While the rsync algorithm - as far as I understand - looks like this:

1. Generate a request on the client and send it to the server
2. Process the request on the server and generate a file-list (with
   additional data like checksums, obviously cached) and send it back to
   the client
3. Process this filelist on the client to find the real differences,
   generate the final transfer request nad transfer this request to the
   server.
4. Collect the requestet files and send them back to the client (via a
   compressed stream?)

Cfengine and scp seem to work like doing the checks on every indiviual
file and imediately transferring it one after the other. - Couldn't 
this be a reason for the differences and the low load on the server? 
A "burst" transfer of the files could also eliminate the problem of 
files to small for encryption...

What Do you think?

Regards, Daniel


On Sun, Jul 07, 2002 at 01:02:51PM +0200, Mark.Burgess@iu.hio.no wrote:
> 
> Daniel,
> 
> I tried to reproduce some numbers to check how cfengine copying
> was in relation to other tools. I don't have rsync set up here, so the
> closest I got to your test was scp: The results on 500MB were
> 
> SCP --
> 
> real    22m10.360s  (plus-minus 15sec)
> user    0m34.320s
> sys     0m14.180s
> 
> daneel# time /local/sbin/cfagent -K -f cftest
> 
> real    20m26.420s (plus minus 15 sec)
> user    1m11.800s
> sys     0m23.570s
> 
> 
> This is not a huge difference, so I cannot se that anything is
> actually wrong. I don't fully understand why cfagent/cfservd don't work
> up more of a sweat and make things go faster. Possibly network
> is the bottleneck. I'm afraid I don't have any appropriate tools
> (or time) to try to evaluate the efficiency at the moment, but
> I would be interested to know of anyone who does and could do this --
> e.g. using some tool like a code analyzer to find out where
> programs are spending most of their time. Looking at resource
> usgae in detail on both hosts.
> 
> This would be a good exercise -- perhpas a paper for LISA 2003?
> 
> cheers,
> Mark
> 
> 
> 
> On  6 Jul, Daniel Riek wrote:
> > Ha Mark,
> > 
> > I can't say anything on the general performance but in our case, it
> > looks like this:
> > 
> > Cfengine config:
> > [...]
> >     www.!www1.do_mirror_test::
> >             /usr/local/sourcedir
> >             dest=/opt/mirror_test
> >             syslog=false
> >             encrypt=false
> >             purge=true
> >             timestamps=preserv
> >             type=ctime
> >             backup=false
> >             recurse=inf
> >             trustkey=true
> >             server=www1.mydomain
> > [...]
> > 
> > 
> > - First run without existing dest dir:
> > 
> > [root@www2 root]# time cfagent -K -D do_mirror_test
> > real    8m52.785s
> > user    0m1.270s
> > sys     0m12.810s
> > 
> > Ah, one more thing, the first data in my mail to Adrian was with
> > "syslog=true". - So without logging it looks much better...
> > 
> > - Second run with unchanged source dir:
> > 
> > [root@wwww2 root]# time cfagent -K -D do_mirror_test
> > 
> > real    7m20.412s
> > user    0m0.430s
> > sys     0m0.120s
> > 
> > And now RSync:
> > 
> > Firstrun with nonexisting dest dir:
> > [root@www2 opt]# time rsync -az -e ssh root@www1:/usr/local/sourcedir 
> > mirror_test
> > 
> > real    3m43.060s
> > user    0m37.400s
> > sys     0m12.590s
> > 
> > 
> > And unchanged sourcedir:
> > [root@www2 opt]# time rsync -az -e ssh root@www1:/usr/local/sourcedir 
> > mirror_test
> > 
> > real    0m1.404s
> > user    0m0.610s
> > sys     0m0.240s
> > 
> > 
> > To be correct, we have to take into account, that cfengine is doing more
> > than just copying that dir. So cfagent without that test:
> > [root@www2 opt]# time cfagent -K
> > 
> > real    0m16.229s
> > user    0m0.320s
> > sys     0m0.170s
> > 
> > 
> > So the difference ist still quite big. You are probably right regarding the
> > security checks, but those are not really necessary in this scenario
> > where we just mirror in a HA environment, I think. So RSync seems to be
> > doing just the right things here...
> > 
> > All tests have been run several times with differences <=10 sec. Encryption 
> > does not seem to make a difference for cfagent. 
> > 
> > One thing is, that a "ps aux" shows a very different behaviour on the 
> > server:
> > while rsync takes about 50% of cpu power on the server, cfservd stays under 
> > 2%
> > 
> > For me this seems to be the reason for the big difference. Did I miss any 
> > configuration option?
> > 
> > 
> > Regards, Daniel
> > 
> > On Fri, Jul 05, 2002 at 05:30:59PM +0200, Mark.Burgess@iu.hio.no wrote:
> >> 
> >> In a paper (not written by me) in 2001, with the older (slower)
> >> protocol, it was shown that cfengine was *faster* tham rsync
> >> at distributing files first time around. Rsync is faster at
> >> updating certain kinds of changes (that the algorithm was designed
> >> for -- small changes to large files). If cfengine is slower
> >> at certain things it is because it is doing extra checking
> >> for security reasons, but that applies to secure copy. This
> >> is also much faster since version 2, so I do not know of any
> >> real studies on this.
> >> 
> >> In short, I just do not believe the asssertion that cfengine
> >> is slow at copying  large amounts of filespace, compared
> >> to rsync. It doesn't tally with experiments done. Is this
> >> just an assumption, or have you actually tried to measure
> >> it and compare?
> >> 
> >> I am not keen on the idea of including librsync in cfengine.
> >> It would not be a straightforward task, and cfengine is much
> >> more security conscious than rsync. Sometimes there is a reason
> >> to take your time and check stuff.
> >> 
> >> M
> >> 
> >> 
> >> On  5 Jul, Adrian Phillips wrote:
> >> >>>>>> "Daniel" == Daniel Riek <riek@de.alcove.com> writes:
> >> > 
> >> >     Daniel> Hi, we are using Cfengine in a environment where we need
> >> >     Daniel> to copy large amounts of data from one machine to
> >> >     Daniel> another. There are mainly to scenarios: software
> >> >     Daniel> distribution (rpm packages and tarballs) and mirroring for
> >> >     Daniel> a failover cluster.
> >> > 
> >> > Large as in ? I use copy for the whole cfengine "setup" from one
> >> > machine to a backup, approximately 1GB which takes some minutes. I can
> >> > understand anyone trying to anything more than this having problems.
> >> > 
> >> >     Daniel> One way would be to use RSync. That is what we would do in
> >> >     Daniel> this environment if we had no Cfengine. But as we have
> >> >     Daniel> some security issues and rsync would require at least a
> >> >     Daniel> minimal root acces from the mirroring machine, we would
> >> >     Daniel> prefere to use Cfengine.
> >> > 
> >> >     Daniel> Another reason for using Cfengine to copy the data is the
> >> >     Daniel> possibility to have services restarted depending on the
> >> >     Daniel> copy...
> >> > 
> >> >     Daniel> Unfortunately Cfengine seems to be very slow when doing
> >> >     Daniel> such things.  This raises the question if anyone else
> >> >     Daniel> tried to use Cfengine in this manner and what his
> >> >     Daniel> experience is like?
> >> > 
> >> > One thing I'l like to do if possible was link in librsync and have
> >> > some additional option to copy to make it use it instead. How much
> >> > work this is and how much it will help I have no idea.
> >> > 
> >> > Sincerely,
> >> > 
> >> > Adrian Phillips
> >> > 
> >> 
> >> 
> >> 
> >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >> Work: +47 22453272            Email:  Mark.Burgess@iu.hio.no
> >> Fax : +47 22453205            WWW  :  http://www.iu.hio.no/~mark
> >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >> 
> >> 
> >> 
> >> _______________________________________________
> >> Help-cfengine mailing list
> >> Help-cfengine@gnu.org
> >> http://mail.gnu.org/mailman/listinfo/help-cfengine
> >> 
> > 
> 
> 
> 
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Work: +47 22453272            Email:  Mark.Burgess@iu.hio.no
> Fax : +47 22453205            WWW  :  http://www.iu.hio.no/~mark
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> 
> 
> _______________________________________________
> Help-cfengine mailing list
> Help-cfengine@gnu.org
> http://mail.gnu.org/mailman/listinfo/help-cfengine
> 

-- 
Daniel Riek <riek@de.alcove.com>   -    http://www.alcove.com/de/
* Technical Manager                -    Tel.:   +49 (0)2 28 / 9 08 69 85
* ALCOVE Deutschland GmbH          -    Fax:    +49 (0)2 28 / 9 08 69 84
* Liberating Software              -    Mobil:  +49 (0)1 71 / 2 80 08 79




reply via email to

[Prev in Thread] Current Thread [Next in Thread]