help-cfengine
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: cfengine 1.x protocol inefficiencies


From: Gregory P. Smith
Subject: Re: cfengine 1.x protocol inefficiencies
Date: Wed, 25 Oct 2000 01:36:10 -0700
User-agent: Mutt/1.2.5i

> > (Side note: the cfengine protocol is not designed well for doing lots
> > quickly; it sends lots of null padding in the protocol in both
> > directions and is latency sensitive by always waiting for the response
> > from one server operation before issuing another request.  Despite all
> > this, it does get the job done, just not nearly as well as it could)
> > 
> > Greg
> 
> cfd 1.6.0 is extremely stable and the claims
> of its inefficiency are slightly exaggerated I think...
> 
> Mark

For most people's use, Mark is right and the inefficiencies are hardly
noticed.

You'll notice performance issues when syncing large directory trees.
It makes it not worth using cfengine itself to keep big trees in sync
across a large number of hosts other than to copy and extract an
updated tar file [a very good trick] or spawn an rsync process.

the padding problem:  Every message to the server is padded to 4kb
with zeros.  So if you're asking for the status of 2000 files, you're
sending 8megs of data to the cfd server to do it, only ~80kb of which
contained useful data (the rest was zeros).  Now multiply this by a
good number of clients (lets say 400 in this example) doing this once
an hour and you've got 3.2gigs of data being sent to your cfd server
just to -ask- it about file timestamps.  In addition to this, all
server responses are padded the same way so the server has to send out
at least that much data in response.  This just about -saturates- a
full duplex 10mbit/sec network connection on the server for such a
small task.  Add to this traffic for actually getting updated files
when some or many of them have changed and you can say goodbye to your
network.

the non-pipelining/latency problem:  On a 10mbit/sec network the
latency is about 8-9ms to send 4kb to a server and get a 4kb response
from the server (its 1-2ms for a tiny amount of data).  The current
client sends one request and waits for its response before sending the
next one.  To stat 2000 files this way would take about 17 seconds
with the padding and 3 seconds without.  If the requests were
pipelined (ie: all stat requests were just sent in a row without
waiting for the previous ones responses) the time for the stat would
be bounded by the bandwidth between the client and server rather than
the latency as well as taking advantage of full duplex communications
and halving the stat time.  The stat time is not dramatic with only
2000 files, but multiply that by 10 and you see the problem...

These days many if not most sites have fast 100mbit/sec ethernet
between the majority of their hosts.  Even so, cfengine uses a
disproportionate amount of bandwidth and becomes much less useful over
leased-lines/WANs (where latency is much higher) if a large number of
file stats/copies are desired.

I hope that any work on cfengine 2.x will include a more efficient
protocol.  If anything is done, at least stop padding messages and use
much simpler length prefixed ones or newline terminated ones like the
rest of the protocols in the world.  (I'm in favor of length prefixing
everything myself; newline termination is asking for disaster when
someone forgets to length check the data anywhere in the code)

Greg

-- 
Gregory P. Smith   gnupg/pgp: http://suitenine.com/greg/keys/
                   C379 1F92 3703 52C9 87C4  BE58 6CDA DB87 105D 9163



reply via email to

[Prev in Thread] Current Thread [Next in Thread]