[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Mldonkey-users] HDD use
From: |
Brett Dikeman |
Subject: |
Re: [Mldonkey-users] HDD use |
Date: |
Sat, 15 Mar 2003 23:15:34 -0500 |
At 1:53 PM +0100 3/13/03, Ficlu wrote:
I am wondering about the use of the disk by the mldonkey client. A friend of
mine (a pc sailer) said me that a lot of p2p users crashe their disk because
the client makes too many access to the disk.
This is pure BS- the head motion involved in, say, launching an
application(especially if the system is low on memory and the drive
fragmented), is certainly more than the activity generated by a p2p
client, which reads and writes in a very predictable pattern.
Assuming no fragmentation, almost everything a p2p client does is
very linear. Hashing a chunk is a perfect example- you read start to
finish one chunk. Little head motion is involved.
Desktop drives are designed for linear access- server
market/enterprise drives are optimized for random access(ie,
databases.) If you've got a desktop drive, chances are your drive
will favor leaving the head where it is after a read or write, since
chances are you're going to write another block nearby. This
assumption is almost -always- correct with a file transfer, unless
the disk is fragmented. This is why, for example, an IDE drive on a
file server will appear to offer better performance with only one
client than its much more expensive SCSI cousin. However, toss 10
clients at the IDE-powered server, or fragment the drive, and watch
it slow to a crawl; the IDE drive's firmware isn't designed for that
sort of thing, and the SCSI drive will wipe the floor with it. Give
me 4 SCSI drives and a SCSI raid card, and I'll embarrass the pants
off your IDE RAID card w/4 IDE drives, on damn near everything except
sequential IO.
Things are changing, however- with the market/economy in a crunch,
IDE is expanding upscale, and we're seeing IDE drives aimed at the
low-end server market, with higher MTBF's and enterprise-class
firmware from their SCSI cousins- to simpify, only the interface is
different.
After 6 months of 24h/day of
utilization, a kazaa client could destroy the disk (it happened for a
significant number of persons and that's the reason I am wondering about the
mldonkey client case).
Pure BS; your reseller friend is picking the wrong common factor
among the users. A gross analogy is "everyone who eats tomatoes
dies!" It's a very common problem for people to pick the wrong
common factor in trying to figure out why something happens.
I suspect what is a greater factor is p2p power users may have more
than one drive- and each time you add a drive, you increase your
chances of drive failure; this is why plain RAID striping is almost
never done anymore. As other people mentioned, folks running p2p
apps might also leave their computers on longer. It has nothing to do
with the drive not being "designed" to run continuously- they're
simply not designed to have as high a MTBF as server/enterprise
drives.
Basically, take the MTBF of your drives(for simplicity, we'll
assume they're all the same) and divide by the number of drives.
That's your effective MTBF for the cluster as a whole- one of them
will kick the bucket approximately by that time.
Combine longer running with more drives(say, two drives, 24x7 instead
of one drive, 8x5), and you set yourself up for much higher chances
of failure(almost 8x more likely.)
Dirty Harry said it best- "Are you feelin' lucky, punk? Well, are ya?" :-)
Is there any policy in the mldonkey client when
writing on the disk ? I know that linux can be configured to synchronize the
buffer cache with the disk less often
What you are referring to is bdflush. On laptops, some users prefer
to run it with a larger time period between flushes to minimize
clashes between the drive's power saving mechanisms and the
repetitive flushing. Buffers are not to be confused with disk cache.
What affects mechanism movement more is the disk IO elevator...its
goal is to cluster reads and writes in a queue, and delay requests
based on what the drive is currently doing(reading or writing) since
it is 'expensive' to switch between reading and writing. -Grossly-
simplified example:
r w r w r w r w
rrrr wwww
Notice one is significantly shorter than the other. For ALL non-zero
values for an expense of switching, the first scenario will always be
slower --overall--, which is why Linux tries to group types of IO.
Also, in this particular example, ALL the requests, with the
exception of the first write, were executed sooner in the second
example!(this is due mostly to the fact that in my example a 'switch'
is as expensive as a single IO transaction- I don't think that is a
realistic assumption on my part, so there's your caveat.)
The maximum time limit on the age of a disk IO request can be tuned,
but it's best left alone unless you really know what you're doing; I
believe in 2.4 it is even self-tuning to some degree, but I'm not
sure about that.
IMHO mldonkey shouldn't do anything specific or special with regard
to IO- it is the job of the operating system(and its drivers) to
decide what the 'best' way is. If you start 'tuning', you also set
yourself up to having to re-tune later when things change- or making
things worse for situations you didn't think of or dismissed.
Case and point is the 2.5 kernel- the anticipatory disk IO scheduler
that has been introduced behaves rather differently from 2.4's
scheduler. Wouldn't it suck to optimize for 2.4, then find your
tricks don't work for 2.5, maybe even make things worse? Whoooops.
Brett
--
----
"They that give up essential liberty to obtain temporary
safety deserve neither liberty nor safety." - Ben Franklin
http://www.users.cloud9.net/~brett/