From: root@hip-hop.suvl.ca.us (Remco Treffkorn) Subject: Re: SCSI Performance (Yet Again) Date: Sun, 22 Aug 1993 20:01:38 GMT
bsa@kf8nh.wariat.org (Brandon S. Allbery) writes:
:
: Any cacheing scheme has pathological cases where the result is actually slower
: than direct I/O. For Linux the pathological case is iozone... a sequential
: write of a large file followed by a sequential read of the same file, if it's
: larger than the cache, results in Linux having to force cache writes so that
: there are enough free blocks to read into.
:
I agree in part. The read case certainly gets all messed up by the fact that
th cache is full. Wether you flush write buffers (costs you right then) or
decide to *not* cache the reads (might cost you later) is a loosing proposition
anyway. That is what I agree with. I do *not* agree with the statement that
iozone makes the performance look worse than direct i/o. If it does there
is something to fix right here. If you look at the szenario you will see:
iozone starts writing to an empty (ideally) cache. Lightning fast.
Only memory moves involved. The only limiting factor is how fast
your system can move data *plus* kernel/cache overhead.
The overhead of a cleanly designed cache is small (whatever that means).
The kernel (meaning everything BUT cache) overhead should be the same
as in direct i/o.
The cache gets full, now writes to disk actually start. Performance
now should equal direct i/o plus cache overhead. Meaning: close to
direct i/o performance if the cache overhead is really low.
iozone is done writing and if compiled with fsync will hang around till
all blocks are written to disk. The time saved in phase one by not
writing to disk but into the cache instead will now be spend. This
*is* direct i/o and as such should be *FAST* (?). Plus there is some
overhead...
That is the write part if iozone is compiled with the fsync call in.
I bet most people do not have that, but anyway, I have! If you add it up
you get the time for x megabytes of direct i/o writes *plus* the cache
overhead for x megabytes. If this value is significantly different from
what you expect the time for purely direct i/o should be you face a dilemma.
Either your cache overhead is higher than it should be *or* (and?) the
direct i/o performance is worse than expected.
The read case is purely direct i/o performance *plus* cache overhead for
the entire file. (if the file was larger than the cache!). Remember, we
have already written all the buffers to disk with the fsync.
Again, if what you get here is not what you want, you have a bottleneck
somewhere.
Now let's assume the fsync does not happen. That is probably the case
for most people usinf iozone. But let us also assume the file is larger
than the cache. The write phase will end early now, giving you an unrealistic
impression about write performance. Now we start reading. Actually the
write for a buffer has to take place now, in order to make room for
the requested block. In addition to the read time you will spend time writing.
What you gained during the write part by not flushing the cache will now
being lost. On top of that you will probably get *much* worse numbers
if the writes are done in small chunks. The disk head will continously
move. Very bad! So the read performance will look *much* worse than it
really is. BUT the write performance will also look *MUCH* better than
it is.
The numbers I have seen here suggest that people do not have iozone
flushing the cache to disk before starting to read. Thus the WRITE
performance is measured too high! If that is so, then the write
performance is very disappointing.
There are some critical assumtions here:
The cache is empty when beginning to write. Normally the case.
Do a sync to be sure.
The file written is larger than the cache. If not you get *very*
god performance readings. Well, I guess we do not have to worry
about that.
You should do an fsync after writing and include the time till
fsync returns. If you don't then the read performance is too high
and write performance too high. BUT between the two you will see
a glimmer of reality.
Bottom line: Under DOS I can see something like 750kB/s. Under ideal
circumstances I should be able to see at least that under Linux. The
only problem is to define 'ideal'. System in single user mode and only
one process writing/reading and not much else running qualifies for me.
If you do not agree, please explain where the bandwith gets burned!
I still think, something is fishy here!
Cheers
Remco
P.s.: This posting is too long already. I know that there are some
crucial omissions! I hope I still made my point. Especially
the term 'direct i/o' could use some definition.
remco@hip-hop.suvl.ca.us DC2XT