From: drew@romeo.cs.colorado.edu (Drew Eckhardt) Subject: Re: Bernoulli/SyQuest Date: 27 Jan 1993 01:12:48 GMT
In article <C1FCLA.1vFD@austin.ibm.com> jstump@auntbea.austin.ibm.com (John E. Stump) writes:
>In article <727748701.AA34908@remote.halcyon.com> Chris.Bugosh@f340.n226.z1.fidonet.org (Chris Bugosh) writes:
>>Can I have some horror/success stories on setting up Linux on either of
>>these two boxes? I'd also know what controller you're using... I hear
>>that the controller that comes with the Bernoulli from Hard Drives
>>International doesn't work. Is the adapter that comes with the SyQuest
>>compatible? Thanks for any/all info.
>>
>>Chris
>>
>> * Origin: The MIDI Exchange - Columbus, Ohio - (614) 846-1274
>>(1:226/340)
>>
>
>I have a SyQuest 44MB attached to a Seagate ST-01 SCSI card in my 486DX
>clone. Up until Linux 0.99pl4, it would not allow me to do any writes to
>the disk without crashing. Now it works, except for a few spurious error
>messages that I believe are harmless.
>
>There are a couple of weaknesses in it though: (1) when a rather large
>buffer cache is being flushed to the Syquest, interrupts are turned off
>and the system in literally locked up until the write is done (which is
>slow since this is a Syquest), and
Technically, this statement is invalid. Interrupts are enabled
in the Seagate driver, and I/O to other disks, serial / ether ports,
etc will continue uninterupted.
However with an unbuffered disk, user processes will not be run for an
unbearably long time.
With an IDE setup, we only need to hang around in kernel
code, not running user processes, when we are actually sending data
to the disk as fast as we can, spending all of our time waiting
for rotational latency and seek time running user code.
With the Seagate / unbuffered SCSI, things are a little more involved.
After issuing the SCSI command (Involving ARBITRATION for the SCSI bus, a
MESSAGE phase and COMMAND phase), the disk may DISCONNECT if it needs
to seek. User processes will run during this time, probably ~8-
~30ms. If we did DISCONNECT, we have to go through RESELECTION,
and MESSAGE phases. Finally, we can start transfering data AS DICTATED
BY THE DRIVE.
An unbuffered disk will force us to twiddle our thumbs in kernel mode,
with no user processes running, until our sector becomes available. This
will be the rotational latency of the disk, ~8ms average, ~16ms worst
case on a 3600 RPM disk. Due to the high-overhead of the SCSI protocol,
we can't stream the disk reading a SINGLE block per SCSI command. So,
for every block on a track after the first one (many SCSI disks have
60 sectors, or 30 blocks on a track, which means nearly .5 seconds), we waste
our worst case rotational delay of ~16ms in kernel code. With only 50K of
dirty blocks in the cache, user processes may be suspended for nearly a
second, running only during those ~8-~30ms periods when the drive is doing a
long seek (Should be infrequent, since ideally files are allocated
contiguously, and the requests have been sorted to minimize head movement).
Like I said, this is what happens using a SINGLE block per command.
Scatter-gather solves this problem, by allowing us to read/write all
contiguous blocks on a disk to non contiguous buffers.
Another problem is in how commands were being propogated to the lowlevel
SCSI drivers. Basically, it looked like this
disk driver translates request to SCSI command
mid level passes command to Seagate driver
seagate drive issues command to disk
command completes, callback to midlevel
midlevel callback to disk driver
From within the callback, the next command is propogated in the
same way back to the Seagate driver. Real painful.
This destroys our chances of getting a command to the disk in time
for non-contigous sectors on the same track.
The solution is to overlap things somewhat, having the disk driver
issue a large number of commands at once, so it looks like
disk driver translates requests to SCSI commands
midlevel passes commands to Seagate driver
issue command
command completes
issue next command
call backs
command completes
and so on.
The code for multiple outstanding commands per LUN makes this
possible.
All of the high and midlevel code is there, thanks to Eric. The code
in the Seagate driver has also been written, but not yet integrated
because I was having problems with the stack overflow problem that
Eric traced down (Didn't know it at the time. It's blazingly fast
compared to the current code, will stream a 34 sector / track
drive interleved aat 3:1, but was a trifle unstable).
Another note :
The current hand-coded assembler loop in the Seagate routines transfers
at ~500K/sec (I think wde verified this with a scope), but you can
go much faster if you let the seagate use the SCSI bus handshake signals
to generate wait states as needed with the 0ws jumper and just dump
data down it's throat.
The code for this is also done, but unintegrated / tested.
As far as the seagate performance improvements becoming publically available :
"RSN" (which means an arbitary time somehwhere between a few hours and six
months) if I do it some other time frame (hint hint) if some one wants to
borrow my code and / or do it themselves.
Another note :
You may be happier if you discover the optimal interleave and format your
cartridges at something less than 1:1.
>(2) although the Syquest is a
>"mountable" hard disk, the partition table on the cartridge is only
>read once during boot up, so you are out of luck if you want to umount
>and mount a different cartridge while the system is up and running.
Nope, you can change cartridges. Sending a BLKRRPART ioctl to the disk will
reread the partition table.
-- Boycott AT&T for their absurd anti-BSDI lawsuit. | Drew Eckhardt Condemn Colorado for Amendment Two. | drew@cs.colorado.edu Use Linux, the fast, flexible, and free 386 unix |