From: sct@dcs.ed.ac.uk (Stephen Tweedie) Subject: Re: Another weird panic (Re: 0.99p8-8 - ext2fs-problem) Date: 27 Apr 1993 16:35:16 GMT
In article <Ifr4LGm00VIFILrF5e@andrew.cmu.edu>, fl0p+@andrew.cmu.edu (Frank T Lofaro) writes:
> Hmmm, another panic where the system stays (sort of) up. I had
> one too, but I was messing with filesystem code when it happened to
> me. Very weird, that a panic would let the system still run. Maybe
> since it wasn't in task[0]?
It's unlikely to be in task[0] if it is because of the filesystem
code; the filesystems usually run under a user process (but in kernel
mode, of course).
While developing filesystem code myself, I am continually amazed at
how robust the kernel is during errors. Whenever the filesystem gets
a kernel-mode segmentation fault, I just sync() and reboot, then look
in the syslog output to see the register dump... all from within X,
which never misses a beat. :-)
> Anyone know how safe it is to do *anything* after one of these, or
> should one go for the reset switch? Judging from the fact that when it
> happened to me, I could still create files on unaffacted filesystems,
> and syncing worked, then hung (i.e. sync would hang after writing the
> disk blocks), and rebooting had the created files still there and the fs
> clean, I would sync &, kill -TSTP 1,kill -9 -1,sync and then reboot, and
> try to get a fairly clean shutdown.
It should be OK to continue, but you might get into trouble if the
kernel panic left filesystem resources in an indeterminate state and
the filesystem tries to reuse those resources later.
Syncing buffers to disk doesn't go through the filesystem (except for
loading the sync(1) binary, of course!); a full sync() will also try
to update inodes (which *does* go through the filesystem), but this is
normally a fairly safe operation since inodes are non-relocatable -
the filesystem doesn't have to work hard to decide what to do in this
case. Ditto for syncing the superblock.
So, sync() at least is pretty safe after a filesystem panic. If the
panic has left a buffer (unlikely - buffers should only be left locked
after a device error), inode or superblock (more probable) unlocked,
then sync() will hang when it gets to the locked resource. However,
as the buffers themselves *are* usually left OK by a filesystem crash
(not necessarily their contents, though... :-( ), sync()ing after a
filesystem crash ought at the very least to leave your filesystem in a
decent state for good recovery by fsck.
Cheers,
Stephen Tweedie.