From: as@comlab (Andrew Stevens) Subject: Re: Is linux telling me what's wrong with my machine? Date: 20 Jul 1992 10:02:59 GMT
In article <21969@venera.isi.edu> grande@isi.edu (Jim Grande) writes:
Well, having just decided to toss (most of) DOS and switch to Linux, I
downloaded the installation kit (based on 0.96c) that's on sc.tamu.edu.
Booting from the boot floppy worked the first couple times and I was
getting ready to repartition the hd and load all of linux. Well, I didn't
make it that far. All of a sudden I can no longer boot linux.
Let me preface this by saying that I've had some mysterious hardware
problems (intermittent screen corruption, crashes, etc) for some time,
but I've never been able to pin down the exact problem.
...
Could this be bad ram or possibly the DMA controller?
It may *just* be that you have a similar problem to one I experienced a
few months back. Basically, I had just bought a 486/33 and was fairly
new to ISA machines. My machine suffered sporadic hangs just infrequent
enough to be hard to pin down. I fiddled around for ages getting
nowhere, I suspect RAM faults but misc. DOS diagnostics found nothing.
They lied. I eventually decided to bring up Linux regardless of
hardware problems (I had *bought* the machine for Linux) and found it
promptly crashed my machine. This made me wonder...
Question: what's the difference between running my machine under
DOS/Windows and Linux.
Answer: under Linux it spends *all* its time in 386 protected mode doing
32 bit memory accesses for code and data, wheras under DOS/Windows it
spends very little time that way.
I took a much closer look at the machine config's which the manufacturer
had ``set up'' for me. Comparison with the Taiwan-english motherboard
manual (eventually) revealed that I had been running my RAM too fast: 0
Wait states rather than the 1 my el cheapo RAM demanded. Once I
switched in the extra wait states my machine was utterly reliable. The
RAM, presumably, was just marginal enough that running 32 bit accesses
occasionally pushed it over the edge. Probably the extra cross-talk
etc...
Anyway, the damn useless parity checking hadn't detected a thing.
(DOS turns it off anyway, but Windows -- supposedly -- turns it on).
Probably timing was such that the parity check just made it, but not the CPU.
The lesson: 1. Linux is a great diagnostic for marginal memory.
2. DOS diagnostics are cr*p.
My *guess* is that you have marginal RAM as I did. I'd try adding a
wait state or two on cache and main memory to see if that helps. If the
machine is still flaky, try removing simms (if you have enough to run
the machine with less) or swapping them with begged / borrowed / stolen
ones to see if you can track down a bad one.
Andrew