From: tytso@ATHENA.MIT.EDU (Theodore Ts'o) Subject: Re: 0.97p2 Date: Sat, 29 Aug 1992 17:55:10 GMT
From: torvalds@klaava.Helsinki.FI (Linus Benedict Torvalds)
Date: 28 Aug 92 23:05:26 GMT
Hmm. I'd like to hear more about the problem - especially if you can
pinpoint it more closely (ie having used 0.97, 0.97.pl1 and now pl2) to
a specific patch. As most people said 0.97.pl1 was fast, I'm assuming
it's specific to patch2, but I'd like to have some confirmation before I
start looking into the problem.
Whether or not a particular version is fast is probably dependent on how
much memory you have. I ran a controlled series of tests by running
0.96c, 0.97, 0.97pl1, and 0.97pl2 on my 40 MhZ 386 machine (with 16meg
memory), I noticed no appreciable difference in times:
Ver. Time to compile the stock 0.97
kernel after doing a "make clean"
0.96c 9:35 (*)
9:33
9:34
0.97 10:21 (*)
9:41
0.97pl1 10:10 (*)
9:45
9:32
0.97pl2 10:36 (*)
9:25
9:30
10:11 (*)
9:41
All of these times were measured by doing (date;make;date) >& MAKELOG
and then measuring the difference between the first and second time.
The only processes that were running on the machine other than the
compile was the X server and a single xterm. The (*) times indicate the
first compile after a reboot; the (*) times are higher because the
buffer cache hasn't been primed yet.
So at least if you have a lot of memory, there is no appreciable
difference between 0.96c, 0.97, 0.97p1, and 0.97p2. If I had to make a
guess, I would guess that the problem happens on machines with less
memory --- say, 4 or 8 megabytes, and I further guess that it might be
related to the buffer changes. It could very well be that the poeple
who said that 0.97pl1 was fast were running with a lot of memory.
If it's patch2, the problem is probably the changed mm code: having
different page tables for each process might be costlier than I thought.
The old (pre-0.97.pl2) mm was very simple and efficient - TLB flushes
happened reasonably seldom. With the new mm, the TLB gets flushed at
every task-switch (not due to any explicit flushing code, but just
because that's how the 386 does things when tasks have different
cr3's).
I don't think the TLB cache flush would be much of a problem. Consider:
There are 32 entries in the TLB, and if you reference a page which is
not in the TLB, you pay a penalty of between 0 and 5 cycles. So the
maximum penalty you incur by flushing the TLB is 5x32 or 160 cycles. If
you further assume the worst case that you are switching contexts every
tick of the 100hz clock, then you will flushing the TLB 100 times a
second, or taking a penalty of 16,000 cycles/second. On a 16MHz machine,
there are 16 x 10**6 cycles/second. So the worst case extra time
incurred by flushing the TLB is (16 x 10**3) / (16 x 10**6) == 10**-3,
or an overhead of 0.1%. On a 40MHz machine, this overhead declines to
0.04%.
Now, these times do assume that the page table/directories haven't
gotten paged out to disk. Since each process must now have at least one
page directory and two page tables (one for low memory and one for the
stack segment in high memory), if you assume a 2 meg system has 8-9
processes running, 24 4k pages, or 10% of its user memory is being used
to hold the page tables/directories. This has two effects; the first is
to increase the memory usage, which may increase thrashing. The second
is that if these pages get swapped out, the kernel will have to bring
them in again the moment that process starts executing again, since the
TLB cache will be empty.
I can optimize things a bit - it's reasonably easy to fake away some of
the TLB flushes by simply forcing the idle task to always use the same
cr3 as the last task did (as the idle task runs only in kernel memory,
and kernel memory is the same for all processes). So, I'd be interested
to hear if this simple patch speeds linux up at all:
Given my back of the envelope calculations above, I would be doubtful if
this patch speeds up Linux by any appreciable amount. And any speed
improvement will probably be taken up by the extra time to do the
extra check in the scheduler. But this is only a theoretical guess;
someone should probably gather experimental evidence to make sure.
- Ted