From: Paul H. Rubin (phr@soda.berkeley.edu)
Date: 03/15/92


From: phr@soda.berkeley.edu (Paul H. Rubin)
Subject: Re: 'pklite' for Linux.
Date: 16 Mar 1992 02:28:50 GMT

In article <1992Mar16.021652.11051@colorado.edu> drew@cs.colorado.edu (Drew Eckhardt) writes:

   Disk is cheap, memory is expensive. If you can't share pages,
   you're going to loose memory, and start swapping to disk much
   sooner. I sometimes look at the memory allocation, and see 100+
   shared pages when I'm heavily loaded - that's 400K.

Actually, this is a faulty analysis. Disk is cheaper than ram, but
the program takes up memory only when it is running; it takes up disk
space whether it is running or not. If your system spends much of its
time thrashing, you don't have enough ram and you should get some
more. If you're not thrashing, why worry about a few unshared pages?

For me, a strong motivation for exec file compression is to be able
to put a Linux setup/demo system onto a single floppy. A smaller
shell would help, too, as people have been discussing.

> 3. Thousands of copies of this decompression routine, one in
> every executable, like a virus. Gross. Plus, the kernel
> is constantly loading the same decompression routine from
> disk. Wasteful.

   If its in the kernel, there aren't "thousands of copies of this routine"

I agree that it should be in the kernel, but another alternative
is for it to be in a shared library. Then it is out of the kernel
but there aren't thousands of copies around.

> 1. Edit the kernel code for exec, to look for a new 'compressed
> executable' magic number.
> 2. Have it then decompress/load the file into ram, and
> 3. Have it then proceed as if it had just loaded a non-demand-paged
> executable.

   Much better. You still have shared pages, etc. The only real problem
   you still have is not being able to page from the file, and text gets
   swapped like data.

This doesn't seem bad to me at all, and I see no way around it.
Currently exec won't allow this, but I think there will eventually a
way to run executables like this. For one thing, I'd like to be able
to mount ms-dog filesystems through VFS, and those are hard to seek
around in, making paging directly from the file difficult.

>The disadvantages.
>
> 1. The kernel is getting bigger (although not much,
> assuming that the decompression routine is small).
> 2. You have to have the decompressing-kernel to run
> compressed programs.

The "uncompress" part of the Unix 'compress' utility is VERY small--
maybe 1k of code. See the famous "shark" archiver for a super-compact,
unreadable implementation.

Actually there IS another alternative, which is perhaps not in the
spirit of Un*x but may be practical: put the decompression code in the
FILE SYSTEM (through vfs). Then we could add additional decompression
algorithms, etc., by adding vfs drivers. The FS would recognize
files with a "compressed" bit set; these files would be read-only
and sequential-access-only.