Re: Re: Swap Compression

From: rmoser <mlmoser@comcast.net>
To: linux-kernel@vger.kernel.org
Subject: Re: Re:  Swap Compression
Date: Sun, 27 Apr 2003 16:10:07 -0400	[thread overview]
Message-ID: <200304271610070120.01FCDE97@smtp.comcast.net> (raw)
In-Reply-To: 200304271609460030.01FC8C2B@smtp.comcast.net

*********** REPLY SEPARATOR  ***********

On 4/27/2003 at 9:04 PM Jörn Engel wrote:

>On Sun, 27 April 2003 14:31:25 -0400, rmoser wrote:
>> 
>> Yeah I know.  I really need to get a working user-space version of this
>> thing so I can bench it against source tarballs and extractions from
>> /dev/random a bunch of times.
>
>Your compression won't be too good on /dev/random. ;)
>But /dev/kmem might be useful test data.
>

Ahh okay.

>> >Why should someone decide on an algorithm before measuring?
>> 
>> Erm.  You can use any one of the other algos you want, there's a lot
>> out there.  Go ahead and try zlib/gzip/bzip2/7zip/compress if you
>> want.  I just figured, the simplest algo would hopefully be the fastest.
>
>Hopefully, yes. Also, the one that doesn't trash the L[12] caches too
>much will be "fast" in that it doesn't slow down the rest of the
>system. This aspect really favors current uncompressing code, as it is
>very easy on the CPU.
>

Mine doesn't do anything :)  It's the lazy-boy of compression!  I don't know
enough to trash anything

>> But really we need some finished code to make this work :-)
>
>Yes!
>
>> The real reason I'm working on this is because I favor speed completely
>> over size in this application.  It's all modular; make the interface for
>the
>> kernel module flexible enough to push in gzip/bzip2 modules or whatever
>> else you want:
>> 
>> <M> Swap partition support
>> ____<M> Compressed Swap
>> ________<M> fcomp
>> ____________<M> fcomp-extended support
>> ________<M> zlib
>> ________<M> gzip
>> ________<M> bzip2
>> ________<M> compress
>> ____<M> Swap on RAM
>
>Exactly. It might also be possible to choose the algorithm at bootup
>or during runtime.
>

(note:  Using swaponram as an example instead of a regular swap to
show how to avoid making the user actually give a swap device name
for that particular feature; this works the same if you replace
 " -o swaponram=24MB" with a device like /dev/hda4)

Okay, again, swapon -o foo[=bar] should pass whatever foo and bar are
to the kernel, and let the kernel evaluate it.  This way you never update
swapon again ;-) Except of the usual (bugs, security fixes, etc)

swapon -o compressed=fcomp-extended -o algoflags=fcomp-mdist=1024 \
  -o swaponram=24MB

That should be completely evaluated by the kernel to mean (as long as
fcomp-extended support is enabled and loaded) to swapon compressed
with fcomp-extended with a max backpointer distance of 1024 bytes (which
is 16 bit pointers; always use the minimum), and do a swaponram of 24MB.
The kernel will look at it, as long as all modules needed are loaded it goes
"Okay", tells swapon what the name (swaponram0 ?) of the swaponram
is, and then that gets handled in userspace as a normal

swapon swaponram0

would (assuming that device exists; think shmfs), meaning

swapoff swaponram0

would turn that swaponram off.

>> As far as I know, zlib, gzip, and compress use Huffman trees.  I am
>pretty
>> sure about zlib, not sure about the others.  gzip I believe uses 16 bit
>> backpointers as well, which means you need a 64k processing window
>> for [de]compression, not to mention that it takes more work.  bzip2 we
>> all know is CPU-intensive, or at least it was last time I checked.
>
>Yes, zlib eats up several 100k of memory. You really notice this when
>you add it to a bootloader that was (once) supposed to be small. :)
>

Bick.

>> Yeah true.  But for guessing the decompressed size I meant like when
>> you don't want to load the WHOLE block into RAM at once.  Ahh, so you
>> swap in pages?  Well whatever unit you swap in, that's how you should
>> compress things.  Look I'm confusing myself here, just ignore anything
>> that sounds retarded--I'm just a kid after all :-p
>
>Will do. :)
>It should be fine to load a page (maybe several pages) at once. There
>is read-ahead code all over the kernel and this is nothing else. Plus
>it simplifies things.
>

Heh, cool.  Compressing groups of pages I think is a bad idea; you have
to waste time RECOMPRESSING when you pull a page off the disk (or
flag it free)

>> >a) Even with 4M, two pages of 4k each don't hurt that much. If they
>> >do, the whole compression trick doesn't pay off at all.
>> >b) Compression ratios usually suck with small buffers.
>> >
>> a)
>> True, two chunks of 4k don't hurt.  But how big are swap pages?  Assuming
>> the page can't be compressed at all, it's [page size / 256] * 3 + [page
>size]
>> for the maximum compressed data size.  (4144 bytes for 4k of absolutely
>> non-redundant data within 256 bytes).
>
>4k + sizeof(header). If the compression doesn't manage to shrink the
>code, it should return an error. The calling code will then put an
>uncompressed flag in the header and we're done.
>The header may be as small as one byte.
>

>> b)
>> Yeah they do.  But to find the compression performance (ratio) loss, you
>> do [max pointer distance]/[block size], meaning like for this one
>> 256/[page size].  If you do a 4k page size, you lose 6.25% of the
>compression
>> performance (so if we average 2:1, we'll average 1.875:1).  What IS the
>page
>> size the kernel uses?
>
>4k on most architectures, 8k on alpha.
>

Let's make this a swapon option at some point, i.e.

swapon -o page_size=16K /dev/hda5

>> >I didn't look that far yet. What you could do, is read through
>> >/usr/src/linux/Documentation/CodingStyle. It is just so much nicer
>> >(and faster) to read and sticking to it usually produces better code.
>> >
>> 
>> Eh, I should just crack open the kernel source and immitate it.  But I
>have
>> that file on my hard disk, so mebbe I should look.  Mebbe I should take a
>> whack at getting the compression algo to work too, instead of pushing it
>> on someone else.
>
>:)
>
>Imitating the kernel source may or may not be a good idea, btw. It is
>very diverse in style and quality. Some drivers are absolutely
>horrible, but they are just drivers, so noone without that hardware
>cares.
>

LOL!  True.

>Another thing: Did you look at the links John Bradford gave yet? It
>doesn't hurt to try something alone first, but once you have a good
>idea about what the problem is and how you would solve it, look for
>existing code.
>

Saw it.

>Most times, someone else already had the same idea and the same
>general solution. Good, less work. Sometimes you were stupid and the
>existing solution is much better. Good to know. And on very rare
>occasions, your solution is better, at least in some details.
>

It happens.

>Well, in this case, the sourceforge project appears to be silent since
>half a year or so, whatever that means.
>

It's dead I'd guess, unless someone can prove me wrong.

>Jörn
>
>-- 
>Data dominates. If you've chosen the right data structures and organized
>things well, the algorithms will almost always be self-evident. Data
>structures, not algorithms, are central to programming.
>-- Rob Pike
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/

--Bluefox Icy