* Re: better/faster kernel tarball compression
@ 2010-03-22 10:09 Tomasz Chmielewski
2010-03-22 12:00 ` Alexander Clouter
0 siblings, 1 reply; 4+ messages in thread
From: Tomasz Chmielewski @ 2010-03-22 10:09 UTC (permalink / raw)
To: linux-kernel; +Cc: lacos
> 403804160 linux-2.6.34-rc2.tar
> 67479563 linux-2.6.34-rc2.tar.bz2
> 58452531 linux-2.6.34-rc2.tar.lz
Speaking of file sizes, xz[1] already provides better compression:
xz -k -9 linux-2.6.34-rc2.tar
55320408 linux-2.6.34-rc2.tar.xz
xz -e -k -9 linux-2.6.34-rc2.tar
54800808 linux-2.6.34-rc2.tar.xz
One drawback of xz is that it's not multi-threaded, much like bzip2 or
gzip; would be great if it could be changed.
[1] http://tukaani.org/xz/
--
Tomasz Chmielewski
http://wpkg.org
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: better/faster kernel tarball compression
2010-03-22 10:09 better/faster kernel tarball compression Tomasz Chmielewski
@ 2010-03-22 12:00 ` Alexander Clouter
2010-03-23 23:21 ` Ersek, Laszlo
0 siblings, 1 reply; 4+ messages in thread
From: Alexander Clouter @ 2010-03-22 12:00 UTC (permalink / raw)
To: linux-kernel
Tomasz Chmielewski <mangoo@wpkg.org> wrote:
>
>> 403804160 linux-2.6.34-rc2.tar
>> 67479563 linux-2.6.34-rc2.tar.bz2
>> 58452531 linux-2.6.34-rc2.tar.lz
>
> Speaking of file sizes, xz[1] already provides better compression:
>
> xz -k -9 linux-2.6.34-rc2.tar
>
> 55320408 linux-2.6.34-rc2.tar.xz
>
> xz -e -k -9 linux-2.6.34-rc2.tar
>
> 54800808 linux-2.6.34-rc2.tar.xz
>
> One drawback of xz is that it's not multi-threaded, much like bzip2 or
> gzip; would be great if it could be changed.
>
For some time there has been a multi-threaded bzip2 called
pbzip2[1], for some time; hell even Debian has it :)
I have no idea why the original poster is trying to say how "all teh
awesome" his code is being faster, well 'duh' it is using all the cores
on $BOX rather than just a single one.
I would be interested in comparisons against pbzip2 and the amusingly
named pigz[2]...plus a bunch of memory use comparisons, my AR7 board
only has 16MB of RAM :)
Cheers
[1] http://compression.ca/pbzip2/ - supports stdio (de)compression
[2] http://www.zlib.net/pigz/ - no idea if this supports stdio
--
Alexander Clouter
.sigmonster says: Approved for veterans.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: better/faster kernel tarball compression
2010-03-22 12:00 ` Alexander Clouter
@ 2010-03-23 23:21 ` Ersek, Laszlo
0 siblings, 0 replies; 4+ messages in thread
From: Ersek, Laszlo @ 2010-03-23 23:21 UTC (permalink / raw)
To: linux-kernel
On Mon, 22 Mar 2010, Alexander Clouter wrote:
> For some time there has been a multi-threaded bzip2 called pbzip2[1],
> for some time; hell even Debian has it :)
>
> I have no idea why the original poster is trying to say how "all teh
> awesome" his code is being faster, well 'duh' it is using all the cores
> on $BOX rather than just a single one.
I asked explicitly to be CC'd. I saw somewhere that more than 200 messages
are posted to lkml each single day, so excuse me for not subscribing to
it. Of course nobody needs to give a damn about what I ask for, but then
it's difficult for me to respond.
You didn't quote where I said 'how "all teh awesome" my code is being
faster'.
Anyway, please download a bz2 kernel tarball, and decompress it with
pbzip2 and then with lbzip2. Eg. on a quad-core AMD or similar. Then
please re-read the last sentence of section (2) of my previous mail.
Or please check my site which I linked to previously, and look at some
test reports and their analysis on the debian-mentors mailing list (link
on my site, likewise).
I could go on about how the multiple workers decompressor of lbzip2 is
designed, or about its signal handling, but since you decided upfront that
I'm an attention whoring idiot, I won't struggle. I'll paste at the end
one such test report I generated now on a 2x2 core Opteron 275, with the
test file being "linux-2.6.33.1.tar".
I'd call lbzip2 useful especially if kernel.org continued to offer the
full tarballs in the current .bz2 format. ("I dare to recommend lbzip2 in
order to shorten both compression and decompression times for whomever
works with the .bz2 tarball.") See [0] for example.
> I would be interested in comparisons against pbzip2 and the amusingly
> named pigz[2]...plus a bunch of memory use comparisons, my AR7 board
> only has 16MB of RAM :)
lbzip2's malloc() peaks were like this during the correctness (not the
performance) test phase -- not that you'd care:
- single-threadded decompression of bzip2-compressed file: 9,955,696
- same but with four worker threads: 71,308,096
- single-threaded decompression of lbzip2-compressed file: 9,955,696
- same but with four worker threads: 65,079,712
- compression with four worker threads: 49,049,265
(lbzip2 has a compile-time configurable "backlog factor", sort of the
maximum number of buffered input chunks per thread. It is 4 per default
which I deem an okay tradeoff between memory usage and presumable,
unexpected bursts in (de)compression performance due to "inhomogeneous"
input data.)
Additionally [1],
----v----
How is it pronounced?
I'm glad you asked. It is pronounced "pig-zee". It is /not/ pronounced
like the plural of pig.
----^----
lacos
[0] http://permalink.gmane.org/gmane.linux.kernel/949924
[1] http://zlib.net/pigz/
(Note that "7za" denotes the (C language) bzip2 module of p7zip -- option
"-tbzip2" -- which is *not* based on libbz2. Furthermore, "ws" means
"workers stalled", ie. the ratio of unfruitful condvar predicate
evaluations in worker threads, each resulting in the given worker going to
sleep until the next condvar broadcast.)
+-----------------------------------------------------------------------
|Version
| |bzip2 1.0.5
| |lbzip2 0.23
| |pbzip2 1.1.0
| |7za 9.04
+-----------------------------------------------------------------------
|File size [B]
| |original 395089920
| |bzip2 66219178
| |lbzip2 66488716
| |pbzip2 66491545
| |7za 66230645
+-----------------------------------------------------------------------
|Compr. size [%]
| |lbzip2:bzip2 100.40
| |pbzip2:bzip2 100.41
| |7za:bzip2 100.01
+-----------------------------------------------------------------------
|Compr. time [s]
| +----------------------------------------------------
| |from regf
| | |bzip2 113.59
| | |lbzip2 31.52
| | |pbzip2 31.55
| | |7za 52.22
| +----------------------------------------------------
| |from pipe
| | |bzip2 116.95
| | |lbzip2 31.44
| | |pbzip2 31.31
| | |7za 49.72
+-----------------------------------------------------------------------
|Compr. speed [%]
| +----------------------------------------------------
| |from regf
| | |lbzip2:bzip2 360.37
| | |pbzip2:bzip2 360.03
| | |7za:bzip2 217.52
| +----------------------------------------------------
| |from pipe
| | |lbzip2:bzip2 371.97
| | |pbzip2:bzip2 373.52
| | |7za:bzip2 235.21
+-----------------------------------------------------------------------
|"lbzip2" ws [%]
| |from regf 1.11
| |from pipe 1.55
+-----------------------------------------------------------------------
|Decompr. time [s]
| +----------------------------------------------------
| |from regf
| | +---------------------------------------
| | |from bzip2
| | | |by bzip2 22.21
| | | |by lbzip2 8.87
| | | |by pbzip2 41.76
| | | |by 7za 14.83
| | +---------------------------------------
| | |from lbzip2
| | | |by bzip2 23.46
| | | |by lbzip2 8.84
| | | |by pbzip2 7.58
| | | |by 7za 22.81
| | +---------------------------------------
| | |from pbzip2
| | | |by bzip2 23.47
| | | |by lbzip2 8.79
| | | |by pbzip2 7.52
| | | |by 7za 24.40
| | +---------------------------------------
| | |from 7za
| | | |by bzip2 22.24
| | | |by lbzip2 8.90
| | | |by pbzip2 51.07
| | | |by 7za 14.92
| +----------------------------------------------------
| |from pipe
| | +---------------------------------------
| | |from bzip2
| | | |by bzip2 22.38
| | | |by lbzip2 8.87
| | | |by pbzip2 51.27
| | | |by 7za 0.00
| | +---------------------------------------
| | |from lbzip2
| | | |by bzip2 23.65
| | | |by lbzip2 8.78
| | | |by pbzip2 7.53
| | | |by 7za 0.00
| | +---------------------------------------
| | |from pbzip2
| | | |by bzip2 28.65
| | | |by lbzip2 8.85
| | | |by pbzip2 7.57
| | | |by 7za 0.00
| | +---------------------------------------
| | |from 7za
| | | |by bzip2 27.53
| | | |by lbzip2 8.88
| | | |by pbzip2 41.91
| | | |by 7za 0.00
+-----------------------------------------------------------------------
|Decompr. speed [%]
| +----------------------------------------------------
| |from regf
| | +---------------------------------------
| | |from bzip2
| | | |lbzip2:bzip2 250.39
| | | |pbzip2:bzip2 53.18
| | | |7za:bzip2 149.76
| | +---------------------------------------
| | |from lbzip2
| | | |lbzip2:bzip2 265.38
| | | |pbzip2:bzip2 309.49
| | | |7za:bzip2 102.84
| | +---------------------------------------
| | |from pbzip2
| | | |lbzip2:bzip2 267.00
| | | |pbzip2:bzip2 312.10
| | | |7za:bzip2 96.18
| | +---------------------------------------
| | |from 7za
| | | |lbzip2:bzip2 249.88
| | | |pbzip2:bzip2 43.54
| | | |7za:bzip2 149.06
| +----------------------------------------------------
| |from pipe
| | +---------------------------------------
| | |from bzip2
| | | |lbzip2:bzip2 252.31
| | | |pbzip2:bzip2 43.65
| | | |7za:bzip2
| | +---------------------------------------
| | |from lbzip2
| | | |lbzip2:bzip2 269.36
| | | |pbzip2:bzip2 314.07
| | | |7za:bzip2
| | +---------------------------------------
| | |from pbzip2
| | | |lbzip2:bzip2 323.72
| | | |pbzip2:bzip2 378.46
| | | |7za:bzip2
| | +---------------------------------------
| | |from 7za
| | | |lbzip2:bzip2 310.02
| | | |pbzip2:bzip2 65.68
| | | |7za:bzip2
+-----------------------------------------------------------------------
|"lbzip2 -d" ws [%]
| +----------------------------------------------------
| |from regf
| | |from bzip2 1.58
| | |from lbzip2 1.00
| | |from pbzip2 1.49
| | |from 7za 1.23
| +----------------------------------------------------
| |from pipe
| | |from bzip2 1.58
| | |from lbzip2 1.16
| | |from pbzip2 .83
| | |from 7za 1.58
(Report ends.)
^ permalink raw reply [flat|nested] 4+ messages in thread
* better/faster kernel tarball compression
@ 2010-03-21 21:27 Ersek, Laszlo
0 siblings, 0 replies; 4+ messages in thread
From: Ersek, Laszlo @ 2010-03-21 21:27 UTC (permalink / raw)
To: linux-kernel; +Cc: Antonio Diaz Diaz
Dear lkml Reader,
please allow me to spam you a bit with two compression programs.
I just downloaded the Linux 2.6.34-rc2 tarball:
5d8a6005280e54cd6e590916c9d7a900 linux-2.6.34-rc2.tar
570da63bf2c0c2e199f4a5616c15f52b linux-2.6.34-rc2.tar.bz2
403804160 linux-2.6.34-rc2.tar
67479563 linux-2.6.34-rc2.tar.bz2
I'd like to recommend two programs to compress the tarball. Allow me to
list mostly the PRO arguments, as I'm sure you have the CON arguments
ready.
(1) The program I recommend primarily is "plzip" [0]. Since kernel.org's
energy consumption and upload costs must surely be staggering, you'll be
delighted to know that the lzlib library compresses much better than the
bzip2 library. Decompression is very fast. The lzip program [1] -- being
the natural choice for decompression -- is very widely available (among
others, in GNU/Linux distributions).
Now one counter-argument might be that lzip compresses much more slowly
than bzip2. Obviously, Linus (or his trustee) has to compress the tarball
only once, but users download and decompress the tarball thousands of
times. Still, this alone would *not* suffice for me to spam you. I wish to
make you aware of plzip, which is a parallel (multi-threaded) version of
lzip. I figure Linus (or his trustee) couldn't care less if compression
suddenly started to take eg. four times as long for him (or him/her).
However, with plzip one can compress the tarball *both* faster and more
efficiently, given enough cores.
Here's the thing. I recompressed the uncompressed tarball with bzip2, and
then with plzip, using 16 worker threads. Note that the platform and
kernel are a Sun Fire E25K and a Solaris 9. This should not deter you from
trying it yourself, as my only reason not to execute this test on a
GNU/Linux box is that I have no access to any Linux box with 16 cores. All
tested binaries are 32bit (although all sources are 64bit-clean).
Command being timed: "bzip2 --keep linux-2.6.34-rc2.tar"
User time (seconds): 130.82
System time (seconds): 1.68
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:12.51
Command being timed: "plzip.32 --threads=16 --keep linux-2.6.34-rc2.tar"
User time (seconds): 1009.95
System time (seconds): 13.55
Percent of CPU this job got: 1145%
Elapsed (wall clock) time (h:mm:ss or m:ss): 1:29.33
403804160 linux-2.6.34-rc2.tar
67479563 linux-2.6.34-rc2.tar.bz2
58452531 linux-2.6.34-rc2.tar.lz
About 13% space was saved with plzip's default compression level (-6)
against bzip2's best compression level (-9), and about 32% wall clock time
was saved.
Decompression times to /dev/null follow. The .tar.lz file was decompressed
with the single-threaded "minilzip" utility coming with lzlib. I also
verified, in a separate test, that the .tar.lz file decompresses back to
the original tarball (sanity check).
Command being timed: "bzip2 -dc linux-2.6.34-rc2.tar.bz2"
User time (seconds): 31.99
System time (seconds): 0.35
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:32.35
Command being timed: "minilzip.32 -dc linux-2.6.34-rc2.tar.lz"
User time (seconds): 16.18
System time (seconds): 0.23
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:16.43
Hence users would benefit not only from the smaller download, but the
faster decompression too. (plzip supports multi-threaded decompression as
well, but I didn't measure it for now.) AFAICT, multiple GNU/Linux
distributions are considering an lzip-compressed package format, too.
Let me cite H. Peter Anvin's mail from Sep 21, 2006 [2]:
----v----
I have been holding out on implementing LZMA on kernel.org, because just
as zip (deflate) didn't become common in the Unix world until an
encapsulation format that handles things expected in the Unix world, e.g.
streaming, was created (gzip), I don't think LZMA is going to be widely
used until there is an "lzip" which does the same thing. I actually
started the work of adding LZMA support to gzip, but then realized it
would be better if a new encapsulation format with proper 64-bit support
everywhere was created.
----^----
In reflection on the followups in said thread, please note that the file
format is very simple, 64bit-clean and CRC-protected [3]. For streaming
properties, see section (3) below.
(2) The program I recommend secondarily, *only* for the case if kernel.org
admins are determined to stick with .bz2, is "lbzip2" [4]. I'll mention
one drawback up-front (which I consider irrelevant, truth to be told): the
compressed output looks like the concatenation of many bzip2 outputs. This
is irrelevant for bunzip2, since the compressed output is still a
perfectly valid bz2 file. Programs decompressing such files with libbz2
will see multiple end-of-bzip2-stream conditions, however. I dare to
recommend lbzip2 in order to shorten both compression and decompression
times for whomever works with the .bz2 tarball. (Though see my disclaimer
at the end.)
Compression times (32 bit binaries, 16 worker threads; re-pasting the
(single-threaded) bzip2 result from above, and moving the downloaded
.tar.bz2 under a subdirectory called "orig" before starting lbzip2):
Command being timed: "bzip2 --keep linux-2.6.34-rc2.tar"
User time (seconds): 130.82
System time (seconds): 1.68
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:12.51
Command being timed: "lbzip2 -n 16 --keep linux-2.6.34-rc2.tar"
User time (seconds): 144.08
System time (seconds): 2.86
Percent of CPU this job got: 1405%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:10.45
Sizes:
403804160 linux-2.6.34-rc2.tar
67479563 orig/linux-2.6.34-rc2.tar.bz2
67691446 linux-2.6.34-rc2.tar.bz2
For less than half a percent size sacrifice, we saved 92% wall clock time.
Both bzip2 and lbzip2 decompress both archives back to the original
tarball (sanity check). Decompression times to /dev/null:
Command being timed: "bzip2 -dc orig/linux-2.6.34-rc2.tar.bz2"
User time (seconds): 29.81
System time (seconds): 0.29
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:30.12
Command being timed: "bzip2 -dc linux-2.6.34-rc2.tar.bz2"
User time (seconds): 31.57
System time (seconds): 0.40
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:31.97
Command being timed: "lbzip2 -n 16 -dc orig/linux-2.6.34-rc2.tar.bz2"
User time (seconds): 54.18
System time (seconds): 2.37
Percent of CPU this job got: 1259%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:04.48
Command being timed: "lbzip2 -n 16 -dc linux-2.6.34-rc2.tar.bz2"
User time (seconds): 53.62
System time (seconds): 1.93
Percent of CPU this job got: 1349%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:04.11
(Note that in the third case, lbzip2 parallelizes the decompression of the
downloaded (single-stream) .tar.bz2 file too.)
(3) Both plzip and lbzip2 parallelize both compression and decompression
from non-seekable input to non-seekable output (eg. pipes and SOCK_STREAM
sockets). Additionally, they strive to follow the Utility Syntax
Guidelines laid down in The Single UNIX(R) Specification, Version 2 [5].
+----------+
|DISCLAIMER|
+----------+
- I am the author of lbzip2. Therefore this mail qualifies as shameless
self-promotion. I've got no problem with that; a rightful public
humiliation will do me only good. I hope the subject pertains well enough
to the payload so that nobody is lured into reading the mail spuriously.
- Originally, I forked plzip from lbzip2 under a different name ("llzip").
>From the start, it was based on lzlib, written by Antonio Diaz Diaz. (Just
as lbzip2 is based on Julian Seward's libbz2. I'm not throwing around
these names to gain credibility, I'm rather trying to give credit.)
Shortly after the fork, Antonio Diaz Diaz has taken over llzip's
maintenance as planned, and renamed it to plzip, much more fittingly. He
has in effect completely rewritten it since then. He knew nothing of this
email beforehand. The blame is entirely mine. Still, I'm convinced people
would benefit if the kernel tarball switched to .lz compression.
- The quoted measurements were done on the "regina" supercomputer node of
the NIIFI [6]. For a scaling test somewhat related to the ones listed
above, see [7]. I'm currently preparing to repeat those tests with plzip.
(Disclaimer ends.)
Thank you very much for considering, and I apologize for being off-topic,
Laszlo Ersek
PS. As permitted by the lkml FAQ 3.3, I'm not subscribed to the list.
Please keep me CC'd (and also poor victim Antonio). Thanks.
[0] http://www.nongnu.org/lzip/plzip.html
[1] http://www.nongnu.org/lzip/lzip.html
[2] http://lkml.indiana.edu/hypermail/linux/kernel/0609.2/1598.html
[3] http://www.nongnu.org/lzip/manual/plzip_manual.html#File-Format
[4] http://lacos.hu/
[5] http://www.opengroup.org/onlinepubs/007908799/xbd/utilconv.html#tag_009_002
[6] http://www.niif.hu/en/niif_institute/supercomputing_service
[7] http://lacos.hu/lbzip2-scaling/scaling.html
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-03-23 23:21 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-22 10:09 better/faster kernel tarball compression Tomasz Chmielewski
2010-03-22 12:00 ` Alexander Clouter
2010-03-23 23:21 ` Ersek, Laszlo
-- strict thread matches above, loose matches on Subject: below --
2010-03-21 21:27 Ersek, Laszlo
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.