All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: better/faster kernel tarball compression
@ 2010-03-22 10:09 Tomasz Chmielewski
  2010-03-22 12:00 ` Alexander Clouter
  0 siblings, 1 reply; 4+ messages in thread
From: Tomasz Chmielewski @ 2010-03-22 10:09 UTC (permalink / raw)
  To: linux-kernel; +Cc: lacos

> 403804160 linux-2.6.34-rc2.tar
>   67479563 linux-2.6.34-rc2.tar.bz2
>   58452531 linux-2.6.34-rc2.tar.lz

Speaking of file sizes, xz[1] already provides better compression:

xz -k -9 linux-2.6.34-rc2.tar

     55320408 linux-2.6.34-rc2.tar.xz

xz -e -k -9 linux-2.6.34-rc2.tar

     54800808 linux-2.6.34-rc2.tar.xz

One drawback of xz is that it's not multi-threaded, much like bzip2 or 
gzip; would be great if it could be changed.

[1] http://tukaani.org/xz/


-- 
Tomasz Chmielewski
http://wpkg.org


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: better/faster kernel tarball compression
  2010-03-22 10:09 better/faster kernel tarball compression Tomasz Chmielewski
@ 2010-03-22 12:00 ` Alexander Clouter
  2010-03-23 23:21   ` Ersek, Laszlo
  0 siblings, 1 reply; 4+ messages in thread
From: Alexander Clouter @ 2010-03-22 12:00 UTC (permalink / raw)
  To: linux-kernel

Tomasz Chmielewski <mangoo@wpkg.org> wrote:
>
>> 403804160 linux-2.6.34-rc2.tar
>>   67479563 linux-2.6.34-rc2.tar.bz2
>>   58452531 linux-2.6.34-rc2.tar.lz
> 
> Speaking of file sizes, xz[1] already provides better compression:
> 
> xz -k -9 linux-2.6.34-rc2.tar
> 
>     55320408 linux-2.6.34-rc2.tar.xz
> 
> xz -e -k -9 linux-2.6.34-rc2.tar
> 
>     54800808 linux-2.6.34-rc2.tar.xz
> 
> One drawback of xz is that it's not multi-threaded, much like bzip2 or 
> gzip; would be great if it could be changed.
> 
For some time there has been a multi-threaded bzip2 called 
pbzip2[1], for some time; hell even Debian has it :)

I have no idea why the original poster is trying to say how "all teh 
awesome" his code is being faster, well 'duh' it is using all the cores 
on $BOX rather than just a single one.

I would be interested in comparisons against pbzip2 and the amusingly 
named pigz[2]...plus a bunch of memory use comparisons, my AR7 board 
only has 16MB of RAM :)

Cheers

[1] http://compression.ca/pbzip2/ - supports stdio (de)compression
[2] http://www.zlib.net/pigz/ - no idea if this supports stdio

-- 
Alexander Clouter
.sigmonster says: Approved for veterans.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: better/faster kernel tarball compression
  2010-03-22 12:00 ` Alexander Clouter
@ 2010-03-23 23:21   ` Ersek, Laszlo
  0 siblings, 0 replies; 4+ messages in thread
From: Ersek, Laszlo @ 2010-03-23 23:21 UTC (permalink / raw)
  To: linux-kernel

On Mon, 22 Mar 2010, Alexander Clouter wrote:

> For some time there has been a multi-threaded bzip2 called pbzip2[1], 
> for some time; hell even Debian has it :)
>
> I have no idea why the original poster is trying to say how "all teh 
> awesome" his code is being faster, well 'duh' it is using all the cores 
> on $BOX rather than just a single one.

I asked explicitly to be CC'd. I saw somewhere that more than 200 messages 
are posted to lkml each single day, so excuse me for not subscribing to 
it. Of course nobody needs to give a damn about what I ask for, but then 
it's difficult for me to respond.

You didn't quote where I said 'how "all teh awesome" my code is being 
faster'.

Anyway, please download a bz2 kernel tarball, and decompress it with 
pbzip2 and then with lbzip2. Eg. on a quad-core AMD or similar. Then 
please re-read the last sentence of section (2) of my previous mail.

Or please check my site which I linked to previously, and look at some 
test reports and their analysis on the debian-mentors mailing list (link 
on my site, likewise).

I could go on about how the multiple workers decompressor of lbzip2 is 
designed, or about its signal handling, but since you decided upfront that 
I'm an attention whoring idiot, I won't struggle. I'll paste at the end 
one such test report I generated now on a 2x2 core Opteron 275, with the 
test file being "linux-2.6.33.1.tar".

I'd call lbzip2 useful especially if kernel.org continued to offer the 
full tarballs in the current .bz2 format. ("I dare to recommend lbzip2 in 
order to shorten both compression and decompression times for whomever 
works with the .bz2 tarball.") See [0] for example.


> I would be interested in comparisons against pbzip2 and the amusingly
> named pigz[2]...plus a bunch of memory use comparisons, my AR7 board
> only has 16MB of RAM :)

lbzip2's malloc() peaks were like this during the correctness (not the 
performance) test phase -- not that you'd care:

- single-threadded decompression of bzip2-compressed file:  9,955,696
- same but with four worker threads:                       71,308,096
- single-threaded decompression of lbzip2-compressed file:  9,955,696
- same but with four worker threads:                       65,079,712
- compression with four worker threads:                    49,049,265

(lbzip2 has a compile-time configurable "backlog factor", sort of the 
maximum number of buffered input chunks per thread. It is 4 per default 
which I deem an okay tradeoff between memory usage and presumable, 
unexpected bursts in (de)compression performance due to "inhomogeneous" 
input data.)


Additionally [1],

----v----
How is it pronounced?

I'm glad you asked. It is pronounced "pig-zee". It is /not/ pronounced 
like the plural of pig.
----^----

lacos

[0] http://permalink.gmane.org/gmane.linux.kernel/949924
[1] http://zlib.net/pigz/


(Note that "7za" denotes the (C language) bzip2 module of p7zip -- option 
"-tbzip2" -- which is *not* based on libbz2. Furthermore, "ws" means 
"workers stalled", ie. the ratio of unfruitful condvar predicate 
evaluations in worker threads, each resulting in the given worker going to 
sleep until the next condvar broadcast.)

+-----------------------------------------------------------------------
|Version
|                  |bzip2                                          1.0.5
|                  |lbzip2                                          0.23
|                  |pbzip2                                         1.1.0
|                  |7za                                             9.04
+-----------------------------------------------------------------------
|File size [B]
|                  |original                                   395089920
|                  |bzip2                                       66219178
|                  |lbzip2                                      66488716
|                  |pbzip2                                      66491545
|                  |7za                                         66230645
+-----------------------------------------------------------------------
|Compr. size [%]
|                  |lbzip2:bzip2                                  100.40
|                  |pbzip2:bzip2                                  100.41
|                  |7za:bzip2                                     100.01
+-----------------------------------------------------------------------
|Compr. time [s]
|                  +----------------------------------------------------
|                  |from regf
|                  |            |bzip2                            113.59
|                  |            |lbzip2                            31.52
|                  |            |pbzip2                            31.55
|                  |            |7za                               52.22
|                  +----------------------------------------------------
|                  |from pipe
|                  |            |bzip2                            116.95
|                  |            |lbzip2                            31.44
|                  |            |pbzip2                            31.31
|                  |            |7za                               49.72
+-----------------------------------------------------------------------
|Compr. speed [%]
|                  +----------------------------------------------------
|                  |from regf
|                  |            |lbzip2:bzip2                     360.37
|                  |            |pbzip2:bzip2                     360.03
|                  |            |7za:bzip2                        217.52
|                  +----------------------------------------------------
|                  |from pipe
|                  |            |lbzip2:bzip2                     371.97
|                  |            |pbzip2:bzip2                     373.52
|                  |            |7za:bzip2                        235.21
+-----------------------------------------------------------------------
|"lbzip2" ws [%]
|                  |from regf                                       1.11
|                  |from pipe                                       1.55
+-----------------------------------------------------------------------
|Decompr. time [s]
|                  +----------------------------------------------------
|                  |from regf
|                  |            +---------------------------------------
|                  |            |from bzip2
|                  |            |            |by bzip2             22.21
|                  |            |            |by lbzip2             8.87
|                  |            |            |by pbzip2            41.76
|                  |            |            |by 7za               14.83
|                  |            +---------------------------------------
|                  |            |from lbzip2
|                  |            |            |by bzip2             23.46
|                  |            |            |by lbzip2             8.84
|                  |            |            |by pbzip2             7.58
|                  |            |            |by 7za               22.81
|                  |            +---------------------------------------
|                  |            |from pbzip2
|                  |            |            |by bzip2             23.47
|                  |            |            |by lbzip2             8.79
|                  |            |            |by pbzip2             7.52
|                  |            |            |by 7za               24.40
|                  |            +---------------------------------------
|                  |            |from 7za
|                  |            |            |by bzip2             22.24
|                  |            |            |by lbzip2             8.90
|                  |            |            |by pbzip2            51.07
|                  |            |            |by 7za               14.92
|                  +----------------------------------------------------
|                  |from pipe
|                  |            +---------------------------------------
|                  |            |from bzip2
|                  |            |            |by bzip2             22.38
|                  |            |            |by lbzip2             8.87
|                  |            |            |by pbzip2            51.27
|                  |            |            |by 7za                0.00
|                  |            +---------------------------------------
|                  |            |from lbzip2
|                  |            |            |by bzip2             23.65
|                  |            |            |by lbzip2             8.78
|                  |            |            |by pbzip2             7.53
|                  |            |            |by 7za                0.00
|                  |            +---------------------------------------
|                  |            |from pbzip2
|                  |            |            |by bzip2             28.65
|                  |            |            |by lbzip2             8.85
|                  |            |            |by pbzip2             7.57
|                  |            |            |by 7za                0.00
|                  |            +---------------------------------------
|                  |            |from 7za
|                  |            |            |by bzip2             27.53
|                  |            |            |by lbzip2             8.88
|                  |            |            |by pbzip2            41.91
|                  |            |            |by 7za                0.00
+-----------------------------------------------------------------------
|Decompr. speed [%]
|                  +----------------------------------------------------
|                  |from regf
|                  |            +---------------------------------------
|                  |            |from bzip2
|                  |            |            |lbzip2:bzip2        250.39
|                  |            |            |pbzip2:bzip2         53.18
|                  |            |            |7za:bzip2           149.76
|                  |            +---------------------------------------
|                  |            |from lbzip2
|                  |            |            |lbzip2:bzip2        265.38
|                  |            |            |pbzip2:bzip2        309.49
|                  |            |            |7za:bzip2           102.84
|                  |            +---------------------------------------
|                  |            |from pbzip2
|                  |            |            |lbzip2:bzip2        267.00
|                  |            |            |pbzip2:bzip2        312.10
|                  |            |            |7za:bzip2            96.18
|                  |            +---------------------------------------
|                  |            |from 7za
|                  |            |            |lbzip2:bzip2        249.88
|                  |            |            |pbzip2:bzip2         43.54
|                  |            |            |7za:bzip2           149.06
|                  +----------------------------------------------------
|                  |from pipe
|                  |            +---------------------------------------
|                  |            |from bzip2
|                  |            |            |lbzip2:bzip2        252.31
|                  |            |            |pbzip2:bzip2         43.65
|                  |            |            |7za:bzip2
|                  |            +---------------------------------------
|                  |            |from lbzip2
|                  |            |            |lbzip2:bzip2        269.36
|                  |            |            |pbzip2:bzip2        314.07
|                  |            |            |7za:bzip2
|                  |            +---------------------------------------
|                  |            |from pbzip2
|                  |            |            |lbzip2:bzip2        323.72
|                  |            |            |pbzip2:bzip2        378.46
|                  |            |            |7za:bzip2
|                  |            +---------------------------------------
|                  |            |from 7za
|                  |            |            |lbzip2:bzip2        310.02
|                  |            |            |pbzip2:bzip2         65.68
|                  |            |            |7za:bzip2
+-----------------------------------------------------------------------
|"lbzip2 -d" ws [%]
|                  +----------------------------------------------------
|                  |from regf
|                  |            |from bzip2                         1.58
|                  |            |from lbzip2                        1.00
|                  |            |from pbzip2                        1.49
|                  |            |from 7za                           1.23
|                  +----------------------------------------------------
|                  |from pipe
|                  |            |from bzip2                         1.58
|                  |            |from lbzip2                        1.16
|                  |            |from pbzip2                         .83
|                  |            |from 7za                           1.58

(Report ends.)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* better/faster kernel tarball compression
@ 2010-03-21 21:27 Ersek, Laszlo
  0 siblings, 0 replies; 4+ messages in thread
From: Ersek, Laszlo @ 2010-03-21 21:27 UTC (permalink / raw)
  To: linux-kernel; +Cc: Antonio Diaz Diaz

Dear lkml Reader,

please allow me to spam you a bit with two compression programs.

I just downloaded the Linux 2.6.34-rc2 tarball:

5d8a6005280e54cd6e590916c9d7a900  linux-2.6.34-rc2.tar
570da63bf2c0c2e199f4a5616c15f52b  linux-2.6.34-rc2.tar.bz2

403804160 linux-2.6.34-rc2.tar
  67479563 linux-2.6.34-rc2.tar.bz2

I'd like to recommend two programs to compress the tarball. Allow me to 
list mostly the PRO arguments, as I'm sure you have the CON arguments 
ready.


(1) The program I recommend primarily is "plzip" [0]. Since kernel.org's 
energy consumption and upload costs must surely be staggering, you'll be 
delighted to know that the lzlib library compresses much better than the 
bzip2 library. Decompression is very fast. The lzip program [1] -- being 
the natural choice for decompression -- is very widely available (among 
others, in GNU/Linux distributions).

Now one counter-argument might be that lzip compresses much more slowly 
than bzip2. Obviously, Linus (or his trustee) has to compress the tarball 
only once, but users download and decompress the tarball thousands of 
times. Still, this alone would *not* suffice for me to spam you. I wish to 
make you aware of plzip, which is a parallel (multi-threaded) version of 
lzip. I figure Linus (or his trustee) couldn't care less if compression 
suddenly started to take eg. four times as long for him (or him/her). 
However, with plzip one can compress the tarball *both* faster and more 
efficiently, given enough cores.

Here's the thing. I recompressed the uncompressed tarball with bzip2, and 
then with plzip, using 16 worker threads. Note that the platform and 
kernel are a Sun Fire E25K and a Solaris 9. This should not deter you from 
trying it yourself, as my only reason not to execute this test on a 
GNU/Linux box is that I have no access to any Linux box with 16 cores. All 
tested binaries are 32bit (although all sources are 64bit-clean).

         Command being timed: "bzip2 --keep linux-2.6.34-rc2.tar"
         User time (seconds): 130.82
         System time (seconds): 1.68
         Percent of CPU this job got: 99%
         Elapsed (wall clock) time (h:mm:ss or m:ss): 2:12.51

         Command being timed: "plzip.32 --threads=16 --keep linux-2.6.34-rc2.tar"
         User time (seconds): 1009.95
         System time (seconds): 13.55
         Percent of CPU this job got: 1145%
         Elapsed (wall clock) time (h:mm:ss or m:ss): 1:29.33

403804160 linux-2.6.34-rc2.tar
  67479563 linux-2.6.34-rc2.tar.bz2
  58452531 linux-2.6.34-rc2.tar.lz

About 13% space was saved with plzip's default compression level (-6) 
against bzip2's best compression level (-9), and about 32% wall clock time 
was saved.

Decompression times to /dev/null follow. The .tar.lz file was decompressed 
with the single-threaded "minilzip" utility coming with lzlib. I also 
verified, in a separate test, that the .tar.lz file decompresses back to 
the original tarball (sanity check).

         Command being timed: "bzip2 -dc linux-2.6.34-rc2.tar.bz2"
         User time (seconds): 31.99
         System time (seconds): 0.35
         Percent of CPU this job got: 99%
         Elapsed (wall clock) time (h:mm:ss or m:ss): 0:32.35

         Command being timed: "minilzip.32 -dc linux-2.6.34-rc2.tar.lz"
         User time (seconds): 16.18
         System time (seconds): 0.23
         Percent of CPU this job got: 99%
         Elapsed (wall clock) time (h:mm:ss or m:ss): 0:16.43

Hence users would benefit not only from the smaller download, but the 
faster decompression too. (plzip supports multi-threaded decompression as 
well, but I didn't measure it for now.) AFAICT, multiple GNU/Linux 
distributions are considering an lzip-compressed package format, too.

Let me cite H. Peter Anvin's mail from Sep 21, 2006 [2]:

----v----
I have been holding out on implementing LZMA on kernel.org, because just 
as zip (deflate) didn't become common in the Unix world until an 
encapsulation format that handles things expected in the Unix world, e.g. 
streaming, was created (gzip), I don't think LZMA is going to be widely 
used until there is an "lzip" which does the same thing. I actually 
started the work of adding LZMA support to gzip, but then realized it 
would be better if a new encapsulation format with proper 64-bit support 
everywhere was created.
----^----

In reflection on the followups in said thread, please note that the file 
format is very simple, 64bit-clean and CRC-protected [3]. For streaming 
properties, see section (3) below.


(2) The program I recommend secondarily, *only* for the case if kernel.org 
admins are determined to stick with .bz2, is "lbzip2" [4]. I'll mention 
one drawback up-front (which I consider irrelevant, truth to be told): the 
compressed output looks like the concatenation of many bzip2 outputs. This 
is irrelevant for bunzip2, since the compressed output is still a 
perfectly valid bz2 file. Programs decompressing such files with libbz2 
will see multiple end-of-bzip2-stream conditions, however. I dare to 
recommend lbzip2 in order to shorten both compression and decompression 
times for whomever works with the .bz2 tarball. (Though see my disclaimer 
at the end.)

Compression times (32 bit binaries, 16 worker threads; re-pasting the 
(single-threaded) bzip2 result from above, and moving the downloaded 
.tar.bz2 under a subdirectory called "orig" before starting lbzip2):

         Command being timed: "bzip2 --keep linux-2.6.34-rc2.tar"
         User time (seconds): 130.82
         System time (seconds): 1.68
         Percent of CPU this job got: 99%
         Elapsed (wall clock) time (h:mm:ss or m:ss): 2:12.51

         Command being timed: "lbzip2 -n 16 --keep linux-2.6.34-rc2.tar"
         User time (seconds): 144.08
         System time (seconds): 2.86
         Percent of CPU this job got: 1405%
         Elapsed (wall clock) time (h:mm:ss or m:ss): 0:10.45

Sizes:

403804160 linux-2.6.34-rc2.tar
  67479563 orig/linux-2.6.34-rc2.tar.bz2
  67691446 linux-2.6.34-rc2.tar.bz2

For less than half a percent size sacrifice, we saved 92% wall clock time.

Both bzip2 and lbzip2 decompress both archives back to the original 
tarball (sanity check). Decompression times to /dev/null:

         Command being timed: "bzip2 -dc orig/linux-2.6.34-rc2.tar.bz2"
         User time (seconds): 29.81
         System time (seconds): 0.29
         Percent of CPU this job got: 99%
         Elapsed (wall clock) time (h:mm:ss or m:ss): 0:30.12

         Command being timed: "bzip2 -dc linux-2.6.34-rc2.tar.bz2"
         User time (seconds): 31.57
         System time (seconds): 0.40
         Percent of CPU this job got: 99%
         Elapsed (wall clock) time (h:mm:ss or m:ss): 0:31.97

         Command being timed: "lbzip2 -n 16 -dc orig/linux-2.6.34-rc2.tar.bz2"
         User time (seconds): 54.18
         System time (seconds): 2.37
         Percent of CPU this job got: 1259%
         Elapsed (wall clock) time (h:mm:ss or m:ss): 0:04.48

         Command being timed: "lbzip2 -n 16 -dc linux-2.6.34-rc2.tar.bz2"
         User time (seconds): 53.62
         System time (seconds): 1.93
         Percent of CPU this job got: 1349%
         Elapsed (wall clock) time (h:mm:ss or m:ss): 0:04.11

(Note that in the third case, lbzip2 parallelizes the decompression of the 
downloaded (single-stream) .tar.bz2 file too.)


(3) Both plzip and lbzip2 parallelize both compression and decompression 
from non-seekable input to non-seekable output (eg. pipes and SOCK_STREAM 
sockets). Additionally, they strive to follow the Utility Syntax 
Guidelines laid down in The Single UNIX(R) Specification, Version 2 [5].


+----------+
|DISCLAIMER|
+----------+

- I am the author of lbzip2. Therefore this mail qualifies as shameless 
self-promotion. I've got no problem with that; a rightful public 
humiliation will do me only good. I hope the subject pertains well enough 
to the payload so that nobody is lured into reading the mail spuriously.

- Originally, I forked plzip from lbzip2 under a different name ("llzip"). 
>From the start, it was based on lzlib, written by Antonio Diaz Diaz. (Just 
as lbzip2 is based on Julian Seward's libbz2. I'm not throwing around 
these names to gain credibility, I'm rather trying to give credit.) 
Shortly after the fork, Antonio Diaz Diaz has taken over llzip's 
maintenance as planned, and renamed it to plzip, much more fittingly. He 
has in effect completely rewritten it since then. He knew nothing of this 
email beforehand. The blame is entirely mine. Still, I'm convinced people 
would benefit if the kernel tarball switched to .lz compression.

- The quoted measurements were done on the "regina" supercomputer node of 
the NIIFI [6]. For a scaling test somewhat related to the ones listed 
above, see [7]. I'm currently preparing to repeat those tests with plzip.

(Disclaimer ends.)

Thank you very much for considering, and I apologize for being off-topic,
Laszlo Ersek

PS. As permitted by the lkml FAQ 3.3, I'm not subscribed to the list. 
Please keep me CC'd (and also poor victim Antonio). Thanks.


[0] http://www.nongnu.org/lzip/plzip.html
[1] http://www.nongnu.org/lzip/lzip.html
[2] http://lkml.indiana.edu/hypermail/linux/kernel/0609.2/1598.html
[3] http://www.nongnu.org/lzip/manual/plzip_manual.html#File-Format
[4] http://lacos.hu/
[5] http://www.opengroup.org/onlinepubs/007908799/xbd/utilconv.html#tag_009_002
[6] http://www.niif.hu/en/niif_institute/supercomputing_service
[7] http://lacos.hu/lbzip2-scaling/scaling.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-03-23 23:21 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-22 10:09 better/faster kernel tarball compression Tomasz Chmielewski
2010-03-22 12:00 ` Alexander Clouter
2010-03-23 23:21   ` Ersek, Laszlo
  -- strict thread matches above, loose matches on Subject: below --
2010-03-21 21:27 Ersek, Laszlo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.