linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Linux Kernel Source Compression
@ 2006-05-21 14:35 Justin Piszcz
  2006-05-21 18:40 ` Jan Engelhardt
                   ` (2 more replies)
  0 siblings, 3 replies; 36+ messages in thread
From: Justin Piszcz @ 2006-05-21 14:35 UTC (permalink / raw)
  To: linux-kernel; +Cc: apiszcz

Was curious as to which utilities would offer the best compression ratio 
for the kernel source, I thought it'd be bzip2 or rar but lzma wins, 
roughly 6 MiB smaller than bzip2.

$ du -sk * | sort -n
33520   linux-2.6.16.17.tar.lzma
33760   linux-2.6.16.17.tar.rar
38064   linux-2.6.16.17.tar.rz
39472   linux-2.6.16.17.tar.szip
39520   linux-2.6.16.17.tar.bz
39936   linux-2.6.16.17.tar.bz2
40000   linux-2.6.16.17.tar.bicom
40656   linux-2.6.16.17.tar.sit
47664   linux-2.6.16.17.tar.lha
49968   linux-2.6.16.17.tar.dzip
50000   linux-2.6.16.17.tar.gz
51344   linux-2.6.16.17.tar.arj
57552   linux-2.6.16.17.tar.lzo
57984   linux-2.6.16.17.tar.F
81136   linux-2.6.16.17.tar.Z
94544   linux-2.6.16.17.tar.zoo
101216  linux-2.6.16.17.tar.arc
228608  linux-2.6.16.17.tar

$ du -sh * | sort -n
  33M    linux-2.6.16.17.tar.lzma
  33M    linux-2.6.16.17.tar.rar
  37M    linux-2.6.16.17.tar.rz
  39M    linux-2.6.16.17.tar.bicom
  39M    linux-2.6.16.17.tar.bz
  39M    linux-2.6.16.17.tar.bz2
  39M    linux-2.6.16.17.tar.szip
  40M    linux-2.6.16.17.tar.sit
  47M    linux-2.6.16.17.tar.lha
  49M    linux-2.6.16.17.tar.dzip
  49M    linux-2.6.16.17.tar.gz
  50M    linux-2.6.16.17.tar.arj
  56M    linux-2.6.16.17.tar.lzo
  57M    linux-2.6.16.17.tar.F
  79M    linux-2.6.16.17.tar.Z
  92M    linux-2.6.16.17.tar.zoo
  99M    linux-2.6.16.17.tar.arc
223M    linux-2.6.16.17.tar


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-21 14:35 Linux Kernel Source Compression Justin Piszcz
@ 2006-05-21 18:40 ` Jan Engelhardt
  2006-05-21 18:56   ` Kasper Sandberg
  2006-05-21 19:28   ` Justin Piszcz
  2006-05-21 19:03 ` Alistair John Strachan
  2006-05-26  4:11 ` Bruce Guenter
  2 siblings, 2 replies; 36+ messages in thread
From: Jan Engelhardt @ 2006-05-21 18:40 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: linux-kernel, apiszcz

>
> Was curious as to which utilities would offer the best compression ratio for
> the kernel source, I thought it'd be bzip2 or rar but lzma wins, roughly 6 MiB
> smaller than bzip2.
>
You forgot:
  - .7z    7zip
  - .j     JAR (www.arjsoftware.com)
  - .ice   LHICE (some sort of "brother" to lharc aka lzh)
  - .ace   ACE (www.winace.com)
  -        UPX (yes!, you just need to put '#!/\n' at the front)
  - .cab   MS CAB (use winace)
  - .bh    BlackHole
  - .pak   PKARC 2.51
  - .sqz   SqueezeIt
  - "LZEXE"

ftp://camelot.spsl.nsc.ru/pub/win32/arc/ - you'll find some there
happy packing :)

> 38064   linux-2.6.16.17.tar.rz

  - is this rzip with _maximum_ distance?


Jan Engelhardt
-- 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-21 18:40 ` Jan Engelhardt
@ 2006-05-21 18:56   ` Kasper Sandberg
  2006-05-21 19:28   ` Justin Piszcz
  1 sibling, 0 replies; 36+ messages in thread
From: Kasper Sandberg @ 2006-05-21 18:56 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: Justin Piszcz, linux-kernel, apiszcz

On Sun, 2006-05-21 at 20:40 +0200, Jan Engelhardt wrote:
> >
> > Was curious as to which utilities would offer the best compression ratio for
> > the kernel source, I thought it'd be bzip2 or rar but lzma wins, roughly 6 MiB
> > smaller than bzip2.
> >
> You forgot:
>   - .7z    7zip
>   - .j     JAR (www.arjsoftware.com)
>   - .ice   LHICE (some sort of "brother" to lharc aka lzh)
>   - .ace   ACE (www.winace.com)
>   -        UPX (yes!, you just need to put '#!/\n' at the front)
>   - .cab   MS CAB (use winace)
>   - .bh    BlackHole
>   - .pak   PKARC 2.51
>   - .sqz   SqueezeIt
>   - "LZEXE"
and also lzx, which was, in the amige days the best there was, allthough
i know of no compressor for linux

> 
> ftp://camelot.spsl.nsc.ru/pub/win32/arc/ - you'll find some there
> happy packing :)
> 
> > 38064   linux-2.6.16.17.tar.rz
> 
>   - is this rzip with _maximum_ distance?
> 
> 
> Jan Engelhardt


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-21 14:35 Linux Kernel Source Compression Justin Piszcz
  2006-05-21 18:40 ` Jan Engelhardt
@ 2006-05-21 19:03 ` Alistair John Strachan
  2006-05-21 21:00   ` Chris Wedgwood
  2006-05-22 18:58   ` H. Peter Anvin
  2006-05-26  4:11 ` Bruce Guenter
  2 siblings, 2 replies; 36+ messages in thread
From: Alistair John Strachan @ 2006-05-21 19:03 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: linux-kernel, apiszcz

On Sunday 21 May 2006 15:35, Justin Piszcz wrote:
> Was curious as to which utilities would offer the best compression ratio
> for the kernel source, I thought it'd be bzip2 or rar but lzma wins,
> roughly 6 MiB smaller than bzip2.
>
> $ du -sk * | sort -n
> 33520   linux-2.6.16.17.tar.lzma

Somebody needs to make lzma userspace tools (like p7zip) faster, not crash, 
and behave like a regular UNIX program. Then we need a patch to GNU tar to 
emerge, and for it to persist for at least 4 years. Then maybe people will 
adopt this format..

-- 
Cheers,
Alistair.

Third year Computer Science undergraduate.
1F2 55 South Clerk Street, Edinburgh, UK.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-21 18:40 ` Jan Engelhardt
  2006-05-21 18:56   ` Kasper Sandberg
@ 2006-05-21 19:28   ` Justin Piszcz
  2006-05-22  2:05     ` Stefan Smietanowski
  1 sibling, 1 reply; 36+ messages in thread
From: Justin Piszcz @ 2006-05-21 19:28 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: linux-kernel, apiszcz

Compressed with -9.

      -9            slowest (best) compression

Unsure on the maximum distance.

Version info:

rzip 2.1
Copright (C) Andrew Tridgell 1998-2003


On Sun, 21 May 2006, Jan Engelhardt wrote:

>>
>> Was curious as to which utilities would offer the best compression ratio for
>> the kernel source, I thought it'd be bzip2 or rar but lzma wins, roughly 6 MiB
>> smaller than bzip2.
>>
> You forgot:
>  - .7z    7zip
>  - .j     JAR (www.arjsoftware.com)
>  - .ice   LHICE (some sort of "brother" to lharc aka lzh)
>  - .ace   ACE (www.winace.com)
>  -        UPX (yes!, you just need to put '#!/\n' at the front)
>  - .cab   MS CAB (use winace)
>  - .bh    BlackHole
>  - .pak   PKARC 2.51
>  - .sqz   SqueezeIt
>  - "LZEXE"
>
> ftp://camelot.spsl.nsc.ru/pub/win32/arc/ - you'll find some there
> happy packing :)
>
>> 38064   linux-2.6.16.17.tar.rz
>
>  - is this rzip with _maximum_ distance?
>
>
> Jan Engelhardt
> -- 
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-21 19:03 ` Alistair John Strachan
@ 2006-05-21 21:00   ` Chris Wedgwood
  2006-05-21 21:22     ` Alistair John Strachan
  2006-05-21 21:42     ` Sam Vilain
  2006-05-22 18:58   ` H. Peter Anvin
  1 sibling, 2 replies; 36+ messages in thread
From: Chris Wedgwood @ 2006-05-21 21:00 UTC (permalink / raw)
  To: Alistair John Strachan; +Cc: Justin Piszcz, linux-kernel, apiszcz

On Sun, May 21, 2006 at 08:03:32PM +0100, Alistair John Strachan wrote:

> Somebody needs to make lzma userspace tools (like p7zip) faster, not
> crash, and behave like a regular UNIX program. Then we need a patch
> to GNU tar to emerge, and for it to persist for at least 4
> years. Then maybe people will adopt this format..

why?

the gains aren't that great

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-21 21:00   ` Chris Wedgwood
@ 2006-05-21 21:22     ` Alistair John Strachan
  2006-05-21 21:42     ` Sam Vilain
  1 sibling, 0 replies; 36+ messages in thread
From: Alistair John Strachan @ 2006-05-21 21:22 UTC (permalink / raw)
  To: Chris Wedgwood; +Cc: Justin Piszcz, linux-kernel, apiszcz

On Sunday 21 May 2006 22:00, Chris Wedgwood wrote:
> On Sun, May 21, 2006 at 08:03:32PM +0100, Alistair John Strachan wrote:
> > Somebody needs to make lzma userspace tools (like p7zip) faster, not
> > crash, and behave like a regular UNIX program. Then we need a patch
> > to GNU tar to emerge, and for it to persist for at least 4
> > years. Then maybe people will adopt this format..
>
> why?
>
> the gains aren't that great

If it was less than 5%, I'd agree with you. The fact is, it's 17% better on a 
regular kernel tarball (not exactly a contrived test), so there would be 
reason to use it. It's also faster to decompress.

http://tukaani.org/lzma/

This utility appears to address most of my original concerns (i.e., it works 
with stream LZMA and has a bzip2/gzip-esque frontend). I could see LZMA 
replacing bzip2, but not gzip, due to the compression performance issues.

-- 
Cheers,
Alistair.

Third year Computer Science undergraduate.
1F2 55 South Clerk Street, Edinburgh, UK.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-21 21:00   ` Chris Wedgwood
  2006-05-21 21:22     ` Alistair John Strachan
@ 2006-05-21 21:42     ` Sam Vilain
  2006-05-21 21:57       ` Alistair John Strachan
  2006-05-21 21:59       ` Diego Calleja
  1 sibling, 2 replies; 36+ messages in thread
From: Sam Vilain @ 2006-05-21 21:42 UTC (permalink / raw)
  To: Chris Wedgwood
  Cc: Alistair John Strachan, Justin Piszcz, linux-kernel, apiszcz

Chris Wedgwood wrote:

>On Sun, May 21, 2006 at 08:03:32PM +0100, Alistair John Strachan wrote:
>
>  
>
>>Somebody needs to make lzma userspace tools (like p7zip) faster, not
>>crash, and behave like a regular UNIX program. Then we need a patch
>>to GNU tar to emerge, and for it to persist for at least 4
>>years. Then maybe people will adopt this format..
>>    
>>
>
>why?
>
>the gains aren't that great
>

Exactly, and while I know my network connection isn't exactly
representative of the general population of people building the kernel,
it's currently faster for me to download and unpack a .gz than to wait
the extra time for bzip2 to decompress. I've always found it quicker
dealing with .gz's for incremental patches. I thought the speed issue is
more of a speed / compression ratio trade-off, ie a caveat of
compression in general.

Mind you, 'git fetch' is even faster, even for people who aren't close
enough to their mirror to fetch a full .gz kernel tarball in <5s.

Sam.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-21 21:42     ` Sam Vilain
@ 2006-05-21 21:57       ` Alistair John Strachan
  2006-05-21 22:22         ` Sam Vilain
  2006-05-21 21:59       ` Diego Calleja
  1 sibling, 1 reply; 36+ messages in thread
From: Alistair John Strachan @ 2006-05-21 21:57 UTC (permalink / raw)
  To: Sam Vilain; +Cc: Chris Wedgwood, linux-kernel

On Sunday 21 May 2006 22:42, Sam Vilain wrote:
> Chris Wedgwood wrote:
> >On Sun, May 21, 2006 at 08:03:32PM +0100, Alistair John Strachan wrote:
> >>Somebody needs to make lzma userspace tools (like p7zip) faster, not
> >>crash, and behave like a regular UNIX program. Then we need a patch
> >>to GNU tar to emerge, and for it to persist for at least 4
> >>years. Then maybe people will adopt this format..
> >
> >why?
> >
> >the gains aren't that great
>
> Exactly, and while I know my network connection isn't exactly
> representative of the general population of people building the kernel,
> it's currently faster for me to download and unpack a .gz than to wait
> the extra time for bzip2 to decompress. I've always found it quicker
> dealing with .gz's for incremental patches. I thought the speed issue is
> more of a speed / compression ratio trade-off, ie a caveat of
> compression in general.

Actually, you're making false assumptions about LZMA. It is in fact quicker to 
decompress than bzip2, which largely addresses this issue. Compression speeds 
are the problem, but the end user won't do a lot of that.

-- 
Cheers,
Alistair.

Third year Computer Science undergraduate.
1F2 55 South Clerk Street, Edinburgh, UK.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-21 21:42     ` Sam Vilain
  2006-05-21 21:57       ` Alistair John Strachan
@ 2006-05-21 21:59       ` Diego Calleja
  1 sibling, 0 replies; 36+ messages in thread
From: Diego Calleja @ 2006-05-21 21:59 UTC (permalink / raw)
  To: Sam Vilain; +Cc: cw, s0348365, jpiszcz, linux-kernel, apiszcz

El Mon, 22 May 2006 09:42:28 +1200,
Sam Vilain <sam@vilain.net> escribió:

> it's currently faster for me to download and unpack a .gz than to wait
> the extra time for bzip2 to decompress. I've always found it quicker


For kernel patches and kernel releases it sure doesn't have a lot of
sense to switch, you don't gain too much.

LZMA has its gains, though. It's probably a interesting choice
for packaging software: You may get some extra space in the CD thanks
to the extra compression, and the faster decompressing could make
installs a bit faster. While LZMA is slower as hell compressing in
the "best compression" mode, is faster than bzip2 when compressing and
decompressing at the same compression levels than bzip2 (according to
the previous web). That pretty much means it's just better.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-21 21:57       ` Alistair John Strachan
@ 2006-05-21 22:22         ` Sam Vilain
  2006-05-21 22:29           ` Alistair John Strachan
  0 siblings, 1 reply; 36+ messages in thread
From: Sam Vilain @ 2006-05-21 22:22 UTC (permalink / raw)
  To: Alistair John Strachan; +Cc: Chris Wedgwood, linux-kernel

Alistair John Strachan wrote:

>>Exactly, and while I know my network connection isn't exactly
>>representative of the general population of people building the kernel,
>>it's currently faster for me to download and unpack a .gz than to wait
>>the extra time for bzip2 to decompress. I've always found it quicker
>>dealing with .gz's for incremental patches. I thought the speed issue is
>>more of a speed / compression ratio trade-off, ie a caveat of
>>compression in general.
>>    
>>
>
>Actually, you're making false assumptions about LZMA. It is in fact quicker to 
>decompress than bzip2, which largely addresses this issue. Compression speeds 
>are the problem, but the end user won't do a lot of that.
>

Interesting.  Googling a bit;  from http://tukaani.org/lzma/benchmarks:

In terms of speed, gzip is the winner again. lzma comes right behind it
two to three times slower than gzip. bzip2 is a lot slower taking
usually two to six times more time than lzma, that is, four to twelve
times more than gzip. One interesting thing is that gzip and lzma
decompress the faster the smaller the compressed size is, while bzip2
gets slower when the compression ratio gets better.
[...]
neither bzip2 nor lzma can compete with gzip in terms of speed or memory
usage.

Also this:

"lzmash -8" and "lzmash -9" require lots of memory and are practical
only on newer computers; the files compressed with them are probably a
pain to decompress on systems with less than 32 MB or 64 MB of memory.
[...]
The files compressed with the default "lzmash -7" can still be
decompressed, even on machines with only 16 MB of RAM

Sam.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-21 22:22         ` Sam Vilain
@ 2006-05-21 22:29           ` Alistair John Strachan
  2006-05-22 19:00             ` H. Peter Anvin
  0 siblings, 1 reply; 36+ messages in thread
From: Alistair John Strachan @ 2006-05-21 22:29 UTC (permalink / raw)
  To: Sam Vilain; +Cc: Chris Wedgwood, linux-kernel

On Sunday 21 May 2006 23:22, Sam Vilain wrote:
[snip]
> Interesting.  Googling a bit;  from http://tukaani.org/lzma/benchmarks:
>
> In terms of speed, gzip is the winner again. lzma comes right behind it
> two to three times slower than gzip. bzip2 is a lot slower taking
> usually two to six times more time than lzma, that is, four to twelve
> times more than gzip. One interesting thing is that gzip and lzma
> decompress the faster the smaller the compressed size is, while bzip2
> gets slower when the compression ratio gets better.
> [...]
> neither bzip2 nor lzma can compete with gzip in terms of speed or memory
> usage.
>
> Also this:
>
> "lzmash -8" and "lzmash -9" require lots of memory and are practical
> only on newer computers; the files compressed with them are probably a
> pain to decompress on systems with less than 32 MB or 64 MB of memory.
> [...]
> The files compressed with the default "lzmash -7" can still be
> decompressed, even on machines with only 16 MB of RAM

Interesting info. I agree that LZMA is not a replacement for gzip/zlib, 
because gzip is extremely size/time efficient.

However, as noted in another thread, it is almost certainly a viable 
replacement for bzip2, since people that use bzip2 are generally interested 
in a size optimisation, not a compression speed one, and even if compression 
speed is relevant, LZMA's options scale to be approximately as good (or as 
bad??) as bzip2.

This is all fairly academic. I think the issue still boils down to widespread 
adoption; bzip2 took a while to get off the ground, people don't like messing 
with new formats, and distributors have to pick up the tools.

I think kernel.org switching formats would be one of the last things that 
could, or indeed should, happen.

-- 
Cheers,
Alistair.

Third year Computer Science undergraduate.
1F2 55 South Clerk Street, Edinburgh, UK.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-21 19:28   ` Justin Piszcz
@ 2006-05-22  2:05     ` Stefan Smietanowski
  0 siblings, 0 replies; 36+ messages in thread
From: Stefan Smietanowski @ 2006-05-22  2:05 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: Jan Engelhardt, linux-kernel, apiszcz

[-- Attachment #1: Type: text/plain, Size: 1250 bytes --]

Justin Piszcz wrote:
> Compressed with -9.
> 
>      -9            slowest (best) compression
> 
> Unsure on the maximum distance.
> 
> Version info:
> 
> rzip 2.1
> Copright (C) Andrew Tridgell 1998-2003
> 
> 
> On Sun, 21 May 2006, Jan Engelhardt wrote:
> 
>>>
>>> Was curious as to which utilities would offer the best compression
>>> ratio for
>>> the kernel source, I thought it'd be bzip2 or rar but lzma wins,
>>> roughly 6 MiB
>>> smaller than bzip2.
>>>
>> You forgot:
>>  - .7z    7zip
>>  - .j     JAR (www.arjsoftware.com)
>>  - .ice   LHICE (some sort of "brother" to lharc aka lzh)
>>  - .ace   ACE (www.winace.com)
>>  -        UPX (yes!, you just need to put '#!/\n' at the front)
>>  - .cab   MS CAB (use winace)
>>  - .bh    BlackHole
>>  - .pak   PKARC 2.51
>>  - .sqz   SqueezeIt
>>  - "LZEXE"
>>
>> ftp://camelot.spsl.nsc.ru/pub/win32/arc/ - you'll find some there
>> happy packing :)
>>
>>> 38064   linux-2.6.16.17.tar.rz
>>
>>
>>  - is this rzip with _maximum_ distance?
>>
>>
>> Jan Engelhardt

Don't forget about .lzx! Probably need an amiga to use it though :)

BUT I could be remembering wrong but I think that they use a newer
version of lzx compression in .CAB since that guy got a job at MS
back in the days.

// Stefan

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 253 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-21 19:03 ` Alistair John Strachan
  2006-05-21 21:00   ` Chris Wedgwood
@ 2006-05-22 18:58   ` H. Peter Anvin
  2006-05-22 19:07     ` Alistair John Strachan
  1 sibling, 1 reply; 36+ messages in thread
From: H. Peter Anvin @ 2006-05-22 18:58 UTC (permalink / raw)
  To: linux-kernel

Followup to:  <200605212003.32063.s0348365@sms.ed.ac.uk>
By author:    Alistair John Strachan <s0348365@sms.ed.ac.uk>
In newsgroup: linux.dev.kernel
> 
> Somebody needs to make lzma userspace tools (like p7zip) faster, not crash, 
> and behave like a regular UNIX program. Then we need a patch to GNU tar to 
> emerge, and for it to persist for at least 4 years. Then maybe people will 
> adopt this format..
> 

The patch to GNU tar isn't necessary.  If the "not crash, and behave
like a regular UNIX program" can be satisfied, I'd be happy to support
7zip/lzma on kernel.org.  Unfortunately, as far as I can tell:

a) right now the standard encapsulation format for LZMA is 7zip, which
only comes in the form of hideously ugly code.  lzma-tools are
cleaner, but incompatible.

b) Even lzma-tools relies on a shell script to behave like a Unix
program.

Personally, I would like to suggest adding LZMA capability to gzip.
The gzip format already has support for multiple compression formats.

	-hpa

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-21 22:29           ` Alistair John Strachan
@ 2006-05-22 19:00             ` H. Peter Anvin
  0 siblings, 0 replies; 36+ messages in thread
From: H. Peter Anvin @ 2006-05-22 19:00 UTC (permalink / raw)
  To: linux-kernel

Followup to:  <200605212329.37719.s0348365@sms.ed.ac.uk>
By author:    Alistair John Strachan <s0348365@sms.ed.ac.uk>
In newsgroup: linux.dev.kernel
> 
> I think kernel.org switching formats would be one of the last things that 
> could, or indeed should, happen.
> 

kernel.org already has a multi-format infrastructure in place.  It
wouldn't be much more work to add a third format to the mix.

	-hpa


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-22 18:58   ` H. Peter Anvin
@ 2006-05-22 19:07     ` Alistair John Strachan
  2006-05-22 19:10       ` H. Peter Anvin
  0 siblings, 1 reply; 36+ messages in thread
From: Alistair John Strachan @ 2006-05-22 19:07 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-kernel

On Monday 22 May 2006 19:58, H. Peter Anvin wrote:
[snip]
> Personally, I would like to suggest adding LZMA capability to gzip.
> The gzip format already has support for multiple compression formats.

Any idea why this wasn't done for bzip2?

-- 
Cheers,
Alistair.

Third year Computer Science undergraduate.
1F2 55 South Clerk Street, Edinburgh, UK.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-22 19:07     ` Alistair John Strachan
@ 2006-05-22 19:10       ` H. Peter Anvin
  2006-05-22 19:15         ` Alistair John Strachan
  0 siblings, 1 reply; 36+ messages in thread
From: H. Peter Anvin @ 2006-05-22 19:10 UTC (permalink / raw)
  To: Alistair John Strachan; +Cc: linux-kernel

Alistair John Strachan wrote:
> On Monday 22 May 2006 19:58, H. Peter Anvin wrote:
> [snip]
>> Personally, I would like to suggest adding LZMA capability to gzip.
>> The gzip format already has support for multiple compression formats.
> 
> Any idea why this wasn't done for bzip2?

Yes, the bzip2 author I have been told was originally planning to do that, but then 
thought it would be harder to deploy that way (because gzip is a core utility, and people 
are nervous about making it larger.)

You'd have to ask him for the details, though.

It *is* true that there is a fair bit of code out there which sees a gzip magic number and 
expects to call deflate functions on it, without ever checking the compression type field. 
  However, even if there is a need for a new magic number, this can be done within the 
gzip code, or by forking gzip.

	-hpa

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-22 19:10       ` H. Peter Anvin
@ 2006-05-22 19:15         ` Alistair John Strachan
  2006-05-22 20:24           ` Jan Engelhardt
  2006-05-31 22:56           ` Bill Davidsen
  0 siblings, 2 replies; 36+ messages in thread
From: Alistair John Strachan @ 2006-05-22 19:15 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-kernel

On Monday 22 May 2006 20:10, H. Peter Anvin wrote:
> Alistair John Strachan wrote:
> > On Monday 22 May 2006 19:58, H. Peter Anvin wrote:
> > [snip]
> >
> >> Personally, I would like to suggest adding LZMA capability to gzip.
> >> The gzip format already has support for multiple compression formats.
> >
> > Any idea why this wasn't done for bzip2?
>
> Yes, the bzip2 author I have been told was originally planning to do that,
> but then thought it would be harder to deploy that way (because gzip is a
> core utility, and people are nervous about making it larger.)
>
> You'd have to ask him for the details, though.
>
> It *is* true that there is a fair bit of code out there which sees a gzip
> magic number and expects to call deflate functions on it, without ever
> checking the compression type field. However, even if there is a need for a
> new magic number, this can be done within the gzip code, or by forking
> gzip.

One trivial solution (that comes to mind) is by symlinking gunzip->unlzma (or 
similar) and having gzip's defaults change according to argv[0].

It's a bit of a shame bzip2 even exists, really. It really would be better if 
there was one unified, pluggable archiver on UNIX (and portables).

-- 
Cheers,
Alistair.

Third year Computer Science undergraduate.
1F2 55 South Clerk Street, Edinburgh, UK.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-22 19:15         ` Alistair John Strachan
@ 2006-05-22 20:24           ` Jan Engelhardt
  2006-05-22 20:41             ` H. Peter Anvin
  2006-05-22 21:00             ` Alistair John Strachan
  2006-05-31 22:56           ` Bill Davidsen
  1 sibling, 2 replies; 36+ messages in thread
From: Jan Engelhardt @ 2006-05-22 20:24 UTC (permalink / raw)
  To: Alistair John Strachan; +Cc: H. Peter Anvin, linux-kernel

>> > Any idea why this wasn't done for bzip2?
>>
>> Yes, the bzip2 author I have been told was originally planning to do that,
>> but then thought it would be harder to deploy that way (because gzip is a
>> core utility, and people are nervous about making it larger.)

I'd say that concern is valid.

>It's a bit of a shame bzip2 even exists, really. It really would be better if 
>there was one unified, pluggable archiver on UNIX (and portables).

Would You Like To Contribute(tm)? :)
Whenever a program is missing, someone is there to write it.



Jan Engelhardt
-- 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-22 20:24           ` Jan Engelhardt
@ 2006-05-22 20:41             ` H. Peter Anvin
  2006-05-22 21:00             ` Alistair John Strachan
  1 sibling, 0 replies; 36+ messages in thread
From: H. Peter Anvin @ 2006-05-22 20:41 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: Alistair John Strachan, linux-kernel

Jan Engelhardt wrote:
> 
> Would You Like To Contribute(tm)? :)
> Whenever a program is missing, someone is there to write it.
> 

I don't have time, sorry.  Between klibc, syslinux, kernel.org and being sick, my time has 
been very sparse recently :(

	-hpa

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-22 20:24           ` Jan Engelhardt
  2006-05-22 20:41             ` H. Peter Anvin
@ 2006-05-22 21:00             ` Alistair John Strachan
  2006-05-22 21:04               ` H. Peter Anvin
                                 ` (2 more replies)
  1 sibling, 3 replies; 36+ messages in thread
From: Alistair John Strachan @ 2006-05-22 21:00 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: H. Peter Anvin, linux-kernel

On Monday 22 May 2006 21:24, Jan Engelhardt wrote:
> >> > Any idea why this wasn't done for bzip2?
> >>
> >> Yes, the bzip2 author I have been told was originally planning to do
> >> that, but then thought it would be harder to deploy that way (because
> >> gzip is a core utility, and people are nervous about making it larger.)
>
> I'd say that concern is valid.
>
> >It's a bit of a shame bzip2 even exists, really. It really would be better
> > if there was one unified, pluggable archiver on UNIX (and portables).
>
> Would You Like To Contribute(tm)? :)
> Whenever a program is missing, someone is there to write it.

I would, but if it's a "valid concern" that gzip is a few hundred KB larger, 
and the community would not graciously receive such work, there's not much 
point, is there? :-)

Seriously, though, if I understand gzip correctly, it uses deflate/zlib 
internally. Why, in that case, does /bin/gzip not (dynamically) link against 
libz? If a first step was fixing that, a second could be linking dynamically 
against libbz2 and 'liblzma', and making it all compile-time configurable.

That should keep everybody happy.

-- 
Cheers,
Alistair.

Third year Computer Science undergraduate.
1F2 55 South Clerk Street, Edinburgh, UK.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-22 21:00             ` Alistair John Strachan
@ 2006-05-22 21:04               ` H. Peter Anvin
  2006-05-22 21:11                 ` Joshua Hudson
  2006-05-23 13:37                 ` Jan Engelhardt
  2006-05-23  2:16               ` Nuri Jawad
  2006-05-23 13:38               ` Jan Engelhardt
  2 siblings, 2 replies; 36+ messages in thread
From: H. Peter Anvin @ 2006-05-22 21:04 UTC (permalink / raw)
  To: Alistair John Strachan; +Cc: Jan Engelhardt, linux-kernel

Alistair John Strachan wrote:
> 
> I would, but if it's a "valid concern" that gzip is a few hundred KB larger, 
> and the community would not graciously receive such work, there's not much 
> point, is there? :-)
> 
> Seriously, though, if I understand gzip correctly, it uses deflate/zlib 
> internally. Why, in that case, does /bin/gzip not (dynamically) link against 
> libz? If a first step was fixing that, a second could be linking dynamically 
> against libbz2 and 'liblzma', and making it all compile-time configurable.
> 

Because gzip predates zlib...

	-hpa

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-22 21:04               ` H. Peter Anvin
@ 2006-05-22 21:11                 ` Joshua Hudson
  2006-05-23 13:37                 ` Jan Engelhardt
  1 sibling, 0 replies; 36+ messages in thread
From: Joshua Hudson @ 2006-05-22 21:11 UTC (permalink / raw)
  To: linux-kernel

On 5/22/06, H. Peter Anvin <hpa@zytor.com> wrote:
> Alistair John Strachan wrote:
[snip]
> > Seriously, though, if I understand gzip correctly, it uses deflate/zlib
> > internally. Why, in that case, does /bin/gzip not (dynamically) link against
> > libz? If a first step was fixing that, a second could be linking dynamically
> > against libbz2 and 'liblzma', and making it all compile-time configurable.
> >
>
> Because gzip predates zlib...
>
Also because it runs faster than zlib on x86 due to register pressure.
Relocatable
code costs one register. x86 only has 7 that an algorithm can scribble over
(8 if it doesn't have any stack).

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-22 21:00             ` Alistair John Strachan
  2006-05-22 21:04               ` H. Peter Anvin
@ 2006-05-23  2:16               ` Nuri Jawad
  2006-05-23  2:55                 ` Stefan Smietanowski
                                   ` (2 more replies)
  2006-05-23 13:38               ` Jan Engelhardt
  2 siblings, 3 replies; 36+ messages in thread
From: Nuri Jawad @ 2006-05-23  2:16 UTC (permalink / raw)
  To: Alistair John Strachan; +Cc: Jan Engelhardt, H. Peter Anvin, linux-kernel

Hi,
just wanted to remark that I never liked that bzip was replaced by bzip2 
(were there license issues?) since bzip's compression was/is often 
stronger:

39843104 Mar 28 09:33 linux-2.6.15.7.tar.bz2
39423739 Mar 28 09:33 linux-2.6.15.7.tar.bz

Not a big difference in this case but still a step back. I for once am 
keeping my bzip binary.. does anyone know where the source can still be 
found?

Regards, Nuri

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-23  2:16               ` Nuri Jawad
@ 2006-05-23  2:55                 ` Stefan Smietanowski
  2006-05-23 14:15                 ` Ivan Novick
  2006-05-31 22:51                 ` Bill Davidsen
  2 siblings, 0 replies; 36+ messages in thread
From: Stefan Smietanowski @ 2006-05-23  2:55 UTC (permalink / raw)
  To: Nuri Jawad
  Cc: Alistair John Strachan, Jan Engelhardt, H. Peter Anvin, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 362 bytes --]

Nuri Jawad wrote:
> Hi,
> just wanted to remark that I never liked that bzip was replaced by bzip2
> (were there license issues?) since bzip's compression was/is often
> stronger:

bzip was (possibly) infringing a patent so a method bzip used was
removed and subsequently bzip2 was created.

http://lists.debian.org/debian-devel/1997/12/msg00778.html

// Stefan

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 253 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-22 21:04               ` H. Peter Anvin
  2006-05-22 21:11                 ` Joshua Hudson
@ 2006-05-23 13:37                 ` Jan Engelhardt
  1 sibling, 0 replies; 36+ messages in thread
From: Jan Engelhardt @ 2006-05-23 13:37 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Alistair John Strachan, linux-kernel

>> Seriously, though, if I understand gzip correctly, it uses deflate/zlib
>> internally. Why, in that case, does /bin/gzip not (dynamically) link
>> against libz? If a first step was fixing that, a second could be linking
>> dynamically against libbz2 and 'liblzma', and making it all compile-time
>> configurable.
>
> Because gzip predates zlib...
>
So we are carrying cruft around.


Jan Engelhardt
-- 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-22 21:00             ` Alistair John Strachan
  2006-05-22 21:04               ` H. Peter Anvin
  2006-05-23  2:16               ` Nuri Jawad
@ 2006-05-23 13:38               ` Jan Engelhardt
  2006-05-23 15:28                 ` Alistair John Strachan
  2 siblings, 1 reply; 36+ messages in thread
From: Jan Engelhardt @ 2006-05-23 13:38 UTC (permalink / raw)
  To: Alistair John Strachan; +Cc: H. Peter Anvin, linux-kernel

>> >> > Any idea why this wasn't done for bzip2?
>> >>
>> >> Yes, the bzip2 author I have been told was originally planning to do
>> >> that, but then thought it would be harder to deploy that way (because
>> >> gzip is a core utility, and people are nervous about making it larger.)
>>
>> I'd say that concern is valid.
>>
>> >It's a bit of a shame bzip2 even exists, really. It really would be better
>> > if there was one unified, pluggable archiver on UNIX (and portables).
>>
>> Would You Like To Contribute(tm)? :)
>> Whenever a program is missing, someone is there to write it.
>
>I would, but if it's a "valid concern" that gzip is a few hundred KB larger, 
>and the community would not graciously receive such work, there's not much 
>point, is there? :-)
>
Make it use shared libraries (did I already mention that?)

BTW, "a few hundred KB" is really overestimated if it's just about bzip2:
-rwxr-xr-x  1 root root 27640 Apr 23 02:20 /usr/bin/bzip2
-rwxr-xr-x  1 root root 66864 Apr 23 02:20 /lib/libbz2.so.1.0.0
That's not even _one_ hundred KB. Oh, just keep it as .so. :)
And of course, compile with klibc, it has less loader bloat than glibc (as 
someone had found out...I think it was Greg.)



Jan Engelhardt
-- 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-23  2:16               ` Nuri Jawad
  2006-05-23  2:55                 ` Stefan Smietanowski
@ 2006-05-23 14:15                 ` Ivan Novick
  2006-05-23 14:23                   ` Olivier Galibert
  2006-05-31 22:51                 ` Bill Davidsen
  2 siblings, 1 reply; 36+ messages in thread
From: Ivan Novick @ 2006-05-23 14:15 UTC (permalink / raw)
  To: Nuri Jawad, Alistair John Strachan
  Cc: Jan Engelhardt, H. Peter Anvin, linux-kernel, julian

cc'ing Julian Seward the author of bzip2

----- Original message -----
Hi,
just wanted to remark that I never liked that bzip was replaced by bzip2 
(were there license issues?) since bzip's compression was/is often 
stronger:

39843104 Mar 28 09:33 linux-2.6.15.7.tar.bz2
39423739 Mar 28 09:33 linux-2.6.15.7.tar.bz

Not a big difference in this case but still a step back. I for once am 
keeping my bzip binary.. does anyone know where the source can still be 
found?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-23 14:15                 ` Ivan Novick
@ 2006-05-23 14:23                   ` Olivier Galibert
  2006-05-23 14:47                     ` Julian Seward
  2006-05-25 11:42                     ` Jan Engelhardt
  0 siblings, 2 replies; 36+ messages in thread
From: Olivier Galibert @ 2006-05-23 14:23 UTC (permalink / raw)
  To: Ivan Novick
  Cc: Nuri Jawad, Alistair John Strachan, Jan Engelhardt,
	H. Peter Anvin, linux-kernel, julian

> just wanted to remark that I never liked that bzip was replaced by bzip2 
> (were there license issues?) since bzip's compression was/is often 
> stronger:

bzip1 uses arithmetic encoding which is heavily patented.  bzip2 uses
huffman instead, which isn't, but is slightly (10% is often quoted)
less efficient.  I guess bzip3 could use range coding which is
supposedly patent-free[1] and has similar compression ratio than
arithmetic coding.

  OG.

[1] I guess everything is in the way it is written, since I have a
very hard time understand where the difference is between range coding
and arithmetic coding.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-23 14:23                   ` Olivier Galibert
@ 2006-05-23 14:47                     ` Julian Seward
  2006-05-23 16:35                       ` Nuri Jawad
  2006-05-25 11:42                     ` Jan Engelhardt
  1 sibling, 1 reply; 36+ messages in thread
From: Julian Seward @ 2006-05-23 14:47 UTC (permalink / raw)
  To: Olivier Galibert
  Cc: Ivan Novick, Nuri Jawad, Alistair John Strachan, Jan Engelhardt,
	H. Peter Anvin, linux-kernel


> bzip1 uses arithmetic encoding which is heavily patented.  bzip2 uses
> huffman instead, which isn't, but is slightly (10% is often quoted)
> less efficient.

It uses an adaptive huffman scheme devised by David Wheeler, which usually
gets within 1% of the arithmetic coder that bzip1 used.

bzip2, especially the 1.0.X series, is superior to bzip1 in terms of speed,
memory use, robustness against bad-case inputs, recoverability of data 
from damaged compressed streams, and that it can be used as a library.
Moving back to bzip1 would IMO be a big step backwards.

J

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-23 13:38               ` Jan Engelhardt
@ 2006-05-23 15:28                 ` Alistair John Strachan
  0 siblings, 0 replies; 36+ messages in thread
From: Alistair John Strachan @ 2006-05-23 15:28 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: H. Peter Anvin, linux-kernel

On Tuesday 23 May 2006 14:38, Jan Engelhardt wrote:
> >> >> > Any idea why this wasn't done for bzip2?
> >> >>
> >> >> Yes, the bzip2 author I have been told was originally planning to do
> >> >> that, but then thought it would be harder to deploy that way (because
> >> >> gzip is a core utility, and people are nervous about making it
> >> >> larger.)
> >>
> >> I'd say that concern is valid.
> >>
> >> >It's a bit of a shame bzip2 even exists, really. It really would be
> >> > better if there was one unified, pluggable archiver on UNIX (and
> >> > portables).
> >>
> >> Would You Like To Contribute(tm)? :)
> >> Whenever a program is missing, someone is there to write it.
> >
> >I would, but if it's a "valid concern" that gzip is a few hundred KB
> > larger, and the community would not graciously receive such work, there's
> > not much point, is there? :-)
>
> Make it use shared libraries (did I already mention that?)

Actually I did, in the paragraph that you just snipped.

> BTW, "a few hundred KB" is really overestimated if it's just about bzip2:
> -rwxr-xr-x  1 root root 27640 Apr 23 02:20 /usr/bin/bzip2
> -rwxr-xr-x  1 root root 66864 Apr 23 02:20 /lib/libbz2.so.1.0.0
> That's not even _one_ hundred KB. Oh, just keep it as .so. :)
> And of course, compile with klibc, it has less loader bloat than glibc (as
> someone had found out...I think it was Greg.)

Agreed. However gzip is such an ancient (and presumably now secure) tool that 
it might be unpopular to modify it so heavily. It might also be desirable for 
embedded folks to statically link code.

But, this is now seriously OT for LKML, so I might just email the GNU gzip 
folks and see whether it's been done before and/or whether it's a good idea.

-- 
Cheers,
Alistair.

Third year Computer Science undergraduate.
1F2 55 South Clerk Street, Edinburgh, UK.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-23 14:47                     ` Julian Seward
@ 2006-05-23 16:35                       ` Nuri Jawad
  0 siblings, 0 replies; 36+ messages in thread
From: Nuri Jawad @ 2006-05-23 16:35 UTC (permalink / raw)
  To: Julian Seward
  Cc: Olivier Galibert, Ivan Novick, Alistair John Strachan,
	Jan Engelhardt, H. Peter Anvin, linux-kernel

On Tue, 23 May 2006, Julian Seward wrote:

> It uses an adaptive huffman scheme devised by David Wheeler, which usually
> gets within 1% of the arithmetic coder that bzip1 used.

If that coder has patent issues, it shouldn't be used, of course, 
regardless of performance.

> bzip2, especially the 1.0.X series, is superior to bzip1 in terms of speed,
> memory use, robustness against bad-case inputs, recoverability of data
> from damaged compressed streams, and that it can be used as a library.

Superior in most aspects, yes, but not regarding compression ratio. 
Anyway, calling bzip2 a step backwards was a bit of provocation and not 
really meant seriously, but it does have slightly reduced compression ratio.

Maybe bzip2 could be updated to make more use of today's fast CPUs? Much 
larger dictionary or other computationally expensive improvements.

Regards, Nuri

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-23 14:23                   ` Olivier Galibert
  2006-05-23 14:47                     ` Julian Seward
@ 2006-05-25 11:42                     ` Jan Engelhardt
  1 sibling, 0 replies; 36+ messages in thread
From: Jan Engelhardt @ 2006-05-25 11:42 UTC (permalink / raw)
  To: Olivier Galibert
  Cc: Ivan Novick, Nuri Jawad, Alistair John Strachan, H. Peter Anvin,
	linux-kernel, julian

>> just wanted to remark that I never liked that bzip was replaced by bzip2 
>> (were there license issues?) since bzip's compression was/is often 
>> stronger:
>
>bzip1 uses arithmetic encoding which is heavily patented.  bzip2 uses
>huffman instead, which isn't, but is slightly (10% is often quoted)
>less efficient.  I guess bzip3 could use range coding which is
>supposedly patent-free[1] and has similar compression ratio than
>arithmetic coding.
>
Although plans for a bzip3 have been posted (I think removing the MTF and 
so on...), it has not been done yet. Maybe I am wrong here.


Jan Engelhardt
-- 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-21 14:35 Linux Kernel Source Compression Justin Piszcz
  2006-05-21 18:40 ` Jan Engelhardt
  2006-05-21 19:03 ` Alistair John Strachan
@ 2006-05-26  4:11 ` Bruce Guenter
  2 siblings, 0 replies; 36+ messages in thread
From: Bruce Guenter @ 2006-05-26  4:11 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 615 bytes --]

On Sun, May 21, 2006 at 10:35:00AM -0400, Justin Piszcz wrote:
> Was curious as to which utilities would offer the best compression ratio 
> for the kernel source, I thought it'd be bzip2 or rar but lzma wins, 
> roughly 6 MiB smaller than bzip2.
> 
> $ du -sk * | sort -n
> 33520   linux-2.6.16.17.tar.lzma

Since it was requested by somebody:

$ du -sk linux-2.6.16.17.*
32904	linux-2.6.16.17.7z
39919	linux-2.6.16.17.tar.bz2

This was done with: 7z -mx=9
-- 
Bruce Guenter <bruce@untroubled.org> http://untroubled.org/
OpenPGP key: 699980E8 / D0B7 C8DD 365D A395 29DA  2E2A E96F B2DC 6999 80E8

[-- Attachment #2: Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-23  2:16               ` Nuri Jawad
  2006-05-23  2:55                 ` Stefan Smietanowski
  2006-05-23 14:15                 ` Ivan Novick
@ 2006-05-31 22:51                 ` Bill Davidsen
  2 siblings, 0 replies; 36+ messages in thread
From: Bill Davidsen @ 2006-05-31 22:51 UTC (permalink / raw)
  To: Nuri Jawad; +Cc: linux-kernel

Nuri Jawad wrote:
> Hi,
> just wanted to remark that I never liked that bzip was replaced by bzip2 
> (were there license issues?) since bzip's compression was/is often 
> stronger:
> 
> 39843104 Mar 28 09:33 linux-2.6.15.7.tar.bz2
> 39423739 Mar 28 09:33 linux-2.6.15.7.tar.bz
> 
> Not a big difference in this case but still a step back. I for once am 
> keeping my bzip binary.. does anyone know where the source can still be 
> found?

I know I have a copy backed up, but I'm rather disorganized at the 
moment, having moved two out-of-town offices into this one, after 
spending 12 years on a ten week contract... but I doubt you want it, 
it's slow as hell and violates all manner of patents. Mind you, I think 
the patents are held by IBM, so they might be negotiable, but I think 
the original is dead. Used either fractal or arithmetic compression IIRC.
-- 
Bill Davidsen <davidsen@tmr.com>
   Obscure bug of 2004: BASH BUFFER OVERFLOW - if bash is being run by a
normal user and is setuid root, with the "vi" line edit mode selected,
and the character set is "big5," an off-by-one errors occurs during
wildcard (glob) expansion.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux Kernel Source Compression
  2006-05-22 19:15         ` Alistair John Strachan
  2006-05-22 20:24           ` Jan Engelhardt
@ 2006-05-31 22:56           ` Bill Davidsen
  1 sibling, 0 replies; 36+ messages in thread
From: Bill Davidsen @ 2006-05-31 22:56 UTC (permalink / raw)
  To: Alistair John Strachan; +Cc: linux-kernel

Alistair John Strachan wrote:

> It's a bit of a shame bzip2 even exists, really. It really would be better if 
> there was one unified, pluggable archiver on UNIX (and portables).
> 
All the people with slow connections bless bzip2. If you or someone want 
a new compressor, write a program for it, call it something unique so 
people will know it's different, and be happy.

Even with a fast line, I can only pull as fast as the source can push, 
so smaller is better for all of us. The time to decompress a tar.bz2 and 
tar.gz are very similar, the CPU for bzip2 is about double, and the time 
to create the directories and write the files is the same in either case.

-- 
Bill Davidsen <davidsen@tmr.com>
   Obscure bug of 2004: BASH BUFFER OVERFLOW - if bash is being run by a
normal user and is setuid root, with the "vi" line edit mode selected,
and the character set is "big5," an off-by-one errors occurs during
wildcard (glob) expansion.


^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2006-05-31 22:55 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-05-21 14:35 Linux Kernel Source Compression Justin Piszcz
2006-05-21 18:40 ` Jan Engelhardt
2006-05-21 18:56   ` Kasper Sandberg
2006-05-21 19:28   ` Justin Piszcz
2006-05-22  2:05     ` Stefan Smietanowski
2006-05-21 19:03 ` Alistair John Strachan
2006-05-21 21:00   ` Chris Wedgwood
2006-05-21 21:22     ` Alistair John Strachan
2006-05-21 21:42     ` Sam Vilain
2006-05-21 21:57       ` Alistair John Strachan
2006-05-21 22:22         ` Sam Vilain
2006-05-21 22:29           ` Alistair John Strachan
2006-05-22 19:00             ` H. Peter Anvin
2006-05-21 21:59       ` Diego Calleja
2006-05-22 18:58   ` H. Peter Anvin
2006-05-22 19:07     ` Alistair John Strachan
2006-05-22 19:10       ` H. Peter Anvin
2006-05-22 19:15         ` Alistair John Strachan
2006-05-22 20:24           ` Jan Engelhardt
2006-05-22 20:41             ` H. Peter Anvin
2006-05-22 21:00             ` Alistair John Strachan
2006-05-22 21:04               ` H. Peter Anvin
2006-05-22 21:11                 ` Joshua Hudson
2006-05-23 13:37                 ` Jan Engelhardt
2006-05-23  2:16               ` Nuri Jawad
2006-05-23  2:55                 ` Stefan Smietanowski
2006-05-23 14:15                 ` Ivan Novick
2006-05-23 14:23                   ` Olivier Galibert
2006-05-23 14:47                     ` Julian Seward
2006-05-23 16:35                       ` Nuri Jawad
2006-05-25 11:42                     ` Jan Engelhardt
2006-05-31 22:51                 ` Bill Davidsen
2006-05-23 13:38               ` Jan Engelhardt
2006-05-23 15:28                 ` Alistair John Strachan
2006-05-31 22:56           ` Bill Davidsen
2006-05-26  4:11 ` Bruce Guenter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).