linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gao Xiang <hsiangkao@aol.com>
To: "Alex Xu (Hello71)" <alex_y_xu@yahoo.ca>
Cc: linux-kernel@vger.kernel.org,
	Nick Terrell <nickrterrell@gmail.com>,
	Nick Terrell <terrelln@fb.com>,
	Norbert Lange <nolange79@gmail.com>, Chris Mason <clm@fb.com>,
	linux-kbuild@vger.kernel.org, x86@kernel.org,
	gregkh@linuxfoundation.org, Petr Malat <oss@malat.biz>,
	Kees Cook <keescook@chromium.org>,
	Kernel Team <Kernel-team@fb.com>,
	Adam Borowski <kilobyte@angband.pl>,
	Patrick Williams <patrickw3@fb.com>,
	rmikey@fb.com, mingo@kernel.org,
	Patrick Williams <patrick@stwcx.xyz>,
	Sedat Dilek <sedat.dilek@gmail.com>
Subject: Re: Kernel compression benchmarks
Date: Wed, 1 Jul 2020 23:50:05 +0800	[thread overview]
Message-ID: <20200701153028.GA30962@hsiangkao-HP-ZHAN-66-Pro-G1> (raw)
In-Reply-To: <1588791882.08g1378g67.none@localhost>

Hi Alex,

(sorry... maybe my @gmx.com email is broken again...)

On Wed, Jul 01, 2020 at 10:35:48AM -0400, Alex Xu (Hello71) wrote:
> 
> My conclusions:
> 
> - zstd is an improvement on almost all metrics.
> - bzip2 and lzma should be removed post-haste.

I'm some familar with LZ4 and LZMA (xz) internals.

I'd like to add some notes from the principle perspective,
but I'm not sure if I would join some further topic
about this...

 XZ is another form of LZMA2, which is based on LZMA.
 It uses range coder technology. In principle, it has better
 compession ratio with slowest speed (due to multiplication
 by bits rather than lookup table). Instead, Zstd uses huffman
 (which is much like deflate, aka. gzip) and FSE (If I'm not
  wrong about Zstd)...

 So in general (apart from the specific implementation),
 the decompression speed vs compression ratio ralationship are
  LZ4  -  Zstd  -  LZMA

 Some arguments such as compression level have impact on
 LZ matchfinder (yeah, except for bzip2, all algorithms
 are LZ-based) and dictionary size. And some specific
 compressors aren't well-optimized (e.g. zlib).

 Anyway, I think LZMA (xz) is still useful and which is more
 friendly to fixed-sized output compression than Zstd yet (But
 yeah, I'm not familar with all ZSTD internals. I will dig
 into that if I've more extra time).

> - lzo should be removed once zstd is merged.
> - compression level is important to consider for compression speed: the 
>   default lz4 -1 compresses very fast but has a very poor compression 
>   ratio. zstd -19 compresses barely better than zstd -18, but takes 
>   significantly longer to compress.
> - compression level should be configurable: lz4 -1 is useful, but so is 
>   lz4 -9. zstd -1 is useful, but so is zstd -19. zstd -1 is useful for 
>   developers who want kernel builds as fast as possible, zstd -19 for 
>   everybody else.
> - gzip is by far not the fastest compressor (even excluding cat)
> - modern compressors (xz, lz4, zstd) decompress about as fast for each 
>   compression level, only requiring more memory

 lz4 has fixed sliding window (dictionary, 64k), so it won't
 require more memory among different compression level when
 decompressing.

Thanks,
Gao Xiang



  reply	other threads:[~2020-07-01 15:50 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1588791882.08g1378g67.none.ref@localhost>
2020-07-01 14:35 ` Kernel compression benchmarks Alex Xu (Hello71)
2020-07-01 15:50   ` Gao Xiang [this message]
2020-07-01 17:32     ` Alex Xu (Hello71)
2020-07-02 15:18   ` Kees Cook
2020-07-03  8:15     ` Sedat Dilek
2020-07-03 16:06       ` Kees Cook
2020-07-03 17:36         ` Norbert Lange
2020-07-06 15:05         ` Nick Terrell
2020-07-26 16:43 Jan Ziak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200701153028.GA30962@hsiangkao-HP-ZHAN-66-Pro-G1 \
    --to=hsiangkao@aol.com \
    --cc=Kernel-team@fb.com \
    --cc=alex_y_xu@yahoo.ca \
    --cc=clm@fb.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=keescook@chromium.org \
    --cc=kilobyte@angband.pl \
    --cc=linux-kbuild@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=nickrterrell@gmail.com \
    --cc=nolange79@gmail.com \
    --cc=oss@malat.biz \
    --cc=patrick@stwcx.xyz \
    --cc=patrickw3@fb.com \
    --cc=rmikey@fb.com \
    --cc=sedat.dilek@gmail.com \
    --cc=terrelln@fb.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).