LKML Archive on lore.kernel.org
 help / color / Atom feed
From: "Alex Xu (Hello71)" <alex_y_xu@yahoo.ca>
To: linux-kernel@vger.kernel.org,
	Nick Terrell <nickrterrell@gmail.com>,
	Nick Terrell <terrelln@fb.com>,
	Norbert Lange <nolange79@gmail.com>
Cc: Chris Mason <clm@fb.com>,
	linux-kbuild@vger.kernel.org, x86@kernel.org,
	gregkh@linuxfoundation.org, Petr Malat <oss@malat.biz>,
	Kees Cook <keescook@chromium.org>,
	Kernel Team <Kernel-team@fb.com>,
	Adam Borowski <kilobyte@angband.pl>,
	Patrick Williams <patrickw3@fb.com>,
	rmikey@fb.com, mingo@kernel.org,
	Patrick Williams <patrick@stwcx.xyz>,
	Sedat Dilek <sedat.dilek@gmail.com>
Subject: Kernel compression benchmarks
Date: Wed, 01 Jul 2020 10:35:48 -0400
Message-ID: <1588791882.08g1378g67.none@localhost> (raw)
In-Reply-To: <1588791882.08g1378g67.none.ref@localhost>


[-- Attachment #1: Type: text/plain, Size: 3687 bytes --]

Hi all,

ZSTD compression patches have been sent in a number of times over the 
past few years. Every time, someone asks for benchmarks. Every time, 
someone is concerned about compression time. Sometimes, someone provides 
benchmarks.

But, as far as I can tell, nobody considered the compression parameters, 
which have a significant impact on compression time and ratio.

So, I did some benchmarks myself, including all the compression levels 
for each compressor.

Results:

The results are attached as SVG graphs and CSV data.

Summary:

- compression level, predictably, has a huge impact on compression time.
- compression level has virtually no impact on decompression time for 
  lz4, zstd, and some effect on others. interestingly, xz decompresses 
  slightly faster at higher compression levels (perhaps cache-related).
- gzip compresses slightly faster than zstd at medium compression levels.
- bzip2 sucks: slow compression, very slow decompression, poor ratio.
- lzma decompresses slightly faster than xz, but is also slightly larger.
- xz is smallest but with very slow compression and decompression.
- lz4 decompresses fastest.
- zstd is a good balanced default.
- 7z is much faster than xz, even with wine overhead.

Files:

For the kernel, I did "make allmodconfig; sed -i -e '/=m$/d' .config" 
with a 5.6 kernel and gcc 9.3.0 on x86_64, then concatenated vmlinux.bin 
and vmlinux.relocs. For the initramfs, I used the Arch Linux fallback 
initramfs with default hooks.

Versions:

gzip 1.10
bzip2, a block-sorting file compressor.  Version 1.0.8, 13-Jul-2019.
xz (XZ Utils) 5.2.5
*** LZ4 command line interface 64-bits v1.9.2, by Yann Collet ***
lzop 1.04
LZO library 2.10
*** zstd command line interface 64-bits v1.4.4, by Yann Collet ***
7-Zip 19.00 (x64) : Copyright (c) 1999-2018 Igor Pavlov : 2019-02-21

Notes:

I used the userspace versions of the decompressors, not the kernel 
version. This is particularly relevant for xz, as the kernel xzminidec 
is significantly slower than xz.

pigz is faster than gzip, but I used gzip as a common baseline.

7-Zip was run through wine with a persistent wineserver.

I ran the benchmark on a Ryzen 1600, with turbo boost turned off. Each 
test was run only once, on the basis that any noise wouldn't disrupt the 
overall curve, and also I don't want to spend hours waiting for the 
results.

The current compression level defaults are:

- gzip -9
- bzip2 -9
- lzma -9
- xz --check=crc32 --x86 --lzma2=,dict=32MiB # except on ppc
- lzop -9
- lz4 -l -1

My conclusions:

- zstd is an improvement on almost all metrics.
- bzip2 and lzma should be removed post-haste.
- lzo should be removed once zstd is merged.
- compression level is important to consider for compression speed: the 
  default lz4 -1 compresses very fast but has a very poor compression 
  ratio. zstd -19 compresses barely better than zstd -18, but takes 
  significantly longer to compress.
- compression level should be configurable: lz4 -1 is useful, but so is 
  lz4 -9. zstd -1 is useful, but so is zstd -19. zstd -1 is useful for 
  developers who want kernel builds as fast as possible, zstd -19 for 
  everybody else.
- gzip is by far not the fastest compressor (even excluding cat)
- modern compressors (xz, lz4, zstd) decompress about as fast for each 
  compression level, only requiring more memory
- 7-Zip is much faster than xz, needs more research
- 7-Zip BCJ2 is slightly better than xz/BCJ. probably better filters for 
  all archs would be a good area of research, as apparently BCJ/BCJ2 are 
  intended only for 32-bit x86.

Thanks,
Alex.

[-- Attachment #2: kernel-compression-benchmarks.tar.gz --]
[-- Type: application/x-compressed-tar, Size: 56733 bytes --]

       reply index

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1588791882.08g1378g67.none.ref@localhost>
2020-07-01 14:35 ` Alex Xu (Hello71) [this message]
2020-07-01 15:50   ` Gao Xiang
2020-07-01 17:32     ` Alex Xu (Hello71)
2020-07-02 15:18   ` Kees Cook
2020-07-03  8:15     ` Sedat Dilek
2020-07-03 16:06       ` Kees Cook
2020-07-03 17:36         ` Norbert Lange
2020-07-06 15:05         ` Nick Terrell
2020-07-26 16:43 Jan Ziak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1588791882.08g1378g67.none@localhost \
    --to=alex_y_xu@yahoo.ca \
    --cc=Kernel-team@fb.com \
    --cc=clm@fb.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=keescook@chromium.org \
    --cc=kilobyte@angband.pl \
    --cc=linux-kbuild@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=nickrterrell@gmail.com \
    --cc=nolange79@gmail.com \
    --cc=oss@malat.biz \
    --cc=patrick@stwcx.xyz \
    --cc=patrickw3@fb.com \
    --cc=rmikey@fb.com \
    --cc=sedat.dilek@gmail.com \
    --cc=terrelln@fb.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git