From: Nick Terrell <terrelln@fb.com>
To: Arvind Sankar <nivedita@alum.mit.edu>
Cc: Pavel Machek <pavel@denx.de>,
Nick Terrell <nickrterrell@gmail.com>,
"Ingo Molnar" <mingo@kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
X86 ML <x86@kernel.org>, Kernel Team <Kernel-team@fb.com>,
Yann Collet <yann.collet.73@gmail.com>,
Gao Xiang <gaoxiang25@huawei.com>,
Sven Schmidt <4sschmid@informatik.uni-hamburg.de>,
Andrew Morton <akpm@linux-foundation.org>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH] lz4: Fix kernel decompression speed
Date: Tue, 4 Aug 2020 17:18:55 +0000 [thread overview]
Message-ID: <BA4151A3-83AB-4481-8A89-D9B645578995@fb.com> (raw)
In-Reply-To: <20200804151654.GA2326348@rani.riverdale.lan>
> On Aug 4, 2020, at 8:16 AM, Arvind Sankar <nivedita@alum.mit.edu> wrote:
>
> On Tue, Aug 04, 2020 at 10:32:36AM +0200, Pavel Machek wrote:
>> Hi!
>>
>>>>> I've measured the kernel decompression speed using QEMU before and after
>>>>> this patch for the x86_64 and i386 architectures. The speed-up is about
>>>>> 10x as shown below.
>>>>>
>>>>> Code Arch Kernel Size Time Speed
>>>>> v5.8 x86_64 11504832 B 148 ms 79 MB/s
>>>>> patch x86_64 11503872 B 13 ms 885 MB/s
>>>>> v5.8 i386 9621216 B 91 ms 106 MB/s
>>>>> patch i386 9620224 B 10 ms 962 MB/s
>>>>>
>>>>> I also measured the time to decompress the initramfs on x86_64, i386,
>>>>> and arm. All three show the same decompression speed before and after,
>>>>> as expected.
>>>>>
>>>>> [1] https://github.com/lz4/lz4/pull/890
>>>>>
>>>>
>>>> Hi Nick, would you be able to test the below patch's performance to
>>>> verify it gives the same speedup? It removes the #undef in misc.c which
>>>> causes the decompressors to not use the builtin version. It should be
>>>> equivalent to yours except for applying it to all the decompressors.
>>>>
>>>> Thanks.
>>>
>>> I will measure it. I would expect it to provide the same speed up. It would be great to fix
>>> the problem for x86/i386 in general.
>>>
>>> But, I believe that this is also a problem for ARM, though I have a hard time measuring
>>> because I can’t get pre-boot print statements in QEMU. I will attempt to take a look at the
>>> assembly, because I’m fairly certain that memcpy() isn’t inlined in master.
>>>
>>> Even if we fix all the architectures, I would still like to merge the LZ4 patch. It seems like it
>>> is pretty easy to merge a patch that is a boot speed regression, because people aren’t
>>> actively measuring it. So I prefer a layered defense.
>>
>>
>> Layered defense against performance-only problem, happening on
>> emulation-only?
>>
>> IMO that's a bit of overkill.
>
> Why would it be emulation-only? QEMU is just being used for ease of
> testing, but the performance impact should be similar on bare metal.
In addition, I want the decompressors to be fast in the pre-boot for all
architectures. Not everyone is going to know that zstd and lz4 require
memcpy to be inlined or they are 10x slower, that is an implementation
detail of the library. It is a performance gotcha that I’d rather not have.
prev parent reply other threads:[~2020-08-04 17:20 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-03 19:40 [PATCH] lz4: Fix kernel decompression speed Nick Terrell
2020-08-03 21:57 ` Arvind Sankar
2020-08-03 22:55 ` Nick Terrell
2020-08-04 1:56 ` Arvind Sankar
2020-08-04 2:57 ` Nick Terrell
2020-08-04 15:19 ` Arvind Sankar
2020-08-04 17:59 ` Nick Terrell
2020-08-04 23:48 ` [PATCH 0/1] x86/boot/compressed: Use builtin mem functions for decompressor Arvind Sankar
2020-08-04 23:48 ` [PATCH 1/1] " Arvind Sankar
2020-08-19 18:14 ` Kees Cook
2020-08-19 18:22 ` Linus Torvalds
2020-08-04 8:32 ` [PATCH] lz4: Fix kernel decompression speed Pavel Machek
2020-08-04 15:16 ` Arvind Sankar
2020-08-04 17:18 ` Nick Terrell [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=BA4151A3-83AB-4481-8A89-D9B645578995@fb.com \
--to=terrelln@fb.com \
--cc=4sschmid@informatik.uni-hamburg.de \
--cc=Kernel-team@fb.com \
--cc=akpm@linux-foundation.org \
--cc=gaoxiang25@huawei.com \
--cc=gregkh@linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=nickrterrell@gmail.com \
--cc=nivedita@alum.mit.edu \
--cc=pavel@denx.de \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
--cc=yann.collet.73@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.