All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nick Terrell <terrelln@fb.com>
To: Masahiro Yamada <masahiroy@kernel.org>
Cc: "Alex Xu (Hello71)" <alex_y_xu@yahoo.ca>,
	Michael Forney <forney@google.com>,
	Michal Marek <michal.lkml@markovi.net>,
	Nick Desaulniers <ndesaulniers@google.com>,
	Ingo Molnar <mingo@kernel.org>,
	Sedat Dilek <sedat.dilek@gmail.com>,
	Kees Cook <keescook@chromium.org>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	Linux Kbuild mailing list <linux-kbuild@vger.kernel.org>
Subject: Re: [PATCH v2 2/2] kbuild: pass --stream-size --no-content-size to zstd
Date: Mon, 6 Dec 2021 18:42:57 +0000	[thread overview]
Message-ID: <F49C6875-FFDD-4314-A202-0C428B525A6A@fb.com> (raw)
In-Reply-To: <CAK7LNASO_EmCp2zR_sBq_YNiw83Px8pKhcW78HKv1My7eKB+2w@mail.gmail.com>



> On Dec 5, 2021, at 2:52 PM, Masahiro Yamada <masahiroy@kernel.org> wrote:
> 
> On Thu, Nov 25, 2021 at 12:30 AM Alex Xu (Hello71) <alex_y_xu@yahoo.ca> wrote:
>> 
>> Otherwise, it allocates 2 GB of memory at once. Even though the majority
>> of this memory is never touched, the default heuristic overcommit
>> refuses this request if less than 2 GB of RAM+swap is currently
>> available. This results in "zstd: error 11 : Allocation error : not
>> enough memory" and the kernel failing to build.
>> 
>> When the size is specified, zstd will reduce the memory request
>> appropriately. For typical kernel sizes of ~32 MB, the largest mmap
>> request will be reduced to 512 MB, which will succeed on all but the
>> smallest devices.
>> 
>> For inputs around this size, --stream-size --no-content-size may
>> slightly decrease the compressed size, or slightly increase it:
>> https://github.com/facebook/zstd/issues/2848.
>> 
>> Signed-off-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
> 
> 
> 
> 
> The reason why we need this workaround is just because we do
> "cat and compress".  zstd must allocate a huge memory beforehand
> since it cannot predict how long the stream it will receive.
> 
> If zstd is given with a file name, it can fstat it to know its file size
> and allocate the minimal amount of memory.
> 
> 
> This is my test.
> I used 'ulimit' to set the upper limit of the memory the zstd can use.
> 
> 
> [test steps]
> 
>  # Create a 1kB file
>  $ truncate --size=1k dummy
> 
>  # Set the memory size limit to 10MB
>  $ ulimit -S -v 10240
> 
>  # Pass the file as a argument; success
>  $ zstd -19 -o dummy.zst dummy
>  dummy                :  2.15%   (  1024 =>     22 bytes, dummy.zst)
> 
>  # cat and zstd; fail
>  $ cat dummy | zstd -19 > dummy.zst
>  zstd: error 11 : Allocation error : not enough memory
> 
>  # cat and zstd --stream-size; success
>  $ cat dummy | zstd -19 --stream-size=1024 > dummy.zst
> 
> 
> 
> 
> scripts/Makefile.modinst was written in such a way
> that zstd can know the file size by itself.
> 
>      cmd_zstd = $(ZSTD) -T0 --rm -f -q $<
> 
> 
> We cannot rewrite scripts/Makefile.lib in that way because
> arch/x86/boot/compress/Makefile concatenates two files before
> compression. And this is the only use-case of this feature.
> 
> So, I am seriously considering to revert this commit:
> 
> commit d3dd3b5a29bb9582957451531fed461628dfc834
> Author: H. Peter Anvin <hpa@zytor.com>
> Date:   Tue May 5 21:17:15 2009 -0700
> 
>    kbuild: allow compressors (gzip, bzip2, lzma) to take multiple inputs
> 
> 
> 
> 
> With that commit reverted, zstd will take a single input file,
> and we can do "zstd -o <output> <input>".
> 
> 
> So, I will take some time to investigate that approach.

This will definitely work from a zstd perspective. All versions
of zstd will downsize their memory usage to match the file size.

Best,
Nick Terrell

>> ---
>> scripts/Makefile.lib | 12 ++++++++++--
>> 1 file changed, 10 insertions(+), 2 deletions(-)
>> 
>> diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
>> index ca901814986a..c98a82ca38e6 100644
>> --- a/scripts/Makefile.lib
>> +++ b/scripts/Makefile.lib
>> @@ -466,12 +466,20 @@ quiet_cmd_xzmisc = XZMISC  $@
>> # single pass, so zstd doesn't need to allocate a window buffer. When streaming
>> # decompression is used, like initramfs decompression, zstd22 should likely not
>> # be used because it would require zstd to allocate a 128 MB buffer.
>> +#
>> +# --stream-size to reduce zstd memory usage (otherwise zstd -22 --ultra
>> +# allocates, but does not use, 2 GB) and potentially improve compression.
>> +#
>> +# --no-content-size to save three bytes which we do not use (we use size_append).
>> +
>> +# zstd --stream-size is only supported since 1.4.4
>> +zstd_stream_size = $(shell $(ZSTD) -1c --stream-size=0 --no-content-size </dev/null >/dev/null 2>&1 && printf '%s' '--stream-size=$(total_size) --no-content-size')
>> 
>> quiet_cmd_zstd = ZSTD    $@
>> -      cmd_zstd = { cat $(real-prereqs) | $(ZSTD) -19; $(size_append); } > $@
>> +      cmd_zstd = { cat $(real-prereqs) | $(ZSTD) $(zstd_stream_size) -19; $(size_append); } > $@
>> 
>> quiet_cmd_zstd22 = ZSTD22  $@
>> -      cmd_zstd22 = { cat $(real-prereqs) | $(ZSTD) -22 --ultra; $(size_append); } > $@
>> +      cmd_zstd22 = { cat $(real-prereqs) | $(ZSTD) $(zstd_stream_size) -22 --ultra; $(size_append); } > $@
>> 
>> # ASM offsets
>> # ---------------------------------------------------------------------------
>> --
>> 2.34.0
>> 
> 
> 
> -- 
> Best Regards
> Masahiro Yamada


  reply	other threads:[~2021-12-06 18:43 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20211124153105.155739-1-alex_y_xu.ref@yahoo.ca>
2021-11-24 15:31 ` [PATCH v2 1/2] kbuild: use perl instead of shell to get file size Alex Xu (Hello71)
2021-11-24 15:31   ` [PATCH v2 2/2] kbuild: pass --stream-size --no-content-size to zstd Alex Xu (Hello71)
2021-12-03  0:49     ` Nick Terrell
2021-12-05 22:52     ` Masahiro Yamada
2021-12-06 18:42       ` Nick Terrell [this message]
2021-12-17  8:51     ` Sedat Dilek
2021-12-17 13:44       ` Sedat Dilek
2021-12-03  0:45   ` [PATCH v2 1/2] kbuild: use perl instead of shell to get file size Nick Terrell
2021-12-17 13:45   ` Sedat Dilek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=F49C6875-FFDD-4314-A202-0C428B525A6A@fb.com \
    --to=terrelln@fb.com \
    --cc=alex_y_xu@yahoo.ca \
    --cc=forney@google.com \
    --cc=keescook@chromium.org \
    --cc=linux-kbuild@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=masahiroy@kernel.org \
    --cc=michal.lkml@markovi.net \
    --cc=mingo@kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=sedat.dilek@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.