All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] kbuild: modinst: Enable multithread xz compression
@ 2023-02-23  0:16 André Almeida
  2023-02-23 20:16 ` Nathan Chancellor
  2023-02-24  5:38 ` Masahiro Yamada
  0 siblings, 2 replies; 5+ messages in thread
From: André Almeida @ 2023-02-23  0:16 UTC (permalink / raw)
  To: linux-kernel, linux-kbuild, Masahiro Yamada
  Cc: kernel-dev, Nathan Chancellor, Nick Desaulniers, Nicolas Schier,
	André Almeida

As it's done for zstd compression, enable multithread compression for
xz to speed up module installation.

Signed-off-by: André Almeida <andrealmeid@igalia.com>
---

On my setup xz is a bottleneck during module installation. Here are the
numbers to install it in a local directory, before and after this patch:

$ time make INSTALL_MOD_PATH=/home/tonyk/codes/.kernel_deploy/ modules_install -j16
Executed in  100.08 secs

$ time make INSTALL_MOD_PATH=/home/tonyk/codes/.kernel_deploy/ modules_install -j16
Executed in   28.60 secs
---
 scripts/Makefile.modinst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/Makefile.modinst b/scripts/Makefile.modinst
index 4815a8e32227..28dcc523d2ee 100644
--- a/scripts/Makefile.modinst
+++ b/scripts/Makefile.modinst
@@ -99,7 +99,7 @@ endif
 quiet_cmd_gzip = GZIP    $@
       cmd_gzip = $(KGZIP) -n -f $<
 quiet_cmd_xz = XZ      $@
-      cmd_xz = $(XZ) --lzma2=dict=2MiB -f $<
+      cmd_xz = $(XZ) --lzma2=dict=2MiB -f -T0 $<
 quiet_cmd_zstd = ZSTD    $@
       cmd_zstd = $(ZSTD) -T0 --rm -f -q $<
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] kbuild: modinst: Enable multithread xz compression
  2023-02-23  0:16 [PATCH] kbuild: modinst: Enable multithread xz compression André Almeida
@ 2023-02-23 20:16 ` Nathan Chancellor
  2023-02-24  5:38 ` Masahiro Yamada
  1 sibling, 0 replies; 5+ messages in thread
From: Nathan Chancellor @ 2023-02-23 20:16 UTC (permalink / raw)
  To: André Almeida
  Cc: linux-kernel, linux-kbuild, Masahiro Yamada, kernel-dev,
	Nick Desaulniers, Nicolas Schier

On Wed, Feb 22, 2023 at 09:16:07PM -0300, André Almeida wrote:
> As it's done for zstd compression, enable multithread compression for
> xz to speed up module installation.
> 
> Signed-off-by: André Almeida <andrealmeid@igalia.com>

This seems reasonable to me.

Reviewed-by: Nathan Chancellor <nathan@kernel.org>

If for some reason Masahiro does not want to take this, you could set
XZ_OPT=-T0 in your build environment, which should accomplish the same
thing.

> ---
> 
> On my setup xz is a bottleneck during module installation. Here are the
> numbers to install it in a local directory, before and after this patch:
> 
> $ time make INSTALL_MOD_PATH=/home/tonyk/codes/.kernel_deploy/ modules_install -j16
> Executed in  100.08 secs
> 
> $ time make INSTALL_MOD_PATH=/home/tonyk/codes/.kernel_deploy/ modules_install -j16
> Executed in   28.60 secs
> ---
>  scripts/Makefile.modinst | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/scripts/Makefile.modinst b/scripts/Makefile.modinst
> index 4815a8e32227..28dcc523d2ee 100644
> --- a/scripts/Makefile.modinst
> +++ b/scripts/Makefile.modinst
> @@ -99,7 +99,7 @@ endif
>  quiet_cmd_gzip = GZIP    $@
>        cmd_gzip = $(KGZIP) -n -f $<
>  quiet_cmd_xz = XZ      $@
> -      cmd_xz = $(XZ) --lzma2=dict=2MiB -f $<
> +      cmd_xz = $(XZ) --lzma2=dict=2MiB -f -T0 $<
>  quiet_cmd_zstd = ZSTD    $@
>        cmd_zstd = $(ZSTD) -T0 --rm -f -q $<
>  
> -- 
> 2.39.2
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] kbuild: modinst: Enable multithread xz compression
  2023-02-23  0:16 [PATCH] kbuild: modinst: Enable multithread xz compression André Almeida
  2023-02-23 20:16 ` Nathan Chancellor
@ 2023-02-24  5:38 ` Masahiro Yamada
  2023-02-24 12:12   ` André Almeida
  1 sibling, 1 reply; 5+ messages in thread
From: Masahiro Yamada @ 2023-02-24  5:38 UTC (permalink / raw)
  To: André Almeida
  Cc: linux-kernel, linux-kbuild, kernel-dev, Nathan Chancellor,
	Nick Desaulniers, Nicolas Schier

On Thu, Feb 23, 2023 at 9:17 AM André Almeida <andrealmeid@igalia.com> wrote:
>
> As it's done for zstd compression, enable multithread compression for
> xz to speed up module installation.
>
> Signed-off-by: André Almeida <andrealmeid@igalia.com>
> ---
>
> On my setup xz is a bottleneck during module installation. Here are the
> numbers to install it in a local directory, before and after this patch:
>
> $ time make INSTALL_MOD_PATH=/home/tonyk/codes/.kernel_deploy/ modules_install -j16
> Executed in  100.08 secs
>
> $ time make INSTALL_MOD_PATH=/home/tonyk/codes/.kernel_deploy/ modules_install -j16
> Executed in   28.60 secs


Heh, this is an interesting benchmark.

Without this patch, you ran 16 processes of 'xz' in parallel
since you gave -j16.

You created multi-threads in each xz process, then you got 3x faster.
What made it happen? How many threads can your system run?



I did not get such an improvement in my testing.
In my machine $(nproc) is 24.


[Without this patch]

$ time make INSTALL_MOD_PATH=/tmp/inst1  modules_install -j$(nproc)

real 0m33.965s
user 10m6.118s
sys 0m37.231s

[With this patch]

$ time make INSTALL_MOD_PATH=/tmp/inst1  modules_install -j$(nproc)

real 0m32.568s
user 10m4.472s
sys 0m39.132s



Given that GNU Make provides the parallel execution environment,
you can control the number of processes of 'xz'.

There is no point in forcing multi-threading, which the user
did not ask or ever want.













> ---
>  scripts/Makefile.modinst | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/scripts/Makefile.modinst b/scripts/Makefile.modinst
> index 4815a8e32227..28dcc523d2ee 100644
> --- a/scripts/Makefile.modinst
> +++ b/scripts/Makefile.modinst
> @@ -99,7 +99,7 @@ endif
>  quiet_cmd_gzip = GZIP    $@
>        cmd_gzip = $(KGZIP) -n -f $<
>  quiet_cmd_xz = XZ      $@
> -      cmd_xz = $(XZ) --lzma2=dict=2MiB -f $<
> +      cmd_xz = $(XZ) --lzma2=dict=2MiB -f -T0 $<
>  quiet_cmd_zstd = ZSTD    $@
>        cmd_zstd = $(ZSTD) -T0 --rm -f -q $<
>
> --
> 2.39.2
>


-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] kbuild: modinst: Enable multithread xz compression
  2023-02-24  5:38 ` Masahiro Yamada
@ 2023-02-24 12:12   ` André Almeida
  2023-02-25 10:21     ` Masahiro Yamada
  0 siblings, 1 reply; 5+ messages in thread
From: André Almeida @ 2023-02-24 12:12 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: linux-kernel, linux-kbuild, kernel-dev, Nathan Chancellor,
	Nick Desaulniers, Nicolas Schier

Hi Masahiro,

Em 24/02/2023 02:38, Masahiro Yamada escreveu:
> On Thu, Feb 23, 2023 at 9:17 AM André Almeida <andrealmeid@igalia.com> wrote:
>>
>> As it's done for zstd compression, enable multithread compression for
>> xz to speed up module installation.
>>
>> Signed-off-by: André Almeida <andrealmeid@igalia.com>
>> ---
>>
>> On my setup xz is a bottleneck during module installation. Here are the
>> numbers to install it in a local directory, before and after this patch:
>>
>> $ time make INSTALL_MOD_PATH=/home/tonyk/codes/.kernel_deploy/ modules_install -j16
>> Executed in  100.08 secs
>>
>> $ time make INSTALL_MOD_PATH=/home/tonyk/codes/.kernel_deploy/ modules_install -j16
>> Executed in   28.60 secs
> 
> 
> Heh, this is an interesting benchmark.
> 
> Without this patch, you ran 16 processes of 'xz' in parallel
> since you gave -j16.
> 
> You created multi-threads in each xz process, then you got 3x faster.
> What made it happen?
> 
> 

During the modules installation in my setup, the build system would 
spend most of it's time compressing big modules (such as the 350M 
amdgpu.ko) in a single thread, with 15 idles threads. Enabling 
multithread allowed amdgpu to be compressed really fast.

The real performance improvement during modules compression is not 
compressing as many small modules as possible in parallel, but 
compressing the big ones in multithread, that proved to be the 
bottleneck in my setup.

 > How many threads can your system run?

$ nproc
16

> 
> I did not get such an improvement in my testing.
> In my machine $(nproc) is 24.
> 
> 
> [Without this patch]
> 
> $ time make INSTALL_MOD_PATH=/tmp/inst1  modules_install -j$(nproc)
> 
> real 0m33.965s
> user 10m6.118s
> sys 0m37.231s
> 
> [With this patch]
> 
> $ time make INSTALL_MOD_PATH=/tmp/inst1  modules_install -j$(nproc)
> 
> real 0m32.568s
> user 10m4.472s
> sys 0m39.132s
> 
> 

I can see that my patch did not introduce performance regressions to 
your setup, at least.

> 
> Given that GNU Make provides the parallel execution environment,
> you can control the number of processes of 'xz'.
> 
> There is no point in forcing multi-threading, which the user
> did not ask or ever want.
> 
> 

Should we drop -T0 from zstd then? Is currently forcing multi-threading.

> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>> ---
>>   scripts/Makefile.modinst | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/scripts/Makefile.modinst b/scripts/Makefile.modinst
>> index 4815a8e32227..28dcc523d2ee 100644
>> --- a/scripts/Makefile.modinst
>> +++ b/scripts/Makefile.modinst
>> @@ -99,7 +99,7 @@ endif
>>   quiet_cmd_gzip = GZIP    $@
>>         cmd_gzip = $(KGZIP) -n -f $<
>>   quiet_cmd_xz = XZ      $@
>> -      cmd_xz = $(XZ) --lzma2=dict=2MiB -f $<
>> +      cmd_xz = $(XZ) --lzma2=dict=2MiB -f -T0 $<
>>   quiet_cmd_zstd = ZSTD    $@
>>         cmd_zstd = $(ZSTD) -T0 --rm -f -q $<
>>
>> --
>> 2.39.2
>>
> 
> 

Thanks,
André Almeida

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] kbuild: modinst: Enable multithread xz compression
  2023-02-24 12:12   ` André Almeida
@ 2023-02-25 10:21     ` Masahiro Yamada
  0 siblings, 0 replies; 5+ messages in thread
From: Masahiro Yamada @ 2023-02-25 10:21 UTC (permalink / raw)
  To: André Almeida
  Cc: linux-kernel, linux-kbuild, kernel-dev, Nathan Chancellor,
	Nick Desaulniers, Nicolas Schier

On Fri, Feb 24, 2023 at 9:13 PM André Almeida <andrealmeid@igalia.com> wrote:
>
> Hi Masahiro,
>
> Em 24/02/2023 02:38, Masahiro Yamada escreveu:
> > On Thu, Feb 23, 2023 at 9:17 AM André Almeida <andrealmeid@igalia.com> wrote:
> >>
> >> As it's done for zstd compression, enable multithread compression for
> >> xz to speed up module installation.
> >>
> >> Signed-off-by: André Almeida <andrealmeid@igalia.com>
> >> ---
> >>
> >> On my setup xz is a bottleneck during module installation. Here are the
> >> numbers to install it in a local directory, before and after this patch:
> >>
> >> $ time make INSTALL_MOD_PATH=/home/tonyk/codes/.kernel_deploy/ modules_install -j16
> >> Executed in  100.08 secs
> >>
> >> $ time make INSTALL_MOD_PATH=/home/tonyk/codes/.kernel_deploy/ modules_install -j16
> >> Executed in   28.60 secs
> >
> >
> > Heh, this is an interesting benchmark.
> >
> > Without this patch, you ran 16 processes of 'xz' in parallel
> > since you gave -j16.
> >
> > You created multi-threads in each xz process, then you got 3x faster.
> > What made it happen?
> >
> >
>
> During the modules installation in my setup, the build system would
> spend most of it's time compressing big modules (such as the 350M
> amdgpu.ko) in a single thread, with 15 idles threads. Enabling
> multithread allowed amdgpu to be compressed really fast.

It is a corner case, isn't it?
amdgpu.ko appears early in modules.order.
In most use-cases, other *.ko will fill the idle threads.


xz(1) says
  Setting threads to a special value 0 makes xz use up to as many threads
  as the processor(s) on the system support.


So, 'make -j$(nproc) modules_install'
will have (nproc * nproc) threads at maximum.

Of course, this is a theoretical calculation.
The actual number of spawned threads will be much less,
but spawning too many threads may not be nice.
For your case, Nathan's suggestion will do.




>
> The real performance improvement during modules compression is not
> compressing as many small modules as possible in parallel, but
> compressing the big ones in multithread, that proved to be the
> bottleneck in my setup.
>
>  > How many threads can your system run?
>
> $ nproc
> 16
>
> >
> > I did not get such an improvement in my testing.
> > In my machine $(nproc) is 24.
> >
> >
> > [Without this patch]
> >
> > $ time make INSTALL_MOD_PATH=/tmp/inst1  modules_install -j$(nproc)
> >
> > real 0m33.965s
> > user 10m6.118s
> > sys 0m37.231s
> >
> > [With this patch]
> >
> > $ time make INSTALL_MOD_PATH=/tmp/inst1  modules_install -j$(nproc)
> >
> > real 0m32.568s
> > user 10m4.472s
> > sys 0m39.132s
> >
> >
>
> I can see that my patch did not introduce performance regressions to
> your setup, at least.
>
> >
> > Given that GNU Make provides the parallel execution environment,
> > you can control the number of processes of 'xz'.
> >
> > There is no point in forcing multi-threading, which the user
> > did not ask or ever want.
> >
> >
>
> Should we drop -T0 from zstd then? Is currently forcing multi-threading.


I think yes.



--
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-02-25 10:22 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-23  0:16 [PATCH] kbuild: modinst: Enable multithread xz compression André Almeida
2023-02-23 20:16 ` Nathan Chancellor
2023-02-24  5:38 ` Masahiro Yamada
2023-02-24 12:12   ` André Almeida
2023-02-25 10:21     ` Masahiro Yamada

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.