All of lore.kernel.org
 help / color / mirror / Atom feed
* build reproducibility
@ 2021-10-17 18:12 Julia Lawall
  2021-10-17 18:32 ` Randy Dunlap
  0 siblings, 1 reply; 7+ messages in thread
From: Julia Lawall @ 2021-10-17 18:12 UTC (permalink / raw)
  To: linux-kernel, Masahiro Yamada, Michal Marek, linux-kbuild

Hello,

If I do the following:

git clean -dfx
cp saved_config .config
make olddefconfig && make && make modules_install && make install

Should I always end up with the same kernel, regardless of the kernel that
is currently running on the machine?

I see a large performance difference between Linux 5.10 and all versions
afterwards for a particular benchmark.  I am unable to bisect the problem
eg between 5.10 and 5.11, because as soon as I come to a kernel that gives
the bad performance, all of the kernels that I generate subsequently in
the bisecting process (using the above commands) also have the bad
performance.

It could of course be that I have completely misinterpreted the problem,
and it has nothing to do with the kernel.  But I have tested the program a
lot when only working on variants of Linux 5.9.  I only start to have
problems when I use versions >= 5.11.

thanks,
julia

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: build reproducibility
  2021-10-17 18:12 build reproducibility Julia Lawall
@ 2021-10-17 18:32 ` Randy Dunlap
  2021-10-17 18:42   ` Julia Lawall
  0 siblings, 1 reply; 7+ messages in thread
From: Randy Dunlap @ 2021-10-17 18:32 UTC (permalink / raw)
  To: Julia Lawall, linux-kernel, Masahiro Yamada, Michal Marek, linux-kbuild

On 10/17/21 11:12 AM, Julia Lawall wrote:
> Hello,
> 
> If I do the following:
> 
> git clean -dfx
> cp saved_config .config
> make olddefconfig && make && make modules_install && make install
> 
> Should I always end up with the same kernel, regardless of the kernel that
> is currently running on the machine?
> 
> I see a large performance difference between Linux 5.10 and all versions
> afterwards for a particular benchmark.  I am unable to bisect the problem
> eg between 5.10 and 5.11, because as soon as I come to a kernel that gives
> the bad performance, all of the kernels that I generate subsequently in
> the bisecting process (using the above commands) also have the bad
> performance.
> 
> It could of course be that I have completely misinterpreted the problem,
> and it has nothing to do with the kernel.  But I have tested the program a
> lot when only working on variants of Linux 5.9.  I only start to have
> problems when I use versions >= 5.11.

Hi,

My "guess" is that this has something to do with the build
reusing some current file(s) that need to be rebuilt.
I.e., adding a "make clean" or "make proper" might be needed.

I say this only because sometimes I cannot even reproduce
a build that has errors or warnings unless I prefix it with
make clean or mrproper. (i.e., nothing to do with booting
and running the new kernel)
Even though the .config file has changed and I do
"make olddefconfig", the same build errors do not show up
unless I do the clean or mrproper step also.


-- 
~Randy

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: build reproducibility
  2021-10-17 18:32 ` Randy Dunlap
@ 2021-10-17 18:42   ` Julia Lawall
  2021-10-18  2:26     ` Masahiro Yamada
  2021-10-18  2:40     ` Willy Tarreau
  0 siblings, 2 replies; 7+ messages in thread
From: Julia Lawall @ 2021-10-17 18:42 UTC (permalink / raw)
  To: Randy Dunlap; +Cc: linux-kernel, Masahiro Yamada, Michal Marek, linux-kbuild



On Sun, 17 Oct 2021, Randy Dunlap wrote:

> On 10/17/21 11:12 AM, Julia Lawall wrote:
> > Hello,
> >
> > If I do the following:
> >
> > git clean -dfx
> > cp saved_config .config
> > make olddefconfig && make && make modules_install && make install
> >
> > Should I always end up with the same kernel, regardless of the kernel that
> > is currently running on the machine?
> >
> > I see a large performance difference between Linux 5.10 and all versions
> > afterwards for a particular benchmark.  I am unable to bisect the problem
> > eg between 5.10 and 5.11, because as soon as I come to a kernel that gives
> > the bad performance, all of the kernels that I generate subsequently in
> > the bisecting process (using the above commands) also have the bad
> > performance.
> >
> > It could of course be that I have completely misinterpreted the problem,
> > and it has nothing to do with the kernel.  But I have tested the program a
> > lot when only working on variants of Linux 5.9.  I only start to have
> > problems when I use versions >= 5.11.
>
> Hi,
>
> My "guess" is that this has something to do with the build
> reusing some current file(s) that need to be rebuilt.
> I.e., adding a "make clean" or "make proper" might be needed.

This was my guess too.  But I have the git clean -dfx.  I did a comparison
with make distclean and this does a little more (mostly some files in
tools).

thanks,
julia

>
> I say this only because sometimes I cannot even reproduce
> a build that has errors or warnings unless I prefix it with
> make clean or mrproper. (i.e., nothing to do with booting
> and running the new kernel)
> Even though the .config file has changed and I do
> "make olddefconfig", the same build errors do not show up
> unless I do the clean or mrproper step also.
>
>
> --
> ~Randy
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: build reproducibility
  2021-10-17 18:42   ` Julia Lawall
@ 2021-10-18  2:26     ` Masahiro Yamada
  2021-10-18  2:40     ` Willy Tarreau
  1 sibling, 0 replies; 7+ messages in thread
From: Masahiro Yamada @ 2021-10-18  2:26 UTC (permalink / raw)
  To: Julia Lawall
  Cc: Randy Dunlap, Linux Kernel Mailing List, Michal Marek,
	Linux Kbuild mailing list

On Mon, Oct 18, 2021 at 3:42 AM Julia Lawall <julia.lawall@inria.fr> wrote:
>
>
>
> On Sun, 17 Oct 2021, Randy Dunlap wrote:
>
> > On 10/17/21 11:12 AM, Julia Lawall wrote:
> > > Hello,
> > >
> > > If I do the following:
> > >
> > > git clean -dfx
> > > cp saved_config .config
> > > make olddefconfig && make && make modules_install && make install
> > >
> > > Should I always end up with the same kernel, regardless of the kernel that
> > > is currently running on the machine?
> > >
> > > I see a large performance difference between Linux 5.10 and all versions
> > > afterwards for a particular benchmark.  I am unable to bisect the problem
> > > eg between 5.10 and 5.11, because as soon as I come to a kernel that gives
> > > the bad performance, all of the kernels that I generate subsequently in
> > > the bisecting process (using the above commands) also have the bad
> > > performance.
> > >
> > > It could of course be that I have completely misinterpreted the problem,
> > > and it has nothing to do with the kernel.  But I have tested the program a
> > > lot when only working on variants of Linux 5.9.  I only start to have
> > > problems when I use versions >= 5.11.
> >
> > Hi,
> >
> > My "guess" is that this has something to do with the build
> > reusing some current file(s) that need to be rebuilt.
> > I.e., adding a "make clean" or "make proper" might be needed.
>
> This was my guess too.  But I have the git clean -dfx.  I did a comparison
> with make distclean and this does a little more (mostly some files in
> tools).
>
> thanks,
> julia
>


'git clean -dfx' is a very hard cleaning.
So, you are doing a full build in every step of bisecting.

I have no idea to explain the symptom you observed:
 "as soon as I come to a kernel that gives
 the bad performance, all of the kernels that I generate subsequently in
 the bisecting process"


If you desire perfect reproducibility, you can check
   Documentation/kbuild/reproducible-builds.rst
But, I doubt slight differences such as timestamps
can explain the large performance difference.


If you are chasing the performance issue,
commit cf536e185869d4815 said
CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_*
might be useful to eliminate the possibility
of code alignment.


Otherwise, I have no more idea...






-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: build reproducibility
  2021-10-17 18:42   ` Julia Lawall
  2021-10-18  2:26     ` Masahiro Yamada
@ 2021-10-18  2:40     ` Willy Tarreau
  2021-10-18  5:51       ` Julia Lawall
  1 sibling, 1 reply; 7+ messages in thread
From: Willy Tarreau @ 2021-10-18  2:40 UTC (permalink / raw)
  To: Julia Lawall
  Cc: Randy Dunlap, linux-kernel, Masahiro Yamada, Michal Marek, linux-kbuild

Hello Julia,

On Sun, Oct 17, 2021 at 08:42:31PM +0200, Julia Lawall wrote:
> On Sun, 17 Oct 2021, Randy Dunlap wrote:
> > My "guess" is that this has something to do with the build
> > reusing some current file(s) that need to be rebuilt.
> > I.e., adding a "make clean" or "make proper" might be needed.
> 
> This was my guess too.  But I have the git clean -dfx.  I did a comparison
> with make distclean and this does a little more (mostly some files in
> tools).

Have you tried power-cycling the machine between boots, or just
rebooting on a working kernel before booting again on a faulty one ?
It could be possible that "something" changes a hardware setting that
the BIOS does not touch, leaving your machine in a different state
after you've booted the first problematic kernel. For example, it's
possible to set some CPU MSRs that affect the maximum CPU power, hence
its performance. Normally the BIOS should reset them, but for this it
must know about the one your kernel (or even userland) would set.

Hoping this helps,
Willy

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: build reproducibility
  2021-10-18  2:40     ` Willy Tarreau
@ 2021-10-18  5:51       ` Julia Lawall
  2021-10-18  5:59         ` Willy Tarreau
  0 siblings, 1 reply; 7+ messages in thread
From: Julia Lawall @ 2021-10-18  5:51 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Randy Dunlap, linux-kernel, Masahiro Yamada, Michal Marek, linux-kbuild



On Mon, 18 Oct 2021, Willy Tarreau wrote:

> Hello Julia,
>
> On Sun, Oct 17, 2021 at 08:42:31PM +0200, Julia Lawall wrote:
> > On Sun, 17 Oct 2021, Randy Dunlap wrote:
> > > My "guess" is that this has something to do with the build
> > > reusing some current file(s) that need to be rebuilt.
> > > I.e., adding a "make clean" or "make proper" might be needed.
> >
> > This was my guess too.  But I have the git clean -dfx.  I did a comparison
> > with make distclean and this does a little more (mostly some files in
> > tools).
>
> Have you tried power-cycling the machine between boots, or just
> rebooting on a working kernel before booting again on a faulty one ?
> It could be possible that "something" changes a hardware setting that
> the BIOS does not touch, leaving your machine in a different state
> after you've booted the first problematic kernel. For example, it's
> possible to set some CPU MSRs that affect the maximum CPU power, hence
> its performance. Normally the BIOS should reset them, but for this it
> must know about the one your kernel (or even userland) would set.

OK, thanks for the suggestions.  My impression is that there is a real
performance problem in 5.11.  The part I don't understand is why once I
have booted that kernel, all of the kernels I make afterwards have the
same performance characteristics.

If I do git clean -dfx, then copy a fixed configuration to .config, and
then use make olddefconfig, should anything about the currently running
kernel have an impact on the kernel that is produced?

I'll try simply rebooting the machine on each git bisect step.  That
should eliminate one more aspect of local state.

thanks,
julia

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: build reproducibility
  2021-10-18  5:51       ` Julia Lawall
@ 2021-10-18  5:59         ` Willy Tarreau
  0 siblings, 0 replies; 7+ messages in thread
From: Willy Tarreau @ 2021-10-18  5:59 UTC (permalink / raw)
  To: Julia Lawall
  Cc: Randy Dunlap, linux-kernel, Masahiro Yamada, Michal Marek, linux-kbuild

On Mon, Oct 18, 2021 at 07:51:13AM +0200, Julia Lawall wrote:
> My impression is that there is a real
> performance problem in 5.11.  The part I don't understand is why once I
> have booted that kernel, all of the kernels I make afterwards have the
> same performance characteristics.
> 
> If I do git clean -dfx, then copy a fixed configuration to .config, and
> then use make olddefconfig, should anything about the currently running
> kernel have an impact on the kernel that is produced?

Normally not at all, especially if you restart from a fixed .config. By
the way, you should compare the resulting .config after "make oldconfig"
for all your kernels, in case you spot a difference there, but there is
no reason for that difference to depend on the currently running kernel.
Or maybe it detects something related to your machine and adjusts the
.config accordingly, and that detection depends on the running kernel
(e.g. CPU affecting default optims etc) ? If that's the case you'll see
it in the final .config.

> I'll try simply rebooting the machine on each git bisect step.  That
> should eliminate one more aspect of local state.

Just to be avoid wasting your time, perform a cold reboot (reset button).
If you just do a hot reboot and the problem persists, there will still
be a tiny part of doubt leaving a "what if" in your mind that will make
you want to run it all again.

Willy

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-10-18  5:59 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-17 18:12 build reproducibility Julia Lawall
2021-10-17 18:32 ` Randy Dunlap
2021-10-17 18:42   ` Julia Lawall
2021-10-18  2:26     ` Masahiro Yamada
2021-10-18  2:40     ` Willy Tarreau
2021-10-18  5:51       ` Julia Lawall
2021-10-18  5:59         ` Willy Tarreau

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.