All of lore.kernel.org
 help / color / mirror / Atom feed
* [Buildroot] [RFC PATCH] toolchain/toolchain-wrapper: remove --build-id=none option
@ 2020-11-11 19:48 John Keeping
  2020-11-11 20:58 ` Yann E. MORIN
  0 siblings, 1 reply; 7+ messages in thread
From: John Keeping @ 2020-11-11 19:48 UTC (permalink / raw)
  To: buildroot

The only build-id style that is not reproducible is "uuid", but the
default is "sha1" so packages would have to go out of their way to
choose a non-reproducible build-id.  Having build IDs in general is
useful as it can be used to match split debuginfo.

I haven't seen any packages that do this, and if there are any that use
non-reproducible build-ids then that should be treated as a bug in that
package like any other reproducibility issue; there is no reason to
globally disable build IDs when BR2_REPRODUCIBLE is set.

Signed-off-by: John Keeping <john@metanate.com>
---
 toolchain/toolchain-wrapper.mk | 1 -
 1 file changed, 1 deletion(-)

diff --git a/toolchain/toolchain-wrapper.mk b/toolchain/toolchain-wrapper.mk
index 8b551e3a18..56d86fa1ea 100644
--- a/toolchain/toolchain-wrapper.mk
+++ b/toolchain/toolchain-wrapper.mk
@@ -22,7 +22,6 @@ TOOLCHAIN_WRAPPER_OPTS = \
 	$(call qstrip,$(BR2_TARGET_OPTIMIZATION))
 
 ifeq ($(BR2_REPRODUCIBLE),y)
-TOOLCHAIN_WRAPPER_OPTS += -Wl,--build-id=none
 ifeq ($(BR2_TOOLCHAIN_GCC_AT_LEAST_8),y)
 TOOLCHAIN_WRAPPER_OPTS += -ffile-prefix-map=$(BASE_DIR)=buildroot
 else
-- 
2.29.2

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Buildroot] [RFC PATCH] toolchain/toolchain-wrapper: remove --build-id=none option
  2020-11-11 19:48 [Buildroot] [RFC PATCH] toolchain/toolchain-wrapper: remove --build-id=none option John Keeping
@ 2020-11-11 20:58 ` Yann E. MORIN
  2020-11-11 21:00   ` Yann E. MORIN
  2020-11-12 11:07   ` John Keeping
  0 siblings, 2 replies; 7+ messages in thread
From: Yann E. MORIN @ 2020-11-11 20:58 UTC (permalink / raw)
  To: buildroot

Jihn, all,

On 2020-11-11 19:48 +0000, John Keeping spake thusly:
> The only build-id style that is not reproducible is "uuid", but the
> default is "sha1" so packages would have to go out of their way to
> choose a non-reproducible build-id.  Having build IDs in general is
> useful as it can be used to match split debuginfo.

I would think we would then want to make sha1 the explicit build-id,
rather than merely depend on the default (even though it has been sha1
since 2007, it may change with future binutils versions).

> I haven't seen any packages that do this, and if there are any that use

Try mot to use first person in commit message; instead, use a neutral
third-person. Also, I find your commit message to have things a bit
shuffled around. We usually liek to have commit logs that explain the
current status, explain why this is a problem, and explain what is done
to overcome the problem, For example:

    When a reproducible build is attempted (with BR2_REPRODUCIBLE=y), we forcibly
    disable the use of build-ids, on the assumption that they are not reproducible,
    as explained in b285c80143 (toolchain/toolchain-wrapper: explicitly pass
    --build-id=none if BR2_REPRODUCIBLE).

    However, only the 'uuid' build-id is not reproducible. 'sha1' (and 'md5') are
    based on the "normative parts of the output contents", so are stable accross
    builds.

    Re-enable use of the build-id even in reproducible builds. We do ensure that
    the same build-id type is used, by forcing it to 'sha1'

However, I would still doubt that build-ids are reproducible, even when
explicitly set to 'sha1', which has been the default since 2007 now, and
given that the commit referenced above was done because indeed they were
not reproducible...

So, I am not sure what the "normative parts of the output contents" are,
but it seems that there is something that is not reproducible, that
influences the build-id. See also the snippet referenced from that
commit https://gitlab.com/snippets/1886180/raw , which found that the
the only delta between two reproducible builds was exactly only the
sha1-based build-id...

Regards,
Yann E. MORIN.

> non-reproducible build-ids then that should be treated as a bug in that
> package like any other reproducibility issue; there is no reason to
> globally disable build IDs when BR2_REPRODUCIBLE is set.
> 
> Signed-off-by: John Keeping <john@metanate.com>
> ---
>  toolchain/toolchain-wrapper.mk | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/toolchain/toolchain-wrapper.mk b/toolchain/toolchain-wrapper.mk
> index 8b551e3a18..56d86fa1ea 100644
> --- a/toolchain/toolchain-wrapper.mk
> +++ b/toolchain/toolchain-wrapper.mk
> @@ -22,7 +22,6 @@ TOOLCHAIN_WRAPPER_OPTS = \
>  	$(call qstrip,$(BR2_TARGET_OPTIMIZATION))
>  
>  ifeq ($(BR2_REPRODUCIBLE),y)
> -TOOLCHAIN_WRAPPER_OPTS += -Wl,--build-id=none
>  ifeq ($(BR2_TOOLCHAIN_GCC_AT_LEAST_8),y)
>  TOOLCHAIN_WRAPPER_OPTS += -ffile-prefix-map=$(BASE_DIR)=buildroot
>  else
> -- 
> 2.29.2
> 
> _______________________________________________
> buildroot mailing list
> buildroot at busybox.net
> http://lists.busybox.net/mailman/listinfo/buildroot

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 561 099 427 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Buildroot] [RFC PATCH] toolchain/toolchain-wrapper: remove --build-id=none option
  2020-11-11 20:58 ` Yann E. MORIN
@ 2020-11-11 21:00   ` Yann E. MORIN
  2020-11-12 11:07   ` John Keeping
  1 sibling, 0 replies; 7+ messages in thread
From: Yann E. MORIN @ 2020-11-11 21:00 UTC (permalink / raw)
  To: buildroot

John, All,

On 2020-11-11 21:58 +0100, Yann E. MORIN spake thusly:
> Jihn, all,

Woops, sorry... Fat finger struck again... My apologies.

> On 2020-11-11 19:48 +0000, John Keeping spake thusly:
> > The only build-id style that is not reproducible is "uuid", but the
> > default is "sha1" so packages would have to go out of their way to
> > choose a non-reproducible build-id.  Having build IDs in general is
> > useful as it can be used to match split debuginfo.
> 
> I would think we would then want to make sha1 the explicit build-id,
> rather than merely depend on the default (even though it has been sha1
> since 2007, it may change with future binutils versions).
> 
> > I haven't seen any packages that do this, and if there are any that use
> 
> Try mot to use first person in commit message; instead, use a neutral
> third-person. Also, I find your commit message to have things a bit
> shuffled around. We usually liek to have commit logs that explain the
> current status, explain why this is a problem, and explain what is done
> to overcome the problem, For example:
> 
>     When a reproducible build is attempted (with BR2_REPRODUCIBLE=y), we forcibly
>     disable the use of build-ids, on the assumption that they are not reproducible,
>     as explained in b285c80143 (toolchain/toolchain-wrapper: explicitly pass
>     --build-id=none if BR2_REPRODUCIBLE).
> 
>     However, only the 'uuid' build-id is not reproducible. 'sha1' (and 'md5') are
>     based on the "normative parts of the output contents", so are stable accross
>     builds.
> 
>     Re-enable use of the build-id even in reproducible builds. We do ensure that
>     the same build-id type is used, by forcing it to 'sha1'
> 
> However, I would still doubt that build-ids are reproducible, even when
> explicitly set to 'sha1', which has been the default since 2007 now, and
> given that the commit referenced above was done because indeed they were
> not reproducible...
> 
> So, I am not sure what the "normative parts of the output contents" are,
> but it seems that there is something that is not reproducible, that
> influences the build-id. See also the snippet referenced from that
> commit https://gitlab.com/snippets/1886180/raw , which found that the
> the only delta between two reproducible builds was exactly only the
> sha1-based build-id...
> 
> Regards,
> Yann E. MORIN.
> 
> > non-reproducible build-ids then that should be treated as a bug in that
> > package like any other reproducibility issue; there is no reason to
> > globally disable build IDs when BR2_REPRODUCIBLE is set.
> > 
> > Signed-off-by: John Keeping <john@metanate.com>
> > ---
> >  toolchain/toolchain-wrapper.mk | 1 -
> >  1 file changed, 1 deletion(-)
> > 
> > diff --git a/toolchain/toolchain-wrapper.mk b/toolchain/toolchain-wrapper.mk
> > index 8b551e3a18..56d86fa1ea 100644
> > --- a/toolchain/toolchain-wrapper.mk
> > +++ b/toolchain/toolchain-wrapper.mk
> > @@ -22,7 +22,6 @@ TOOLCHAIN_WRAPPER_OPTS = \
> >  	$(call qstrip,$(BR2_TARGET_OPTIMIZATION))
> >  
> >  ifeq ($(BR2_REPRODUCIBLE),y)
> > -TOOLCHAIN_WRAPPER_OPTS += -Wl,--build-id=none
> >  ifeq ($(BR2_TOOLCHAIN_GCC_AT_LEAST_8),y)
> >  TOOLCHAIN_WRAPPER_OPTS += -ffile-prefix-map=$(BASE_DIR)=buildroot
> >  else
> > -- 
> > 2.29.2
> > 
> > _______________________________________________
> > buildroot mailing list
> > buildroot at busybox.net
> > http://lists.busybox.net/mailman/listinfo/buildroot
> 
> -- 
> .-----------------.--------------------.------------------.--------------------.
> |  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
> | +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
> | +33 561 099 427 `------------.-------:  X  AGAINST      |  \e/  There is no  |
> | http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
> '------------------------------^-------^------------------^--------------------'
> _______________________________________________
> buildroot mailing list
> buildroot at busybox.net
> http://lists.busybox.net/mailman/listinfo/buildroot

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 561 099 427 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Buildroot] [RFC PATCH] toolchain/toolchain-wrapper: remove --build-id=none option
  2020-11-11 20:58 ` Yann E. MORIN
  2020-11-11 21:00   ` Yann E. MORIN
@ 2020-11-12 11:07   ` John Keeping
  2020-11-12 20:03     ` John Keeping
  2020-11-12 20:58     ` Yann E. MORIN
  1 sibling, 2 replies; 7+ messages in thread
From: John Keeping @ 2020-11-12 11:07 UTC (permalink / raw)
  To: buildroot

On Wed, 11 Nov 2020 21:58:16 +0100
"Yann E. MORIN" <yann.morin.1998@free.fr> wrote:

> On 2020-11-11 19:48 +0000, John Keeping spake thusly:
> > The only build-id style that is not reproducible is "uuid", but the
> > default is "sha1" so packages would have to go out of their way to
> > choose a non-reproducible build-id.  Having build IDs in general is
> > useful as it can be used to match split debuginfo.  
> 
> I would think we would then want to make sha1 the explicit build-id,
> rather than merely depend on the default (even though it has been sha1
> since 2007, it may change with future binutils versions).

Unfortunately this is difficult to do - we can't just force
--build-id=sha1 globally as some files really do need to be compiled
without a build ID (the glibc build breaks if you try this).

We could intercept the command line and override the type if --build-id
is specified, but I don't think that's really necessary.  Given the
wider push for reproducible builds, it seems that the defaults are
likely to remain reproducible - so as long as the toolchain is the same
then the output will be too.

> > I haven't seen any packages that do this, and if there are any that use  
> 
> Try mot to use first person in commit message; instead, use a neutral
> third-person. Also, I find your commit message to have things a bit
> shuffled around. We usually liek to have commit logs that explain the
> current status, explain why this is a problem, and explain what is done
> to overcome the problem, For example:
> 
>     When a reproducible build is attempted (with BR2_REPRODUCIBLE=y), we forcibly
>     disable the use of build-ids, on the assumption that they are not reproducible,
>     as explained in b285c80143 (toolchain/toolchain-wrapper: explicitly pass
>     --build-id=none if BR2_REPRODUCIBLE).
> 
>     However, only the 'uuid' build-id is not reproducible. 'sha1' (and 'md5') are
>     based on the "normative parts of the output contents", so are stable accross
>     builds.
> 
>     Re-enable use of the build-id even in reproducible builds. We do ensure that
>     the same build-id type is used, by forcing it to 'sha1'
> 
> However, I would still doubt that build-ids are reproducible, even when
> explicitly set to 'sha1', which has been the default since 2007 now, and
> given that the commit referenced above was done because indeed they were
> not reproducible...

The comment on that commit says it's about building in different output
directories, but commit 71d6901 a month later added -ffile-prefix-map
(or overrides of __FILE__/__BASE_FILE__ for older compilers), so perhaps
that has resolved the problem that b285c80143 saw?

I ran two builds (admittedly of a minimal configuration - although it
does include libmount.so.1.1.0 which is referenced in the snippet below)
and the only differences I saw were in .pyc files which seems to be a
result of https://bugs.python.org/issue37596 - I'm planning to
experiment with exporting PYTHONHASHSEED when BR2_REPRODUCIBLE is set to
see if that works around the issue here.

> So, I am not sure what the "normative parts of the output contents" are,
> but it seems that there is something that is not reproducible, that
> influences the build-id. See also the snippet referenced from that
> commit https://gitlab.com/snippets/1886180/raw , which found that the
> the only delta between two reproducible builds was exactly only the
> sha1-based build-id...

I'm not sure there's any way to find out without trying it more widely
:-(

Having done a bit more research to spot the -ffile-prefix-map issue, I'm
actually more confident that this should be okay, so what would you
think about a patch with an improved commit message explaining why
b285c80143 may no longer apply?


Thanks,
John

> > non-reproducible build-ids then that should be treated as a bug in that
> > package like any other reproducibility issue; there is no reason to
> > globally disable build IDs when BR2_REPRODUCIBLE is set.
> > 
> > Signed-off-by: John Keeping <john@metanate.com>
> > ---
> >  toolchain/toolchain-wrapper.mk | 1 -
> >  1 file changed, 1 deletion(-)
> > 
> > diff --git a/toolchain/toolchain-wrapper.mk b/toolchain/toolchain-wrapper.mk
> > index 8b551e3a18..56d86fa1ea 100644
> > --- a/toolchain/toolchain-wrapper.mk
> > +++ b/toolchain/toolchain-wrapper.mk
> > @@ -22,7 +22,6 @@ TOOLCHAIN_WRAPPER_OPTS = \
> >  	$(call qstrip,$(BR2_TARGET_OPTIMIZATION))
> >  
> >  ifeq ($(BR2_REPRODUCIBLE),y)
> > -TOOLCHAIN_WRAPPER_OPTS += -Wl,--build-id=none
> >  ifeq ($(BR2_TOOLCHAIN_GCC_AT_LEAST_8),y)
> >  TOOLCHAIN_WRAPPER_OPTS += -ffile-prefix-map=$(BASE_DIR)=buildroot
> >  else
> > -- 
> > 2.29.2
> > 
> > _______________________________________________
> > buildroot mailing list
> > buildroot at busybox.net
> > http://lists.busybox.net/mailman/listinfo/buildroot  
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Buildroot] [RFC PATCH] toolchain/toolchain-wrapper: remove --build-id=none option
  2020-11-12 11:07   ` John Keeping
@ 2020-11-12 20:03     ` John Keeping
  2020-11-12 21:11       ` Yann E. MORIN
  2020-11-12 20:58     ` Yann E. MORIN
  1 sibling, 1 reply; 7+ messages in thread
From: John Keeping @ 2020-11-12 20:03 UTC (permalink / raw)
  To: buildroot

On Thu, 12 Nov 2020 11:07:17 +0000
John Keeping <john@metanate.com> wrote:

> On Wed, 11 Nov 2020 21:58:16 +0100
> "Yann E. MORIN" <yann.morin.1998@free.fr> wrote:
> 
> > However, I would still doubt that build-ids are reproducible, even when
> > explicitly set to 'sha1', which has been the default since 2007 now, and
> > given that the commit referenced above was done because indeed they were
> > not reproducible...
> 
> The comment on that commit says it's about building in different output
> directories, but commit 71d6901 a month later added -ffile-prefix-map
> (or overrides of __FILE__/__BASE_FILE__ for older compilers), so perhaps
> that has resolved the problem that b285c80143 saw?
> 
> I ran two builds (admittedly of a minimal configuration - although it
> does include libmount.so.1.1.0 which is referenced in the snippet below)
> and the only differences I saw were in .pyc files which seems to be a
> result of https://bugs.python.org/issue37596 - I'm planning to
> experiment with exporting PYTHONHASHSEED when BR2_REPRODUCIBLE is set to
> see if that works around the issue here.

I've done some more testing on this and unfortunately building in a
different directory does lead to non-reproducible output, including of
build-ids.

For the build-id, the issue is that .symtab includes STT_FILE references
to crti.o & crtn.o (and possibly start.o and others for executables) by
absolute path.  These entries are removed by strip(1) so the end result
doesn't include it, but it seems that this is part of the content used
to generate the build-id.

Weirdly, I don't see that for x86_64 but I do for ARM32.

However, ignoring build-ids, building in a different directory (or at
least a directory with a different path length) leads to other
differences in the output.  I see output like:

	-  [ 6] .dynstr           STRTAB          00010f10 000f10 0006dc 00   A  0   0  1
	+  [ 6] .dynstr           STRTAB          00010f10 000f10 0006dd 00   A  0   0  1

where the size differs by one (because my build directories happened to
differ in length by one I think) and then the offsets of everything else
are shifted correspondingly.

The diff for .dynstr there is:

	# readelf --wide --decompress --hex-dump=.dynstr {}
	@@ -105,9 +105,9 @@
	   0x00011570 005f5f6c 6962635f 73746172 745f6d61 .__libc_start_ma
	   0x00011580 696e0073 7973636f 6e660047 4c494243 in.sysconf.GLIBC
	   0x00011590 5f322e34 00474c49 42435f32 2e323800 _2.4.GLIBC_2.28.
	   0x000115a0 00585858 58585858 58585858 58585858 .XXXXXXXXXXXXXXX
	   0x000115b0 58585858 58585858 58585858 58585858 XXXXXXXXXXXXXXXX
	   0x000115c0 58585858 58585858 58585858 58585858 XXXXXXXXXXXXXXXX
	   0x000115d0 58585858 58585858 58585858 58585858 XXXXXXXXXXXXXXXX
	-  0x000115e0 58585858 58585858 58585800          XXXXXXXXXXX.
	+  0x000115e0 58585858 58585858 58585858 00       XXXXXXXXXXXX.

So it looks like if the build path is different, then there will be
differences in the output independent of the build-id,

My conclusion from this is that BR2_REPRODUCIBLE is more broken than I
thought :-(   But build-id is not itself a problem as far as I can tell
- when the build-id is different then generally there is also some other
difference in the file (even when stripped).


John

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Buildroot] [RFC PATCH] toolchain/toolchain-wrapper: remove --build-id=none option
  2020-11-12 11:07   ` John Keeping
  2020-11-12 20:03     ` John Keeping
@ 2020-11-12 20:58     ` Yann E. MORIN
  1 sibling, 0 replies; 7+ messages in thread
From: Yann E. MORIN @ 2020-11-12 20:58 UTC (permalink / raw)
  To: buildroot

John, All,

On 2020-11-12 11:07 +0000, John Keeping spake thusly:
> On Wed, 11 Nov 2020 21:58:16 +0100
> "Yann E. MORIN" <yann.morin.1998@free.fr> wrote:
> 
> > On 2020-11-11 19:48 +0000, John Keeping spake thusly:
> > > The only build-id style that is not reproducible is "uuid", but the
> > > default is "sha1" so packages would have to go out of their way to
> > > choose a non-reproducible build-id.  Having build IDs in general is
> > > useful as it can be used to match split debuginfo.  
> > I would think we would then want to make sha1 the explicit build-id,
> > rather than merely depend on the default (even though it has been sha1
> > since 2007, it may change with future binutils versions).
> Unfortunately this is difficult to do - we can't just force
> --build-id=sha1 globally as some files really do need to be compiled
> without a build ID (the glibc build breaks if you try this).

This is... "interesting"...

> We could intercept the command line and override the type if --build-id
> is specified, but I don't think that's really necessary.

Indeed, this would be going a bit too far...

>  Given the
> wider push for reproducible builds, it seems that the defaults are
> likely to remain reproducible

Well, I would not be so sure... sha1 is now an old hash, it would not be
too surprising that sha2 build-ids appear at some point in the future...

> - so as long as the toolchain is the same
> then the output will be too.

We're obviously only conisdering reproducible buils with the same input
*and* the same tools.

> >     When a reproducible build is attempted (with BR2_REPRODUCIBLE=y), we forcibly
> >     disable the use of build-ids, on the assumption that they are not reproducible,
> >     as explained in b285c80143 (toolchain/toolchain-wrapper: explicitly pass
> >     --build-id=none if BR2_REPRODUCIBLE).
> The comment on that commit says it's about building in different output
> directories, but commit 71d6901 a month later added -ffile-prefix-map
> (or overrides of __FILE__/__BASE_FILE__ for older compilers), so perhaps
> that has resolved the problem that b285c80143 saw?

Maybe, that would be interesting ideed to analyse the impact of 71d6901
on the build id.

> > So, I am not sure what the "normative parts of the output contents" are,
> > but it seems that there is something that is not reproducible, that
> > influences the build-id. See also the snippet referenced from that
> > commit https://gitlab.com/snippets/1886180/raw , which found that the
> > the only delta between two reproducible builds was exactly only the
> > sha1-based build-id...
> 
> I'm not sure there's any way to find out without trying it more widely
> :-(
> 
> Having done a bit more research to spot the -ffile-prefix-map issue, I'm
> actually more confident that this should be okay, so what would you
> think about a patch with an improved commit message explaining why
> b285c80143 may no longer apply?

Yes, referencing past commits and explaining why they no longer apply,
and when they were superseded, is a very valuable resource.

Regards,
Yann E. MORIN.

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 561 099 427 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Buildroot] [RFC PATCH] toolchain/toolchain-wrapper: remove --build-id=none option
  2020-11-12 20:03     ` John Keeping
@ 2020-11-12 21:11       ` Yann E. MORIN
  0 siblings, 0 replies; 7+ messages in thread
From: Yann E. MORIN @ 2020-11-12 21:11 UTC (permalink / raw)
  To: buildroot

John, All,

On 2020-11-12 20:03 +0000, John Keeping spake thusly:
> On Thu, 12 Nov 2020 11:07:17 +0000
> John Keeping <john@metanate.com> wrote:
> > On Wed, 11 Nov 2020 21:58:16 +0100
> > "Yann E. MORIN" <yann.morin.1998@free.fr> wrote:
> > > However, I would still doubt that build-ids are reproducible, even when
> > > explicitly set to 'sha1', which has been the default since 2007 now, and
> > > given that the commit referenced above was done because indeed they were
> > > not reproducible...
> > The comment on that commit says it's about building in different output
> > directories, but commit 71d6901 a month later added -ffile-prefix-map
> > (or overrides of __FILE__/__BASE_FILE__ for older compilers), so perhaps
> > that has resolved the problem that b285c80143 saw?
> > 
> > I ran two builds (admittedly of a minimal configuration - although it
> > does include libmount.so.1.1.0 which is referenced in the snippet below)
> > and the only differences I saw were in .pyc files which seems to be a
> > result of https://bugs.python.org/issue37596 - I'm planning to
> > experiment with exporting PYTHONHASHSEED when BR2_REPRODUCIBLE is set to
> > see if that works around the issue here.
> 
> I've done some more testing on this and unfortunately building in a
> different directory does lead to non-reproducible output, including of
> build-ids.
> 
> For the build-id, the issue is that .symtab includes STT_FILE references
> to crti.o & crtn.o (and possibly start.o and others for executables) by
> absolute path.  These entries are removed by strip(1) so the end result
> doesn't include it, but it seems that this is part of the content used
> to generate the build-id.
> 
> Weirdly, I don't see that for x86_64 but I do for ARM32.
> 
> However, ignoring build-ids, building in a different directory (or at
> least a directory with a different path length) leads to other
> differences in the output.  I see output like:
> 
> 	-  [ 6] .dynstr           STRTAB          00010f10 000f10 0006dc 00   A  0   0  1
> 	+  [ 6] .dynstr           STRTAB          00010f10 000f10 0006dd 00   A  0   0  1
> 
> where the size differs by one (because my build directories happened to
> differ in length by one I think) and then the offsets of everything else
> are shifted correspondingly.

Ah, but then the reproducible efforts only try with path of the same
length. We're noyt even trying (yet!) with path of arbitrary length.

> So it looks like if the build path is different, then there will be
> differences in the output independent of the build-id,

the delta you observed is not unexpected. This should not be a
deterrent to try and change the current option in the wrapper.

> My conclusion from this is that BR2_REPRODUCIBLE is more broken than I
> thought :-(   But build-id is not itself a problem as far as I can tell
> - when the build-id is different then generally there is also some other
> difference in the file (even when stripped).

Thanks for the experiment, but again, this is not a proper argument
against your initial change, as a delta in this cae is not unexpected.

So, please, can you retry this experiment with two directories of the
same length and compare the builfd ids?

If the result does not show delta in build ids, then you should respin
your inital patch, with a commit log that references the comits we
already talked about, plus the results of your experimetnt. That should
be it, I believe.

Thanks for the back-n-forth on the topic! :-)

Regards,
Yann E. MORIN.

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 561 099 427 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-11-12 21:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-11 19:48 [Buildroot] [RFC PATCH] toolchain/toolchain-wrapper: remove --build-id=none option John Keeping
2020-11-11 20:58 ` Yann E. MORIN
2020-11-11 21:00   ` Yann E. MORIN
2020-11-12 11:07   ` John Keeping
2020-11-12 20:03     ` John Keeping
2020-11-12 21:11       ` Yann E. MORIN
2020-11-12 20:58     ` Yann E. MORIN

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.