All of lore.kernel.org
 help / color / mirror / Atom feed
* The state of reproducible Builds
@ 2019-07-01 15:58 Joshua Watt
  2019-07-02  0:43 ` Douglas Royds
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Joshua Watt @ 2019-07-01 15:58 UTC (permalink / raw)
  To: OE-core

All,

I've been working on making OE builds reproducible (that is, two given 
builds can have binary-identical outputs). The current "test" for 
reproducibility involves building core-image-minimal in two different 
build directories, then doing a binary diff of the resulting target 
Debian packages files and reporting if any of them differ (I'd like to 
expand this test, see below). I believe that we are very close to 
achieving this level of reproducibility, with a few caveats as shown below:

1. Both builds must be clean builds from scratch

2. Neither build can use sstate (sstate isn't currently reproducible for 
a variety of reasons, more on that later)

3. The QA test for reproducibility takes about 4 hours on my 4/8 core 
i7-3770 CPU @ 3.40GHz. I'm not sure how "expensive" a test has to be 
before it can't reasonably be run on the autobuilders, but I'm guessing 
this isn't a QA test that would currently be able to be run very often 
(if at all). If sstate were reproducible, this would effectively be cut 
in half, since you would only need one clean build from scratch (if that 
would even matter).


The current test is obviously deficient in a few areas, but I believe 
that is at the very least a good starting point since it has already 
uncovered numerous reproducibility issues. The places where I think it 
needs to be improved are:

1. Testing RPM and IPK package formats. I think RPMs will be pretty 
easy; IPKs might be more challenging since AFAIK the tools that make 
them don't generate reproducible output to begin with.

2. Testing more images than core-image-minimal; This should be pretty 
straight forward to add to the QA test, it's mostly a matter of fixing 
all the issues that come up.

3. Test for binary reproducible images (e.g. check that the entire ext4 
image produced is binary identical). This one also might be pretty easy 
for some formats, and hard for others (e.g. ext4 I think would be easy, 
squashfs might be hard).

4. Improve the test to better test timestamp changes. Currently, the QA 
test runs the two test builds serially which ensures that they have a 
different datestamp when building. However, there are some packages that 
are not reproducible based on only the Day, Month, or Year, neither of 
which is likely to be different between the two serial test builds. I 
would like to figure out a way to force one of the builds to be 
separated by a sufficient about of time to tease out these issues. This 
might be as easy as running bitbake under faketime, or it might be more 
involved.

5. I don't know if anyone is clamoring for reproducible nativesdk builds?

6. We should also be testing if sstate objects are reproducible, 
otherwise sstate can't really be relied on when doing a reproducible 
build (In fact, I think the original reproducible build work that I took 
over was focused on making sstate reproducible).


I think that OE has some significant advantages in being able to make 
reproducible builds compared to other projects attempting the same 
thing; primarily, we are capable of building up all (or most) of the 
required build tools internally, then using these internal tools to 
build up the target (e.g. we build GCC for the target, then use this 
built GCC to compile target source). This means that we have a great 
opportunity to isolate the build from the host environment and truly 
achieve "simple" reproducible builds; any given set of layers at their 
respective SHA's should be able to build a binary identical output on 
any given host, with (ideally) no dependency on the host. We can't do 
this today, and I've identified a number of roadblocks that will need to 
be resolved (this is not a complete list; there will be more):

1. HOSTTOOLS differences. There are a lot of tools listed in HOSTTOOLS, 
and unfortunately some of them have version dependent output and are 
used for target builds (the one I've currently stumbled upon is pod2man, 
but I'm sure there are others). Unfortunately, one could probably argue 
that HOSTTOOLS is somewhat antithetical to the above statement, at least 
in regard to target builds. Any host tool output that "leaks" into the 
target build output can result in a non-reproducible build across hosts, 
and possibly should be avoided; the alternative is to use (or mandate) 
the corresponding -native recipe that provides that tool as a DEPENDS so 
that the controlled internally built version is used instead. Note that 
this only really applies target builds, not -native (or nativesdk right 
now). -native recipes would obviously need more HOSTTOOLS to help 
bootstrap the system. I suspect this would require reworking how 
HOSTOOLS works so that they can be split into two categories somehow; 
the tools that have "ubiquitous and stable" interfaces and are fine for 
all recipes (e.g. cat, sed, true, rm, etc.) and those that are variable 
and should only be used for -native builds (e.g. pod2man, rpcgen(?), 
chrpath(?), tar(?)... others?). Anyone have thoughts on this?

2. sstate currently isn't reproducible. This is at least partially 
related to the why non-clean rebuilds aren't reproducible[1]. These two 
are related because AFAIK there isn't really anyway of knowing if an 
sstate object came from a clean build of a recipe or a rebuild of a 
recipe, so as long as rebuilds aren't reproducible, neither will sstate 
be reproducible. The simplest fixes for these problems is to add more 
-native tools to DEPENDS if they are used by the builds so that are 
"stable" across all the tasks where it matters, but there might also be 
some more "tricky" things that can/should be done with RSS to help 
mitigate the problem. The HOSTTOOLS issue also makes sstate 
non-reproducible, since AFAIK, there isn't necessarily a way to ensure 
that a sstate object came from a specific host. In fact, I would 
speculate that most core reproducibility issues will also make sstate 
non-reproducible. Reproducible sstate also plays directly into hash 
equivalence, since it is based on sstate and would be *much* more 
effective if sstate were reproducible.


Many of the remaining problems can be solved by adding more -native 
recipes to DEPENDS, but this has meet with some (justified) push back; 
doing this things will likely increase the build time since more -native 
dependencies will mean more -native tools have to be built, and more 
serialization of the builds waiting for those tools to be built. I 
suspect this is more true for replacing HOSTTOOLS with -native recipes, 
since many of them may not have needed to be built at all. For the 
sstate/rebuild reproducibility this is likely to have less impact since 
those recipes were going to eventually have been built to be included in 
RSS. Adding them to DEPENDS just moves them to be included sooner.


I'm curious what people thing about all this; How important is 
reproducibility? How reproducible do we want to be? How hard should it 
be to have reproducible builds? What trade-offs are willing to be made 
for reproducible builds? Are there smart ways we can mitigate some of 
the potential performance impacts of reproducible builds?


Thanks for your time. I know this was a long e-mail.

Joshua Watt

[1]: https://bugzilla.yoctoproject.org/show_bug.cgi?id=13378





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: The state of reproducible Builds
  2019-07-01 15:58 The state of reproducible Builds Joshua Watt
@ 2019-07-02  0:43 ` Douglas Royds
  2019-07-02  0:57   ` Joshua Watt
  2019-07-02 13:26 ` Adrian Bunk
  2019-07-02 15:39 ` Martin Jansa
  2 siblings, 1 reply; 7+ messages in thread
From: Douglas Royds @ 2019-07-02  0:43 UTC (permalink / raw)
  To: Joshua Watt, OE-core

On 2/07/19 3:58 AM, Joshua Watt wrote:

> 1. Testing RPM and IPK package formats. I think RPMs will be pretty 
> easy; IPKs might be more challenging since AFAIK the tools that make 
> them don't generate reproducible output to begin with.


This has not been my experience. I have been building reproducible ipks, 
indeed, it is the hashsums of the ipks that I've been examining. In most 
cases, the correct SOURCE_DATE_EPOCH is enough, but there have been 
cases where I've had to correct upstream projects to cope with the 
SOURCE_DATE_EPOCH or avoid the effect of differing uname settings.


> 1. HOSTTOOLS differences. There are a lot of tools listed in 
> HOSTTOOLS, and unfortunately some of them have version dependent 
> output and are used for target builds (the one I've currently stumbled 
> upon is pod2man, but I'm sure there are others). Unfortunately, one 
> could probably argue that HOSTTOOLS is somewhat antithetical to the 
> above statement, at least in regard to target builds. Any host tool 
> output that "leaks" into the target build output can result in a 
> non-reproducible build across hosts, and possibly should be avoided; 
> the alternative is to use (or mandate) the corresponding -native 
> recipe that provides that tool as a DEPENDS so that the controlled 
> internally built version is used instead. Note that this only really 
> applies target builds, not -native (or nativesdk right now). -native 
> recipes would obviously need more HOSTTOOLS to help bootstrap the 
> system. I suspect this would require reworking how HOSTOOLS works so 
> that they can be split into two categories somehow; the tools that 
> have "ubiquitous and stable" interfaces and are fine for all recipes 
> (e.g. cat, sed, true, rm, etc.) and those that are variable and should 
> only be used for -native builds (e.g. pod2man, rpcgen(?), chrpath(?), 
> tar(?)... others?). Anyone have thoughts on this?


Perhaps reproducibility is the decision-point for adding a tool to the 
HOSTTOOLS: If the precise version of the tool has no impact on 
reproducibility (eg. cat, sed, and even gawk), it is a good candidate 
for the HOSTTOOLS. pod2man shouldn't be in the HOSTTOOLS, because we 
need to control the version.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: The state of reproducible Builds
  2019-07-02  0:43 ` Douglas Royds
@ 2019-07-02  0:57   ` Joshua Watt
  0 siblings, 0 replies; 7+ messages in thread
From: Joshua Watt @ 2019-07-02  0:57 UTC (permalink / raw)
  To: Douglas Royds; +Cc: OE-core

[-- Attachment #1: Type: text/plain, Size: 2463 bytes --]

On Mon, Jul 1, 2019, 7:43 PM Douglas Royds <douglas.royds@taitradio.com>
wrote:

> On 2/07/19 3:58 AM, Joshua Watt wrote:
>
> > 1. Testing RPM and IPK package formats. I think RPMs will be pretty
> > easy; IPKs might be more challenging since AFAIK the tools that make
> > them don't generate reproducible output to begin with.
>
>
> This has not been my experience. I have been building reproducible ipks,
> indeed, it is the hashsums of the ipks that I've been examining. In most
> cases, the correct SOURCE_DATE_EPOCH is enough, but there have been
> cases where I've had to correct upstream projects to cope with the
> SOURCE_DATE_EPOCH or avoid the effect of differing uname settings.
>

Ah, fair enough. I must have misremembered something.


>
> > 1. HOSTTOOLS differences. There are a lot of tools listed in
> > HOSTTOOLS, and unfortunately some of them have version dependent
> > output and are used for target builds (the one I've currently stumbled
> > upon is pod2man, but I'm sure there are others). Unfortunately, one
> > could probably argue that HOSTTOOLS is somewhat antithetical to the
> > above statement, at least in regard to target builds. Any host tool
> > output that "leaks" into the target build output can result in a
> > non-reproducible build across hosts, and possibly should be avoided;
> > the alternative is to use (or mandate) the corresponding -native
> > recipe that provides that tool as a DEPENDS so that the controlled
> > internally built version is used instead. Note that this only really
> > applies target builds, not -native (or nativesdk right now). -native
> > recipes would obviously need more HOSTTOOLS to help bootstrap the
> > system. I suspect this would require reworking how HOSTOOLS works so
> > that they can be split into two categories somehow; the tools that
> > have "ubiquitous and stable" interfaces and are fine for all recipes
> > (e.g. cat, sed, true, rm, etc.) and those that are variable and should
> > only be used for -native builds (e.g. pod2man, rpcgen(?), chrpath(?),
> > tar(?)... others?). Anyone have thoughts on this?
>
>
> Perhaps reproducibility is the decision-point for adding a tool to the
> HOSTTOOLS: If the precise version of the tool has no impact on
> reproducibility (eg. cat, sed, and even gawk), it is a good candidate
> for the HOSTTOOLS. pod2man shouldn't be in the HOSTTOOLS, because we
> need to control the version.
>
>

[-- Attachment #2: Type: text/html, Size: 3205 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: The state of reproducible Builds
  2019-07-01 15:58 The state of reproducible Builds Joshua Watt
  2019-07-02  0:43 ` Douglas Royds
@ 2019-07-02 13:26 ` Adrian Bunk
  2019-07-02 14:13   ` Joshua Watt
  2019-07-02 15:39 ` Martin Jansa
  2 siblings, 1 reply; 7+ messages in thread
From: Adrian Bunk @ 2019-07-02 13:26 UTC (permalink / raw)
  To: Joshua Watt; +Cc: OE-core

On Mon, Jul 01, 2019 at 10:58:04AM -0500, Joshua Watt wrote:
>...
> 1. HOSTTOOLS differences. There are a lot of tools listed in HOSTTOOLS, and
> unfortunately some of them have version dependent output and are used for
> target builds (the one I've currently stumbled upon is pod2man, but I'm sure
> there are others). Unfortunately, one could probably argue that HOSTTOOLS is
> somewhat antithetical to the above statement, at least in regard to target
> builds. Any host tool output that "leaks" into the target build output can
> result in a non-reproducible build across hosts, and possibly should be
> avoided; the alternative is to use (or mandate) the corresponding -native
> recipe that provides that tool as a DEPENDS so that the controlled
> internally built version is used instead. Note that this only really applies
> target builds, not -native (or nativesdk right now). -native recipes would
> obviously need more HOSTTOOLS to help bootstrap the system. I suspect this
> would require reworking how HOSTOOLS works so that they can be split into
> two categories somehow; the tools that have "ubiquitous and stable"
> interfaces and are fine for all recipes (e.g. cat, sed, true, rm, etc.) and
> those that are variable and should only be used for -native builds (e.g.
> pod2man, rpcgen(?), chrpath(?), tar(?)... others?). Anyone have thoughts on
> this?
>...

What is the goal?

1. being able to prove that a given binary has actually been 
   built from the correct sources, or
2. builds on all hosts have the same output

With 1. you can just record all host properties like installed packages
and running kernel, and it isn't a problem if different hosts result in
different output.

With 2. any kind of differences due to host differences is a problem.
You need -native for nearly everything, and then fix all other kinds of 
differences like the version of the running kernel recorded somewhere.

For detecting malicous binaries not built from the claimed sources 1. is 
sufficient. For distributions like Debian that build natively this is 
even the only option available since the host compiler is used.

Doing 2. would of course be more desirable, but it can also be done in 
a second step after all issues related to building on exactly the same
host have been sorted out.

> Joshua Watt
>...

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: The state of reproducible Builds
  2019-07-02 13:26 ` Adrian Bunk
@ 2019-07-02 14:13   ` Joshua Watt
  2019-07-02 14:32     ` Martin Hundebøll
  0 siblings, 1 reply; 7+ messages in thread
From: Joshua Watt @ 2019-07-02 14:13 UTC (permalink / raw)
  To: Adrian Bunk; +Cc: OE-core


On 7/2/19 8:26 AM, Adrian Bunk wrote:
> On Mon, Jul 01, 2019 at 10:58:04AM -0500, Joshua Watt wrote:
>> ...
>> 1. HOSTTOOLS differences. There are a lot of tools listed in HOSTTOOLS, and
>> unfortunately some of them have version dependent output and are used for
>> target builds (the one I've currently stumbled upon is pod2man, but I'm sure
>> there are others). Unfortunately, one could probably argue that HOSTTOOLS is
>> somewhat antithetical to the above statement, at least in regard to target
>> builds. Any host tool output that "leaks" into the target build output can
>> result in a non-reproducible build across hosts, and possibly should be
>> avoided; the alternative is to use (or mandate) the corresponding -native
>> recipe that provides that tool as a DEPENDS so that the controlled
>> internally built version is used instead. Note that this only really applies
>> target builds, not -native (or nativesdk right now). -native recipes would
>> obviously need more HOSTTOOLS to help bootstrap the system. I suspect this
>> would require reworking how HOSTOOLS works so that they can be split into
>> two categories somehow; the tools that have "ubiquitous and stable"
>> interfaces and are fine for all recipes (e.g. cat, sed, true, rm, etc.) and
>> those that are variable and should only be used for -native builds (e.g.
>> pod2man, rpcgen(?), chrpath(?), tar(?)... others?). Anyone have thoughts on
>> this?
>> ...
> What is the goal?
>
> 1. being able to prove that a given binary has actually been
>     built from the correct sources, or
> 2. builds on all hosts have the same output
I'm not sure there is just one goal...
> With 1. you can just record all host properties like installed packages
> and running kernel, and it isn't a problem if different hosts result in
> different output.

Right... I know that my employer would really like this sort of binary 
reproducibility; that is we should be able to pull some archived code 
out of our salt mine, build it, and know its the same binary that our 
customers have. I think if you combine what we have today and some sort 
of reproducible host image (archived Docker container, virtual machine, 
et al.) we are pretty close to that


>
> With 2. any kind of differences due to host differences is a problem.
> You need -native for nearly everything, and then fix all other kinds of
> differences like the version of the running kernel recorded somewhere.

Yes. I would hope that after using mostly -native tools where 
applicable, the currently running kernel wouldn't figure into the build 
of target packages... if it does I would venture to say that is a 
cross-compiling/reproducibility bug in the package.

Also, to be clear, I'm hoping we don't need to go so far as to say that 
-native recipes need to necessarily be reproducible; as long as they 
always generate reproducible output regardless of which host they were 
built on I suspect they don't need to be.

>
> For detecting malicous binaries not built from the claimed sources 1. is
> sufficient. For distributions like Debian that build natively this is
> even the only option available since the host compiler is used.
>
> Doing 2. would of course be more desirable, but it can also be done in
> a second step after all issues related to building on exactly the same
> host have been sorted out.

I think there are also other use cases for #2 besides detecting 
malicious binaries/source code, such as hash equivalence, or even being 
able use sstate when making a reproducible build. You are correct that 
this can be done in a second step, but I think that everyone needs to be 
aware of the limitations that will present when #2 is not present (the 
main one being that you probably can't make a reproducible build if you 
use sstate).

>
>> Joshua Watt
>> ...
> cu
> Adrian
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: The state of reproducible Builds
  2019-07-02 14:13   ` Joshua Watt
@ 2019-07-02 14:32     ` Martin Hundebøll
  0 siblings, 0 replies; 7+ messages in thread
From: Martin Hundebøll @ 2019-07-02 14:32 UTC (permalink / raw)
  To: openembedded-core

Hi,

On 02/07/2019 16.13, Joshua Watt wrote:
>> For detecting malicous binaries not built from the claimed sources 1. is
>> sufficient. For distributions like Debian that build natively this is
>> even the only option available since the host compiler is used.
>>
>> Doing 2. would of course be more desirable, but it can also be done in
>> a second step after all issues related to building on exactly the same
>> host have been sorted out.
> 
> I think there are also other use cases for #2 besides detecting 
> malicious binaries/source code, such as hash equivalence, or even being 
> able use sstate when making a reproducible build. You are correct that 
> this can be done in a second step, but I think that everyone needs to be 
> aware of the limitations that will present when #2 is not present (the 
> main one being that you probably can't make a reproducible build if you 
> use sstate).

Our use case for reproducible builds is to limit delta update sizes. 
I.e. updating one package shouldn;t change the binary output from other 
independent packages.

// Martin


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: The state of reproducible Builds
  2019-07-01 15:58 The state of reproducible Builds Joshua Watt
  2019-07-02  0:43 ` Douglas Royds
  2019-07-02 13:26 ` Adrian Bunk
@ 2019-07-02 15:39 ` Martin Jansa
  2 siblings, 0 replies; 7+ messages in thread
From: Martin Jansa @ 2019-07-02 15:39 UTC (permalink / raw)
  To: Joshua Watt; +Cc: OE-core

[-- Attachment #1: Type: text/plain, Size: 3051 bytes --]

On Mon, Jul 01, 2019 at 10:58:04AM -0500, Joshua Watt wrote:
> I'm curious what people thing about all this; How important is 
> reproducibility? How reproducible do we want to be? How hard should it 
> be to have reproducible builds? What trade-offs are willing to be made 
> for reproducible builds? Are there smart ways we can mitigate some of 
> the potential performance impacts of reproducible builds?

For me 100% reproducibility isn't hard requirement, but every
reproducible bit makes it more useful.

Once we upgrade to newer Yocto which includes more of these fixes my
plan is to achieve these milestones.

1) no changes in buildhistory reports (especially files-in-image.txt)
between 2 clean builds on the same host
2) same as above, but on different hosts, but with the same OS (now we use
Ubuntu 18.04 on all LGE builds)
3) same of above but with different OS on host (not important to us, but
interesting to see which host differences cause differences in the
target image).

A) no changes in installed files (not only in their ls -l shown in the
buildhistory reports), again on the same host and after it works on the
same host, than maybe different hosts with the same OS and maybe then
also different OS

B) no changes in the .ipk files (after the packaged bits are identical)

C) with the hash equivalence server we might get rid of .ipk files
having different EXTENDPRAUTO from PRserv when they are rebuilt just
because some dependency changed the signature.

And all these milestones also have another scope axis (it's great to
have everything reproducible in core-image-minimal, but there might be
still a lot of differences in bigger images and our images are really
big) - but again every reproducible bit helps, once the low hanging
fruits are fixed, it will be easier to see what next is causing a lot of
differences or even filter-out the known to be not-reproducible bits
when comparing 2 images.

We don't hide any source code in salt mines, so reproducing some very
old binary (on possibly very different host OS) is less important for
us. Similarly detecting the maliciously modified binaries is less
important for us because we control the whole pipeline from source to
the bits installed on the TVs.

Being able to see that the diff between 2 official builds doesn't contain
any unexpected changes is probably the most important aspect for us.

Also in the opposite direction when QA reports new issue in the latest
build and we need to compare with previous one to find the cause of it
and now there is too many random changes just because the recipes were
rebuilt makes it difficult to spot the significant difference which
caused the new issue.

> Thanks for your time. I know this was a long e-mail.

Thanks for working on this, I believe this issue is really important and
I really like your changes. Once we get closer to master I hope I'll be
able to contribute some fixes back.

Cheers,
-- 
Martin 'JaMa' Jansa     jabber: Martin.Jansa@gmail.com

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 201 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-07-02 15:39 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-01 15:58 The state of reproducible Builds Joshua Watt
2019-07-02  0:43 ` Douglas Royds
2019-07-02  0:57   ` Joshua Watt
2019-07-02 13:26 ` Adrian Bunk
2019-07-02 14:13   ` Joshua Watt
2019-07-02 14:32     ` Martin Hundebøll
2019-07-02 15:39 ` Martin Jansa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.