All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/6] Reproducible binaries
@ 2017-05-01 20:58 Juro Bystricky
  2017-05-01 20:58 ` [PATCH v2 1/6] bitbake.conf: new variable BUILD_REPRODUCIBLE_BINARIES Juro Bystricky
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Juro Bystricky @ 2017-05-01 20:58 UTC (permalink / raw)
  To: openembedded-core; +Cc: jurobystricky

This patch set (V2) contains several patches aimed to achieve reproducible binaries.
Building reproducible binaries may remove certain intentional
randomness intended for increased security. Hence, it is reasonable
to expect there will be cases where this is not desirable.
The user can select his/her preferences via the variable
BUILD_REPRODUCIBLE_BINARIES. The variable defaults to "0" (do not
build reproducible binaries) in order to minimize any potential
regressions.

For debian packages we get a lot of binary identical packages simply by
exporting SOURCE_DATE_EPOCH. This is done automatically when
BUILD_REPRODUCIBLE_BINARIES="1".

For rootfs we get much fewer differences by modified prelinking and by
ensuring various timestamps are reproducible.

For example, building core-image-minimal with this patchset,
using the following settings in the local.conf:

    BUILD_REPRODUCIBLE_BINARIES="1"
    LDCONFIGDEPEND=""
    IMAGE_CMD_TAR="tar -v --sort=name"

    #Optional user specified timestams:
    REPRODUCIBLE_TIMESTAMP_IMAGE_PRELINK="1483228800"
    REPRODUCIBLE_TIMESTAMP_ROOTFS="1483228800"

we can build binary identical core-image-minimal-rootfs.tar.bz2 images.
(Tested on the same machine, two different build folders, images built at different
times)
Eventually, it will be possible to build identical identical core-image-minimal-rootfs.ext4
as well. (Note in this test case the rootfs is built without pre-built ldconfig aux-cache).
This patchset does not address the reproducibility of the linux kernel nor
the reproducibility of linux kernel modules.



Juro Bystricky (6):
  bitbake.conf: new variable BUILD_REPRODUCIBLE_BINARIES
  base.bbclass: initial support for binary reproducibility
  image-prelink.bbclass: support binary reproducibility
  rootfs-postcommands.bbclass: support binary reproducibility
  busybox.inc: improve reproducibility
  image.bbclass: support binary reproducibility

 meta/classes/base.bbclass                | 82 ++++++++++++++++++++++++++++++++
 meta/classes/image-prelink.bbclass       | 12 ++++-
 meta/classes/image.bbclass               | 12 +++++
 meta/classes/rootfs-postcommands.bbclass | 24 ++++++++--
 meta/conf/bitbake.conf                   | 11 +++++
 meta/recipes-core/busybox/busybox.inc    |  3 ++
 6 files changed, 140 insertions(+), 4 deletions(-)

-- 
2.7.4



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/6] bitbake.conf: new variable BUILD_REPRODUCIBLE_BINARIES
  2017-05-01 20:58 [PATCH v2 0/6] Reproducible binaries Juro Bystricky
@ 2017-05-01 20:58 ` Juro Bystricky
  2017-05-01 23:13   ` Richard Purdie
  2017-05-01 20:59 ` [PATCH v2 2/6] base.bbclass: initial support for binary reproducibility Juro Bystricky
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Juro Bystricky @ 2017-05-01 20:58 UTC (permalink / raw)
  To: openembedded-core; +Cc: jurobystricky

Building reproducible binaries may remove certain intentional
randomness intended for increased security. Hence, it is reasonable
to expect there will be cases where this is not desirable.
The user can select his/her preferences via the variable
BUILD_REPRODUCIBLE_BINARIES. The variable defaults to "0" (do not
build reproducible binaries) in order to minimize any potential
regressions. (Once the reproducible binaries code is mature enough,
it can be set to "1".)
If the variable BUILD_REPRODUCIBLE_BINARIES is set to "1",
timestamp values taken from additional variables will be optionally used
when building binary reproducible images:

    REPRODUCIBLE_TIMESTAMP_ROOTFS
        If the value is specified, all files mtime will be set to this value.
        In addition, /etc/timestamp and /etc/version will both contain the value.
        If no value is specified, timestamp will be derived from the top git commit.

    REPRODUCIBLE_TIMESTAMP_IMAGE_PRELINK
        Value passed via environment variable PRELINK_TIMESTAMP to the prelink program.
        If the value is specified, the value will be used.
        If no value is specified, timestamp will be derived from the top git commit.

Signed-off-by: Juro Bystricky <juro.bystricky@intel.com>
---
 meta/conf/bitbake.conf | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/meta/conf/bitbake.conf b/meta/conf/bitbake.conf
index 227babd..6ce1a1a 100644
--- a/meta/conf/bitbake.conf
+++ b/meta/conf/bitbake.conf
@@ -859,3 +859,14 @@ BB_SIGNATURE_EXCLUDE_FLAGS ?= "doc deps depends \
 
 MLPREFIX ??= ""
 MULTILIB_VARIANTS ??= ""
+
+BUILD_REPRODUCIBLE_BINARIES ??= "0"
+BUILD_REPRODUCIBLE_BINARIES[export] = "1"
+
+# Unix timestamp
+REPRODUCIBLE_TIMESTAMP_ROOTFS ??= ""
+REPRODUCIBLE_TIMESTAMP_ROOTFS[export] = "1"
+
+# Unix timestamp
+REPRODUCIBLE_TIMESTAMP_IMAGE_PRELINK ??= ""
+REPRODUCIBLE_TIMESTAMP_IMAGE_PRELINK[export] = "1"
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/6] base.bbclass: initial support for binary reproducibility
  2017-05-01 20:58 [PATCH v2 0/6] Reproducible binaries Juro Bystricky
  2017-05-01 20:58 ` [PATCH v2 1/6] bitbake.conf: new variable BUILD_REPRODUCIBLE_BINARIES Juro Bystricky
@ 2017-05-01 20:59 ` Juro Bystricky
  2017-06-14 20:30   ` Martin Jansa
  2017-05-01 20:59 ` [PATCH v2 3/6] image-prelink.bbclass: support " Juro Bystricky
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Juro Bystricky @ 2017-05-01 20:59 UTC (permalink / raw)
  To: openembedded-core; +Cc: jurobystricky

Conditionally set some environment variables in order to achieve
improved binary reproducibility. Providing BUILD_REPRODUCIBLE_BINARIES is
set to "1", we set the following environment variables:

export PYTHONHASHSEED=0
export PERL_HASH_SEED=0
export TZ="UTC"

We also export and set SOURCE_DATE_EPOCH. The value for this variable
is obtained after source code for a recipe has been unpacked, but before it is
patched. If the code comes from a GIT repo, we get the timestamp from the top
commit. (This usually corresponds to the mktime of "changelog".)
Otherwise we go through all files and get the timestamp from the youngest
one. We create a timestamp for each recipe. The timestamp is stored in the file
'src_date_epoch.txt'. Later on, each task reads this file and sets SOURCE_DATE_EPOCH
based on the value found in the file.

[YOCTO#11178]
[YOCTO#11179]

Signed-off-by: Juro Bystricky <juro.bystricky@intel.com>
---
 meta/classes/base.bbclass | 82 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 82 insertions(+)

diff --git a/meta/classes/base.bbclass b/meta/classes/base.bbclass
index e29821f..f2b2d97 100644
--- a/meta/classes/base.bbclass
+++ b/meta/classes/base.bbclass
@@ -10,6 +10,52 @@ inherit utility-tasks
 inherit metadata_scm
 inherit logging
 
+def get_git_src_date_epoch(d, path):
+    import subprocess
+    saved_cwd = os.getcwd()
+    os.chdir(path)
+    src_date_epoch = int(subprocess.check_output(['git','log','-1','--pretty=%ct']))
+    os.chdir(saved_cwd)
+    return src_date_epoch
+
+def create_src_date_epoch_stamp(d):
+    if d.getVar('BUILD_REPRODUCIBLE_BINARIES') == '1':
+        path = d.getVar('S')
+        src_date_epoch = 0
+        filename_dbg = None
+
+        if path.endswith('/git'):
+            src_date_epoch = get_git_src_date_epoch(d, path)
+        else:
+            exclude = set(["temp", "licenses", "patches", "recipe-sysroot-native", "recipe-sysroot" ])
+            for root, dirs, files in os.walk(path, topdown=True):
+                dirs[:] = [d for d in dirs if d not in exclude]
+                if root.endswith('/git'):
+                    src_date_epoch = get_git_src_date_epoch(d, root)
+                    break
+
+                for fname in files:
+                    filename = os.path.join(root, fname)
+                    try:
+                        mtime = int(os.path.getmtime(filename))
+                    except:
+                        mtime = 0
+                    if mtime > src_date_epoch:
+                        src_date_epoch = mtime
+                        filename_dbg = filename
+
+        # Most likely an empty folder
+        if src_date_epoch == 0:
+            bb.warn("Unable to determine src_date_epoch! path:%s" % path)
+
+        f = open(os.path.join(path,'src_date_epoch.txt'), 'w')
+        f.write(str(src_date_epoch))
+        f.close()
+
+        if filename_dbg != None:
+            bb.debug(1," src_date_epoch %d derived from: %s" % (src_date_epoch, filename_dbg))
+
+
 OE_IMPORTS += "os sys time oe.path oe.utils oe.types oe.package oe.packagegroup oe.sstatesig oe.lsb oe.cachedpath oe.license"
 OE_IMPORTS[type] = "list"
 
@@ -173,6 +219,7 @@ python base_do_unpack() {
     try:
         fetcher = bb.fetch2.Fetch(src_uri, d)
         fetcher.unpack(d.getVar('WORKDIR'))
+        create_src_date_epoch_stamp(d)
     except bb.fetch2.BBFetchException as e:
         bb.fatal(str(e))
 }
@@ -383,9 +430,43 @@ def set_packagetriplet(d):
 
     settriplet(d, "PKGMLTRIPLETS", archs, tos, tvs)
 
+
+export PYTHONHASHSEED
+export PERL_HASH_SEED
+export SOURCE_DATE_EPOCH
+
+BB_HASHBASE_WHITELIST += "SOURCE_DATE_EPOCH PYTHONHASHSEED PERL_HASH_SEED "
+
 python () {
     import string, re
 
+    # Create reproducible_environment
+
+    if d.getVar('BUILD_REPRODUCIBLE_BINARIES') == '1':
+        import subprocess
+        d.setVar('PYTHONHASHSEED', '0')
+        d.setVar('PERL_HASH_SEED', '0')
+        d.setVar('TZ', 'UTC')
+
+        path = d.getVar('S')
+        epochfile = os.path.join(path,'src_date_epoch.txt')
+        if os.path.isfile(epochfile):
+            f = open(epochfile, 'r')
+            src_date_epoch = f.read()
+            f.close()
+            bb.debug(1, "src_date_epoch stamp found ---> stamp %s" % src_date_epoch)
+            d.setVar('SOURCE_DATE_EPOCH', src_date_epoch)
+        else:
+            bb.debug(1, "src_date_epoch stamp not found.")
+            d.setVar('SOURCE_DATE_EPOCH', '0')
+    else:
+        if 'PYTHONHASHSEED' in os.environ:
+            del os.environ['PYTHONHASHSEED']
+        if 'PERL_HASH_SEED' in os.environ:
+            del os.environ['PERL_HASH_SEED']
+        if 'SOURCE_DATE_EPOCH' in os.environ:
+            del os.environ['SOURCE_DATE_EPOCH']
+
     # Handle PACKAGECONFIG
     #
     # These take the form:
@@ -678,6 +759,7 @@ python () {
             bb.warn("Recipe %s is marked as only being architecture specific but seems to have machine specific packages?! The recipe may as well mark itself as machine specific directly." % d.getVar("PN"))
 }
 
+
 addtask cleansstate after do_clean
 python do_cleansstate() {
         sstate_clean_cachefiles(d)
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 3/6] image-prelink.bbclass: support binary reproducibility
  2017-05-01 20:58 [PATCH v2 0/6] Reproducible binaries Juro Bystricky
  2017-05-01 20:58 ` [PATCH v2 1/6] bitbake.conf: new variable BUILD_REPRODUCIBLE_BINARIES Juro Bystricky
  2017-05-01 20:59 ` [PATCH v2 2/6] base.bbclass: initial support for binary reproducibility Juro Bystricky
@ 2017-05-01 20:59 ` Juro Bystricky
  2017-05-01 20:59 ` [PATCH v2 4/6] rootfs-postcommands.bbclass: " Juro Bystricky
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Juro Bystricky @ 2017-05-01 20:59 UTC (permalink / raw)
  To: openembedded-core; +Cc: jurobystricky

Conditionally support binary reproducibility in built images.
If BUILD_REPRODUCIBLE_BINARIES = 1 then:

1. Do not randomize library addresses
2. Set/export PRELINK_TIMESTAMP to a reproducible value.
   If REPRODUCIBLE_TIMESTAMP_IMAGE_PRELINK is specified, then the value will
   be used. Otherwise the timestamp will be derived from the top git commit.

Signed-off-by: Juro Bystricky <juro.bystricky@intel.com>
---
 meta/classes/image-prelink.bbclass | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/meta/classes/image-prelink.bbclass b/meta/classes/image-prelink.bbclass
index 4157df0..046660e 100644
--- a/meta/classes/image-prelink.bbclass
+++ b/meta/classes/image-prelink.bbclass
@@ -36,7 +36,17 @@ prelink_image () {
 	dynamic_loader=$(linuxloader)
 
 	# prelink!
-	${STAGING_SBINDIR_NATIVE}/prelink --root ${IMAGE_ROOTFS} -amR -N -c ${sysconfdir}/prelink.conf --dynamic-linker $dynamic_loader
+	if [ "$BUILD_REPRODUCIBLE_BINARIES" = "1" ]; then
+		bbnote " prelink: BUILD_REPRODUCIBLE_BINARIES..."
+		if [ "$REPRODUCIBLE_TIMESTAMP_IMAGE_PRELINK" = "" ]; then
+			export PRELINK_TIMESTAMP=`git log -1 --pretty=%ct `
+		else
+			export PRELINK_TIMESTAMP=$REPRODUCIBLE_TIMESTAMP_IMAGE_PRELINK
+		fi
+		${STAGING_SBINDIR_NATIVE}/prelink --root ${IMAGE_ROOTFS} -am -N -c ${sysconfdir}/prelink.conf --dynamic-linker $dynamic_loader
+	else
+		${STAGING_SBINDIR_NATIVE}/prelink --root ${IMAGE_ROOTFS} -amR -N -c ${sysconfdir}/prelink.conf --dynamic-linker $dynamic_loader
+	fi
 
 	# Remove the prelink.conf if we had to add it.
 	if [ "$dummy_prelink_conf" = "true" ]; then
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 4/6] rootfs-postcommands.bbclass: support binary reproducibility
  2017-05-01 20:58 [PATCH v2 0/6] Reproducible binaries Juro Bystricky
                   ` (2 preceding siblings ...)
  2017-05-01 20:59 ` [PATCH v2 3/6] image-prelink.bbclass: support " Juro Bystricky
@ 2017-05-01 20:59 ` Juro Bystricky
  2017-05-01 20:59 ` [PATCH v2 5/6] busybox.inc: improve reproducibility Juro Bystricky
  2017-05-01 20:59 ` [PATCH v2 6/6] image.bbclass: support binary reproducibility Juro Bystricky
  5 siblings, 0 replies; 13+ messages in thread
From: Juro Bystricky @ 2017-05-01 20:59 UTC (permalink / raw)
  To: openembedded-core; +Cc: jurobystricky

Conditionally support binary reproducibility of rootfs images.
If BUILD_REPRODUCIBLE_BINARIES = 1 then:

1. set /etc/timestamp to a reproducible value
2. set /etc/version to a reproducible value

The reproducible value is taken from the variable REPRODUCIBLE_TIMESTAMP_ROOTFS.
If the variable is not specified, the timestamp value is derived from
the top git commit.

[YOCTO#11176]

Signed-off-by: Juro Bystricky <juro.bystricky@intel.com>
---
 meta/classes/rootfs-postcommands.bbclass | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/meta/classes/rootfs-postcommands.bbclass b/meta/classes/rootfs-postcommands.bbclass
index 498174a..072b43d 100644
--- a/meta/classes/rootfs-postcommands.bbclass
+++ b/meta/classes/rootfs-postcommands.bbclass
@@ -48,6 +48,7 @@ ROOTFS_POSTPROCESS_COMMAND_append_qemuall = "${SSH_DISABLE_DNS_LOOKUP}"
 SORT_PASSWD_POSTPROCESS_COMMAND ??= " sort_passwd; "
 python () {
     d.appendVar('ROOTFS_POSTPROCESS_COMMAND', '${SORT_PASSWD_POSTPROCESS_COMMAND}')
+    d.appendVar('ROOTFS_POSTPROCESS_COMMAND', 'rootfs_reproducible;')
 }
 
 systemd_create_users () {
@@ -243,10 +244,12 @@ python write_image_manifest () {
         os.symlink(os.path.basename(manifest_name), manifest_link)
 }
 
-# Can be use to create /etc/timestamp during image construction to give a reasonably
+# Can be used to create /etc/timestamp during image construction to give a reasonably
 # sane default time setting
 rootfs_update_timestamp () {
-	date -u +%4Y%2m%2d%2H%2M%2S >${IMAGE_ROOTFS}/etc/timestamp
+	if [ "$BUILD_REPRODUCIBLE_BINARIES" = "0" ]; then
+		date -u +%4Y%2m%2d%2H%2M%2S >${IMAGE_ROOTFS}/etc/timestamp
+	fi
 }
 
 # Prevent X from being started
@@ -286,7 +289,6 @@ rootfs_sysroot_relativelinks () {
 	sysroot-relativelinks.py ${SDK_OUTPUT}/${SDKTARGETSYSROOT}
 }
 
-
 # Generated test data json file
 python write_image_test_data() {
     from oe.data import export2json
@@ -302,3 +304,19 @@ python write_image_test_data() {
        os.remove(testdata_link)
     os.symlink(os.path.basename(testdata), testdata_link)
 }
+
+# Perform any additional adjustments needed to make rootf binary reproducible
+rootfs_reproducible () {
+	if [ "$BUILD_REPRODUCIBLE_BINARIES" = "1" ]; then
+		if [ "$REPRODUCIBLE_TIMESTAMP_ROOTFS" = "" ]; then
+			REPRODUCIBLE_TIMESTAMP_ROOTFS=`git log -1 --pretty=%ct`
+		fi
+
+		# Convert UTC into %4Y%2m%2d%2H%2M%2S
+		sformatted=`date -u -d @$REPRODUCIBLE_TIMESTAMP_ROOTFS +%4Y%2m%2d%2H%2M%2S`
+		echo $sformatted > ${IMAGE_ROOTFS}/etc/version
+		bbnote "rootfs_reproducible: set /etc/version to $sformatted"
+		echo $sformatted > ${IMAGE_ROOTFS}/etc/timestamp
+		bbnote "rootfs_reproducible: set /etc/timestamp to $sformatted"
+	fi
+}
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 5/6] busybox.inc: improve reproducibility
  2017-05-01 20:58 [PATCH v2 0/6] Reproducible binaries Juro Bystricky
                   ` (3 preceding siblings ...)
  2017-05-01 20:59 ` [PATCH v2 4/6] rootfs-postcommands.bbclass: " Juro Bystricky
@ 2017-05-01 20:59 ` Juro Bystricky
  2017-05-02  0:31   ` Andre McCurdy
  2017-05-01 20:59 ` [PATCH v2 6/6] image.bbclass: support binary reproducibility Juro Bystricky
  5 siblings, 1 reply; 13+ messages in thread
From: Juro Bystricky @ 2017-05-01 20:59 UTC (permalink / raw)
  To: openembedded-core; +Cc: jurobystricky

For reproducible builds do not generate build timestamp as part of
the version string.

Signed-off-by: Juro Bystricky <juro.bystricky@intel.com>
---
 meta/recipes-core/busybox/busybox.inc | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/meta/recipes-core/busybox/busybox.inc b/meta/recipes-core/busybox/busybox.inc
index 375632d..3e991f9 100644
--- a/meta/recipes-core/busybox/busybox.inc
+++ b/meta/recipes-core/busybox/busybox.inc
@@ -138,6 +138,9 @@ do_configure () {
 
 do_compile() {
 	unset CFLAGS CPPFLAGS CXXFLAGS LDFLAGS
+	if [ "$BUILD_REPRODUCIBLE_BINARIES" = "1" ]; then
+		export KCONFIG_NOTIMESTAMP=1
+	fi
 	if [ "${BUSYBOX_SPLIT_SUID}" = "1" -a x`grep "CONFIG_FEATURE_INDIVIDUAL=y" .config` = x ]; then
 	# split the .config into two parts, and make two busybox binaries
 		if [ -e .config.orig ]; then
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 6/6] image.bbclass: support binary reproducibility
  2017-05-01 20:58 [PATCH v2 0/6] Reproducible binaries Juro Bystricky
                   ` (4 preceding siblings ...)
  2017-05-01 20:59 ` [PATCH v2 5/6] busybox.inc: improve reproducibility Juro Bystricky
@ 2017-05-01 20:59 ` Juro Bystricky
  5 siblings, 0 replies; 13+ messages in thread
From: Juro Bystricky @ 2017-05-01 20:59 UTC (permalink / raw)
  To: openembedded-core; +Cc: jurobystricky

Added a new task "reproducible_final_image_task".
If binary reproducibility is desired ($BUILD_REPRODUCIBLE_BINARIES" = "1"),
then recursivley modify mtimes of all files to a reproducible vale.
The value is obtained via REPRODUCIBLE_TIMESTAMP_ROOTFS.
This task is executed as the very last step in image creation, once all
the files in the image have been finalized.

[YOCTO#11176]

Signed-off-by: Juro Bystricky <juro.bystricky@intel.com>
---
 meta/classes/image.bbclass | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/meta/classes/image.bbclass b/meta/classes/image.bbclass
index 405fd73..c311de5 100644
--- a/meta/classes/image.bbclass
+++ b/meta/classes/image.bbclass
@@ -617,3 +617,15 @@ do_bundle_initramfs () {
 	:
 }
 addtask bundle_initramfs after do_image_complete
+
+reproducible_final_image_task () {
+    if [ "$BUILD_REPRODUCIBLE_BINARIES" = "1" ]; then
+        if [ "$REPRODUCIBLE_TIMESTAMP_ROOTFS" = "" ]; then
+            REPRODUCIBLE_TIMESTAMP_ROOTFS=`git log -1 --pretty=%ct`
+        fi
+        # Set mtime of all files to a reproducible value
+        bbnote "reproducible_final_image_task: mtime set to $REPRODUCIBLE_TIMESTAMP_ROOTFS"
+        find  ${IMAGE_ROOTFS} -exec touch -h  --date=@$REPRODUCIBLE_TIMESTAMP_ROOTFS {} \;
+    fi
+}
+IMAGE_PREPROCESS_COMMAND_append = " reproducible_final_image_task; "
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/6] bitbake.conf: new variable BUILD_REPRODUCIBLE_BINARIES
  2017-05-01 20:58 ` [PATCH v2 1/6] bitbake.conf: new variable BUILD_REPRODUCIBLE_BINARIES Juro Bystricky
@ 2017-05-01 23:13   ` Richard Purdie
  2017-05-02  0:35     ` Bystricky, Juro
  0 siblings, 1 reply; 13+ messages in thread
From: Richard Purdie @ 2017-05-01 23:13 UTC (permalink / raw)
  To: Juro Bystricky, openembedded-core; +Cc: jurobystricky

On Mon, 2017-05-01 at 13:58 -0700, Juro Bystricky wrote:
> Building reproducible binaries may remove certain intentional
> randomness intended for increased security. Hence, it is reasonable
> to expect there will be cases where this is not desirable.
> The user can select his/her preferences via the variable
> BUILD_REPRODUCIBLE_BINARIES. The variable defaults to "0" (do not
> build reproducible binaries) in order to minimize any potential
> regressions. (Once the reproducible binaries code is mature enough,
> it can be set to "1".)
> If the variable BUILD_REPRODUCIBLE_BINARIES is set to "1",
> timestamp values taken from additional variables will be optionally
> used
> when building binary reproducible images:
> 
>     REPRODUCIBLE_TIMESTAMP_ROOTFS
>         If the value is specified, all files mtime will be set to
> this value.
>         In addition, /etc/timestamp and /etc/version will both
> contain the value.
>         If no value is specified, timestamp will be derived from the
> top git commit.
> 
>     REPRODUCIBLE_TIMESTAMP_IMAGE_PRELINK
>         Value passed via environment variable PRELINK_TIMESTAMP to
> the prelink program.
>         If the value is specified, the value will be used.
>         If no value is specified, timestamp will be derived from the
> top git commit.
> 
> Signed-off-by: Juro Bystricky <juro.bystricky@intel.com>
> ---
>  meta/conf/bitbake.conf | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/meta/conf/bitbake.conf b/meta/conf/bitbake.conf
> index 227babd..6ce1a1a 100644
> --- a/meta/conf/bitbake.conf
> +++ b/meta/conf/bitbake.conf
> @@ -859,3 +859,14 @@ BB_SIGNATURE_EXCLUDE_FLAGS ?= "doc deps depends
> \
>  
>  MLPREFIX ??= ""
>  MULTILIB_VARIANTS ??= ""
> +
> +BUILD_REPRODUCIBLE_BINARIES ??= "0"
> +BUILD_REPRODUCIBLE_BINARIES[export] = "1"
> +
> +# Unix timestamp
> +REPRODUCIBLE_TIMESTAMP_ROOTFS ??= ""
> +REPRODUCIBLE_TIMESTAMP_ROOTFS[export] = "1"
> +
> +# Unix timestamp
> +REPRODUCIBLE_TIMESTAMP_IMAGE_PRELINK ??= ""
> +REPRODUCIBLE_TIMESTAMP_IMAGE_PRELINK[export] = "1"

Please don't add new global exports in bitbake.conf. Changing the value
of this will cause everything to rebuild (e.g. recompile) since the
exported environment goes to all tasks. We really don't want to do that
if it only affects the image generation.

I'll give this a bit more thought/review but wanted to comment on this
whilst I see it/remember.

Cheers,

Richard



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 5/6] busybox.inc: improve reproducibility
  2017-05-01 20:59 ` [PATCH v2 5/6] busybox.inc: improve reproducibility Juro Bystricky
@ 2017-05-02  0:31   ` Andre McCurdy
  0 siblings, 0 replies; 13+ messages in thread
From: Andre McCurdy @ 2017-05-02  0:31 UTC (permalink / raw)
  To: Juro Bystricky; +Cc: Juro Bystricky, OE Core mailing list

On Mon, May 1, 2017 at 1:59 PM, Juro Bystricky <juro.bystricky@intel.com> wrote:
> For reproducible builds do not generate build timestamp as part of
> the version string.

Maybe just do that by default?

> Signed-off-by: Juro Bystricky <juro.bystricky@intel.com>
> ---
>  meta/recipes-core/busybox/busybox.inc | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/meta/recipes-core/busybox/busybox.inc b/meta/recipes-core/busybox/busybox.inc
> index 375632d..3e991f9 100644
> --- a/meta/recipes-core/busybox/busybox.inc
> +++ b/meta/recipes-core/busybox/busybox.inc
> @@ -138,6 +138,9 @@ do_configure () {
>
>  do_compile() {
>         unset CFLAGS CPPFLAGS CXXFLAGS LDFLAGS
> +       if [ "$BUILD_REPRODUCIBLE_BINARIES" = "1" ]; then
> +               export KCONFIG_NOTIMESTAMP=1
> +       fi
>         if [ "${BUSYBOX_SPLIT_SUID}" = "1" -a x`grep "CONFIG_FEATURE_INDIVIDUAL=y" .config` = x ]; then
>         # split the .config into two parts, and make two busybox binaries
>                 if [ -e .config.orig ]; then
> --
> 2.7.4
>
> --
> _______________________________________________
> Openembedded-core mailing list
> Openembedded-core@lists.openembedded.org
> http://lists.openembedded.org/mailman/listinfo/openembedded-core


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/6] bitbake.conf: new variable BUILD_REPRODUCIBLE_BINARIES
  2017-05-01 23:13   ` Richard Purdie
@ 2017-05-02  0:35     ` Bystricky, Juro
  2017-05-02  5:55       ` Martin Jansa
  0 siblings, 1 reply; 13+ messages in thread
From: Bystricky, Juro @ 2017-05-02  0:35 UTC (permalink / raw)
  To: Richard Purdie, openembedded-core; +Cc: jurobystricky

I see your point. The original idea was to keep all related variables in one place. There is 
one variable ( BUILD_REPRODUCIBLE_BINARIES ) that I think should be global,
as it should be visible by all tasks (well, a lot of tasks). The rest can be moved to more appropriate places.


________________________________________
From: Richard Purdie [richard.purdie@linuxfoundation.org]
Sent: Monday, May 01, 2017 4:13 PM
To: Bystricky, Juro; openembedded-core@lists.openembedded.org
Cc: joshua.g.lock@linux.intel.com; Burton, Ross; martin.jansa@gmail.com; raj.khem@gmail.com; jurobystricky@hotmail.com
Subject: Re: [PATCH v2 1/6] bitbake.conf: new variable BUILD_REPRODUCIBLE_BINARIES

On Mon, 2017-05-01 at 13:58 -0700, Juro Bystricky wrote:
> Building reproducible binaries may remove certain intentional
> randomness intended for increased security. Hence, it is reasonable
> to expect there will be cases where this is not desirable.
> The user can select his/her preferences via the variable
> BUILD_REPRODUCIBLE_BINARIES. The variable defaults to "0" (do not
> build reproducible binaries) in order to minimize any potential
> regressions. (Once the reproducible binaries code is mature enough,
> it can be set to "1".)
> If the variable BUILD_REPRODUCIBLE_BINARIES is set to "1",
> timestamp values taken from additional variables will be optionally
> used
> when building binary reproducible images:
>
>     REPRODUCIBLE_TIMESTAMP_ROOTFS
>         If the value is specified, all files mtime will be set to
> this value.
>         In addition, /etc/timestamp and /etc/version will both
> contain the value.
>         If no value is specified, timestamp will be derived from the
> top git commit.
>
>     REPRODUCIBLE_TIMESTAMP_IMAGE_PRELINK
>         Value passed via environment variable PRELINK_TIMESTAMP to
> the prelink program.
>         If the value is specified, the value will be used.
>         If no value is specified, timestamp will be derived from the
> top git commit.
>
> Signed-off-by: Juro Bystricky <juro.bystricky@intel.com>
> ---
>  meta/conf/bitbake.conf | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> diff --git a/meta/conf/bitbake.conf b/meta/conf/bitbake.conf
> index 227babd..6ce1a1a 100644
> --- a/meta/conf/bitbake.conf
> +++ b/meta/conf/bitbake.conf
> @@ -859,3 +859,14 @@ BB_SIGNATURE_EXCLUDE_FLAGS ?= "doc deps depends
> \
>
>  MLPREFIX ??= ""
>  MULTILIB_VARIANTS ??= ""
> +
> +BUILD_REPRODUCIBLE_BINARIES ??= "0"
> +BUILD_REPRODUCIBLE_BINARIES[export] = "1"
> +
> +# Unix timestamp
> +REPRODUCIBLE_TIMESTAMP_ROOTFS ??= ""
> +REPRODUCIBLE_TIMESTAMP_ROOTFS[export] = "1"
> +
> +# Unix timestamp
> +REPRODUCIBLE_TIMESTAMP_IMAGE_PRELINK ??= ""
> +REPRODUCIBLE_TIMESTAMP_IMAGE_PRELINK[export] = "1"

Please don't add new global exports in bitbake.conf. Changing the value
of this will cause everything to rebuild (e.g. recompile) since the
exported environment goes to all tasks. We really don't want to do that
if it only affects the image generation.

I'll give this a bit more thought/review but wanted to comment on this
whilst I see it/remember.

Cheers,

Richard



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/6] bitbake.conf: new variable BUILD_REPRODUCIBLE_BINARIES
  2017-05-02  0:35     ` Bystricky, Juro
@ 2017-05-02  5:55       ` Martin Jansa
  0 siblings, 0 replies; 13+ messages in thread
From: Martin Jansa @ 2017-05-02  5:55 UTC (permalink / raw)
  To: Bystricky, Juro; +Cc: openembedded-core, jurobystricky

[-- Attachment #1: Type: text/plain, Size: 3528 bytes --]

I think you can define them in bitbake.conf, but then export them only
where needed.

On Tue, May 2, 2017 at 2:35 AM, Bystricky, Juro <juro.bystricky@intel.com>
wrote:

> I see your point. The original idea was to keep all related variables in
> one place. There is
> one variable ( BUILD_REPRODUCIBLE_BINARIES ) that I think should be global,
> as it should be visible by all tasks (well, a lot of tasks). The rest can
> be moved to more appropriate places.
>
>
> ________________________________________
> From: Richard Purdie [richard.purdie@linuxfoundation.org]
> Sent: Monday, May 01, 2017 4:13 PM
> To: Bystricky, Juro; openembedded-core@lists.openembedded.org
> Cc: joshua.g.lock@linux.intel.com; Burton, Ross; martin.jansa@gmail.com;
> raj.khem@gmail.com; jurobystricky@hotmail.com
> Subject: Re: [PATCH v2 1/6] bitbake.conf: new variable
> BUILD_REPRODUCIBLE_BINARIES
>
> On Mon, 2017-05-01 at 13:58 -0700, Juro Bystricky wrote:
> > Building reproducible binaries may remove certain intentional
> > randomness intended for increased security. Hence, it is reasonable
> > to expect there will be cases where this is not desirable.
> > The user can select his/her preferences via the variable
> > BUILD_REPRODUCIBLE_BINARIES. The variable defaults to "0" (do not
> > build reproducible binaries) in order to minimize any potential
> > regressions. (Once the reproducible binaries code is mature enough,
> > it can be set to "1".)
> > If the variable BUILD_REPRODUCIBLE_BINARIES is set to "1",
> > timestamp values taken from additional variables will be optionally
> > used
> > when building binary reproducible images:
> >
> >     REPRODUCIBLE_TIMESTAMP_ROOTFS
> >         If the value is specified, all files mtime will be set to
> > this value.
> >         In addition, /etc/timestamp and /etc/version will both
> > contain the value.
> >         If no value is specified, timestamp will be derived from the
> > top git commit.
> >
> >     REPRODUCIBLE_TIMESTAMP_IMAGE_PRELINK
> >         Value passed via environment variable PRELINK_TIMESTAMP to
> > the prelink program.
> >         If the value is specified, the value will be used.
> >         If no value is specified, timestamp will be derived from the
> > top git commit.
> >
> > Signed-off-by: Juro Bystricky <juro.bystricky@intel.com>
> > ---
> >  meta/conf/bitbake.conf | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> >
> > diff --git a/meta/conf/bitbake.conf b/meta/conf/bitbake.conf
> > index 227babd..6ce1a1a 100644
> > --- a/meta/conf/bitbake.conf
> > +++ b/meta/conf/bitbake.conf
> > @@ -859,3 +859,14 @@ BB_SIGNATURE_EXCLUDE_FLAGS ?= "doc deps depends
> > \
> >
> >  MLPREFIX ??= ""
> >  MULTILIB_VARIANTS ??= ""
> > +
> > +BUILD_REPRODUCIBLE_BINARIES ??= "0"
> > +BUILD_REPRODUCIBLE_BINARIES[export] = "1"
> > +
> > +# Unix timestamp
> > +REPRODUCIBLE_TIMESTAMP_ROOTFS ??= ""
> > +REPRODUCIBLE_TIMESTAMP_ROOTFS[export] = "1"
> > +
> > +# Unix timestamp
> > +REPRODUCIBLE_TIMESTAMP_IMAGE_PRELINK ??= ""
> > +REPRODUCIBLE_TIMESTAMP_IMAGE_PRELINK[export] = "1"
>
> Please don't add new global exports in bitbake.conf. Changing the value
> of this will cause everything to rebuild (e.g. recompile) since the
> exported environment goes to all tasks. We really don't want to do that
> if it only affects the image generation.
>
> I'll give this a bit more thought/review but wanted to comment on this
> whilst I see it/remember.
>
> Cheers,
>
> Richard
>
>

[-- Attachment #2: Type: text/html, Size: 4738 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/6] base.bbclass: initial support for binary reproducibility
  2017-05-01 20:59 ` [PATCH v2 2/6] base.bbclass: initial support for binary reproducibility Juro Bystricky
@ 2017-06-14 20:30   ` Martin Jansa
  2017-06-14 20:50     ` Bystricky, Juro
  0 siblings, 1 reply; 13+ messages in thread
From: Martin Jansa @ 2017-06-14 20:30 UTC (permalink / raw)
  To: Juro Bystricky
  Cc: Patches and discussions about the oe-core layer, Juro Bystricky

[-- Attachment #1: Type: text/plain, Size: 9470 bytes --]

For some recipes (llvm, chromium, chromium-wayland) I've noticed this
function to fail with:

ERROR: llvm3.3-3.3-r0 do_unpack: Error executing a python function in
exec_python_func() autogenerated:

The stack trace of python calls that resulted in this exception/failure was:
File: 'exec_python_func() autogenerated', lineno: 2, function: <module>
     0001:
 *** 0002:base_do_unpack(d)
     0003:
File: '/home/jenkins/oe/world/shr-core/openembedded-core/meta/classes/base.bbclass',
lineno: 215, function: base_do_unpack
     0211:
     0212:    try:
     0213:        fetcher = bb.fetch2.Fetch(src_uri, d)
     0214:        fetcher.unpack(d.getVar('WORKDIR'))
 *** 0215:        create_src_date_epoch_stamp(d)
     0216:    except bb.fetch2.BBFetchException as e:
     0217:        bb.fatal(str(e))
     0218:}
     0219:
File: '/home/jenkins/oe/world/shr-core/openembedded-core/meta/classes/base.bbclass',
lineno: 34, function: create_src_date_epoch_stamp
     0030:            exclude = set(["temp", "licenses", "patches",
"recipe-sysroot-native", "recipe-sysroot" ])
     0031:            for root, dirs, files in os.walk(path, topdown=True):
     0032:                dirs[:] = [d for d in dirs if d not in exclude]
     0033:                if root.endswith('/git'):
 *** 0034:                    src_date_epoch = get_git_src_date_epoch(d, root)
     0035:                    break
     0036:
     0037:                for fname in files:
     0038:                    filename = os.path.join(root, fname)
File: '/home/jenkins/oe/world/shr-core/openembedded-core/meta/classes/base.bbclass',
lineno: 17, function: get_git_src_date_epoch
     0013:def get_git_src_date_epoch(d, path):
     0014:    import subprocess
     0015:    saved_cwd = os.getcwd()
     0016:    os.chdir(path)
 *** 0017:    src_date_epoch =
int(subprocess.check_output(['git','log','-1','--pretty=%ct']))
     0018:    os.chdir(saved_cwd)
     0019:    return src_date_epoch
     0020:
     0021:def create_src_date_epoch_stamp(d):
File: '/usr/lib/python3.5/subprocess.py', lineno: 626, function: check_output
     0622:        # empty string. That is maintained here for
backwards compatibility.
     0623:        kwargs['input'] = '' if
kwargs.get('universal_newlines', False) else b''
     0624:
     0625:    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
 *** 0626:               **kwargs).stdout
     0627:
     0628:
     0629:class CompletedProcess(object):
     0630:    """A process that has finished running.
File: '/usr/lib/python3.5/subprocess.py', lineno: 708, function: run
     0704:            raise
     0705:        retcode = process.poll()
     0706:        if check and retcode:
     0707:            raise CalledProcessError(retcode, process.args,
 *** 0708:                                     output=stdout, stderr=stderr)
     0709:    return CompletedProcess(process.args, retcode, stdout, stderr)
     0710:
     0711:
     0712:def list2cmdline(seq):
Exception: subprocess.CalledProcessError: Command '['git', 'log',
'-1', '--pretty=%ct']' returned non-zero exit status 128

ERROR: llvm3.3-3.3-r0 do_unpack: Function failed: base_do_unpack
ERROR: Logfile of failure stored in:
/home/jenkins/oe/world/shr-core/tmp-glibc/work/i586-oe-linux/llvm3.3/3.3-r0/temp/log.do_unpack.25005
NOTE: recipe llvm3.3-3.3-r0: task do_unpack: Failed
ERROR: Task (/home/jenkins/oe/world/shr-core/meta-openembedded/meta-oe/recipes-core/llvm/llvm3.3_3.3.bb:do_unpack)
failed with exit code '1'


Maybe I don't have the latest version of this patch (I'm not using
your poky-contrib branch yet), but it should fail with nicer message
when git log fails for whatever reason.


On Mon, May 1, 2017 at 10:59 PM, Juro Bystricky <juro.bystricky@intel.com>
wrote:

> Conditionally set some environment variables in order to achieve
> improved binary reproducibility. Providing BUILD_REPRODUCIBLE_BINARIES is
> set to "1", we set the following environment variables:
>
> export PYTHONHASHSEED=0
> export PERL_HASH_SEED=0
> export TZ="UTC"
>
> We also export and set SOURCE_DATE_EPOCH. The value for this variable
> is obtained after source code for a recipe has been unpacked, but before
> it is
> patched. If the code comes from a GIT repo, we get the timestamp from the
> top
> commit. (This usually corresponds to the mktime of "changelog".)
> Otherwise we go through all files and get the timestamp from the youngest
> one. We create a timestamp for each recipe. The timestamp is stored in the
> file
> 'src_date_epoch.txt'. Later on, each task reads this file and sets
> SOURCE_DATE_EPOCH
> based on the value found in the file.
>
> [YOCTO#11178]
> [YOCTO#11179]
>
> Signed-off-by: Juro Bystricky <juro.bystricky@intel.com>
> ---
>  meta/classes/base.bbclass | 82 ++++++++++++++++++++++++++++++
> +++++++++++++++++
>  1 file changed, 82 insertions(+)
>
> diff --git a/meta/classes/base.bbclass b/meta/classes/base.bbclass
> index e29821f..f2b2d97 100644
> --- a/meta/classes/base.bbclass
> +++ b/meta/classes/base.bbclass
> @@ -10,6 +10,52 @@ inherit utility-tasks
>  inherit metadata_scm
>  inherit logging
>
> +def get_git_src_date_epoch(d, path):
> +    import subprocess
> +    saved_cwd = os.getcwd()
> +    os.chdir(path)
> +    src_date_epoch = int(subprocess.check_output(['
> git','log','-1','--pretty=%ct']))
> +    os.chdir(saved_cwd)
> +    return src_date_epoch
> +
> +def create_src_date_epoch_stamp(d):
> +    if d.getVar('BUILD_REPRODUCIBLE_BINARIES') == '1':
> +        path = d.getVar('S')
> +        src_date_epoch = 0
> +        filename_dbg = None
> +
> +        if path.endswith('/git'):
> +            src_date_epoch = get_git_src_date_epoch(d, path)
> +        else:
> +            exclude = set(["temp", "licenses", "patches",
> "recipe-sysroot-native", "recipe-sysroot" ])
> +            for root, dirs, files in os.walk(path, topdown=True):
> +                dirs[:] = [d for d in dirs if d not in exclude]
> +                if root.endswith('/git'):
> +                    src_date_epoch = get_git_src_date_epoch(d, root)
> +                    break
> +
> +                for fname in files:
> +                    filename = os.path.join(root, fname)
> +                    try:
> +                        mtime = int(os.path.getmtime(filename))
> +                    except:
> +                        mtime = 0
> +                    if mtime > src_date_epoch:
> +                        src_date_epoch = mtime
> +                        filename_dbg = filename
> +
> +        # Most likely an empty folder
> +        if src_date_epoch == 0:
> +            bb.warn("Unable to determine src_date_epoch! path:%s" % path)
> +
> +        f = open(os.path.join(path,'src_date_epoch.txt'), 'w')
> +        f.write(str(src_date_epoch))
> +        f.close()
> +
> +        if filename_dbg != None:
> +            bb.debug(1," src_date_epoch %d derived from: %s" %
> (src_date_epoch, filename_dbg))
> +
> +
>  OE_IMPORTS += "os sys time oe.path oe.utils oe.types oe.package
> oe.packagegroup oe.sstatesig oe.lsb oe.cachedpath oe.license"
>  OE_IMPORTS[type] = "list"
>
> @@ -173,6 +219,7 @@ python base_do_unpack() {
>      try:
>          fetcher = bb.fetch2.Fetch(src_uri, d)
>          fetcher.unpack(d.getVar('WORKDIR'))
> +        create_src_date_epoch_stamp(d)
>      except bb.fetch2.BBFetchException as e:
>          bb.fatal(str(e))
>  }
> @@ -383,9 +430,43 @@ def set_packagetriplet(d):
>
>      settriplet(d, "PKGMLTRIPLETS", archs, tos, tvs)
>
> +
> +export PYTHONHASHSEED
> +export PERL_HASH_SEED
> +export SOURCE_DATE_EPOCH
> +
> +BB_HASHBASE_WHITELIST += "SOURCE_DATE_EPOCH PYTHONHASHSEED PERL_HASH_SEED
> "
> +
>  python () {
>      import string, re
>
> +    # Create reproducible_environment
> +
> +    if d.getVar('BUILD_REPRODUCIBLE_BINARIES') == '1':
> +        import subprocess
> +        d.setVar('PYTHONHASHSEED', '0')
> +        d.setVar('PERL_HASH_SEED', '0')
> +        d.setVar('TZ', 'UTC')
> +
> +        path = d.getVar('S')
> +        epochfile = os.path.join(path,'src_date_epoch.txt')
> +        if os.path.isfile(epochfile):
> +            f = open(epochfile, 'r')
> +            src_date_epoch = f.read()
> +            f.close()
> +            bb.debug(1, "src_date_epoch stamp found ---> stamp %s" %
> src_date_epoch)
> +            d.setVar('SOURCE_DATE_EPOCH', src_date_epoch)
> +        else:
> +            bb.debug(1, "src_date_epoch stamp not found.")
> +            d.setVar('SOURCE_DATE_EPOCH', '0')
> +    else:
> +        if 'PYTHONHASHSEED' in os.environ:
> +            del os.environ['PYTHONHASHSEED']
> +        if 'PERL_HASH_SEED' in os.environ:
> +            del os.environ['PERL_HASH_SEED']
> +        if 'SOURCE_DATE_EPOCH' in os.environ:
> +            del os.environ['SOURCE_DATE_EPOCH']
> +
>      # Handle PACKAGECONFIG
>      #
>      # These take the form:
> @@ -678,6 +759,7 @@ python () {
>              bb.warn("Recipe %s is marked as only being architecture
> specific but seems to have machine specific packages?! The recipe may as
> well mark itself as machine specific directly." % d.getVar("PN"))
>  }
>
> +
>  addtask cleansstate after do_clean
>  python do_cleansstate() {
>          sstate_clean_cachefiles(d)
> --
> 2.7.4
>
>

[-- Attachment #2: Type: text/html, Size: 11745 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/6] base.bbclass: initial support for binary reproducibility
  2017-06-14 20:30   ` Martin Jansa
@ 2017-06-14 20:50     ` Bystricky, Juro
  0 siblings, 0 replies; 13+ messages in thread
From: Bystricky, Juro @ 2017-06-14 20:50 UTC (permalink / raw)
  To: Martin Jansa
  Cc: Patches and discussions about the oe-core layer, Juro Bystricky

[-- Attachment #1: Type: text/plain, Size: 10024 bytes --]

Thanks, it seems there is not-too-great error checking implemented. There is an assumption if there is
a git folder there is also a git repo (and "git" binary in PATH). Seems the assumption is wrong.
I'll add some error checking/improved error message. The problem with git repos is they modify the
file mod timestamps upon checkout, so it is not possible to find the one that is truly the youngest.
So the code tries to get the timestamp of the top commit instead.

________________________________
From: Martin Jansa [martin.jansa@gmail.com]
Sent: Wednesday, June 14, 2017 1:30 PM
To: Bystricky, Juro
Cc: Patches and discussions about the oe-core layer; Richard Purdie; Joshua G Lock; Burton, Ross; Khem Raj; Juro Bystricky
Subject: Re: [PATCH v2 2/6] base.bbclass: initial support for binary reproducibility

For some recipes (llvm, chromium, chromium-wayland) I've noticed this function to fail with:


ERROR: llvm3.3-3.3-r0 do_unpack: Error executing a python function in exec_python_func() autogenerated:

The stack trace of python calls that resulted in this exception/failure was:
File: 'exec_python_func() autogenerated', lineno: 2, function: <module>
     0001:
 *** 0002:base_do_unpack(d)
     0003:
File: '/home/jenkins/oe/world/shr-core/openembedded-core/meta/classes/base.bbclass', lineno: 215, function: base_do_unpack
     0211:
     0212:    try:
     0213:        fetcher = bb.fetch2.Fetch(src_uri, d)
     0214:        fetcher.unpack(d.getVar('WORKDIR'))
 *** 0215:        create_src_date_epoch_stamp(d)
     0216:    except bb.fetch2.BBFetchException as e:
     0217:        bb.fatal(str(e))
     0218:}
     0219:
File: '/home/jenkins/oe/world/shr-core/openembedded-core/meta/classes/base.bbclass', lineno: 34, function: create_src_date_epoch_stamp
     0030:            exclude = set(["temp", "licenses", "patches", "recipe-sysroot-native", "recipe-sysroot" ])
     0031:            for root, dirs, files in os.walk(path, topdown=True):
     0032:                dirs[:] = [d for d in dirs if d not in exclude]
     0033:                if root.endswith('/git'):
 *** 0034:                    src_date_epoch = get_git_src_date_epoch(d, root)
     0035:                    break
     0036:
     0037:                for fname in files:
     0038:                    filename = os.path.join(root, fname)
File: '/home/jenkins/oe/world/shr-core/openembedded-core/meta/classes/base.bbclass', lineno: 17, function: get_git_src_date_epoch
     0013:def get_git_src_date_epoch(d, path):
     0014:    import subprocess
     0015:    saved_cwd = os.getcwd()
     0016:    os.chdir(path)
 *** 0017:    src_date_epoch = int(subprocess.check_output(['git','log','-1','--pretty=%ct']))
     0018:    os.chdir(saved_cwd)
     0019:    return src_date_epoch
     0020:
     0021:def create_src_date_epoch_stamp(d):
File: '/usr/lib/python3.5/subprocess.py', lineno: 626, function: check_output
     0622:        # empty string. That is maintained here for backwards compatibility.
     0623:        kwargs['input'] = '' if kwargs.get('universal_newlines', False) else b''
     0624:
     0625:    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
 *** 0626:               **kwargs).stdout
     0627:
     0628:
     0629:class CompletedProcess(object):
     0630:    """A process that has finished running.
File: '/usr/lib/python3.5/subprocess.py', lineno: 708, function: run
     0704:            raise
     0705:        retcode = process.poll()
     0706:        if check and retcode:
     0707:            raise CalledProcessError(retcode, process.args,
 *** 0708:                                     output=stdout, stderr=stderr)
     0709:    return CompletedProcess(process.args, retcode, stdout, stderr)
     0710:
     0711:
     0712:def list2cmdline(seq):
Exception: subprocess.CalledProcessError: Command '['git', 'log', '-1', '--pretty=%ct']' returned non-zero exit status 128

ERROR: llvm3.3-3.3-r0 do_unpack: Function failed: base_do_unpack
ERROR: Logfile of failure stored in: /home/jenkins/oe/world/shr-core/tmp-glibc/work/i586-oe-linux/llvm3.3/3.3-r0/temp/log.do_unpack.25005
NOTE: recipe llvm3.3-3.3-r0: task do_unpack: Failed
ERROR: Task (/home/jenkins/oe/world/shr-core/meta-openembedded/meta-oe/recipes-core/llvm/llvm3.3_3.3.bb:do_unpack) failed with exit code '1'


Maybe I don't have the latest version of this patch (I'm not using your poky-contrib branch yet), but it should fail with nicer message when git log fails for whatever reason.

On Mon, May 1, 2017 at 10:59 PM, Juro Bystricky <juro.bystricky@intel.com<mailto:juro.bystricky@intel.com>> wrote:
Conditionally set some environment variables in order to achieve
improved binary reproducibility. Providing BUILD_REPRODUCIBLE_BINARIES is
set to "1", we set the following environment variables:

export PYTHONHASHSEED=0
export PERL_HASH_SEED=0
export TZ="UTC"

We also export and set SOURCE_DATE_EPOCH. The value for this variable
is obtained after source code for a recipe has been unpacked, but before it is
patched. If the code comes from a GIT repo, we get the timestamp from the top
commit. (This usually corresponds to the mktime of "changelog".)
Otherwise we go through all files and get the timestamp from the youngest
one. We create a timestamp for each recipe. The timestamp is stored in the file
'src_date_epoch.txt'. Later on, each task reads this file and sets SOURCE_DATE_EPOCH
based on the value found in the file.

[YOCTO#11178]
[YOCTO#11179]

Signed-off-by: Juro Bystricky <juro.bystricky@intel.com<mailto:juro.bystricky@intel.com>>
---
 meta/classes/base.bbclass | 82 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 82 insertions(+)

diff --git a/meta/classes/base.bbclass b/meta/classes/base.bbclass
index e29821f..f2b2d97 100644
--- a/meta/classes/base.bbclass
+++ b/meta/classes/base.bbclass
@@ -10,6 +10,52 @@ inherit utility-tasks
 inherit metadata_scm
 inherit logging

+def get_git_src_date_epoch(d, path):
+    import subprocess
+    saved_cwd = os.getcwd()
+    os.chdir(path)
+    src_date_epoch = int(subprocess.check_output(['git','log','-1','--pretty=%ct']))
+    os.chdir(saved_cwd)
+    return src_date_epoch
+
+def create_src_date_epoch_stamp(d):
+    if d.getVar('BUILD_REPRODUCIBLE_BINARIES') == '1':
+        path = d.getVar('S')
+        src_date_epoch = 0
+        filename_dbg = None
+
+        if path.endswith('/git'):
+            src_date_epoch = get_git_src_date_epoch(d, path)
+        else:
+            exclude = set(["temp", "licenses", "patches", "recipe-sysroot-native", "recipe-sysroot" ])
+            for root, dirs, files in os.walk(path, topdown=True):
+                dirs[:] = [d for d in dirs if d not in exclude]
+                if root.endswith('/git'):
+                    src_date_epoch = get_git_src_date_epoch(d, root)
+                    break
+
+                for fname in files:
+                    filename = os.path.join(root, fname)
+                    try:
+                        mtime = int(os.path.getmtime(filename))
+                    except:
+                        mtime = 0
+                    if mtime > src_date_epoch:
+                        src_date_epoch = mtime
+                        filename_dbg = filename
+
+        # Most likely an empty folder
+        if src_date_epoch == 0:
+            bb.warn("Unable to determine src_date_epoch! path:%s" % path)
+
+        f = open(os.path.join(path,'src_date_epoch.txt'), 'w')
+        f.write(str(src_date_epoch))
+        f.close()
+
+        if filename_dbg != None:
+            bb.debug(1," src_date_epoch %d derived from: %s" % (src_date_epoch, filename_dbg))
+
+
 OE_IMPORTS += "os sys time oe.path oe.utils oe.types oe.package oe.packagegroup oe.sstatesig oe.lsb oe.cachedpath oe.license"
 OE_IMPORTS[type] = "list"

@@ -173,6 +219,7 @@ python base_do_unpack() {
     try:
         fetcher = bb.fetch2.Fetch(src_uri, d)
         fetcher.unpack(d.getVar('WORKDIR'))
+        create_src_date_epoch_stamp(d)
     except bb.fetch2.BBFetchException as e:
         bb.fatal(str(e))
 }
@@ -383,9 +430,43 @@ def set_packagetriplet(d):

     settriplet(d, "PKGMLTRIPLETS", archs, tos, tvs)

+
+export PYTHONHASHSEED
+export PERL_HASH_SEED
+export SOURCE_DATE_EPOCH
+
+BB_HASHBASE_WHITELIST += "SOURCE_DATE_EPOCH PYTHONHASHSEED PERL_HASH_SEED "
+
 python () {
     import string, re

+    # Create reproducible_environment
+
+    if d.getVar('BUILD_REPRODUCIBLE_BINARIES') == '1':
+        import subprocess
+        d.setVar('PYTHONHASHSEED', '0')
+        d.setVar('PERL_HASH_SEED', '0')
+        d.setVar('TZ', 'UTC')
+
+        path = d.getVar('S')
+        epochfile = os.path.join(path,'src_date_epoch.txt')
+        if os.path.isfile(epochfile):
+            f = open(epochfile, 'r')
+            src_date_epoch = f.read()
+            f.close()
+            bb.debug(1, "src_date_epoch stamp found ---> stamp %s" % src_date_epoch)
+            d.setVar('SOURCE_DATE_EPOCH', src_date_epoch)
+        else:
+            bb.debug(1, "src_date_epoch stamp not found.")
+            d.setVar('SOURCE_DATE_EPOCH', '0')
+    else:
+        if 'PYTHONHASHSEED' in os.environ:
+            del os.environ['PYTHONHASHSEED']
+        if 'PERL_HASH_SEED' in os.environ:
+            del os.environ['PERL_HASH_SEED']
+        if 'SOURCE_DATE_EPOCH' in os.environ:
+            del os.environ['SOURCE_DATE_EPOCH']
+
     # Handle PACKAGECONFIG
     #
     # These take the form:
@@ -678,6 +759,7 @@ python () {
             bb.warn("Recipe %s is marked as only being architecture specific but seems to have machine specific packages?! The recipe may as well mark itself as machine specific directly." % d.getVar("PN"))
 }

+
 addtask cleansstate after do_clean
 python do_cleansstate() {
         sstate_clean_cachefiles(d)
--
2.7.4



[-- Attachment #2: Type: text/html, Size: 14876 bytes --]

^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-06-14 20:50 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-01 20:58 [PATCH v2 0/6] Reproducible binaries Juro Bystricky
2017-05-01 20:58 ` [PATCH v2 1/6] bitbake.conf: new variable BUILD_REPRODUCIBLE_BINARIES Juro Bystricky
2017-05-01 23:13   ` Richard Purdie
2017-05-02  0:35     ` Bystricky, Juro
2017-05-02  5:55       ` Martin Jansa
2017-05-01 20:59 ` [PATCH v2 2/6] base.bbclass: initial support for binary reproducibility Juro Bystricky
2017-06-14 20:30   ` Martin Jansa
2017-06-14 20:50     ` Bystricky, Juro
2017-05-01 20:59 ` [PATCH v2 3/6] image-prelink.bbclass: support " Juro Bystricky
2017-05-01 20:59 ` [PATCH v2 4/6] rootfs-postcommands.bbclass: " Juro Bystricky
2017-05-01 20:59 ` [PATCH v2 5/6] busybox.inc: improve reproducibility Juro Bystricky
2017-05-02  0:31   ` Andre McCurdy
2017-05-01 20:59 ` [PATCH v2 6/6] image.bbclass: support binary reproducibility Juro Bystricky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.