All of lore.kernel.org
 help / color / mirror / Atom feed
From: Juro Bystricky <juro.bystricky@intel.com>
To: openembedded-core@lists.openembedded.org
Cc: jurobystricky@hotmail.com
Subject: [PATCH v2 2/6] base.bbclass: initial support for binary reproducibility
Date: Mon,  1 May 2017 13:59:00 -0700	[thread overview]
Message-ID: <1493672344-21965-3-git-send-email-juro.bystricky@intel.com> (raw)
In-Reply-To: <1493672344-21965-1-git-send-email-juro.bystricky@intel.com>

Conditionally set some environment variables in order to achieve
improved binary reproducibility. Providing BUILD_REPRODUCIBLE_BINARIES is
set to "1", we set the following environment variables:

export PYTHONHASHSEED=0
export PERL_HASH_SEED=0
export TZ="UTC"

We also export and set SOURCE_DATE_EPOCH. The value for this variable
is obtained after source code for a recipe has been unpacked, but before it is
patched. If the code comes from a GIT repo, we get the timestamp from the top
commit. (This usually corresponds to the mktime of "changelog".)
Otherwise we go through all files and get the timestamp from the youngest
one. We create a timestamp for each recipe. The timestamp is stored in the file
'src_date_epoch.txt'. Later on, each task reads this file and sets SOURCE_DATE_EPOCH
based on the value found in the file.

[YOCTO#11178]
[YOCTO#11179]

Signed-off-by: Juro Bystricky <juro.bystricky@intel.com>
---
 meta/classes/base.bbclass | 82 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 82 insertions(+)

diff --git a/meta/classes/base.bbclass b/meta/classes/base.bbclass
index e29821f..f2b2d97 100644
--- a/meta/classes/base.bbclass
+++ b/meta/classes/base.bbclass
@@ -10,6 +10,52 @@ inherit utility-tasks
 inherit metadata_scm
 inherit logging
 
+def get_git_src_date_epoch(d, path):
+    import subprocess
+    saved_cwd = os.getcwd()
+    os.chdir(path)
+    src_date_epoch = int(subprocess.check_output(['git','log','-1','--pretty=%ct']))
+    os.chdir(saved_cwd)
+    return src_date_epoch
+
+def create_src_date_epoch_stamp(d):
+    if d.getVar('BUILD_REPRODUCIBLE_BINARIES') == '1':
+        path = d.getVar('S')
+        src_date_epoch = 0
+        filename_dbg = None
+
+        if path.endswith('/git'):
+            src_date_epoch = get_git_src_date_epoch(d, path)
+        else:
+            exclude = set(["temp", "licenses", "patches", "recipe-sysroot-native", "recipe-sysroot" ])
+            for root, dirs, files in os.walk(path, topdown=True):
+                dirs[:] = [d for d in dirs if d not in exclude]
+                if root.endswith('/git'):
+                    src_date_epoch = get_git_src_date_epoch(d, root)
+                    break
+
+                for fname in files:
+                    filename = os.path.join(root, fname)
+                    try:
+                        mtime = int(os.path.getmtime(filename))
+                    except:
+                        mtime = 0
+                    if mtime > src_date_epoch:
+                        src_date_epoch = mtime
+                        filename_dbg = filename
+
+        # Most likely an empty folder
+        if src_date_epoch == 0:
+            bb.warn("Unable to determine src_date_epoch! path:%s" % path)
+
+        f = open(os.path.join(path,'src_date_epoch.txt'), 'w')
+        f.write(str(src_date_epoch))
+        f.close()
+
+        if filename_dbg != None:
+            bb.debug(1," src_date_epoch %d derived from: %s" % (src_date_epoch, filename_dbg))
+
+
 OE_IMPORTS += "os sys time oe.path oe.utils oe.types oe.package oe.packagegroup oe.sstatesig oe.lsb oe.cachedpath oe.license"
 OE_IMPORTS[type] = "list"
 
@@ -173,6 +219,7 @@ python base_do_unpack() {
     try:
         fetcher = bb.fetch2.Fetch(src_uri, d)
         fetcher.unpack(d.getVar('WORKDIR'))
+        create_src_date_epoch_stamp(d)
     except bb.fetch2.BBFetchException as e:
         bb.fatal(str(e))
 }
@@ -383,9 +430,43 @@ def set_packagetriplet(d):
 
     settriplet(d, "PKGMLTRIPLETS", archs, tos, tvs)
 
+
+export PYTHONHASHSEED
+export PERL_HASH_SEED
+export SOURCE_DATE_EPOCH
+
+BB_HASHBASE_WHITELIST += "SOURCE_DATE_EPOCH PYTHONHASHSEED PERL_HASH_SEED "
+
 python () {
     import string, re
 
+    # Create reproducible_environment
+
+    if d.getVar('BUILD_REPRODUCIBLE_BINARIES') == '1':
+        import subprocess
+        d.setVar('PYTHONHASHSEED', '0')
+        d.setVar('PERL_HASH_SEED', '0')
+        d.setVar('TZ', 'UTC')
+
+        path = d.getVar('S')
+        epochfile = os.path.join(path,'src_date_epoch.txt')
+        if os.path.isfile(epochfile):
+            f = open(epochfile, 'r')
+            src_date_epoch = f.read()
+            f.close()
+            bb.debug(1, "src_date_epoch stamp found ---> stamp %s" % src_date_epoch)
+            d.setVar('SOURCE_DATE_EPOCH', src_date_epoch)
+        else:
+            bb.debug(1, "src_date_epoch stamp not found.")
+            d.setVar('SOURCE_DATE_EPOCH', '0')
+    else:
+        if 'PYTHONHASHSEED' in os.environ:
+            del os.environ['PYTHONHASHSEED']
+        if 'PERL_HASH_SEED' in os.environ:
+            del os.environ['PERL_HASH_SEED']
+        if 'SOURCE_DATE_EPOCH' in os.environ:
+            del os.environ['SOURCE_DATE_EPOCH']
+
     # Handle PACKAGECONFIG
     #
     # These take the form:
@@ -678,6 +759,7 @@ python () {
             bb.warn("Recipe %s is marked as only being architecture specific but seems to have machine specific packages?! The recipe may as well mark itself as machine specific directly." % d.getVar("PN"))
 }
 
+
 addtask cleansstate after do_clean
 python do_cleansstate() {
         sstate_clean_cachefiles(d)
-- 
2.7.4



  parent reply	other threads:[~2017-05-01 20:59 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-01 20:58 [PATCH v2 0/6] Reproducible binaries Juro Bystricky
2017-05-01 20:58 ` [PATCH v2 1/6] bitbake.conf: new variable BUILD_REPRODUCIBLE_BINARIES Juro Bystricky
2017-05-01 23:13   ` Richard Purdie
2017-05-02  0:35     ` Bystricky, Juro
2017-05-02  5:55       ` Martin Jansa
2017-05-01 20:59 ` Juro Bystricky [this message]
2017-06-14 20:30   ` [PATCH v2 2/6] base.bbclass: initial support for binary reproducibility Martin Jansa
2017-06-14 20:50     ` Bystricky, Juro
2017-05-01 20:59 ` [PATCH v2 3/6] image-prelink.bbclass: support " Juro Bystricky
2017-05-01 20:59 ` [PATCH v2 4/6] rootfs-postcommands.bbclass: " Juro Bystricky
2017-05-01 20:59 ` [PATCH v2 5/6] busybox.inc: improve reproducibility Juro Bystricky
2017-05-02  0:31   ` Andre McCurdy
2017-05-01 20:59 ` [PATCH v2 6/6] image.bbclass: support binary reproducibility Juro Bystricky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1493672344-21965-3-git-send-email-juro.bystricky@intel.com \
    --to=juro.bystricky@intel.com \
    --cc=jurobystricky@hotmail.com \
    --cc=openembedded-core@lists.openembedded.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.