All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jacob Kroon <jacob.kroon@gmail.com>
To: "André Draszik" <git@andred.net>
Cc: openembedded-core <openembedded-core@lists.openembedded.org>
Subject: Re: [RFC][PATCH 2/2] buildhistory: support generating md5sum of files
Date: Mon, 7 Jan 2019 10:38:10 +0100	[thread overview]
Message-ID: <CAPbeDC=vBTFU6G+CjKUwe_CjhqHqyN6mKxU3t+cr+aw+Wu+5DA@mail.gmail.com> (raw)
In-Reply-To: <7561c555f243ddaaa45bdd28eed97394c35d67e5.camel@andred.net>

Hi André,

On Mon, Jan 7, 2019 at 12:09 AM André Draszik <git@andred.net> wrote:
>
> Hi,
>
> On Sun, 2019-01-06 at 19:13 +0100, Jacob Kroon wrote:
> > Introduce 'md5' in BUILDHISTORY_FEATURES and enable it by default
> > when doing reproducible builds.
> >
> > When enabled this will additionally create:
> >
> >   files-in-package-md5.txt
> >   files-in-image-md5.txt
> >   files-in-sdk-md5.txt
> >
> > containing the md5 checksums of regular files.
> >
> > Signed-off-by: Jacob Kroon <jacob.kroon@gmail.com>
> > ---
> >  meta/classes/buildhistory.bbclass | 10 ++++++++--
> >  1 file changed, 8 insertions(+), 2 deletions(-)
> >
> > diff --git a/meta/classes/buildhistory.bbclass
> > b/meta/classes/buildhistory.bbclass
> > index 33eb1b00f6..00f0701dec 100644
> > --- a/meta/classes/buildhistory.bbclass
> > +++ b/meta/classes/buildhistory.bbclass
> > @@ -7,7 +7,8 @@
> >  # Copyright (C) 2007-2011 Koen Kooi <koen@openembedded.org>
> >  #
> >
> > -BUILDHISTORY_FEATURES ?= "image package sdk"
> > +BUILDHISTORY_FEATURES ?= "image package sdk \
> > +  ${@ "md5" if
> > bb.utils.to_boolean(d.getVar('BUILD_REPRODUCIBLE_BINARIES')) else ""}"
> >  BUILDHISTORY_DIR ?= "${TOPDIR}/buildhistory"
> >  BUILDHISTORY_DIR_IMAGE =
> > "${BUILDHISTORY_DIR}/images/${MACHINE_ARCH}/${TCLIBC}/${IMAGE_BASENAME}"
> >  BUILDHISTORY_DIR_PACKAGE =
> > "${BUILDHISTORY_DIR}/packages/${MULTIMACH_TARGET_SYS}/${PN}"
> > @@ -526,7 +527,12 @@ buildhistory_list_files() {
> >               eval ${FAKEROOTENV} ${FAKEROOTCMD} $find_cmd
> >       else
> >               eval $find_cmd
> > -     fi | sort -k5 | sed 's/ * -> $//' > $2 )
> > +     fi | sort -k5 | sed 's/ * -> $//' > $2
> > +     if [ "${@bb.utils.contains('BUILDHISTORY_FEATURES', 'md5', '1', '0',
> > d)}" = "1" ] ; then
> > +             md5filename=$(echo $2 | sed 's/\.txt$/-md5.txt/')
> > +             find -type f | xargs -I{} -n1 md5sum {} | sort -k2 >
> > $md5filename
>
> Why don't you
>   find . -type f -exec md5sum {} + | sort -sk2 > $md5filename
> ?
> It'll be quite a bit faster because way fewer processes will be spawned.
>
> Am I missing something?

You're right, I will update the patch. I'm assuming I don't need the
stable sort, -s,
since the filenames should all be unique.

> I don't know what the intended use-case of the md5 files is, but could
> sha256 or similar maybe be more appropriate?

I thought it would be a good idea to store some sort of checksum of files in the
buildhistory when doing reproducible builds, so that it is easier to detect
when a rebuild produces changed files, but perhaps there is some way to do
this already that I am missing ?

But I have no real motivation for choosing md5, other than that I
assumed it would be less
cpu intensive than sha256, and the fact I'm not too worried about collisions.

Thanks for the feedback,
Jacob

> Cheers,
> Andre'
>
>
> > +             [ -s $md5filename ] || rm $md5filename # remove result if
> > empty
> > +     fi )
> >  }
> >
> >  buildhistory_list_pkg_files() {
> > --
> > 2.11.0
> >
>
> --
> _______________________________________________
> Openembedded-core mailing list
> Openembedded-core@lists.openembedded.org
> http://lists.openembedded.org/mailman/listinfo/openembedded-core


  reply	other threads:[~2019-01-07  9:38 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-06 18:13 [RFC][PATCH 1/2] buildhistory: simplify buildhistory_list_files() Jacob Kroon
2019-01-06 18:13 ` [RFC][PATCH 2/2] buildhistory: support generating md5sum of files Jacob Kroon
2019-01-06 23:08   ` André Draszik
2019-01-07  9:38     ` Jacob Kroon [this message]
2019-01-07 14:31       ` Richard Purdie
2019-01-07 15:50         ` Jacob Kroon
2019-01-07 14:17   ` Jacob Kroon
2019-01-08 10:32     ` Mikko.Rapeli
2019-01-08 10:36       ` Jacob Kroon
2019-01-08 11:02         ` Mikko.Rapeli
2019-01-09 11:20   ` Peter Kjellerstedt
2019-01-09 18:36     ` Jacob Kroon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPbeDC=vBTFU6G+CjKUwe_CjhqHqyN6mKxU3t+cr+aw+Wu+5DA@mail.gmail.com' \
    --to=jacob.kroon@gmail.com \
    --cc=git@andred.net \
    --cc=openembedded-core@lists.openembedded.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.