All of lore.kernel.org
 help / color / mirror / Atom feed
* Mis-generation of shell script (run.do_install)?
@ 2018-12-11 13:42 Jason Andryuk
  2018-12-11 15:02 ` Richard Purdie
  0 siblings, 1 reply; 15+ messages in thread
From: Jason Andryuk @ 2018-12-11 13:42 UTC (permalink / raw)
  To: openembedded-core

Hi,

Has anyone ever seen a generated shell script missing functions?

I have an OpenXT/OpenEmbedded setup where I had run many successful
builds.  I made a change and then re-ran the build - it failed in
binutil's do_install with autotools_do_install command not found.

core2-64-oe-linux/binutils/2.28-r0/temp/run.do_install.11776: line
124: autotools_do_install: command not found

Sure enough, autotools_do_install is not in run.do_install.

I had not changed binutils or any relevant variable, as far as I can
tell.  If I run with '-e' I see the full autotools_do_install
function in the output.  For some reason, the generated script wasn't
including autotools_do_install.

I tried binutils -c cleansstate, but that didn't work.  I tried
pruning the sstate-cache dir, but that didn't work.  I tried deleting
tmp-glibc and sstate-cache, but it had the same error when I rebuilt.

Modifying binutils do_install by adding a comment and `true` lets it builds.

I saw something similar one other time where the generated script was
missing a function.  I can't recall the details, but it was a
different package and MACHINE.

Any suggestions on debugging this?

Thanks,
Jason


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Mis-generation of shell script (run.do_install)?
  2018-12-11 13:42 Mis-generation of shell script (run.do_install)? Jason Andryuk
@ 2018-12-11 15:02 ` Richard Purdie
  2018-12-14 19:30   ` Jason Andryuk
  0 siblings, 1 reply; 15+ messages in thread
From: Richard Purdie @ 2018-12-11 15:02 UTC (permalink / raw)
  To: Jason Andryuk, openembedded-core

On Tue, 2018-12-11 at 08:42 -0500, Jason Andryuk wrote:
> Has anyone ever seen a generated shell script missing functions?
> 
> I have an OpenXT/OpenEmbedded setup where I had run many successful
> builds.  I made a change and then re-ran the build - it failed in
> binutil's do_install with autotools_do_install command not found.
> 
> core2-64-oe-linux/binutils/2.28-r0/temp/run.do_install.11776: line
> 124: autotools_do_install: command not found
> 
> Sure enough, autotools_do_install is not in run.do_install.
> 
> I had not changed binutils or any relevant variable, as far as I can
> tell.  If I run with '-e' I see the full autotools_do_install
> function in the output.  For some reason, the generated script wasn't
> including autotools_do_install.
> 
> I tried binutils -c cleansstate, but that didn't work.  I tried
> pruning the sstate-cache dir, but that didn't work.  I tried deleting
> tmp-glibc and sstate-cache, but it had the same error when I rebuilt.
> 
> Modifying binutils do_install by adding a comment and `true` lets it
> builds.
> 
> I saw something similar one other time where the generated script was
> missing a function.  I can't recall the details, but it was a
> different package and MACHINE.
> 
> Any suggestions on debugging this?

It sounds like pysh in bitbake wasn't able to see a dependency on the
function in question. Creating a small/reproducible test case would be
how I'd approach it, there are tests on the pysh code in bitbake-
selftest for example.

Once I had a test case which failed, I'd then use that to debug and see
if I could figure out a fix.

Cheers,

Richard



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Mis-generation of shell script (run.do_install)?
  2018-12-11 15:02 ` Richard Purdie
@ 2018-12-14 19:30   ` Jason Andryuk
  2018-12-15 10:51     ` richard.purdie
  0 siblings, 1 reply; 15+ messages in thread
From: Jason Andryuk @ 2018-12-14 19:30 UTC (permalink / raw)
  To: richard.purdie; +Cc: openembedded-core

On Tue, Dec 11, 2018 at 10:02 AM Richard Purdie
<richard.purdie@linuxfoundation.org> wrote:
>
> On Tue, 2018-12-11 at 08:42 -0500, Jason Andryuk wrote:
> > Has anyone ever seen a generated shell script missing functions?
> >
> > I have an OpenXT/OpenEmbedded setup where I had run many successful
> > builds.  I made a change and then re-ran the build - it failed in
> > binutil's do_install with autotools_do_install command not found.
> >
> > core2-64-oe-linux/binutils/2.28-r0/temp/run.do_install.11776: line
> > 124: autotools_do_install: command not found
> >
> > Sure enough, autotools_do_install is not in run.do_install.
> >
> > I had not changed binutils or any relevant variable, as far as I can
> > tell.  If I run with '-e' I see the full autotools_do_install
> > function in the output.  For some reason, the generated script wasn't
> > including autotools_do_install.
> >
> > I tried binutils -c cleansstate, but that didn't work.  I tried
> > pruning the sstate-cache dir, but that didn't work.  I tried deleting
> > tmp-glibc and sstate-cache, but it had the same error when I rebuilt.
> >
> > Modifying binutils do_install by adding a comment and `true` lets it
> > builds.
> >
> > I saw something similar one other time where the generated script was
> > missing a function.  I can't recall the details, but it was a
> > different package and MACHINE.
> >
> > Any suggestions on debugging this?
>
> It sounds like pysh in bitbake wasn't able to see a dependency on the
> function in question. Creating a small/reproducible test case would be
> how I'd approach it, there are tests on the pysh code in bitbake-
> selftest for example.
>
> Once I had a test case which failed, I'd then use that to debug and see
> if I could figure out a fix.

Thanks, Richard.

I wasn't sure how to tie into the pysh stuff, but that got me poking
around in bitbake/lib/bb/codeparser.py .  Adding debug messages to
parse_shell(), I see that do_install is found in the CodeParserCache,
bb_codeparser.dat, but the returned `execs` do not include
autotools_do_install.  Strangely, it includes a path to python -
...core2-32-oe-linux/python-async/0.6.2-r0/recipe-sysroot-native/usr/bin/python-native/python.
It looks like `execs` could be for `distutils_do_install`.  And again,
strangely, python-async is not in my tmp-glibc.  It must have been
built at some point which left the entry in bb_codeparser.dat.

I built python-async, but its distutils_do_install hash value does not
match the one in the cache.

Moving cache/bb_codeparser.dat out of the way, bitbake complains:
ERROR: When reparsing
/home/build/openxt/build/repos/openembedded-core/meta/recipes-devtools/binutils/binutils_2.28.bb.do_install,
the basehash value changed from 80812e0772cf901b51790c205564070d to
493152cd3740c5420d0bf7a5d09df001. The metadata is not deterministic
and this needs to be fixed.

`cleanall` does not clear out the message, but the package builds.

Regards,
Jason


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Mis-generation of shell script (run.do_install)?
  2018-12-14 19:30   ` Jason Andryuk
@ 2018-12-15 10:51     ` richard.purdie
  2018-12-16  1:19       ` Jason Andryuk
  0 siblings, 1 reply; 15+ messages in thread
From: richard.purdie @ 2018-12-15 10:51 UTC (permalink / raw)
  To: Jason Andryuk; +Cc: openembedded-core

On Fri, 2018-12-14 at 14:30 -0500, Jason Andryuk wrote:
> I wasn't sure how to tie into the pysh stuff, but that got me poking
> around in bitbake/lib/bb/codeparser.py .  Adding debug messages to
> parse_shell(), I see that do_install is found in the CodeParserCache,
> bb_codeparser.dat, but the returned `execs` do not include
> autotools_do_install.  Strangely, it includes a path to python -
> ...core2-32-oe-linux/python-async/0.6.2-r0/recipe-sysroot-
> native/usr/bin/python-native/python.
> It looks like `execs` could be for `distutils_do_install`.  And
> again,
> strangely, python-async is not in my tmp-glibc.  It must have been
> built at some point which left the entry in bb_codeparser.dat.
> 
> I built python-async, but its distutils_do_install hash value does
> not
> match the one in the cache.
> 
> Moving cache/bb_codeparser.dat out of the way, bitbake complains:
> ERROR: When reparsing
> /home/build/openxt/build/repos/openembedded-core/meta/recipes-
> devtools/binutils/binutils_2.28.bb.do_install,
> the basehash value changed from 80812e0772cf901b51790c205564070d to
> 493152cd3740c5420d0bf7a5d09df001. The metadata is not deterministic
> and this needs to be fixed.
> 
> `cleanall` does not clear out the message, but the package builds.

Its a little hard to make sense of this. If you move the cache out the
way it should simply get regenerated. It is long lived so things from
old builds in there is expected.

Were you able to isolate this into a smaller test case someone else
could reproduce?

Cheers,

Richard



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Mis-generation of shell script (run.do_install)?
  2018-12-15 10:51     ` richard.purdie
@ 2018-12-16  1:19       ` Jason Andryuk
  2018-12-17 14:44         ` richard.purdie
  0 siblings, 1 reply; 15+ messages in thread
From: Jason Andryuk @ 2018-12-16  1:19 UTC (permalink / raw)
  To: Richard Purdie; +Cc: openembedded-core

On Sat, Dec 15, 2018, 5:51 AM <richard.purdie@linuxfoundation.org wrote:
>
> On Fri, 2018-12-14 at 14:30 -0500, Jason Andryuk wrote:
> > I wasn't sure how to tie into the pysh stuff, but that got me poking
> > around in bitbake/lib/bb/codeparser.py .  Adding debug messages to
> > parse_shell(), I see that do_install is found in the CodeParserCache,
> > bb_codeparser.dat, but the returned `execs` do not include
> > autotools_do_install.  Strangely, it includes a path to python -
> > ...core2-32-oe-linux/python-async/0.6.2-r0/recipe-sysroot-
> > native/usr/bin/python-native/python.
> > It looks like `execs` could be for `distutils_do_install`.  And
> > again,
> > strangely, python-async is not in my tmp-glibc.  It must have been
> > built at some point which left the entry in bb_codeparser.dat.
> >
> > I built python-async, but its distutils_do_install hash value does
> > not
> > match the one in the cache.
> >
> > Moving cache/bb_codeparser.dat out of the way, bitbake complains:
> > ERROR: When reparsing
> > /home/build/openxt/build/repos/openembedded-core/meta/recipes-
> > devtools/binutils/binutils_2.28.bb.do_install,
> > the basehash value changed from 80812e0772cf901b51790c205564070d to
> > 493152cd3740c5420d0bf7a5d09df001. The metadata is not deterministic
> > and this needs to be fixed.
> >
> > `cleanall` does not clear out the message, but the package builds.
>
> Its a little hard to make sense of this. If you move the cache out the
> way it should simply get regenerated. It is long lived so things from
> old builds in there is expected.
>
> Were you able to isolate this into a smaller test case someone else
> could reproduce?

As far as I can tell, pysh is working properly - it's just the
bb_codeparser.dat which is returning the incorrect shellCacheLine
entry.  It seems like I have an md5 collision between a pyro core2-64
binutils do_install and core2-32 python-async distutils_do_install in
the shellCacheLine.  python-async's entry got in first, so that's why
binutils run.do_install doesn't include autotools_do_install - the
shellCacheLine `execs` entry doesn't include it.  Or somehow the
`bb_codeparser.dat` file was corrupted to have an incorrect `execs`
for the binutils do_install hash.

I briefly tried to reproduce the python-async distutils_do_install
with the same hash, but could not get it to match.  Also I tried to
manually unpickle bb_codeparser.dat, but it threw a stack underflow
error - maybe I just didn't have all the necessary imports?

I'm not sure where the basehash/"metadata is not deterministic"
message comes from.  I am using two different x86-64 machines types
that both fall back to the core2-64 binutils.  Could that be an issue?

Regards,
Jason


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Mis-generation of shell script (run.do_install)?
  2018-12-16  1:19       ` Jason Andryuk
@ 2018-12-17 14:44         ` richard.purdie
  2018-12-17 20:21           ` Andre McCurdy
  0 siblings, 1 reply; 15+ messages in thread
From: richard.purdie @ 2018-12-17 14:44 UTC (permalink / raw)
  To: Jason Andryuk; +Cc: openembedded-core

On Sat, 2018-12-15 at 20:19 -0500, Jason Andryuk wrote:
> As far as I can tell, pysh is working properly - it's just the
> bb_codeparser.dat which is returning the incorrect shellCacheLine
> entry.  It seems like I have an md5 collision between a pyro core2-64
> binutils do_install and core2-32 python-async distutils_do_install in
> the shellCacheLine.  python-async's entry got in first, so that's why
> binutils run.do_install doesn't include autotools_do_install - the
> shellCacheLine `execs` entry doesn't include it.  Or somehow the
> `bb_codeparser.dat` file was corrupted to have an incorrect `execs`
> for the binutils do_install hash.

That is rather worrying. Looking at the known issues with md5, I can
see how this could happen though.

I think this means we need to switch to a better hash mechanism. I've
sent a patch changing this to sha256 on the bitbake list.

We also probably need to change over the code in siggen for the sstate
hashes too but one step at a time...

Cheers,

Richard



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Mis-generation of shell script (run.do_install)?
  2018-12-17 14:44         ` richard.purdie
@ 2018-12-17 20:21           ` Andre McCurdy
  2018-12-17 21:24             ` richard.purdie
  0 siblings, 1 reply; 15+ messages in thread
From: Andre McCurdy @ 2018-12-17 20:21 UTC (permalink / raw)
  To: Richard Purdie; +Cc: OE Core mailing list

On Mon, Dec 17, 2018 at 6:44 AM <richard.purdie@linuxfoundation.org> wrote:
>
> On Sat, 2018-12-15 at 20:19 -0500, Jason Andryuk wrote:
> > As far as I can tell, pysh is working properly - it's just the
> > bb_codeparser.dat which is returning the incorrect shellCacheLine
> > entry.  It seems like I have an md5 collision between a pyro core2-64
> > binutils do_install and core2-32 python-async distutils_do_install in
> > the shellCacheLine.  python-async's entry got in first, so that's why
> > binutils run.do_install doesn't include autotools_do_install - the
> > shellCacheLine `execs` entry doesn't include it.  Or somehow the
> > `bb_codeparser.dat` file was corrupted to have an incorrect `execs`
> > for the binutils do_install hash.
>
> That is rather worrying. Looking at the known issues with md5, I can
> see how this could happen though.

How do you see this could happen? By random bad luck?

Despite md5 now being susceptible to targeted attacks, the chances of
accidentally hitting a collision between two 128bit hashes is as
unlikely as it's always been.

  http://big.info/2013/04/md5-hash-collision-probability-using.html

"It is not that easy to get hash collisions when using MD5 algorithm.
Even after you have generated 26 trillion hash values, the probability
of the next generated hash value to be the same as one of those 26
trillion previously generated hash values is 1/1trillion (1 out of 1
trillion)."

It seems much more likely that there's a bug somewhere in the way the
hashes are used. Unless we understand that then switching to a longer
hash might not solve anything.

> I think this means we need to switch to a better hash mechanism. I've
> sent a patch changing this to sha256 on the bitbake list.
>
> We also probably need to change over the code in siggen for the sstate
> hashes too but one step at a time...
>
> Cheers,
>
> Richard
>
> --
> _______________________________________________
> Openembedded-core mailing list
> Openembedded-core@lists.openembedded.org
> http://lists.openembedded.org/mailman/listinfo/openembedded-core


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Mis-generation of shell script (run.do_install)?
  2018-12-17 20:21           ` Andre McCurdy
@ 2018-12-17 21:24             ` richard.purdie
  2018-12-18 17:45               ` Jason Andryuk
  0 siblings, 1 reply; 15+ messages in thread
From: richard.purdie @ 2018-12-17 21:24 UTC (permalink / raw)
  To: Andre McCurdy; +Cc: OE Core mailing list

On Mon, 2018-12-17 at 12:21 -0800, Andre McCurdy wrote:
> On Mon, Dec 17, 2018 at 6:44 AM <richard.purdie@linuxfoundation.org>
> wrote:
> > On Sat, 2018-12-15 at 20:19 -0500, Jason Andryuk wrote:
> > > As far as I can tell, pysh is working properly - it's just the
> > > bb_codeparser.dat which is returning the incorrect shellCacheLine
> > > entry.  It seems like I have an md5 collision between a pyro
> > > core2-64
> > > binutils do_install and core2-32 python-async
> > > distutils_do_install in
> > > the shellCacheLine.  python-async's entry got in first, so that's
> > > why
> > > binutils run.do_install doesn't include autotools_do_install -
> > > the
> > > shellCacheLine `execs` entry doesn't include it.  Or somehow the
> > > `bb_codeparser.dat` file was corrupted to have an incorrect
> > > `execs`
> > > for the binutils do_install hash.
> > 
> > That is rather worrying. Looking at the known issues with md5, I
> > can
> > see how this could happen though.
> 
> How do you see this could happen? By random bad luck?
> 
> Despite md5 now being susceptible to targeted attacks, the chances of
> accidentally hitting a collision between two 128bit hashes is as
> unlikely as it's always been.
> 
>   http://big.info/2013/04/md5-hash-collision-probability-using.html
> 
> "It is not that easy to get hash collisions when using MD5 algorithm.
> Even after you have generated 26 trillion hash values, the
> probability of the next generated hash value to be the same as one of
> those 26 trillion previously generated hash values is 1/1trillion (1
> out of 1 trillion)."
> 
> It seems much more likely that there's a bug somewhere in the way the
> hashes are used. Unless we understand that then switching to a longer
> hash might not solve anything.

The md5 collision generators have demonstrated its possible to get
checksums where there is a block of contiguous fixed data and a block
of arbitrary data in ratios of up to about 75% to 25%.

That pattern nearly exactly matches our function templating mechanism
where two functions may be nearly identical except for a name or a
small subset of it.

Two random hashes colliding are less interesting than the chances of
two very similar but subtly different pieces of code getting the same
hash. I don't have a mathematical level proof of it but looking at the
way you can generate collisions, I suspect our data is susceptible and
the fact you can do it at all with such large blocks is concerning.

I would love to have definitive proof. I'd be really interested if
Jason has the "bad" checksum and one of the inputs which matches it as
I'd probably see if we could brute force the other. I've read enough to
lose faith in our current code though.

Also though, there is the human factor. What I don't want to have is
people put off the project deeming it "insecure". I already get raised
eyebrows at the use of md5. Its probably time to switch and be done
with any perception anyway, particularly now questions are being asked,
valid or not as the performance hit, whilst noticeable on a profile is
not earth shattering.

Finally, by all means please do audit the codepaths and see if there is
another explanation. Our hash use is fairly simple but its possible
there is some other logic error and if there is we should fix it.

Cheers,

Richard




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Mis-generation of shell script (run.do_install)?
  2018-12-17 21:24             ` richard.purdie
@ 2018-12-18 17:45               ` Jason Andryuk
  2019-01-08 18:26                 ` richard.purdie
  0 siblings, 1 reply; 15+ messages in thread
From: Jason Andryuk @ 2018-12-18 17:45 UTC (permalink / raw)
  To: Richard Purdie; +Cc: OE Core mailing list

[-- Attachment #1: Type: text/plain, Size: 5261 bytes --]

On Mon, Dec 17, 2018 at 4:24 PM <richard.purdie@linuxfoundation.org> wrote:
>
> On Mon, 2018-12-17 at 12:21 -0800, Andre McCurdy wrote:
> > On Mon, Dec 17, 2018 at 6:44 AM <richard.purdie@linuxfoundation.org>
> > wrote:
> > > On Sat, 2018-12-15 at 20:19 -0500, Jason Andryuk wrote:
> > > > As far as I can tell, pysh is working properly - it's just the
> > > > bb_codeparser.dat which is returning the incorrect shellCacheLine
> > > > entry.  It seems like I have an md5 collision between a pyro
> > > > core2-64
> > > > binutils do_install and core2-32 python-async
> > > > distutils_do_install in
> > > > the shellCacheLine.  python-async's entry got in first, so that's
> > > > why
> > > > binutils run.do_install doesn't include autotools_do_install -
> > > > the
> > > > shellCacheLine `execs` entry doesn't include it.  Or somehow the
> > > > `bb_codeparser.dat` file was corrupted to have an incorrect
> > > > `execs`
> > > > for the binutils do_install hash.
> > >
> > > That is rather worrying. Looking at the known issues with md5, I
> > > can
> > > see how this could happen though.
> >
> > How do you see this could happen? By random bad luck?
> >
> > Despite md5 now being susceptible to targeted attacks, the chances of
> > accidentally hitting a collision between two 128bit hashes is as
> > unlikely as it's always been.
> >
> >   http://big.info/2013/04/md5-hash-collision-probability-using.html
> >
> > "It is not that easy to get hash collisions when using MD5 algorithm.
> > Even after you have generated 26 trillion hash values, the
> > probability of the next generated hash value to be the same as one of
> > those 26 trillion previously generated hash values is 1/1trillion (1
> > out of 1 trillion)."
> >
> > It seems much more likely that there's a bug somewhere in the way the
> > hashes are used. Unless we understand that then switching to a longer
> > hash might not solve anything.
>
> The md5 collision generators have demonstrated its possible to get
> checksums where there is a block of contiguous fixed data and a block
> of arbitrary data in ratios of up to about 75% to 25%.
>
> That pattern nearly exactly matches our function templating mechanism
> where two functions may be nearly identical except for a name or a
> small subset of it.
>
> Two random hashes colliding are less interesting than the chances of
> two very similar but subtly different pieces of code getting the same
> hash. I don't have a mathematical level proof of it but looking at the
> way you can generate collisions, I suspect our data is susceptible and
> the fact you can do it at all with such large blocks is concerning.
>
> I would love to have definitive proof. I'd be really interested if
> Jason has the "bad" checksum and one of the inputs which matches it as
> I'd probably see if we could brute force the other. I've read enough to
> lose faith in our current code though.
>
> Also though, there is the human factor. What I don't want to have is
> people put off the project deeming it "insecure". I already get raised
> eyebrows at the use of md5. Its probably time to switch and be done
> with any perception anyway, particularly now questions are being asked,
> valid or not as the performance hit, whilst noticeable on a profile is
> not earth shattering.
>
> Finally, by all means please do audit the codepaths and see if there is
> another explanation. Our hash use is fairly simple but its possible
> there is some other logic error and if there is we should fix it.

I can definitively state I have a hash in bb_codeparser.dat with an
incorrect shellCacheLine entry and I don't know how it got there.

The bad hash is 3df9018676de219bb3e46e88eea09c98.  I've attached a
file with the binutils do_install() contents which hash to that value.

The bad 3df9018676de219bb3e46e88eea09c98 entry in the bb_codeparser.dat returned
DEBUG: execs [
DEBUG: execs rm
DEBUG: execs install
DEBUG: execs test
DEBUG: execs sed
DEBUG: execs rmdir
DEBUG: execs bbfatal_log
DEBUG: execs mv
DEBUG: execs /home/build/openxt-compartments/build/tmp-glibc/work/core2-32-oe-linux/python-async/0.6.2-r0/recipe-sysroot-native/usr/bin/python-native/python
DEBUG: execs find

These execs looks like they could be from a distutils_do_install(),
but that's just a guess.  python-async was not in my tmp-glibc
directory when I started this investigation.  I don't know how it got
there.  I built it manually, but the resulting distutils_do_install
has a different hash :(

The correct shellCacheLine entry for core2-64 binutils do_install returns:
DEBUG: execs basename
DEBUG: execs rm
DEBUG: execs oe_multilib_header
DEBUG: execs ln
DEBUG: execs install
DEBUG: execs echo
DEBUG: execs cd
DEBUG: execs autotools_do_install
DEBUG: execs sed
DEBUG: execs tr

Is it an md5 collision?  I don't know - I don't have a second
colliding input for 3df9018676de219bb3e46e88eea09c98.

Any hashing can potentially have collisions.  A longer and stronger
algorithm reduces the chances, but there is no absolute fix.  Without
comparing the original inputs, you can't know if two inputs collided.

This openxt 8 build is based on pyro, fyi.

Regards,
Jason

[-- Attachment #2: binutils_do_install-3df9018676de219bb3e46e88eea09c98 --]
[-- Type: application/octet-stream, Size: 1705 bytes --]

	autotools_do_install

	# We don't really need these, so we'll remove them...
	rm -rf /home/build/openxt-compartments/build/tmp-glibc/work/core2-64-oe-linux/binutils/2.28-r0/image/usr/lib/ldscripts

	# Fix the /usr/x86_64-oe-linux/bin/* links
	for l in /home/build/openxt-compartments/build/tmp-glibc/work/core2-64-oe-linux/binutils/2.28-r0/image/usr/x86_64-oe-linux/bin/*; do
		rm -f $l
		ln -sf `echo /usr/x86_64-oe-linux/bin \
			| tr -s / \
			| sed -e 's,^/,,' -e 's,[^/]*,..,g'`/usr/bin/x86_64-oe-linux-`basename $l` $l
	done

	# Install the libiberty header
	install -d /home/build/openxt-compartments/build/tmp-glibc/work/core2-64-oe-linux/binutils/2.28-r0/image/usr/include
	install -m 644 /home/build/openxt-compartments/build/tmp-glibc/work/core2-64-oe-linux/binutils/2.28-r0/git/include/ansidecl.h /home/build/openxt-compartments/build/tmp-glibc/work/core2-64-oe-linux/binutils/2.28-r0/image/usr/include
	install -m 644 /home/build/openxt-compartments/build/tmp-glibc/work/core2-64-oe-linux/binutils/2.28-r0/git/include/libiberty.h /home/build/openxt-compartments/build/tmp-glibc/work/core2-64-oe-linux/binutils/2.28-r0/image/usr/include

	cd /home/build/openxt-compartments/build/tmp-glibc/work/core2-64-oe-linux/binutils/2.28-r0/image/usr/bin

	# Symlinks for ease of running these on the native target
	for p in x86_64-oe-linux-* ; do
		ln -sf $p `echo $p | sed -e s,x86_64-oe-linux-,,`
	done

	for alt in  	addr2line 	ar 	as 	c++filt 	elfedit 	gprof 	ld 	ld.bfd 	ld.gold dwp 	nm 	objcopy 	objdump 	ranlib 	readelf 	size 	strings 	strip ; do
		rm -f /home/build/openxt-compartments/build/tmp-glibc/work/core2-64-oe-linux/binutils/2.28-r0/image/usr/bin/$alt
	done

	oe_multilib_header bfd.h

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Mis-generation of shell script (run.do_install)?
  2018-12-18 17:45               ` Jason Andryuk
@ 2019-01-08 18:26                 ` richard.purdie
  2019-01-16 13:55                   ` Jason Andryuk
  0 siblings, 1 reply; 15+ messages in thread
From: richard.purdie @ 2019-01-08 18:26 UTC (permalink / raw)
  To: Jason Andryuk; +Cc: OE Core mailing list

On Tue, 2018-12-18 at 12:45 -0500, Jason Andryuk wrote:
> I can definitively state I have a hash in bb_codeparser.dat with an
> incorrect shellCacheLine entry and I don't know how it got there.
> 
> The bad hash is 3df9018676de219bb3e46e88eea09c98.  I've attached a
> file with the binutils do_install() contents which hash to that
> value.
> 
> The bad 3df9018676de219bb3e46e88eea09c98 entry in the
> bb_codeparser.dat returned
> DEBUG: execs [
> DEBUG: execs rm
> DEBUG: execs install
> DEBUG: execs test
> DEBUG: execs sed
> DEBUG: execs rmdir
> DEBUG: execs bbfatal_log
> DEBUG: execs mv
> DEBUG: execs /home/build/openxt-compartments/build/tmp-
> glibc/work/core2-32-oe-linux/python-async/0.6.2-r0/recipe-sysroot-
> native/usr/bin/python-native/python
> DEBUG: execs find

This is useful data (along with the attachment), thanks.

I agree that this looks likely to have come from a core2-32 tuned
machine (e.g. genericx86) from python-async do_install.

How old was this build directory? Can you remember any details of the
update history for it?

I'd be very interested to try and reproduce that hash. I locally
blacklisted your collision from my cache and tried to reproduce this. I
can generate a matching hash for the binutils do_install but I can't
produce one matching the above.

Can you remember the history of this build directory and which updates
it may have had? The python-async recipe is confined to OE-Core so its
probably the revision history for the oe-core repo which is most
interesting. Anything in the .git/logs directory for that which would
help us replay the different versions you might have built?

Cheers,

Richard



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Mis-generation of shell script (run.do_install)?
  2019-01-08 18:26                 ` richard.purdie
@ 2019-01-16 13:55                   ` Jason Andryuk
  2019-01-16 14:02                     ` Richard Purdie
  0 siblings, 1 reply; 15+ messages in thread
From: Jason Andryuk @ 2019-01-16 13:55 UTC (permalink / raw)
  To: Richard Purdie; +Cc: OE Core mailing list

On Tue, Jan 8, 2019 at 1:26 PM <richard.purdie@linuxfoundation.org> wrote:
>
> On Tue, 2018-12-18 at 12:45 -0500, Jason Andryuk wrote:
> > I can definitively state I have a hash in bb_codeparser.dat with an
> > incorrect shellCacheLine entry and I don't know how it got there.
> >
> > The bad hash is 3df9018676de219bb3e46e88eea09c98.  I've attached a
> > file with the binutils do_install() contents which hash to that
> > value.
> >
> > The bad 3df9018676de219bb3e46e88eea09c98 entry in the
> > bb_codeparser.dat returned
> > DEBUG: execs [
> > DEBUG: execs rm
> > DEBUG: execs install
> > DEBUG: execs test
> > DEBUG: execs sed
> > DEBUG: execs rmdir
> > DEBUG: execs bbfatal_log
> > DEBUG: execs mv
> > DEBUG: execs /home/build/openxt-compartments/build/tmp-
> > glibc/work/core2-32-oe-linux/python-async/0.6.2-r0/recipe-sysroot-
> > native/usr/bin/python-native/python
> > DEBUG: execs find
>
> This is useful data (along with the attachment), thanks.
>
> I agree that this looks likely to have come from a core2-32 tuned
> machine (e.g. genericx86) from python-async do_install.
>
> How old was this build directory? Can you remember any details of the
> update history for it?

I think the build directory was from the beginning of October 30th,
and I guess I hit the collision December 10th or so.

> I'd be very interested to try and reproduce that hash. I locally
> blacklisted your collision from my cache and tried to reproduce this. I
> can generate a matching hash for the binutils do_install but I can't
> produce one matching the above.

I tried around December 18th to generate the collision again.  I set
up a new container with an identical openxt path.  There, python-async
was built, but it did not have the colliding hash.  When core2-64
binutils was built, it had the expected hash.

> Can you remember the history of this build directory and which updates
> it may have had? The python-async recipe is confined to OE-Core so its
> probably the revision history for the oe-core repo which is most
> interesting. Anything in the .git/logs directory for that which would
> help us replay the different versions you might have built?

oe-core is checked out at 819aa151bd634122a46ffdd822064313c67f5ba5
It's a git submodule locked at a fixed revision, and it had not
changed in the build directory.

OpenXT builds 8 or 9 different MACHINEs and images in sequence in the
same build directory.  Maybe 6 are core2-32 and two are core2-64. The
32bit ones run first.

I think the problem first manifest after I added an additional local
layer to BBLAYERS.  At that time, I started building an additional
MACHINE.  Along with the mis-generated run.do_install script, bitbake
was complaining about the binutils base hash mismatch which triggered
the re-build.  The first 64bit MACHINE included TUNE-CCARGS +=
"-mstackrealign" while the second did not.  Could that be a reason why
bitbake complained about the base hash mismatch?

Without reproducing the hash, I'm more puzzled.

Regards,
Jason


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Mis-generation of shell script (run.do_install)?
  2019-01-16 13:55                   ` Jason Andryuk
@ 2019-01-16 14:02                     ` Richard Purdie
  2019-01-16 20:20                       ` Jason Andryuk
  0 siblings, 1 reply; 15+ messages in thread
From: Richard Purdie @ 2019-01-16 14:02 UTC (permalink / raw)
  To: Jason Andryuk; +Cc: OE Core mailing list

On Wed, 2019-01-16 at 08:55 -0500, Jason Andryuk wrote:
> On Tue, Jan 8, 2019 at 1:26 PM <richard.purdie@linuxfoundation.org>
> wrote:
> > On Tue, 2018-12-18 at 12:45 -0500, Jason Andryuk wrote:
> > > I can definitively state I have a hash in bb_codeparser.dat with
> > > an
> > > incorrect shellCacheLine entry and I don't know how it got there.
> > > 
> > > The bad hash is 3df9018676de219bb3e46e88eea09c98.  I've attached
> > > a
> > > file with the binutils do_install() contents which hash to that
> > > value.
> > > 
> > > The bad 3df9018676de219bb3e46e88eea09c98 entry in the
> > > bb_codeparser.dat returned
> > > DEBUG: execs [
> > > DEBUG: execs rm
> > > DEBUG: execs install
> > > DEBUG: execs test
> > > DEBUG: execs sed
> > > DEBUG: execs rmdir
> > > DEBUG: execs bbfatal_log
> > > DEBUG: execs mv
> > > DEBUG: execs /home/build/openxt-compartments/build/tmp-
> > > glibc/work/core2-32-oe-linux/python-async/0.6.2-r0/recipe-
> > > sysroot-
> > > native/usr/bin/python-native/python
> > > DEBUG: execs find
> > 
> > This is useful data (along with the attachment), thanks.
> > 
> > I agree that this looks likely to have come from a core2-32 tuned
> > machine (e.g. genericx86) from python-async do_install.
> > 
> > How old was this build directory? Can you remember any details of
> > the
> > update history for it?
> 
> I think the build directory was from the beginning of October 30th,
> and I guess I hit the collision December 10th or so.
> 
> > I'd be very interested to try and reproduce that hash. I locally
> > blacklisted your collision from my cache and tried to reproduce
> > this. I
> > can generate a matching hash for the binutils do_install but I
> > can't
> > produce one matching the above.
> 
> I tried around December 18th to generate the collision again.  I set
> up a new container with an identical openxt path.  There, python-
> async was built, but it did not have the colliding hash.  When core2-
> 64 binutils was built, it had the expected hash.
> 
> > Can you remember the history of this build directory and which
> > updates
> > it may have had? The python-async recipe is confined to OE-Core so
> > its
> > probably the revision history for the oe-core repo which is most
> > interesting. Anything in the .git/logs directory for that which
> > would
> > help us replay the different versions you might have built?
> 
> oe-core is checked out at 819aa151bd634122a46ffdd822064313c67f5ba5
> It's a git submodule locked at a fixed revision, and it had not
> changed in the build directory.
> 
> OpenXT builds 8 or 9 different MACHINEs and images in sequence in the
> same build directory.  Maybe 6 are core2-32 and two are core2-64. The
> 32bit ones run first.

The hash we don't have is from a core2-32 MACHINE. I'm wondering which
configurations you might have parsed for a core2-32 MACHINE between
October and December?

Was TMPDIR ever cleaned? If not, do you have the python-async WORKDIR
for core2-32? The TMPDIR/logs directory may also have useful hints
about the configurations built...

> I think the problem first manifest after I added an additional local
> layer to BBLAYERS.  At that time, I started building an additional
> MACHINE.  Along with the mis-generated run.do_install script, bitbake
> was complaining about the binutils base hash mismatch which triggered
> the re-build.  The first 64bit MACHINE included TUNE-CCARGS +=
> "-mstackrealign" while the second did not.  Could that be a reason
> why bitbake complained about the base hash mismatch?

By the time the binutils error happens, the error is kind of lost in
history and must have been added some time prior to that.

We know its a build of python-async for a core2-32 MACHINE. Did you
also try building those with the "-mstackrealign" option? Were there
any other changes you can think of that would have applied to the
core2-32 MACHINE builds?

Cheers,

Richard





^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Mis-generation of shell script (run.do_install)?
  2019-01-16 14:02                     ` Richard Purdie
@ 2019-01-16 20:20                       ` Jason Andryuk
  2019-01-16 20:28                         ` Richard Purdie
  0 siblings, 1 reply; 15+ messages in thread
From: Jason Andryuk @ 2019-01-16 20:20 UTC (permalink / raw)
  To: Richard Purdie; +Cc: OE Core mailing list

On Wed, Jan 16, 2019 at 9:02 AM Richard Purdie
<richard.purdie@linuxfoundation.org> wrote:
>
> On Wed, 2019-01-16 at 08:55 -0500, Jason Andryuk wrote:
> > On Tue, Jan 8, 2019 at 1:26 PM <richard.purdie@linuxfoundation.org>
> > wrote:
> > > On Tue, 2018-12-18 at 12:45 -0500, Jason Andryuk wrote:
> > > > I can definitively state I have a hash in bb_codeparser.dat with
> > > > an
> > > > incorrect shellCacheLine entry and I don't know how it got there.
> > > >
> > > > The bad hash is 3df9018676de219bb3e46e88eea09c98.  I've attached
> > > > a
> > > > file with the binutils do_install() contents which hash to that
> > > > value.
> > > >
> > > > The bad 3df9018676de219bb3e46e88eea09c98 entry in the
> > > > bb_codeparser.dat returned
> > > > DEBUG: execs [
> > > > DEBUG: execs rm
> > > > DEBUG: execs install
> > > > DEBUG: execs test
> > > > DEBUG: execs sed
> > > > DEBUG: execs rmdir
> > > > DEBUG: execs bbfatal_log
> > > > DEBUG: execs mv
> > > > DEBUG: execs /home/build/openxt-compartments/build/tmp-
> > > > glibc/work/core2-32-oe-linux/python-async/0.6.2-r0/recipe-
> > > > sysroot-
> > > > native/usr/bin/python-native/python
> > > > DEBUG: execs find
> > >
> > > This is useful data (along with the attachment), thanks.
> > >
> > > I agree that this looks likely to have come from a core2-32 tuned
> > > machine (e.g. genericx86) from python-async do_install.
> > >
> > > How old was this build directory? Can you remember any details of
> > > the
> > > update history for it?
> >
> > I think the build directory was from the beginning of October 30th,
> > and I guess I hit the collision December 10th or so.
> >
> > > I'd be very interested to try and reproduce that hash. I locally
> > > blacklisted your collision from my cache and tried to reproduce
> > > this. I
> > > can generate a matching hash for the binutils do_install but I
> > > can't
> > > produce one matching the above.
> >
> > I tried around December 18th to generate the collision again.  I set
> > up a new container with an identical openxt path.  There, python-
> > async was built, but it did not have the colliding hash.  When core2-
> > 64 binutils was built, it had the expected hash.
> >
> > > Can you remember the history of this build directory and which
> > > updates
> > > it may have had? The python-async recipe is confined to OE-Core so
> > > its
> > > probably the revision history for the oe-core repo which is most
> > > interesting. Anything in the .git/logs directory for that which
> > > would
> > > help us replay the different versions you might have built?
> >
> > oe-core is checked out at 819aa151bd634122a46ffdd822064313c67f5ba5
> > It's a git submodule locked at a fixed revision, and it had not
> > changed in the build directory.
> >
> > OpenXT builds 8 or 9 different MACHINEs and images in sequence in the
> > same build directory.  Maybe 6 are core2-32 and two are core2-64. The
> > 32bit ones run first.
>
> The hash we don't have is from a core2-32 MACHINE. I'm wondering which
> configurations you might have parsed for a core2-32 MACHINE between
> October and December?

Which "configurations" are you asking about?

The standard OpenXT build loops through building all 8 images and
packaging them up into an installer iso.  Often I run that build
script, but sometimes I just build individual machines manually.

I was mainly working on the core2-64 machines immediately prior to
this event.  I was very surprised when it occured since 1) I didn't
expect binutils to be re-built and 2) I wasn't working on the
openxt-installer machine which failed.

> Was TMPDIR ever cleaned? If not, do you have the python-async WORKDIR
> for core2-32? The TMPDIR/logs directory may also have useful hints
> about the configurations built...

Unfortunately, yes, I cleaned TMPDIR when I hit the build error.  Same
with the sstate-cache.

In general, I don't see python-async in TMPDIR after running through
the OpenXT build.  Would that be because an early machine builds
python-async, but then it gets cleared out of TMPDIR when a later
machine/image are built?

> > I think the problem first manifest after I added an additional local
> > layer to BBLAYERS.  At that time, I started building an additional
> > MACHINE.  Along with the mis-generated run.do_install script, bitbake
> > was complaining about the binutils base hash mismatch which triggered
> > the re-build.  The first 64bit MACHINE included TUNE-CCARGS +=
> > "-mstackrealign" while the second did not.  Could that be a reason
> > why bitbake complained about the base hash mismatch?
>
> By the time the binutils error happens, the error is kind of lost in
> history and must have been added some time prior to that.
>
> We know its a build of python-async for a core2-32 MACHINE. Did you
> also try building those with the "-mstackrealign" option? Were there
> any other changes you can think of that would have applied to the
> core2-32 MACHINE builds?

All the base OpenXT machines have "-mstackrealign" in their conf.  My
new 64bit machines do not have it.  I don't recall working with
core2-32 MACHINES at the time.  The new layer I pulled in only had a
layer.conf and a 64bit machine.conf.

In my second container, I `rm -rf cache/ tmp-glibc/ sstate-cache/`.
Running the build of the first OpenXT machine, bb_codeparse.dat gets
populated with python-async:
'3c6fe664c51d2f793f8fd0eb103d68cb': frozenset({'find', 'sed',
'install', 'mv', 'bbfatal_log', 'rmdir', '[', 'rm',
'/home/build/openxt-compartments/build/tmp-glibc/work/core2-32-oe-linux/python-async/0.6.2-r0/recipe-sysroot-native/usr/bin/python-native/python',
'test'})

python-async is not in tmp-glibc/work and `grep -r tmp-glibc/log`
doesn't turn up anything.  If I run `bitbake -g`, python-async doesn't
appear in any of the output files.  Is bb_codeparser.data getting
populated without building the recipe to be expected?

Regards,
Jason


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Mis-generation of shell script (run.do_install)?
  2019-01-16 20:20                       ` Jason Andryuk
@ 2019-01-16 20:28                         ` Richard Purdie
  2019-01-17 17:10                           ` Jason Andryuk
  0 siblings, 1 reply; 15+ messages in thread
From: Richard Purdie @ 2019-01-16 20:28 UTC (permalink / raw)
  To: Jason Andryuk; +Cc: OE Core mailing list

On Wed, 2019-01-16 at 15:20 -0500, Jason Andryuk wrote:
> On Wed, Jan 16, 2019 at 9:02 AM Richard Purdie
> <richard.purdie@linuxfoundation.org> wrote:
> > On Wed, 2019-01-16 at 08:55 -0500, Jason Andryuk wrote:
> > > On Tue, Jan 8, 2019 at 1:26 PM <
> > > richard.purdie@linuxfoundation.org>
> > > wrote:
> > > OpenXT builds 8 or 9 different MACHINEs and images in sequence in
> > > the
> > > same build directory.  Maybe 6 are core2-32 and two are core2-64. 
> > > The
> > > 32bit ones run first.
> > 
> > The hash we don't have is from a core2-32 MACHINE. I'm wondering
> > which configurations you might have parsed for a core2-32 MACHINE
> > between October and December?
> 
> Which "configurations" are you asking about?

I mean the machine configurations. It sounds like it was just the
standard ones from OpenXT which most of your builds would loop through.
I did try and reproduce the hash using those but I could be missing
something.

> The standard OpenXT build loops through building all 8 images and
> packaging them up into an installer iso.  Often I run that build
> script, but sometimes I just build individual machines manually.
>
> I was mainly working on the core2-64 machines immediately prior to
> this event.  I was very surprised when it occured since 1) I didn't
> expect binutils to be re-built and 2) I wasn't working on the
> openxt-installer machine which failed.
> 
> > Was TMPDIR ever cleaned? If not, do you have the python-async
> > WORKDIR
> > for core2-32? The TMPDIR/logs directory may also have useful hints
> > about the configurations built...
> 
> Unfortunately, yes, I cleaned TMPDIR when I hit the build
> error.  Same with the sstate-cache.
> 
> In general, I don't see python-async in TMPDIR after running through
> the OpenXT build.  Would that be because an early machine builds
> python-async, but then it gets cleared out of TMPDIR when a later
> machine/image are built?
>
> > > 
[...]
> All the base OpenXT machines have "-mstackrealign" in their conf.  My
> new 64bit machines do not have it.  I don't recall working with
> core2-32 MACHINES at the time.  The new layer I pulled in only had a
> layer.conf and a 64bit machine.conf.
> 
> In my second container, I `rm -rf cache/ tmp-glibc/ sstate-cache/`.
> Running the build of the first OpenXT machine, bb_codeparse.dat gets
> populated with python-async:
> '3c6fe664c51d2f793f8fd0eb103d68cb': frozenset({'find', 'sed',
> 'install', 'mv', 'bbfatal_log', 'rmdir', '[', 'rm',
> '/home/build/openxt-compartments/build/tmp-glibc/work/core2-32-oe-
> linux/python-async/0.6.2-r0/recipe-sysroot-native/usr/bin/python-
> native/python',
> 'test'})
> 
> python-async is not in tmp-glibc/work and `grep -r tmp-glibc/log`
> doesn't turn up anything.  If I run `bitbake -g`, python-async
> doesn't
> appear in any of the output files.  Is bb_codeparser.data getting
> populated without building the recipe to be expected?

The data is in the codeparser cache which is first populated at parse
time so its enough just to parse the machine+recipe in question, not
build it. I think that explains the answer to a ew of your questions
above.

Sorry for asking so many questions btw, I'd just really love to be able
to reproduce this issue! Thanks for trying to help answer them too!

Is the bitbake-cookerdeamon.log file still there for this build (in the
top level build directory)?

Cheers,

Richard



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Mis-generation of shell script (run.do_install)?
  2019-01-16 20:28                         ` Richard Purdie
@ 2019-01-17 17:10                           ` Jason Andryuk
  0 siblings, 0 replies; 15+ messages in thread
From: Jason Andryuk @ 2019-01-17 17:10 UTC (permalink / raw)
  To: Richard Purdie; +Cc: OE Core mailing list

On Wed, Jan 16, 2019 at 3:28 PM Richard Purdie
<richard.purdie@linuxfoundation.org> wrote:
> The data is in the codeparser cache which is first populated at parse
> time so its enough just to parse the machine+recipe in question, not
> build it. I think that explains the answer to a ew of your questions
> above.

Yes, thanks.

> Sorry for asking so many questions btw, I'd just really love to be able
> to reproduce this issue! Thanks for trying to help answer them too!
>
> Is the bitbake-cookerdeamon.log file still there for this build (in the
> top level build directory)?

I don't seem to have this file in any of my OpenXT builds.

I still have the "bad" bb_codeparser.dat file.  It is 30MB whereas the
new one is only 6.5MB.  I thought it may be excessively large, but I
actually have an 80MB one in a different build directory.

Anyway, it has 4 different entries that look like core2-32
python-async do_install()-s

3c6fe664c51d2f793f8fd0eb103d68cb - reproduces currently
3df9018676de219bb3e46e88eea09c98 - one matching binutils core2-64 do_install
382871fb17743ba9635d7efc4db7d993
ee6850bdcf70ba63dea37e09c78c599f

They all have
frozenset({'[', 'mv', 'test',
'/home/build/openxt-compartments/build/tmp-glibc/work/core2-32-oe-linux/python-async/0.6.2-r0/recipe-sysroot-native/usr/bin/python-native/python',
'sed', 'install', 'bbfatal_log', 'find', 'rm', 'rmdir'})

Eyeballing distutils_do_install, I don't see what could produce so
many variations.

Going into the new, clean build container, I can see those last two
hashes with different entries:
>>> d['ee6850bdcf70ba63dea37e09c78c599f']
frozenset({'tr', 'rm', 'sed', 'ln', 'cd', 'oe_multilib_header',
'autotools_do_install', 'echo', 'basename', 'install'})
>>> d['382871fb17743ba9635d7efc4db7d993']
frozenset({'tr', 'rm', 'sed', 'ln', 'cd', 'oe_multilib_header',
'autotools_do_install', 'echo', 'basename', 'install'})

and the expected core2-32 python-async do_install
>>> d['3c6fe664c51d2f793f8fd0eb103d68cb']
frozenset({'bbfatal_log', 'rm', 'test', 'sed', '[', 'rmdir',
'/home/build/openxt-compartments/build/tmp-glibc/work/core2-32-oe-linux/python-async/0.6.2-r0/recipe-sysroot-native/usr/bin/python-native/python',
'find', 'install', 'mv'})

I've only run one core2-32 build in the fresh container, so no 64bit
binutils at the original collision 3df9018676de219bb3e46e88eea09c98

Ok, I hacked up a script to check two bb_codeparser.dat files for
collisions.  Compare the current one with the "bad" one:
$ ./pickle-cmp.py cache/bb_codeparser.dat cache/bb_codeparser.dat.old-bad-one
Collision ee6850bdcf70ba63dea37e09c78c599f
frozenset({'echo', 'rm', 'autotools_do_install', 'tr',
'oe_multilib_header', 'cd', 'basename', 'sed', 'ln', 'install'})
frozenset({'find', 'test', 'rm', 'bbfatal_log', '[', 'sed', 'mv',
'/home/build/openxt-compartments/build/tmp-glibc/work/core2-32-oe-linux/python-async/0.6.2-r0/recipe-sysroot-native/usr/bin/python-native/python',
'rmdir', 'install'})
Collision 382871fb17743ba9635d7efc4db7d993
frozenset({'echo', 'rm', 'autotools_do_install', 'tr',
'oe_multilib_header', 'cd', 'basename', 'sed', 'ln', 'install'})
frozenset({'find', 'test', 'rm', 'bbfatal_log', '[', 'sed', 'mv',
'/home/build/openxt-compartments/build/tmp-glibc/work/core2-32-oe-linux/python-async/0.6.2-r0/recipe-sysroot-native/usr/bin/python-native/python',
'rmdir', 'install'})
Collision 5254083eac08e32fc68bc9421d7df287
frozenset({'autotools_do_install', 'rm', 'sed', 'touch', 'install'})
frozenset({'/etc/init.d/xenclient-boot-sound', 'true', ':', '['})
Collision d0701fd5c05175aeafc06d8ce34d3532
frozenset({'create-cracklib-dict', 'autotools_do_install'})
frozenset({'/etc/init.d/gateone', 'true', ':', '['})
Collision ec332415bd96520823ba383494e7a9a7
frozenset({'ln', 'popd', ':', 'pushd'})
frozenset({'DEPLOY_DIR', 'useradd_preinst', 'perform_useradd', 'PKGD',
'PKGDEST', 'pkg_preinst', 'MLPREFIX', 'perform_groupadd', 'PN',
'perform_groupmems', 'PACKAGES', 'NOAUTOPACKAGEDEBUG',
'USERADD_PACKAGES', 'WORKDIR'})
Collision 3df9018676de219bb3e46e88eea09c98
frozenset({'echo', 'rm', 'autotools_do_install', 'tr',
'oe_multilib_header', 'cd', 'basename', 'sed', 'ln', 'install'})
frozenset({'find', 'test', 'rm', 'bbfatal_log', '[', 'sed', 'mv',
'/home/build/openxt-compartments/build/tmp-glibc/work/core2-32-oe-linux/python-async/0.6.2-r0/recipe-sysroot-native/usr/bin/python-native/python',
'rmdir', 'install'})
Collision 0aa15eb469ad8854cda0b0675217b8f6
frozenset({'find', 'test', 'rm', 'bbfatal_log', '[', 'sed', 'mv',
'/home/build/openxt-compartments/build/tmp-glibc/work/core2-32-oe-linux/python-mock/2.0.0-r0/recipe-sysroot-native/usr/bin/python-native/python',
'rmdir', 'install'})
frozenset({'oe_runmake', 'find', 'true', 'test', 'echo', 'chmod',
'rm', 'mkdir', '[', 'oe_multilib_header', 'cd', 'lnr', 'basename',
'continue', 'mv', 'ln', 'local', 'install'})

Compare the current one with the fresh one from the other container (build4):
$ ./pickle-cmp.py cache/bb_codeparser.dat build4-codeparser.dat
Collision d0701fd5c05175aeafc06d8ce34d3532
frozenset({'create-cracklib-dict', 'autotools_do_install'})
frozenset({'[', ':', '/etc/init.d/gateone', 'true'})
Collision 5254083eac08e32fc68bc9421d7df287
frozenset({'touch', 'install', 'sed', 'autotools_do_install', 'rm'})
frozenset({'[', '/etc/init.d/xenclient-boot-sound', ':', 'true'})

I figured I can reproduce the hash collisions for
d0701fd5c05175aeafc06d8ce34d3532, but I cannot.

gateone update-rc.d updatercd_prerm matches d0701fd5c05175aeafc06d8ce34d3532
Below, it should be a tab before /etc/init.d/gateone , but I can't
insert that because of gmail)
'''
# Begin section update-rc.d
if true && [ -z "$D" -a -x "/etc/init.d/gateone" ]; then
        /etc/init.d/gateone stop || :
fi
# End section update-rc.d
'''

cracklib do_install is only two lines, but I cannot reproduce.
(Below, it should be a tab before create-cracklib-dict)
$ sed -n '/^do_install/,/^}/p'
tmp-glibc/work/core2-32-oe-linux/cracklib/2.9.5-r0/temp/run.do_install
| head -n -7 | tail -n 2 | tee /dev/stderr | md5sum
    autotools_do_install
       create-cracklib-dict -o
/home/build/openxt-compartments/build/tmp-glibc/work/core2-32-oe-linux/cracklib/2.9.5-r0/image/usr/share/cracklib/pw_dict
/home/build/openxt-compartments/build/tmp-glibc/work/core2-32-oe-linux/cracklib/2.9.5-r0/image/usr/share/cracklib/cracklib-small
8ca3daacb394a11f38c851a8d71ed4de  -

#current
$ ./i-m-in-a-pickle.py cache/bb_codeparser.dat cracklib
3933316
'8ca3daacb394a11f38c851a8d71ed4de': frozenset({'create-cracklib-dict',
'autotools_do_install'})
5050048
'b50633e8f154817d7cc3ec94c6405379': frozenset({'create-cracklib-dict',
'autotools_do_install'})
5641024
'd0701fd5c05175aeafc06d8ce34d3532': frozenset({'create-cracklib-dict',
'autotools_do_install'})

#old bad
$ ./i-m-in-a-pickle.py cache/bb_codeparser.dat.old cracklib
13503349
'8ca3daacb394a11f38c851a8d71ed4de': frozenset({'autotools_do_install',
'create-cracklib-dict'})
39348111
'b50633e8f154817d7cc3ec94c6405379': frozenset({'autotools_do_install',
'create-cracklib-dict'})

#fresh
$ ./i-m-in-a-pickle.py build4-codeparser.dat cracklib
2085662
'8ca3daacb394a11f38c851a8d71ed4de': frozenset({'create-cracklib-dict',
'autotools_do_install'}),

b50633e8f154817d7cc3ec94c6405379 is the core2-64 cracklib do_install
$ sed -n '/^do_install/,/^}/p'
tmp-glibc/work/core2-64-oe-linux/cracklib/2.9.5-r0/temp/run.do_install
| head -n -7 | tail -n 2 | tee /dev/stderr | md5sum
    autotools_do_install
    create-cracklib-dict -o
/home/build/openxt-compartments/build/tmp-glibc/work/core2-64-oe-linux/cracklib/2.9.5-r0/image/usr/share/cracklib/pw_dict
/home/build/openxt-compartments/build/tmp-glibc/work/core2-64-oe-linux/cracklib/2.9.5-r0/image/usr/share/cracklib/cracklib-small
b50633e8f154817d7cc3ec94c6405379  -

But where did d0701fd5c05175aeafc06d8ce34d3532 come from?

When pickling the cache for writing to disk, could a dangling
pointer/index/reference be left such that a given hash's frozenset
entry changes?

An aside - entries include the standard linux utilities & shell
builtins, but those are always available.  If a given shell cache
entry only relies on those utilities, it could collide and still run.
It's only an issue when a shell function or special utility is needed.

Regards,
Jason


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2019-01-17 17:11 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-11 13:42 Mis-generation of shell script (run.do_install)? Jason Andryuk
2018-12-11 15:02 ` Richard Purdie
2018-12-14 19:30   ` Jason Andryuk
2018-12-15 10:51     ` richard.purdie
2018-12-16  1:19       ` Jason Andryuk
2018-12-17 14:44         ` richard.purdie
2018-12-17 20:21           ` Andre McCurdy
2018-12-17 21:24             ` richard.purdie
2018-12-18 17:45               ` Jason Andryuk
2019-01-08 18:26                 ` richard.purdie
2019-01-16 13:55                   ` Jason Andryuk
2019-01-16 14:02                     ` Richard Purdie
2019-01-16 20:20                       ` Jason Andryuk
2019-01-16 20:28                         ` Richard Purdie
2019-01-17 17:10                           ` Jason Andryuk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.