All of lore.kernel.org
 help / color / mirror / Atom feed
* broken sstate archives
@ 2018-10-30 15:45 André Draszik
  2018-10-30 16:49 ` Richard Purdie
  2018-11-12 12:56 ` André Draszik
  0 siblings, 2 replies; 7+ messages in thread
From: André Draszik @ 2018-10-30 15:45 UTC (permalink / raw)
  To: openembedded-core

Hi,

Having updated poky from 028a292001f64ad86c6b960a05ba1f6fd72199de (end-of
July) to 3b77e7b7852549dcfbc426d4ce258e6e857c0acd (mid October), at least
two broken sstate archives have been created:

* the sstate archive for update-rc.d package_write_ipk contains a broken
  main package update-rc.d_0.8-r0_all.ipk
* zlib populate_sysroot has a broken sysroot-destdir/lib/libz.so.1.2.11

For the update-rc.d case:
* tar tzvvf displays a reasonable size for all files inside the sstate
  archive
* tar xzf extracts all files and sets a size on update-rc.d_0.8-r0_all.ipk,
  but it's all NULs, and hence is broken
* for those who know midnight commander, it's 'open' displays a size of
  0 bytes for update-rc.d_0.8-r0_all.ipk in the first place


For the broken zlib sstate archive, things are similar, additionally:
* the zlib ipk packages (and their contents) contained inside
  sstate_zlib_*_package_write_ipk.tgz are actually not broken


The original (first) build resulting from my poky.git update had actually
completed successfully. It is only subsequent builds trying to use the
generated sstate artefacts that now don't work.

I can't say for sure whether or not other sstate artefacts are broken, too.


Any ideas how this could have happened? Have similar issues been seen
before?


Cheers,
Andre'




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: broken sstate archives
  2018-10-30 15:45 broken sstate archives André Draszik
@ 2018-10-30 16:49 ` Richard Purdie
  2018-11-01 12:22   ` André Draszik
  2018-11-12 12:56 ` André Draszik
  1 sibling, 1 reply; 7+ messages in thread
From: Richard Purdie @ 2018-10-30 16:49 UTC (permalink / raw)
  To: André Draszik, openembedded-core

On Tue, 2018-10-30 at 15:45 +0000, André Draszik wrote:
> Having updated poky from 028a292001f64ad86c6b960a05ba1f6fd72199de
> (end-of
> July) to 3b77e7b7852549dcfbc426d4ce258e6e857c0acd (mid October), at
> least
> two broken sstate archives have been created:
> 
> * the sstate archive for update-rc.d package_write_ipk contains a
> broken
>   main package update-rc.d_0.8-r0_all.ipk
> * zlib populate_sysroot has a broken sysroot-
> destdir/lib/libz.so.1.2.11
> 
> For the update-rc.d case:
> * tar tzvvf displays a reasonable size for all files inside the
> sstate
>   archive
> * tar xzf extracts all files and sets a size on update-rc.d_0.8-
> r0_all.ipk,
>   but it's all NULs, and hence is broken
> * for those who know midnight commander, it's 'open' displays a size
> of
>   0 bytes for update-rc.d_0.8-r0_all.ipk in the first place
> 
> 
> For the broken zlib sstate archive, things are similar, additionally:
> * the zlib ipk packages (and their contents) contained inside
>   sstate_zlib_*_package_write_ipk.tgz are actually not broken
> 
> 
> The original (first) build resulting from my poky.git update had
> actually
> completed successfully. It is only subsequent builds trying to use
> the
> generated sstate artefacts that now don't work.
> 
> I can't say for sure whether or not other sstate artefacts are
> broken, too.
> 
> 
> Any ideas how this could have happened? Have similar issues been seen
> before?

I've not seen/heard of any reports of that before and it is worrying.
The question is can you reproduce it? As things stand that is a little
bit of a hard one to replicate/debug :(

Which host OS and filesystem was it?

Cheers,

Richard



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: broken sstate archives
  2018-10-30 16:49 ` Richard Purdie
@ 2018-11-01 12:22   ` André Draszik
  2018-11-01 13:07     ` richard.purdie
  0 siblings, 1 reply; 7+ messages in thread
From: André Draszik @ 2018-11-01 12:22 UTC (permalink / raw)
  To: Richard Purdie, openembedded-core

On Tue, 2018-10-30 at 16:49 +0000, Richard Purdie wrote:
> On Tue, 2018-10-30 at 15:45 +0000, André Draszik wrote:
> > Having updated poky from 028a292001f64ad86c6b960a05ba1f6fd72199de
> > (end-of
> > July) to 3b77e7b7852549dcfbc426d4ce258e6e857c0acd (mid October), at
> > least
> > two broken sstate archives have been created:
> > 
> > * the sstate archive for update-rc.d package_write_ipk contains a
> > broken
> >   main package update-rc.d_0.8-r0_all.ipk
> > * zlib populate_sysroot has a broken sysroot-
> > destdir/lib/libz.so.1.2.11
> > 
> > For the update-rc.d case:
> > * tar tzvvf displays a reasonable size for all files inside the
> > sstate
> >   archive
> > * tar xzf extracts all files and sets a size on update-rc.d_0.8-
> > r0_all.ipk,
> >   but it's all NULs, and hence is broken
> > * for those who know midnight commander, it's 'open' displays a size
> > of
> >   0 bytes for update-rc.d_0.8-r0_all.ipk in the first place
> > 
> > 
> > For the broken zlib sstate archive, things are similar, additionally:
> > * the zlib ipk packages (and their contents) contained inside
> >   sstate_zlib_*_package_write_ipk.tgz are actually not broken
> > 
> > 
> > The original (first) build resulting from my poky.git update had
> > actually
> > completed successfully. It is only subsequent builds trying to use
> > the
> > generated sstate artefacts that now don't work.
> > 
> > I can't say for sure whether or not other sstate artefacts are
> > broken, too.
> > 
> > 
> > Any ideas how this could have happened? Have similar issues been seen
> > before?
> 
> I've not seen/heard of any reports of that before and it is worrying.
> The question is can you reproduce it? As things stand that is a little
> bit of a hard one to replicate/debug :(
> 
> Which host OS and filesystem was it?

I now have continuous rebuilds of above described scenario on two different
physical machines and am getting broken sstate archives in a similar way to
the description above every now and then for varying recipes and/or varying
files inside a given sstate archive on one of the two machines (the faster
one) so far.

Unfortunately I don't know which of the two physical machines was involved
in my original description.

The faster machine is running Debian 9.5 and kernel 4.9.110-3+deb9u5 the
other Debian unstable and kernel 4.18.8-1 - builds happen inside a Debian 9
docker container in any case. Both machines use btrfs.


Nevertheless I'm tempted to rule out hardware issues, though because:
* The sstate archives are either just archiving the output of
  do_populate_sysroot (which itself is just hard-linking the output of
  do_install), and that same do_populate_sysroot output is also used during
  compilation of dependent recipes
* Or the sstate is created by archiving the output of do_package_write_ipk,
  where again the same IPKs (hard-linked) are used to build the final image
* The defect seems to be inside the .tar, but an sstate archive is a
  .tar.gz, and no temporary uncompressed .tar is stored in the file-system
  to create the compressed .tar.gz

Please correct me if this reasoning is flawed.


I'll play with a few bits and variations to try to narrow things down, but
given the nature it'll likely take a while to get more insights.


Cheers,
Andre'




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: broken sstate archives
  2018-11-01 12:22   ` André Draszik
@ 2018-11-01 13:07     ` richard.purdie
  0 siblings, 0 replies; 7+ messages in thread
From: richard.purdie @ 2018-11-01 13:07 UTC (permalink / raw)
  To: André Draszik, openembedded-core

On Thu, 2018-11-01 at 12:22 +0000, André Draszik wrote:
> On Tue, 2018-10-30 at 16:49 +0000, Richard Purdie wrote:
> > I've not seen/heard of any reports of that before and it is
> > worrying.
> > The question is can you reproduce it? As things stand that is a
> > little
> > bit of a hard one to replicate/debug :(
> > 
> > Which host OS and filesystem was it?
> 
> I now have continuous rebuilds of above described scenario on two
> different
> physical machines and am getting broken sstate archives in a similar
> way to
> the description above every now and then for varying recipes and/or
> varying
> files inside a given sstate archive on one of the two machines (the
> faster
> one) so far.
> 
> Unfortunately I don't know which of the two physical machines was
> involved
> in my original description.
> 
> The faster machine is running Debian 9.5 and kernel 4.9.110-3+deb9u5
> the
> other Debian unstable and kernel 4.18.8-1 - builds happen inside a
> Debian 9
> docker container in any case. Both machines use btrfs.
> 
> 
> Nevertheless I'm tempted to rule out hardware issues, though because:
> * The sstate archives are either just archiving the output of
>   do_populate_sysroot (which itself is just hard-linking the output
> of
>   do_install), and that same do_populate_sysroot output is also used
> during
>   compilation of dependent recipes
> * Or the sstate is created by archiving the output of
> do_package_write_ipk,
>   where again the same IPKs (hard-linked) are used to build the final
> image
> * The defect seems to be inside the .tar, but an sstate archive is a
>   .tar.gz, and no temporary uncompressed .tar is stored in the file-
> system
>   to create the compressed .tar.gz
> 
> Please correct me if this reasoning is flawed.

I think the reasoning is reasonable. My prime suspect would probably be
btrfs and if you have the space/capability, I'd be tempted to try
builds on ext4 partitions on those machines...

> I'll play with a few bits and variations to try to narrow things
> down, but given the nature it'll likely take a while to get more
> insights.

Yes, problems like this tend to be "fun" to track down :(

Cheers,

Richard



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: broken sstate archives
  2018-10-30 15:45 broken sstate archives André Draszik
  2018-10-30 16:49 ` Richard Purdie
@ 2018-11-12 12:56 ` André Draszik
  2018-11-12 12:56   ` [DONT-MERGE] sstate: add hack to detect sstate archive compression failures André Draszik
  2018-11-12 14:07   ` broken sstate archives richard.purdie
  1 sibling, 2 replies; 7+ messages in thread
From: André Draszik @ 2018-11-12 12:56 UTC (permalink / raw)
  To: openembedded-core

As a follow-up and as a reference for the future - having done countless
builds I am confident enough that a kernel update solved the problem:
- linux (Debian 9) versions 4.9.110-3+deb9u6 and 4.9.110-3+deb9u5 both
  exhibit the problem
- 4.18.6-1~bpo9+1 doesn't

Something must have gone into the btrfs driver to fix the issue. I didn't
really spend the time to try and find the relevant commit.

For reference, I am also sending my hackish patch to poky to detect
the error case early, which (tries) to detect the broken archive, and
simply runs tar again.

Thanks Richard for all your inputs.

Cheers,
Andre'




^ permalink raw reply	[flat|nested] 7+ messages in thread

* [DONT-MERGE] sstate: add hack to detect sstate archive compression failures
  2018-11-12 12:56 ` André Draszik
@ 2018-11-12 12:56   ` André Draszik
  2018-11-12 14:07   ` broken sstate archives richard.purdie
  1 sibling, 0 replies; 7+ messages in thread
From: André Draszik @ 2018-11-12 12:56 UTC (permalink / raw)
  To: openembedded-core

From: André Draszik <andre.draszik@jci.com>

Signed-off-by: André Draszik <andre.draszik@jci.com>
---
 meta/classes/sstate.bbclass | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/meta/classes/sstate.bbclass b/meta/classes/sstate.bbclass
index efb0096c70..b0ba07cacb 100644
--- a/meta/classes/sstate.bbclass
+++ b/meta/classes/sstate.bbclass
@@ -731,6 +731,8 @@ sstate_create_package () {
 
 	# Need to handle empty directories
 	if [ "$(ls -A)" ]; then
+	    ctr=0
+	    while [ $ctr -lt 2 ] ; do
 		set +e
 		tar $OPT -f $TFILE *
 		ret=$?
@@ -738,6 +740,33 @@ sstate_create_package () {
 			exit 1
 		fi
 		set -e
+
+		TDIR=`mktemp -d ${SSTATE_PKG}.extracted.XXXXXXXX`
+		tar -C $TDIR -xzf $TFILE
+		export TFILE
+		find $TDIR -type f -exec sh -ceu '
+			for c in "$@" ; do
+				if stat "$c" | grep -v "Size: 0" | grep -q "Blocks: 0" ; then
+					echo File $c in archive $TFILE is broken
+					echo "$TFILE -> $c" >> "$TFILE.broken"
+				fi
+			done
+		' _ '{}' +
+		rm -rf $TDIR
+		if [ ! -s "$TFILE.broken" ] ; then
+			if [ $ctr -ne 0 ] ; then
+				bbwarn "$TFILE created successfully on attempt # $ctr"
+			fi
+			break
+		fi
+		bbwarn "$TFILE is broken, retrying"
+		bbwarn "`cat $TFILE.broken`"
+		rm $TFILE.broken
+		ctr=$(expr $ctr + 1)
+	    done
+	    if [ $ctr -ge 2 ] ; then
+		bbfatal "Could not compress sstate"
+	    fi
 	else
 		tar $OPT --file=$TFILE --files-from=/dev/null
 	fi
-- 
2.19.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: broken sstate archives
  2018-11-12 12:56 ` André Draszik
  2018-11-12 12:56   ` [DONT-MERGE] sstate: add hack to detect sstate archive compression failures André Draszik
@ 2018-11-12 14:07   ` richard.purdie
  1 sibling, 0 replies; 7+ messages in thread
From: richard.purdie @ 2018-11-12 14:07 UTC (permalink / raw)
  To: André Draszik, openembedded-core

On Mon, 2018-11-12 at 12:56 +0000, André Draszik wrote:
> As a follow-up and as a reference for the future - having done
> countless
> builds I am confident enough that a kernel update solved the problem:
> - linux (Debian 9) versions 4.9.110-3+deb9u6 and 4.9.110-3+deb9u5
> both
>   exhibit the problem
> - 4.18.6-1~bpo9+1 doesn't
> 
> Something must have gone into the btrfs driver to fix the issue. I
> didn't
> really spend the time to try and find the relevant commit.
> 
> For reference, I am also sending my hackish patch to poky to detect
> the error case early, which (tries) to detect the broken archive, and
> simply runs tar again.
> 
> Thanks Richard for all your inputs.

Glad you kind of have an answer, these issues can be challenging to
debug. FWIW it sounds like the right kind of problem/solution to fit
the symptoms...

Cheers,

Richard




^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-11-12 14:07 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-30 15:45 broken sstate archives André Draszik
2018-10-30 16:49 ` Richard Purdie
2018-11-01 12:22   ` André Draszik
2018-11-01 13:07     ` richard.purdie
2018-11-12 12:56 ` André Draszik
2018-11-12 12:56   ` [DONT-MERGE] sstate: add hack to detect sstate archive compression failures André Draszik
2018-11-12 14:07   ` broken sstate archives richard.purdie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.