* [PATCH 0/2] Avoid build failures due to setscene errors @ 2017-08-29 20:00 Peter Kjellerstedt 2017-08-29 20:00 ` [PATCH 1/2] bitbake: fetch2: Allow Fetch.download() to warn instead of error Peter Kjellerstedt ` (3 more replies) 0 siblings, 4 replies; 18+ messages in thread From: Peter Kjellerstedt @ 2017-08-29 20:00 UTC (permalink / raw) To: openembedded-core Occasionally, we see errors on our autobuilders where a setscene task fails to retrieve a file from our global sstate cache. It typically looks something like this: WARNING: zip-3.0-r2 do_populate_sysroot_setscene: Failed to fetch URL file://66/sstate:zip:core2-64-poky-linux:3.0:r2:core2-64:3:\ 66832b8c4e7babe0eac9d9579d1e2b6a_populate_sysroot.tgz;\ downloadfilename=66/sstate:zip:core2-64-poky-linux:3.0:r2:core2-64:3:\ 66832b8c4e7babe0eac9d9579d1e2b6a_populate_sysroot.tgz, attempting MIRRORS if available ERROR: zip-3.0-r2 do_populate_sysroot_setscene: Fetcher failure: Unable to find file file://66/sstate:zip:core2-64-poky-linux:3.0:r2:core2-64:3:\ 66832b8c4e7babe0eac9d9579d1e2b6a_populate_sysroot.tgz;\ downloadfilename=66/sstate:zip:core2-64-poky-linux:3.0:r2:core2-64:3:\ 66832b8c4e7babe0eac9d9579d1e2b6a_populate_sysroot.tgz anywhere. The paths that were searched were: /home/pkj/.openembedded/sstate-cache ERROR: zip-3.0-r2 do_populate_sysroot_setscene: No suitable staging package found WARNING: Setscene task (meta/recipes-extended/zip/zip_3.0.bb:do_populate_sysroot_setscene) failed with exit code '1' - real task will be run instead As the last warning indicates, the build will proceed and the real task will run and the build will eventually complete. However, due to the two errors above, bitbake will return with an error code which causes the autobuilder to treat the build as failed and it proceeds to throw everything it built away. Since this is quite pointless and causes unnecessary build resources to be spent and grief from the developers, the two patches in this change set turn the errors from setscene tasks into warnings. //Peter The following changes since commit bc2e0b2e9b95707d96c840dade12b00e1450ecc3: libsdl: Move PACKAGECONFIG options from meta-mingw (2017-08-29 12:23:10 +0100) are available in the git repository at: git://git.yoctoproject.org/poky-contrib pkj/setscene-errors http://git.yoctoproject.org/cgit.cgi/poky-contrib/log/?h=pkj/setscene-errors Peter Kjellerstedt (2): bitbake: fetch2: Allow Fetch.download() to warn instead of error sstate.bbclass: Do not cause build failures due to setscene errors bitbake/lib/bb/fetch2/__init__.py | 20 +++++++++++++++----- meta/classes/sstate.bbclass | 5 +++-- 2 files changed, 18 insertions(+), 7 deletions(-) -- 2.12.0 ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 1/2] bitbake: fetch2: Allow Fetch.download() to warn instead of error 2017-08-29 20:00 [PATCH 0/2] Avoid build failures due to setscene errors Peter Kjellerstedt @ 2017-08-29 20:00 ` Peter Kjellerstedt 2017-08-29 20:00 ` [PATCH 2/2] sstate.bbclass: Do not cause build failures due to setscene errors Peter Kjellerstedt ` (2 subsequent siblings) 3 siblings, 0 replies; 18+ messages in thread From: Peter Kjellerstedt @ 2017-08-29 20:00 UTC (permalink / raw) To: openembedded-core Under some situations it can be allowed for Fetch.download() to fail to fetch a file without causing bitbake to fail. By adding only_warn=True as argument to Fetch.download(), it will call logger.warning() instead of logger.error() and thus not cause build failures. Signed-off-by: Peter Kjellerstedt <peter.kjellerstedt@axis.com> --- bitbake/lib/bb/fetch2/__init__.py | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/bitbake/lib/bb/fetch2/__init__.py b/bitbake/lib/bb/fetch2/__init__.py index 3eb0e4d211..58f65ada84 100644 --- a/bitbake/lib/bb/fetch2/__init__.py +++ b/bitbake/lib/bb/fetch2/__init__.py @@ -1608,9 +1608,10 @@ class Fetch(object): return local - def download(self, urls=None): + def download(self, urls=None, only_warn=False): """ - Fetch all urls + Fetch all urls. In case only_warn is True, a failure to fetch a url + will only result in a warning message, rather than an error message. """ if not urls: urls = self.urls @@ -1688,19 +1689,28 @@ class Fetch(object): if not localpath or ((not os.path.exists(localpath)) and localpath.find("*") == -1): if firsterr: - logger.error(str(firsterr)) + if only_warn: + logger.warning(str(firsterr)) + else: + logger.error(str(firsterr)) raise FetchError("Unable to fetch URL from any source.", u) update_stamp(ud, self.d) except IOError as e: if e.errno in [os.errno.ESTALE]: - logger.error("Stale Error Observed %s." % u) + if only_warn: + logger.warning("Stale Error Observed %s." % u) + else: + logger.error("Stale Error Observed %s." % u) raise ChecksumError("Stale Error Detected") except BBFetchException as e: if isinstance(e, ChecksumError): - logger.error("Checksum failure fetching %s" % u) + if only_warn: + logger.warning("Checksum failure fetching %s" % u) + else: + logger.error("Checksum failure fetching %s" % u) raise finally: -- 2.12.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 2/2] sstate.bbclass: Do not cause build failures due to setscene errors 2017-08-29 20:00 [PATCH 0/2] Avoid build failures due to setscene errors Peter Kjellerstedt 2017-08-29 20:00 ` [PATCH 1/2] bitbake: fetch2: Allow Fetch.download() to warn instead of error Peter Kjellerstedt @ 2017-08-29 20:00 ` Peter Kjellerstedt 2017-08-29 20:04 ` ✗ patchtest: failure for Avoid " Patchwork 2017-08-29 20:38 ` [PATCH 0/2] " Andre McCurdy 3 siblings, 0 replies; 18+ messages in thread From: Peter Kjellerstedt @ 2017-08-29 20:00 UTC (permalink / raw) To: openembedded-core If a setscene task fails, the real task will be run instead. However, in case the failed setscene task happened to log any errors, this will still cause bitbake to return with an error code, even though everything actually built ok. To avoid this, modify setscene to only warn about errors. Signed-off-by: Peter Kjellerstedt <peter.kjellerstedt@axis.com> --- meta/classes/sstate.bbclass | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/meta/classes/sstate.bbclass b/meta/classes/sstate.bbclass index 6af0d388bc..7d76ac141b 100644 --- a/meta/classes/sstate.bbclass +++ b/meta/classes/sstate.bbclass @@ -671,7 +671,7 @@ def pstaging_fetch(sstatefetch, sstatepkg, d): localdata.setVar('SRC_URI', srcuri) try: fetcher = bb.fetch2.Fetch([srcuri], localdata, cache=False) - fetcher.download() + fetcher.download(only_warn=True) except bb.fetch2.BBFetchException: break @@ -680,7 +680,8 @@ def sstate_setscene(d): shared_state = sstate_state_fromvars(d) accelerate = sstate_installpkg(shared_state, d) if not accelerate: - bb.fatal("No suitable staging package found") + bb.warn("No suitable staging package found") + sys.exit(1) python sstate_task_prefunc () { shared_state = sstate_state_fromvars(d) -- 2.12.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* ✗ patchtest: failure for Avoid build failures due to setscene errors 2017-08-29 20:00 [PATCH 0/2] Avoid build failures due to setscene errors Peter Kjellerstedt 2017-08-29 20:00 ` [PATCH 1/2] bitbake: fetch2: Allow Fetch.download() to warn instead of error Peter Kjellerstedt 2017-08-29 20:00 ` [PATCH 2/2] sstate.bbclass: Do not cause build failures due to setscene errors Peter Kjellerstedt @ 2017-08-29 20:04 ` Patchwork 2017-08-29 20:25 ` Peter Kjellerstedt 2017-08-29 20:38 ` [PATCH 0/2] " Andre McCurdy 3 siblings, 1 reply; 18+ messages in thread From: Patchwork @ 2017-08-29 20:04 UTC (permalink / raw) To: Peter Kjellerstedt; +Cc: openembedded-core == Series Details == Series: Avoid build failures due to setscene errors Revision: 1 URL : https://patchwork.openembedded.org/series/8575/ State : failure == Summary == Thank you for submitting this patch series to OpenEmbedded Core. This is an automated response. Several tests have been executed on the proposed series by patchtest resulting in the following failures: * Issue Series does not apply on top of target branch [test_series_merge_on_head] Suggested fix Rebase your series on top of targeted branch Targeted branch master (currently at 2454019844) If you believe any of these test results are incorrect, please reply to the mailing list (openembedded-core@lists.openembedded.org) raising your concerns. Otherwise we would appreciate you correcting the issues and submitting a new version of the patchset if applicable. Please ensure you add/increment the version number when sending the new version (i.e. [PATCH] -> [PATCH v2] -> [PATCH v3] -> ...). --- Test framework: http://git.yoctoproject.org/cgit/cgit.cgi/patchtest Test suite: http://git.yoctoproject.org/cgit/cgit.cgi/patchtest-oe ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: ✗ patchtest: failure for Avoid build failures due to setscene errors 2017-08-29 20:04 ` ✗ patchtest: failure for Avoid " Patchwork @ 2017-08-29 20:25 ` Peter Kjellerstedt 2017-08-29 22:35 ` Philip Balister 0 siblings, 1 reply; 18+ messages in thread From: Peter Kjellerstedt @ 2017-08-29 20:25 UTC (permalink / raw) To: openembedded-core > -----Original Message----- > From: Patchwork [mailto:patchwork@patchwork.openembedded.org] > Sent: den 29 augusti 2017 22:05 > To: Peter Kjellerstedt <peter.kjellerstedt@axis.com> > Cc: openembedded-core@lists.openembedded.org > Subject: ✗ patchtest: failure for Avoid build failures due to setscene > errors > > == Series Details == > > Series: Avoid build failures due to setscene errors > Revision: 1 > URL : https://patchwork.openembedded.org/series/8575/ > State : failure > > == Summary == > > > Thank you for submitting this patch series to OpenEmbedded Core. This is > an automated response. Several tests have been executed on the proposed > series by patchtest resulting in the following failures: > > > > * Issue Series does not apply on top of target branch [test_series_merge_on_head] > Suggested fix Rebase your series on top of targeted branch > Targeted branch master (currently at 2454019844) Argh, why can't this handle combined bitbake and OE-Core changes, i.e., changes for Poky. Oh well, separate patches coming up... > If you believe any of these test results are incorrect, please reply to the > mailing list (openembedded-core@lists.openembedded.org) raising your concerns. > Otherwise we would appreciate you correcting the issues and submitting a new > version of the patchset if applicable. Please ensure you add/increment the > version number when sending the new version (i.e. [PATCH] -> [PATCH v2] -> > [PATCH v3] -> ...). > > --- > Test framework: http://git.yoctoproject.org/cgit/cgit.cgi/patchtest > Test suite: http://git.yoctoproject.org/cgit/cgit.cgi/patchtest-oe //Peter ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: ✗ patchtest: failure for Avoid build failures due to setscene errors 2017-08-29 20:25 ` Peter Kjellerstedt @ 2017-08-29 22:35 ` Philip Balister 2017-08-30 7:41 ` Peter Kjellerstedt 0 siblings, 1 reply; 18+ messages in thread From: Philip Balister @ 2017-08-29 22:35 UTC (permalink / raw) To: Peter Kjellerstedt, openembedded-core On 08/29/2017 04:25 PM, Peter Kjellerstedt wrote: >> -----Original Message----- >> From: Patchwork [mailto:patchwork@patchwork.openembedded.org] >> Sent: den 29 augusti 2017 22:05 >> To: Peter Kjellerstedt <peter.kjellerstedt@axis.com> >> Cc: openembedded-core@lists.openembedded.org >> Subject: ✗ patchtest: failure for Avoid build failures due to setscene >> errors >> >> == Series Details == >> >> Series: Avoid build failures due to setscene errors >> Revision: 1 >> URL : https://patchwork.openembedded.org/series/8575/ >> State : failure >> >> == Summary == >> >> >> Thank you for submitting this patch series to OpenEmbedded Core. This is >> an automated response. Several tests have been executed on the proposed >> series by patchtest resulting in the following failures: >> >> >> >> * Issue Series does not apply on top of target branch [test_series_merge_on_head] >> Suggested fix Rebase your series on top of targeted branch >> Targeted branch master (currently at 2454019844) > > Argh, why can't this handle combined bitbake and OE-Core changes, i.e., > changes for Poky. Oh well, separate patches coming up... Because poky isn't the upstream project. Philip > >> If you believe any of these test results are incorrect, please reply to the >> mailing list (openembedded-core@lists.openembedded.org) raising your concerns. >> Otherwise we would appreciate you correcting the issues and submitting a new >> version of the patchset if applicable. Please ensure you add/increment the >> version number when sending the new version (i.e. [PATCH] -> [PATCH v2] -> >> [PATCH v3] -> ...). >> >> --- >> Test framework: http://git.yoctoproject.org/cgit/cgit.cgi/patchtest >> Test suite: http://git.yoctoproject.org/cgit/cgit.cgi/patchtest-oe > > //Peter > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: ✗ patchtest: failure for Avoid build failures due to setscene errors 2017-08-29 22:35 ` Philip Balister @ 2017-08-30 7:41 ` Peter Kjellerstedt 0 siblings, 0 replies; 18+ messages in thread From: Peter Kjellerstedt @ 2017-08-30 7:41 UTC (permalink / raw) To: Philip Balister, openembedded-core > -----Original Message----- > From: openembedded-core-bounces@lists.openembedded.org > [mailto:openembedded-core-bounces@lists.openembedded.org] On Behalf Of > Philip Balister > Sent: den 30 augusti 2017 00:36 > To: Peter Kjellerstedt <peter.kjellerstedt@axis.com>; openembedded- > core@lists.openembedded.org > Subject: Re: [OE-core] ✗ patchtest: failure for Avoid build failures > due to setscene errors > > On 08/29/2017 04:25 PM, Peter Kjellerstedt wrote: > >> -----Original Message----- > >> From: Patchwork [mailto:patchwork@patchwork.openembedded.org] > >> Sent: den 29 augusti 2017 22:05 > >> To: Peter Kjellerstedt <peter.kjellerstedt@axis.com> > >> Cc: openembedded-core@lists.openembedded.org > >> Subject: ✗ patchtest: failure for Avoid build failures due to > setscene > >> errors > >> > >> == Series Details == > >> > >> Series: Avoid build failures due to setscene errors > >> Revision: 1 > >> URL : https://patchwork.openembedded.org/series/8575/ > >> State : failure > >> > >> == Summary == > >> > >> > >> Thank you for submitting this patch series to OpenEmbedded Core. > This is > >> an automated response. Several tests have been executed on the > proposed > >> series by patchtest resulting in the following failures: > >> > >> > >> > >> * Issue Series does not apply on top of target branch > [test_series_merge_on_head] > >> Suggested fix Rebase your series on top of targeted branch > >> Targeted branch master (currently at 2454019844) Actually, would it be possible to get a better error message that indicates that one has mixed in patches for other projects that are part of Poky? When working with Poky as the basis, differentiating between, e.g., bitbake and OE-Core is not something that comes natural. I actually had to think both once and twice before I realized that one of my patches was actually for bitbake (and just barely stopped me in time from sending an irritated mail about why patchtest wasn't accepting my changes). > > Argh, why can't this handle combined bitbake and OE-Core changes, > > i.e., changes for Poky. Oh well, separate patches coming up... > > Because poky isn't the upstream project. > > Philip Well, I know that. However, I doubt we are the only ones who use Poky as the base for our distribution. Thus Poky is our de facto upstream. So when working on a change that affects both BitBake, OE-Core and maybe even the OE documentation (none of which is uncommon), having to split the changes in multiple stacks and keeping track of them together over multiple projects is not very encouraging, especially when Poky is distributed as a unit. I am not asking for this to change, but it would have been nice to be able to treat Poky as an upstream and to deliver changes that span the individual projects as one set of changes against Poky. //Peter ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 0/2] Avoid build failures due to setscene errors 2017-08-29 20:00 [PATCH 0/2] Avoid build failures due to setscene errors Peter Kjellerstedt ` (2 preceding siblings ...) 2017-08-29 20:04 ` ✗ patchtest: failure for Avoid " Patchwork @ 2017-08-29 20:38 ` Andre McCurdy 2017-08-29 20:59 ` Peter Kjellerstedt 3 siblings, 1 reply; 18+ messages in thread From: Andre McCurdy @ 2017-08-29 20:38 UTC (permalink / raw) To: Peter Kjellerstedt; +Cc: OE Core mailing list On Tue, Aug 29, 2017 at 1:00 PM, Peter Kjellerstedt <peter.kjellerstedt@axis.com> wrote: > Occasionally, we see errors on our autobuilders where a setscene task > fails to retrieve a file from our global sstate cache. It typically > looks something like this: > > WARNING: zip-3.0-r2 do_populate_sysroot_setscene: Failed to fetch URL > file://66/sstate:zip:core2-64-poky-linux:3.0:r2:core2-64:3:\ > 66832b8c4e7babe0eac9d9579d1e2b6a_populate_sysroot.tgz;\ > downloadfilename=66/sstate:zip:core2-64-poky-linux:3.0:r2:core2-64:3:\ > 66832b8c4e7babe0eac9d9579d1e2b6a_populate_sysroot.tgz, attempting > MIRRORS if available > ERROR: zip-3.0-r2 do_populate_sysroot_setscene: Fetcher failure: > Unable to find file > file://66/sstate:zip:core2-64-poky-linux:3.0:r2:core2-64:3:\ > 66832b8c4e7babe0eac9d9579d1e2b6a_populate_sysroot.tgz;\ > downloadfilename=66/sstate:zip:core2-64-poky-linux:3.0:r2:core2-64:3:\ > 66832b8c4e7babe0eac9d9579d1e2b6a_populate_sysroot.tgz anywhere. The > paths that were searched were: > /home/pkj/.openembedded/sstate-cache To trigger this, do you have SSTATE_MIRRORS pointing to "/home/pkj/.openembedded/sstate-cache" and SSTATE_DIR pointed somewhere else? Or are they both pointing to the same local directory? Or something else? > ERROR: zip-3.0-r2 do_populate_sysroot_setscene: No suitable staging > package found > WARNING: Setscene task > (meta/recipes-extended/zip/zip_3.0.bb:do_populate_sysroot_setscene) > failed with exit code '1' - real task will be run instead > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 0/2] Avoid build failures due to setscene errors 2017-08-29 20:38 ` [PATCH 0/2] " Andre McCurdy @ 2017-08-29 20:59 ` Peter Kjellerstedt 2017-08-29 21:49 ` Richard Purdie 2017-08-29 22:03 ` Andre McCurdy 0 siblings, 2 replies; 18+ messages in thread From: Peter Kjellerstedt @ 2017-08-29 20:59 UTC (permalink / raw) To: Andre McCurdy; +Cc: OE Core mailing list > -----Original Message----- > From: Andre McCurdy [mailto:armccurdy@gmail.com] > Sent: den 29 augusti 2017 22:38 > To: Peter Kjellerstedt <peter.kjellerstedt@axis.com> > Cc: OE Core mailing list <openembedded-core@lists.openembedded.org> > Subject: Re: [OE-core] [PATCH 0/2] Avoid build failures due to setscene > errors > > On Tue, Aug 29, 2017 at 1:00 PM, Peter Kjellerstedt > <peter.kjellerstedt@axis.com> wrote: > > Occasionally, we see errors on our autobuilders where a setscene task > > fails to retrieve a file from our global sstate cache. It typically > > looks something like this: > > > > WARNING: zip-3.0-r2 do_populate_sysroot_setscene: Failed to fetch URL > > file://66/sstate:zip:core2-64-poky-linux:3.0:r2:core2-64:3:\ > > 66832b8c4e7babe0eac9d9579d1e2b6a_populate_sysroot.tgz;\ > > downloadfilename=66/sstate:zip:core2-64-poky-linux:3.0:r2:core2- > 64:3:\ > > 66832b8c4e7babe0eac9d9579d1e2b6a_populate_sysroot.tgz, attempting > > MIRRORS if available > > ERROR: zip-3.0-r2 do_populate_sysroot_setscene: Fetcher failure: > > Unable to find file > > file://66/sstate:zip:core2-64-poky-linux:3.0:r2:core2-64:3:\ > > 66832b8c4e7babe0eac9d9579d1e2b6a_populate_sysroot.tgz;\ > > downloadfilename=66/sstate:zip:core2-64-poky-linux:3.0:r2:core2- > 64:3:\ > > 66832b8c4e7babe0eac9d9579d1e2b6a_populate_sysroot.tgz anywhere. The > > paths that were searched were: > > /home/pkj/.openembedded/sstate-cache > > To trigger this, do you have SSTATE_MIRRORS pointing to > "/home/pkj/.openembedded/sstate-cache" and SSTATE_DIR pointed > somewhere else? Or are they both pointing to the same local directory? > Or something else? No, the directory above is actually what is in SSTATE_DIR. SSTATE_MIRRORS is set to: SSTATE_MIRRORS ?= "\ file://.* file:///n/oe/sstate-cache/PATH;downloadfilename=PATH" where /n/oe is an NFS mount where we share a global sstate cache. The only way I have figured out to manually simulate the problem is by modifying the code in sstate_checkhashes() in sstate.bbclass and commenting out the call to fetcher.checkstatus(). Then as long as there actually is no sstate files for the task in either the global or the local sstate cache, I will get the above. I do not know what triggers it on the autobuilder though. My guess is that somehow the sstate tgz file disappears between the call to sstate_checkhashes() and when bitbake actually tries to download the file. We do have a daily job that cleans up the global sstate cache and removes files that have not been accessed in the last ten days, but it seems unlikely that it should remove a file that just happens to be required again, and do it at exactly the time when that task is building. > > ERROR: zip-3.0-r2 do_populate_sysroot_setscene: No suitable staging > > package found > > WARNING: Setscene task > > (meta/recipes-extended/zip/zip_3.0.bb:do_populate_sysroot_setscene) > > failed with exit code '1' - real task will be run instead //Peter ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 0/2] Avoid build failures due to setscene errors 2017-08-29 20:59 ` Peter Kjellerstedt @ 2017-08-29 21:49 ` Richard Purdie 2017-08-30 6:44 ` Peter Kjellerstedt 2017-08-29 22:03 ` Andre McCurdy 1 sibling, 1 reply; 18+ messages in thread From: Richard Purdie @ 2017-08-29 21:49 UTC (permalink / raw) To: Peter Kjellerstedt, Andre McCurdy; +Cc: OE Core mailing list On Tue, 2017-08-29 at 20:59 +0000, Peter Kjellerstedt wrote: > > -----Original Message----- > > From: Andre McCurdy [mailto:armccurdy@gmail.com] > > Sent: den 29 augusti 2017 22:38 > > To: Peter Kjellerstedt <peter.kjellerstedt@axis.com> > > Cc: OE Core mailing list <openembedded-core@lists.openembedded.org> > > Subject: Re: [OE-core] [PATCH 0/2] Avoid build failures due to > > setscene > > errors > > > > On Tue, Aug 29, 2017 at 1:00 PM, Peter Kjellerstedt > > <peter.kjellerstedt@axis.com> wrote: > > > > > > Occasionally, we see errors on our autobuilders where a setscene > > > task > > > fails to retrieve a file from our global sstate cache. It > > > typically > > > looks something like this: > > > > > > WARNING: zip-3.0-r2 do_populate_sysroot_setscene: Failed to fetch > > > URL > > > file://66/sstate:zip:core2-64-poky-linux:3.0:r2:core2-64:3:\ > > > 66832b8c4e7babe0eac9d9579d1e2b6a_populate_sysroot.tgz;\ > > > downloadfilename=66/sstate:zip:core2-64-poky-linux:3.0:r2:core2- > > 64:3:\ > > > > > > 66832b8c4e7babe0eac9d9579d1e2b6a_populate_sysroot.tgz, attempting > > > MIRRORS if available > > > ERROR: zip-3.0-r2 do_populate_sysroot_setscene: Fetcher failure: > > > Unable to find file > > > file://66/sstate:zip:core2-64-poky-linux:3.0:r2:core2-64:3:\ > > > 66832b8c4e7babe0eac9d9579d1e2b6a_populate_sysroot.tgz;\ > > > downloadfilename=66/sstate:zip:core2-64-poky-linux:3.0:r2:core2- > > 64:3:\ > > > > > > 66832b8c4e7babe0eac9d9579d1e2b6a_populate_sysroot.tgz anywhere. > > > The > > > paths that were searched were: > > > /home/pkj/.openembedded/sstate-cache > > To trigger this, do you have SSTATE_MIRRORS pointing to > > "/home/pkj/.openembedded/sstate-cache" and SSTATE_DIR pointed > > somewhere else? Or are they both pointing to the same local > > directory? > > Or something else? > No, the directory above is actually what is in SSTATE_DIR. > SSTATE_MIRRORS is set to: > > SSTATE_MIRRORS ?= "\ > file://.* file:///n/oe/sstate-cache/PATH;downloadfilename=PATH" > > where /n/oe is an NFS mount where we share a global sstate cache. > > The only way I have figured out to manually simulate the problem is > by modifying the code in sstate_checkhashes() in sstate.bbclass and > commenting out the call to fetcher.checkstatus(). Then as long as > there actually is no sstate files for the task in either the global > or the local sstate cache, I will get the above. > > I do not know what triggers it on the autobuilder though. My guess > is > that somehow the sstate tgz file disappears between the call to > sstate_checkhashes() and when bitbake actually tries to download the > file. > > We do have a daily job that cleans up the global sstate cache and > removes files that have not been accessed in the last ten days, but > it seems unlikely that it should remove a file that just happens to > be required again, and do it at exactly the time when that task is > building. I have left this code as an error deliberately as this kind of thing should not happen and if it does, there is really something wrong which you need to figure out. It means that at one point bitbake thinks the sstate is present and valid, then later it isn't. I'm not convinced patching out the errors is the right solution here... Cheers, Richard ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 0/2] Avoid build failures due to setscene errors 2017-08-29 21:49 ` Richard Purdie @ 2017-08-30 6:44 ` Peter Kjellerstedt 2017-08-30 7:54 ` Martin Jansa 2017-08-30 8:02 ` Richard Purdie 0 siblings, 2 replies; 18+ messages in thread From: Peter Kjellerstedt @ 2017-08-30 6:44 UTC (permalink / raw) To: Richard Purdie, Andre McCurdy; +Cc: OE Core mailing list > -----Original Message----- > From: openembedded-core-bounces@lists.openembedded.org > [mailto:openembedded-core-bounces@lists.openembedded.org] On Behalf Of > Richard Purdie > Sent: den 29 augusti 2017 23:50 > To: Peter Kjellerstedt <peter.kjellerstedt@axis.com>; Andre McCurdy > <armccurdy@gmail.com> > Cc: OE Core mailing list <openembedded-core@lists.openembedded.org> > Subject: Re: [OE-core] [PATCH 0/2] Avoid build failures due to setscene > errors > > On Tue, 2017-08-29 at 20:59 +0000, Peter Kjellerstedt wrote: > > > -----Original Message----- > > > From: Andre McCurdy [mailto:armccurdy@gmail.com] > > > Sent: den 29 augusti 2017 22:38 > > > To: Peter Kjellerstedt <peter.kjellerstedt@axis.com> > > > Cc: OE Core mailing list <openembedded-core@lists.openembedded.org> > > > Subject: Re: [OE-core] [PATCH 0/2] Avoid build failures due to > > > setscene > > > errors > > > > > > On Tue, Aug 29, 2017 at 1:00 PM, Peter Kjellerstedt > > > <peter.kjellerstedt@axis.com> wrote: > > > > > > > > Occasionally, we see errors on our autobuilders where a setscene > > > > task > > > > fails to retrieve a file from our global sstate cache. It > > > > typically > > > > looks something like this: > > > > > > > > WARNING: zip-3.0-r2 do_populate_sysroot_setscene: Failed to fetch > > > > URL > > > > file://66/sstate:zip:core2-64-poky-linux:3.0:r2:core2-64:3:\ > > > > 66832b8c4e7babe0eac9d9579d1e2b6a_populate_sysroot.tgz;\ > > > > downloadfilename=66/sstate:zip:core2-64-poky-linux:3.0:r2:core2- > > > 64:3:\ > > > > > > > > 66832b8c4e7babe0eac9d9579d1e2b6a_populate_sysroot.tgz, attempting > > > > MIRRORS if available > > > > ERROR: zip-3.0-r2 do_populate_sysroot_setscene: Fetcher failure: > > > > Unable to find file > > > > file://66/sstate:zip:core2-64-poky-linux:3.0:r2:core2-64:3:\ > > > > 66832b8c4e7babe0eac9d9579d1e2b6a_populate_sysroot.tgz;\ > > > > downloadfilename=66/sstate:zip:core2-64-poky-linux:3.0:r2:core2- > > > 64:3:\ > > > > > > > > 66832b8c4e7babe0eac9d9579d1e2b6a_populate_sysroot.tgz anywhere. > > > > The > > > > paths that were searched were: > > > > /home/pkj/.openembedded/sstate-cache > > > To trigger this, do you have SSTATE_MIRRORS pointing to > > > "/home/pkj/.openembedded/sstate-cache" and SSTATE_DIR pointed > > > somewhere else? Or are they both pointing to the same local > > > directory? > > > Or something else? > > No, the directory above is actually what is in SSTATE_DIR. > > SSTATE_MIRRORS is set to: > > > > SSTATE_MIRRORS ?= "\ > > file://.* file:///n/oe/sstate-cache/PATH;downloadfilename=PATH" > > > > where /n/oe is an NFS mount where we share a global sstate cache. > > > > The only way I have figured out to manually simulate the problem is > > by modifying the code in sstate_checkhashes() in sstate.bbclass and > > commenting out the call to fetcher.checkstatus(). Then as long as > > there actually is no sstate files for the task in either the global > > or the local sstate cache, I will get the above. > > > > I do not know what triggers it on the autobuilder though. My guess > > is > > that somehow the sstate tgz file disappears between the call to > > sstate_checkhashes() and when bitbake actually tries to download the > > file. > > > > We do have a daily job that cleans up the global sstate cache and > > removes files that have not been accessed in the last ten days, but > > it seems unlikely that it should remove a file that just happens to > > be required again, and do it at exactly the time when that task is > > building. > > I have left this code as an error deliberately as this kind of thing > should not happen and if it does, there is really something wrong which > you need to figure out. It means that at one point bitbake thinks the > sstate is present and valid, then later it isn't. True, but since the operations of checking if an sstate file exists and retrieving it is not an atomic operation, there are always problems that can occur. Some may be fixable, some may not. However, using a build failure to detect these kind of problems is a bit harsh on the developers who only sees their builds complete only to get an error for something that is not their fault. We have better ways to detect these kinds of problems, e.g., through log monitoring, without having to cause unnecessary grief amongst the developers. > I'm not convinced patching out the errors is the right solution here... How about I make it conditional by adding an IGNORE_SETSCENE_ERRORS? That way it can default to "0", but we can set it to "1" to prioritize the production builds. > Cheers, > > Richard //Peter ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 0/2] Avoid build failures due to setscene errors 2017-08-30 6:44 ` Peter Kjellerstedt @ 2017-08-30 7:54 ` Martin Jansa 2019-11-29 16:48 ` Martin Jansa 2017-08-30 8:02 ` Richard Purdie 1 sibling, 1 reply; 18+ messages in thread From: Martin Jansa @ 2017-08-30 7:54 UTC (permalink / raw) To: Peter Kjellerstedt; +Cc: OE Core mailing list [-- Attachment #1: Type: text/plain, Size: 6965 bytes --] I agree with this patchset and it would be OK with IGNORE_SETSCENE_ERRORS conditional as well. We're also sometimes seeing these errors, sometime anticipated when cleaning shared sstate-cache on NFS server sometimes unexpected when NFS or network goes down for a minute and for some builds it happens between sstate_checkhashes() and using the sstate. We normally stop all jenkins builds, until the cleanup is complete (there is jenkins job doing the cleanup, so it puts jenkins into stop mode, waits for all current jobs to finish which can take hours, then performs the cleanup and cancels the stop mode), but we cannot stop hundreds of developers using the same sstate-cache in local builds (especially when we cannot really know when exactly the job will have free jenkins to perform the cleanup) - luckily in local builds it doesn't hurt so bad, because the developers are more likely to ignore the error as long as the image was created, but in jenkins builds when bitbake returns error we cannot easily distinguish this case of "RP is intentionally warning us that something went wrong with sstate, but everything was built correctly in the end" and "something failed in the build and we weren't able to recover from that, maybe even the image wasn't created" - so we don't trigger the follow up actions like announcing new official builds or parsing release notes or automated testing. Yes we could add more logic to these CI jobs, to grep the logs to decide if this error was the only one which caused the bitbake to return error code and ignore the returned error in such case, but simple variable is easier to maintain (even for the cost of forking bitbake and oe-core) and will work for local builds as well. Regards, On Wed, Aug 30, 2017 at 8:44 AM, Peter Kjellerstedt < peter.kjellerstedt@axis.com> wrote: > > -----Original Message----- > > From: openembedded-core-bounces@lists.openembedded.org > > [mailto:openembedded-core-bounces@lists.openembedded.org] On Behalf Of > > Richard Purdie > > Sent: den 29 augusti 2017 23:50 > > To: Peter Kjellerstedt <peter.kjellerstedt@axis.com>; Andre McCurdy > > <armccurdy@gmail.com> > > Cc: OE Core mailing list <openembedded-core@lists.openembedded.org> > > Subject: Re: [OE-core] [PATCH 0/2] Avoid build failures due to setscene > > errors > > > > On Tue, 2017-08-29 at 20:59 +0000, Peter Kjellerstedt wrote: > > > > -----Original Message----- > > > > From: Andre McCurdy [mailto:armccurdy@gmail.com] > > > > Sent: den 29 augusti 2017 22:38 > > > > To: Peter Kjellerstedt <peter.kjellerstedt@axis.com> > > > > Cc: OE Core mailing list <openembedded-core@lists.openembedded.org> > > > > Subject: Re: [OE-core] [PATCH 0/2] Avoid build failures due to > > > > setscene > > > > errors > > > > > > > > On Tue, Aug 29, 2017 at 1:00 PM, Peter Kjellerstedt > > > > <peter.kjellerstedt@axis.com> wrote: > > > > > > > > > > Occasionally, we see errors on our autobuilders where a setscene > > > > > task > > > > > fails to retrieve a file from our global sstate cache. It > > > > > typically > > > > > looks something like this: > > > > > > > > > > WARNING: zip-3.0-r2 do_populate_sysroot_setscene: Failed to fetch > > > > > URL > > > > > file://66/sstate:zip:core2-64-poky-linux:3.0:r2:core2-64:3:\ > > > > > 66832b8c4e7babe0eac9d9579d1e2b6a_populate_sysroot.tgz;\ > > > > > downloadfilename=66/sstate:zip:core2-64-poky-linux:3.0:r2:core2- > > > > 64:3:\ > > > > > > > > > > 66832b8c4e7babe0eac9d9579d1e2b6a_populate_sysroot.tgz, attempting > > > > > MIRRORS if available > > > > > ERROR: zip-3.0-r2 do_populate_sysroot_setscene: Fetcher failure: > > > > > Unable to find file > > > > > file://66/sstate:zip:core2-64-poky-linux:3.0:r2:core2-64:3:\ > > > > > 66832b8c4e7babe0eac9d9579d1e2b6a_populate_sysroot.tgz;\ > > > > > downloadfilename=66/sstate:zip:core2-64-poky-linux:3.0:r2:core2- > > > > 64:3:\ > > > > > > > > > > 66832b8c4e7babe0eac9d9579d1e2b6a_populate_sysroot.tgz anywhere. > > > > > The > > > > > paths that were searched were: > > > > > /home/pkj/.openembedded/sstate-cache > > > > To trigger this, do you have SSTATE_MIRRORS pointing to > > > > "/home/pkj/.openembedded/sstate-cache" and SSTATE_DIR pointed > > > > somewhere else? Or are they both pointing to the same local > > > > directory? > > > > Or something else? > > > No, the directory above is actually what is in SSTATE_DIR. > > > SSTATE_MIRRORS is set to: > > > > > > SSTATE_MIRRORS ?= "\ > > > file://.* file:///n/oe/sstate-cache/PATH;downloadfilename=PATH" > > > > > > where /n/oe is an NFS mount where we share a global sstate cache. > > > > > > The only way I have figured out to manually simulate the problem is > > > by modifying the code in sstate_checkhashes() in sstate.bbclass and > > > commenting out the call to fetcher.checkstatus(). Then as long as > > > there actually is no sstate files for the task in either the global > > > or the local sstate cache, I will get the above. > > > > > > I do not know what triggers it on the autobuilder though. My guess > > > is > > > that somehow the sstate tgz file disappears between the call to > > > sstate_checkhashes() and when bitbake actually tries to download the > > > file. > > > > > > We do have a daily job that cleans up the global sstate cache and > > > removes files that have not been accessed in the last ten days, but > > > it seems unlikely that it should remove a file that just happens to > > > be required again, and do it at exactly the time when that task is > > > building. > > > > I have left this code as an error deliberately as this kind of thing > > should not happen and if it does, there is really something wrong which > > you need to figure out. It means that at one point bitbake thinks the > > sstate is present and valid, then later it isn't. > > True, but since the operations of checking if an sstate file exists and > retrieving it is not an atomic operation, there are always problems that > can occur. Some may be fixable, some may not. However, using a build > failure to detect these kind of problems is a bit harsh on the developers > who only sees their builds complete only to get an error for something > that is not their fault. We have better ways to detect these kinds of > problems, e.g., through log monitoring, without having to cause > unnecessary grief amongst the developers. > > > I'm not convinced patching out the errors is the right solution here... > > How about I make it conditional by adding an IGNORE_SETSCENE_ERRORS? > That way it can default to "0", but we can set it to "1" to prioritize > the production builds. > > > Cheers, > > > > Richard > > //Peter > > -- > _______________________________________________ > Openembedded-core mailing list > Openembedded-core@lists.openembedded.org > http://lists.openembedded.org/mailman/listinfo/openembedded-core > [-- Attachment #2: Type: text/html, Size: 9768 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 0/2] Avoid build failures due to setscene errors 2017-08-30 7:54 ` Martin Jansa @ 2019-11-29 16:48 ` Martin Jansa 2020-01-09 12:26 ` Ricardo Ribalda Delgado 0 siblings, 1 reply; 18+ messages in thread From: Martin Jansa @ 2019-11-29 16:48 UTC (permalink / raw) To: Peter Kjellerstedt; +Cc: OE Core mailing list [-- Attachment #1: Type: text/plain, Size: 2625 bytes --] On Wed, Aug 30, 2017 at 9:54 AM Martin Jansa <martin.jansa@gmail.com> wrote: > I agree with this patchset and it would be OK with IGNORE_SETSCENE_ERRORS > conditional as well. > > We're also sometimes seeing these errors, sometime anticipated when > cleaning shared sstate-cache on NFS server sometimes unexpected when NFS or > network goes down for a minute and for some builds it happens between > sstate_checkhashes() and using the sstate. > > We normally stop all jenkins builds, until the cleanup is complete (there > is jenkins job doing the cleanup, so it puts jenkins into stop mode, waits > for all current jobs to finish which can take hours, then performs the > cleanup and cancels the stop mode), but we cannot stop hundreds of > developers using the same sstate-cache in local builds (especially when we > cannot really know when exactly the job will have free jenkins to perform > the cleanup) - luckily in local builds it doesn't hurt so bad, because the > developers are more likely to ignore the error as long as the image was > created, but in jenkins builds when bitbake returns error we cannot easily > distinguish this case of "RP is intentionally warning us that something > went wrong with sstate, but everything was built correctly in the end" and > "something failed in the build and we weren't able to recover from that, > maybe even the image wasn't created" - so we don't trigger the follow up > actions like announcing new official builds or parsing release notes or > automated testing. > > Yes we could add more logic to these CI jobs, to grep the logs to decide > if this error was the only one which caused the bitbake to return error > code and ignore the returned error in such case, but simple variable is > easier to maintain (even for the cost of forking bitbake and oe-core) and > will work for local builds as well. > I was using these 2 changes in my fork of oe-core and bitbake since they were sent to the list, but today after getting a bunch of errors like this from build which unfortunately wasn't using my forks and few questions about why these errors aren't ignored from fellow developers I've finally found time to improve our CI jobs to deal with this and ignore the bitbake return code if it's reporting failure only because of these setscene fetcher failures. If someone needs similar work around for bitbake behavior, here is what I did: https://github.com/webOS-ports/jenkins-jobs/pull/12 yes, it's ugly, but it seems to work and is a bit better than forking oe-core and bitbake just because of this issue. Regards, [-- Attachment #2: Type: text/html, Size: 3479 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 0/2] Avoid build failures due to setscene errors 2019-11-29 16:48 ` Martin Jansa @ 2020-01-09 12:26 ` Ricardo Ribalda Delgado 0 siblings, 0 replies; 18+ messages in thread From: Ricardo Ribalda Delgado @ 2020-01-09 12:26 UTC (permalink / raw) To: Martin Jansa; +Cc: Peter Kjellerstedt, OE Core mailing list Hi I am also hitting this wall. Any reason why the original patches could not be merged? On Fri, Nov 29, 2019 at 5:49 PM Martin Jansa <martin.jansa@gmail.com> wrote: > > On Wed, Aug 30, 2017 at 9:54 AM Martin Jansa <martin.jansa@gmail.com> wrote: >> >> I agree with this patchset and it would be OK with IGNORE_SETSCENE_ERRORS conditional as well. >> >> We're also sometimes seeing these errors, sometime anticipated when cleaning shared sstate-cache on NFS server sometimes unexpected when NFS or network goes down for a minute and for some builds it happens between sstate_checkhashes() and using the sstate. >> >> We normally stop all jenkins builds, until the cleanup is complete (there is jenkins job doing the cleanup, so it puts jenkins into stop mode, waits for all current jobs to finish which can take hours, then performs the cleanup and cancels the stop mode), but we cannot stop hundreds of developers using the same sstate-cache in local builds (especially when we cannot really know when exactly the job will have free jenkins to perform the cleanup) - luckily in local builds it doesn't hurt so bad, because the developers are more likely to ignore the error as long as the image was created, but in jenkins builds when bitbake returns error we cannot easily distinguish this case of "RP is intentionally warning us that something went wrong with sstate, but everything was built correctly in the end" and "something failed in the build and we weren't able to recover from that, maybe even the image wasn't created" - so we don't trigger the follow up actions like announcing new official builds or parsing release notes or automated testing. >> >> Yes we could add more logic to these CI jobs, to grep the logs to decide if this error was the only one which caused the bitbake to return error code and ignore the returned error in such case, but simple variable is easier to maintain (even for the cost of forking bitbake and oe-core) and will work for local builds as well. > > > I was using these 2 changes in my fork of oe-core and bitbake since they were sent to the list, but today after getting a bunch of errors like this from build which unfortunately wasn't using my forks and few questions about why these errors aren't ignored from fellow developers I've finally found time to improve our CI jobs to deal with this and ignore the bitbake return code if it's reporting failure only because of these setscene fetcher failures. > > If someone needs similar work around for bitbake behavior, here is what I did: > https://github.com/webOS-ports/jenkins-jobs/pull/12 > yes, it's ugly, but it seems to work and is a bit better than forking oe-core and bitbake just because of this issue. > > Regards, > -- > _______________________________________________ > Openembedded-core mailing list > Openembedded-core@lists.openembedded.org > http://lists.openembedded.org/mailman/listinfo/openembedded-core -- Ricardo Ribalda ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 0/2] Avoid build failures due to setscene errors 2017-08-30 6:44 ` Peter Kjellerstedt 2017-08-30 7:54 ` Martin Jansa @ 2017-08-30 8:02 ` Richard Purdie 2017-08-30 9:52 ` Peter Kjellerstedt 1 sibling, 1 reply; 18+ messages in thread From: Richard Purdie @ 2017-08-30 8:02 UTC (permalink / raw) To: Peter Kjellerstedt, Andre McCurdy; +Cc: OE Core mailing list On Wed, 2017-08-30 at 06:44 +0000, Peter Kjellerstedt wrote: > > I have left this code as an error deliberately as this kind of > > thing should not happen and if it does, there is really something > > wrong which you need to figure out. It means that at one point > > bitbake thinks the sstate is present and valid, then later it > > isn't. > > True, but since the operations of checking if an sstate file exists > and retrieving it is not an atomic operation, there are always > problems that can occur. Some may be fixable, some may not. However, > using a build failure to detect these kind of problems is a bit harsh > on the developers who only sees their builds complete only to get an > error for something that is not their fault. We have better ways to > detect these kinds of problems, e.g., through log monitoring, without > having to cause unnecessary grief amongst the developers. Files are randomly disappearing from your sstate source. So far you've been lucky and these are not causing corruption, but they could. Please figure out and fix your sstate infrastructure, not hack the code to avoid the errors. I do appreciate its painful, we did once see this issue on the autobuilder. There was a real error in the sstate cleanup scripts and we fixed that but it took some work to find it. Also, with changes like this you can end up in a state where sstate can completely stop working and the only way you'd tell is by increased build time. > > I'm not convinced patching out the errors is the right solution > > here... > How about I make it conditional by adding an IGNORE_SETSCENE_ERRORS? > That way it can default to "0", but we can set it to "1" to > prioritize the production builds. I'm still not convinced, sorry. [The reason being complexity. I don't like having multiple ways of doing things if we can help it, particularly when one of them is a workaround for a problem elsewhere. One of the codepaths in a case like this is unlikely to get well tested.] Cheers, Richard ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 0/2] Avoid build failures due to setscene errors 2017-08-30 8:02 ` Richard Purdie @ 2017-08-30 9:52 ` Peter Kjellerstedt 0 siblings, 0 replies; 18+ messages in thread From: Peter Kjellerstedt @ 2017-08-30 9:52 UTC (permalink / raw) To: Richard Purdie, Andre McCurdy; +Cc: OE Core mailing list > -----Original Message----- > From: Richard Purdie [mailto:richard.purdie@linuxfoundation.org] > Sent: den 30 augusti 2017 10:03 > To: Peter Kjellerstedt <peter.kjellerstedt@axis.com>; Andre McCurdy > <armccurdy@gmail.com> > Cc: OE Core mailing list <openembedded-core@lists.openembedded.org> > Subject: Re: [OE-core] [PATCH 0/2] Avoid build failures due to setscene > errors > > On Wed, 2017-08-30 at 06:44 +0000, Peter Kjellerstedt wrote: > > > I have left this code as an error deliberately as this kind of > > > thing should not happen and if it does, there is really something > > > wrong which you need to figure out. It means that at one point > > > bitbake thinks the sstate is present and valid, then later it > > > isn't. > > > > True, but since the operations of checking if an sstate file exists > > and retrieving it is not an atomic operation, there are always > > problems that can occur. Some may be fixable, some may not. However, > > using a build failure to detect these kind of problems is a bit harsh > > on the developers who only sees their builds complete only to get an > > error for something that is not their fault. We have better ways to > > detect these kinds of problems, e.g., through log monitoring, without > > having to cause unnecessary grief amongst the developers. > > Files are randomly disappearing from your sstate source. So far you've > been lucky and these are not causing corruption, but they could. Somehow I fail to see how missing sstate cache files can cause corruption. If they are missing, the real task is run and all is well. Also, I do not actually know if the files disappear permanently or temporarily, because at the time when I look at the global sstate cache the files are there, newly created because the build continued and let the real task run. My guess though is that the files only temporarily disappeared due to some network glitch, but currently I cannot verify it. Regardless of whether my proposed changes are accepted or not, if you want to keep the default behavior that a failed setscene task will eventually cause the build to fail, then we should change it to fail immediately instead. Continuing the build when you know it will fail makes no sense at all. > Please figure out and fix your sstate infrastructure, not hack the code > to avoid the errors. As Martin Jansa mentioned in another response, the problem may be due to NFS or general network disturbances. And I see no way to protect ourselves from them. And apparently we are not alone in seeing these kinds of transient errors. > I do appreciate its painful, we did once see this issue on the > autobuilder. There was a real error in the sstate cleanup scripts and > we fixed that but it took some work to find it. Are your sstate cache clean up scripts available somewhere? Because obviously it is not trivial to get it right, and since keeping the sstate cache clean is something that I expect many like to do, having a common script for this seems like a good thing. Otherwise I can contribute our script. If nothing else it would probably be good to have it reviewed by someone who is an expert on the sstate cache. It currently features: * configurable retention period (default is 10 days) * removes related .tgz and .tgz.siginfo files as one * can remove stale symbolic links (typically wanted for a local sstate cache which has links into a global sstate cache which have seen the actual files being cleaned away) * dry run mode * quiet mode (only prints a summary stating how much was clean up and the current size of the sstate cache; very nice for running it as a cronjob) > Also, with changes like this you can end up in a state where sstate can > completely stop working and the only way you'd tell is by increased > build time. As I mentioned, we have monitoring of our builds in place and would definitely notice if the global sstate cache is not used as expected. > > > I'm not convinced patching out the errors is the right solution > > > here... > > > > How about I make it conditional by adding an IGNORE_SETSCENE_ERRORS? > > That way it can default to "0", but we can set it to "1" to > > prioritize the production builds. > > I'm still not convinced, sorry. > > [The reason being complexity. I don't like having multiple ways of > doing things if we can help it, particularly when one of them is a > workaround for a problem elsewhere. One of the codepaths in a case like > this is unlikely to get well tested.] Well, as long as the conditional path is clearly marked as "only enable this if you know what you are doing", I do not see a problem with that path receiving less or no testing by you. It should get enough testing by those of us who rely on it. The problem for me in this kind of situations is that we do not want to make changes to anything inside the Poky repository (which would effectively fork it), because down that route lies madness. So instead we rely on making all adaptations in our own layers. Making changes to recipes is easy as we can use .bbappends in our layers. Making changes to classes or configuration files works by copying them to our layers and changing them there, even though I personally hate it because it causes extra maintenance for me since I often need to build with a newer version of Poky than our layers are currently adapted for in preparations for updating to the next Poky release. However, changes to anything inside bitbake is near impossible. The same with changes to anything in meta/lib/oe. Thus we rely on being able to find a way to get these kinds of changes integrated upstream. > Cheers, > > Richard And in case any of the above sounds as if I am trying to force a feature down your throat that you do not like, then I beg for forgiveness. We really do appreciate your expertise and dedication to the OE community, and I hope we can work this to something that you can accept and that we can use. //Peter ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 0/2] Avoid build failures due to setscene errors 2017-08-29 20:59 ` Peter Kjellerstedt 2017-08-29 21:49 ` Richard Purdie @ 2017-08-29 22:03 ` Andre McCurdy 2017-08-30 9:55 ` Peter Kjellerstedt 1 sibling, 1 reply; 18+ messages in thread From: Andre McCurdy @ 2017-08-29 22:03 UTC (permalink / raw) To: Peter Kjellerstedt; +Cc: OE Core mailing list On Tue, Aug 29, 2017 at 1:59 PM, Peter Kjellerstedt <peter.kjellerstedt@axis.com> wrote: > We do have a daily job that cleans up the global sstate cache and > removes files that have not been accessed in the last ten days, but > it seems unlikely that it should remove a file that just happens to > be required again, and do it at exactly the time when that task is > building. I guess you've already confirmed that accessing the sstate files over NFS does actually modify atime on the server (and that the filesystem on the server really does have atime support enabled, e.g. mounted with strictatime rather than relatime etc)? If access time isn't being determined reliably and sstate files are being removed 10 days after being created then that might make the race a little more likely to trigger. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 0/2] Avoid build failures due to setscene errors 2017-08-29 22:03 ` Andre McCurdy @ 2017-08-30 9:55 ` Peter Kjellerstedt 0 siblings, 0 replies; 18+ messages in thread From: Peter Kjellerstedt @ 2017-08-30 9:55 UTC (permalink / raw) To: Andre McCurdy; +Cc: OE Core mailing list > -----Original Message----- > From: Andre McCurdy [mailto:armccurdy@gmail.com] > Sent: den 30 augusti 2017 00:04 > To: Peter Kjellerstedt <peter.kjellerstedt@axis.com> > Cc: OE Core mailing list <openembedded-core@lists.openembedded.org> > Subject: Re: [OE-core] [PATCH 0/2] Avoid build failures due to setscene > errors > > On Tue, Aug 29, 2017 at 1:59 PM, Peter Kjellerstedt > <peter.kjellerstedt@axis.com> wrote: > > We do have a daily job that cleans up the global sstate cache and > > removes files that have not been accessed in the last ten days, but > > it seems unlikely that it should remove a file that just happens to > > be required again, and do it at exactly the time when that task is > > building. > > I guess you've already confirmed that accessing the sstate files over > NFS does actually modify atime on the server (and that the filesystem > on the server really does have atime support enabled, e.g. mounted > with strictatime rather than relatime etc)? Well, it is mounted with relatime. However, only updating the access time once a day should be ok since we are only concerned with changes that have not been accessed in the last ten days. > If access time isn't being determined reliably and sstate files are > being removed 10 days after being created then that might make the > race a little more likely to trigger. The thing is that the cleaning script runs at 3 am (and takes about 15 minutes to complete), but we have seen the build problem at times when no cleaning is taking place. I am currently leaning more towards network glitches as the source of the problem, but that is hard to verify. //Peter ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2020-01-09 12:26 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-08-29 20:00 [PATCH 0/2] Avoid build failures due to setscene errors Peter Kjellerstedt 2017-08-29 20:00 ` [PATCH 1/2] bitbake: fetch2: Allow Fetch.download() to warn instead of error Peter Kjellerstedt 2017-08-29 20:00 ` [PATCH 2/2] sstate.bbclass: Do not cause build failures due to setscene errors Peter Kjellerstedt 2017-08-29 20:04 ` ✗ patchtest: failure for Avoid " Patchwork 2017-08-29 20:25 ` Peter Kjellerstedt 2017-08-29 22:35 ` Philip Balister 2017-08-30 7:41 ` Peter Kjellerstedt 2017-08-29 20:38 ` [PATCH 0/2] " Andre McCurdy 2017-08-29 20:59 ` Peter Kjellerstedt 2017-08-29 21:49 ` Richard Purdie 2017-08-30 6:44 ` Peter Kjellerstedt 2017-08-30 7:54 ` Martin Jansa 2019-11-29 16:48 ` Martin Jansa 2020-01-09 12:26 ` Ricardo Ribalda Delgado 2017-08-30 8:02 ` Richard Purdie 2017-08-30 9:52 ` Peter Kjellerstedt 2017-08-29 22:03 ` Andre McCurdy 2017-08-30 9:55 ` Peter Kjellerstedt
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.