* pseudo 1.8.1 doesn't work with docker & dumb-init @ 2016-08-31 9:21 wenzong fan 2016-08-31 15:11 ` Joshua Lock 2016-08-31 15:48 ` Seebs 0 siblings, 2 replies; 11+ messages in thread From: wenzong fan @ 2016-08-31 9:21 UTC (permalink / raw) To: 'Patches and discussions about the oe-core layer', seebs, Richard Purdie Hi Experts, While I trying to build Yocto in Docker Container which using dumb-init as init system, I found the build always be stopped at some point and the container was terminated as well with below errors: Child process timeout after 2 seconds. Child process exit status 4: lock_held Sometimes there's not any obvious error message. After some `git bisect` testing, I believe the issue was started since commit: ---------------------- 9df3cdf42d8c1216682f497f0b166a43ef9f4184 is the first bad commit commit 9df3cdf42d8c1216682f497f0b166a43ef9f4184 Author: Richard Purdie <richard.purdie@linuxfoundation.org> Date: Tue Jul 5 13:18:31 2016 +0100 pseudo: Upgrade to 1.8.1 * Drop patches where the changes exist upstream * Fetch from git as no tarball is available for 1.8.1 * Move common code to pseudo.inc * Update patchset in git recipe (From OE-Core rev: 0c36984d4c501d12fa91cf7371511641585cc256) ----------------------- Finally I narrowed it down to pseudo commit: ------------------------ commit 77ee254a6c974aad9bcab2c58c9ee9e0880c9718 Author: Peter Seebach <peter.seebach@windriver.com> Date: Tue Mar 1 16:21:15 2016 -0600 Server launch reworking. This is the big overhaul to have the server provide meaningful exit status to clients. In the process, I discovered that the server was running with signals blocked if launched by a client, which is not a good thing, and prevented this from working as intended. Still looking to see why more than one server spawn seems to happen. ------------------------ I also created a testcase for reproducing the issue at: https://github.com/WenzongFan/docker-build-yocto For dumb-init please refer to: https://github.com/Yelp/dumb-init Could anyone help to fix the signal handling in pseudo? Thanks Wenzong ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: pseudo 1.8.1 doesn't work with docker & dumb-init 2016-08-31 9:21 pseudo 1.8.1 doesn't work with docker & dumb-init wenzong fan @ 2016-08-31 15:11 ` Joshua Lock 2016-09-02 1:24 ` wenzong fan 2016-08-31 15:48 ` Seebs 1 sibling, 1 reply; 11+ messages in thread From: Joshua Lock @ 2016-08-31 15:11 UTC (permalink / raw) To: wenzong fan, 'Patches and discussions about the oe-core layer', seebs, Richard Purdie On Wed, 2016-08-31 at 17:21 +0800, wenzong fan wrote: > Hi Experts, > > While I trying to build Yocto in Docker Container which using dumb- > init > as init system, I found the build always be stopped at some point > and > the container was terminated as well with below errors: > > Child process timeout after 2 seconds. > Child process exit status 4: lock_held > > Sometimes there's not any obvious error message. > > After some `git bisect` testing, I believe the issue was started > since > commit: > > ---------------------- > 9df3cdf42d8c1216682f497f0b166a43ef9f4184 is the first bad commit > commit 9df3cdf42d8c1216682f497f0b166a43ef9f4184 > Author: Richard Purdie <richard.purdie@linuxfoundation.org> > Date: Tue Jul 5 13:18:31 2016 +0100 > > pseudo: Upgrade to 1.8.1 > > * Drop patches where the changes exist upstream > * Fetch from git as no tarball is available for 1.8.1 > * Move common code to pseudo.inc > * Update patchset in git recipe > > (From OE-Core rev: 0c36984d4c501d12fa91cf7371511641585cc256) > ----------------------- > > Finally I narrowed it down to pseudo commit: > > ------------------------ > commit 77ee254a6c974aad9bcab2c58c9ee9e0880c9718 > Author: Peter Seebach <peter.seebach@windriver.com> > Date: Tue Mar 1 16:21:15 2016 -0600 > > Server launch reworking. > > This is the big overhaul to have the server provide meaningful > exit > status > to clients. > > In the process, I discovered that the server was running with > signals blocked > if launched by a client, which is not a good thing, and > prevented > this from > working as intended. > > Still looking to see why more than one server spawn seems to > happen. > ------------------------ > > I also created a testcase for reproducing the issue at: > > https://github.com/WenzongFan/docker-build-yocto Thanks for providing a detailed reproducer. I'm trying to configure a container behind my proxy here. > > For dumb-init please refer to: > > https://github.com/Yelp/dumb-init > > Could anyone help to fix the signal handling in pseudo? It may not actually be pseudo at fault here. I've only skimmed the dumb-init README but it looks like there might be a strange interaction between the newly fixed signal handling in pseudo and dumb-init's signal handling. Should dumb-init be running in single-child/non-setsid mode so that signals are only forwarded to the direct child rather than all child processes in the dumb-init session? Is this a scenario you've tested? Regards, Joshua ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: pseudo 1.8.1 doesn't work with docker & dumb-init 2016-08-31 15:11 ` Joshua Lock @ 2016-09-02 1:24 ` wenzong fan 0 siblings, 0 replies; 11+ messages in thread From: wenzong fan @ 2016-09-02 1:24 UTC (permalink / raw) To: Joshua Lock, 'Patches and discussions about the oe-core layer', seebs, Richard Purdie On 08/31/2016 11:11 PM, Joshua Lock wrote: > On Wed, 2016-08-31 at 17:21 +0800, wenzong fan wrote: >> Hi Experts, >> >> While I trying to build Yocto in Docker Container which using dumb- >> init >> as init system, I found the build always be stopped at some point >> and >> the container was terminated as well with below errors: >> >> Child process timeout after 2 seconds. >> Child process exit status 4: lock_held >> >> Sometimes there's not any obvious error message. >> >> After some `git bisect` testing, I believe the issue was started >> since >> commit: >> >> ---------------------- >> 9df3cdf42d8c1216682f497f0b166a43ef9f4184 is the first bad commit >> commit 9df3cdf42d8c1216682f497f0b166a43ef9f4184 >> Author: Richard Purdie <richard.purdie@linuxfoundation.org> >> Date: Tue Jul 5 13:18:31 2016 +0100 >> >> pseudo: Upgrade to 1.8.1 >> >> * Drop patches where the changes exist upstream >> * Fetch from git as no tarball is available for 1.8.1 >> * Move common code to pseudo.inc >> * Update patchset in git recipe >> >> (From OE-Core rev: 0c36984d4c501d12fa91cf7371511641585cc256) >> ----------------------- >> >> Finally I narrowed it down to pseudo commit: >> >> ------------------------ >> commit 77ee254a6c974aad9bcab2c58c9ee9e0880c9718 >> Author: Peter Seebach <peter.seebach@windriver.com> >> Date: Tue Mar 1 16:21:15 2016 -0600 >> >> Server launch reworking. >> >> This is the big overhaul to have the server provide meaningful >> exit >> status >> to clients. >> >> In the process, I discovered that the server was running with >> signals blocked >> if launched by a client, which is not a good thing, and >> prevented >> this from >> working as intended. >> >> Still looking to see why more than one server spawn seems to >> happen. >> ------------------------ >> >> I also created a testcase for reproducing the issue at: >> >> https://github.com/WenzongFan/docker-build-yocto > > Thanks for providing a detailed reproducer. I'm trying to configure a > container behind my proxy here. > >> >> For dumb-init please refer to: >> >> https://github.com/Yelp/dumb-init >> >> Could anyone help to fix the signal handling in pseudo? > > It may not actually be pseudo at fault here. I've only skimmed the > dumb-init README but it looks like there might be a strange interaction > between the newly fixed signal handling in pseudo and dumb-init's > signal handling. > > Should dumb-init be running in single-child/non-setsid mode so that > signals are only forwarded to the direct child rather than all child > processes in the dumb-init session? Is this a scenario you've tested? Yes, I had try below options, but all of them don't work: 1) Run dumb-init with the -c flag: https://github.com/Yelp/dumb-init/issues/51 - single-child/non-setsid mode 2) Update dumb-init to latest version v1.1.3 (the release notes mention fixes for race conditions) 3) Switch to tini which an alterative to dumb-init: https://github.com/krallin/tini Thanks Wenzong > > Regards, > > Joshua > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: pseudo 1.8.1 doesn't work with docker & dumb-init 2016-08-31 9:21 pseudo 1.8.1 doesn't work with docker & dumb-init wenzong fan 2016-08-31 15:11 ` Joshua Lock @ 2016-08-31 15:48 ` Seebs 2016-09-02 1:33 ` wenzong fan 1 sibling, 1 reply; 11+ messages in thread From: Seebs @ 2016-08-31 15:48 UTC (permalink / raw) To: wenzong fan; +Cc: Patches and discussions about the oe-core layer On 31 Aug 2016, at 4:21, wenzong fan wrote: > Finally I narrowed it down to pseudo commit: Yes, that makes sense, we expect that there'd be potential issues, but I didn't have a reproducer for any. Thanks! I'll see whether it reproduces for me now. Any specific version of docker I might need? -s ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: pseudo 1.8.1 doesn't work with docker & dumb-init 2016-08-31 15:48 ` Seebs @ 2016-09-02 1:33 ` wenzong fan 2016-09-02 2:10 ` Seebs 0 siblings, 1 reply; 11+ messages in thread From: wenzong fan @ 2016-09-02 1:33 UTC (permalink / raw) To: Seebs; +Cc: Patches and discussions about the oe-core layer On 08/31/2016 11:48 PM, Seebs wrote: > On 31 Aug 2016, at 4:21, wenzong fan wrote: > >> Finally I narrowed it down to pseudo commit: > > Yes, that makes sense, we expect that there'd be potential issues, but I > didn't have a reproducer for any. Thanks! I'll see whether it reproduces > for me now. Any specific version of docker I might need? No, I didn't think it's related to any specific docker version. I tested it on "Docker version 1.7.1, build 786b29d" & "Docker version 1.11.2, build b9f10c9". BTW, I also tested the docker build w/o dumb-init, and the build works ... Thanks Wenzong > > -s > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: pseudo 1.8.1 doesn't work with docker & dumb-init 2016-09-02 1:33 ` wenzong fan @ 2016-09-02 2:10 ` Seebs 2016-09-07 6:32 ` wenzong fan 0 siblings, 1 reply; 11+ messages in thread From: Seebs @ 2016-09-02 2:10 UTC (permalink / raw) To: Patches and discussions about the oe-core layer On 1 Sep 2016, at 20:33, wenzong fan wrote: > No, I didn't think it's related to any specific docker version. > > I tested it on "Docker version 1.7.1, build 786b29d" & "Docker version > 1.11.2, build b9f10c9". > > BTW, I also tested the docker build w/o dumb-init, and the build works > ... Yeah, it's definitely specific in some way to docker. However, it doesn't appear to be 100% reproducible; I just tried a build with your reproducer and it completed without problems. (Unless the problems are more subtle, and don't prevent a build.) So this one's gonna be really fun to track down. -s ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: pseudo 1.8.1 doesn't work with docker & dumb-init 2016-09-02 2:10 ` Seebs @ 2016-09-07 6:32 ` wenzong fan 2016-09-07 6:40 ` Seebs 0 siblings, 1 reply; 11+ messages in thread From: wenzong fan @ 2016-09-07 6:32 UTC (permalink / raw) To: Seebs, Patches and discussions about the oe-core layer On 09/02/2016 10:10 AM, Seebs wrote: > On 1 Sep 2016, at 20:33, wenzong fan wrote: > >> No, I didn't think it's related to any specific docker version. >> >> I tested it on "Docker version 1.7.1, build 786b29d" & "Docker version >> 1.11.2, build b9f10c9". >> >> BTW, I also tested the docker build w/o dumb-init, and the build works >> ... > > Yeah, it's definitely specific in some way to docker. > > However, it doesn't appear to be 100% reproducible; I just tried a build > with your reproducer and it completed without problems. (Unless the > problems are more subtle, and don't prevent a build.) So this one's > gonna be really fun to track down. Yes, I believe it's not a 100 reproducible issue. Maybe you could run it with other builds in parallel and try it 3 times or more. It keeps high probability on my work host which a server that shared by several persons, I can always get the error from 1 ~ 3 times build. Thanks Wenzong > > -s ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: pseudo 1.8.1 doesn't work with docker & dumb-init 2016-09-07 6:32 ` wenzong fan @ 2016-09-07 6:40 ` Seebs 2016-09-14 20:46 ` Bystricky, Juro 0 siblings, 1 reply; 11+ messages in thread From: Seebs @ 2016-09-07 6:40 UTC (permalink / raw) To: wenzong fan; +Cc: Patches and discussions about the oe-core layer On 7 Sep 2016, at 1:32, wenzong fan wrote: > Yes, I believe it's not a 100 reproducible issue. Maybe you could run > it with other builds in parallel and try it 3 times or more. I can try, but that might need bigger hardware than I have to hand at the moment. -s ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: pseudo 1.8.1 doesn't work with docker & dumb-init 2016-09-07 6:40 ` Seebs @ 2016-09-14 20:46 ` Bystricky, Juro 2016-09-15 2:24 ` Randy MacLeod 0 siblings, 1 reply; 11+ messages in thread From: Bystricky, Juro @ 2016-09-14 20:46 UTC (permalink / raw) To: Seebs, Fan, Wenzong (Wind River) Cc: Patches and discussions about the oe-core layer I am pretty sure I glimpsed the messages: Child process timeout after 2 seconds. Child process exit status 4: lock_held on several occasions recently, just before my Xserver was restarted and I was kicked back to the login prompt. I typically ran several parallel bitbake builds. Ubuntu 16.04, not using container. The last message in the syslog (first error message) was always: Fatal IO error 11 (Resource temporarily unavailable) on X server :0 Possibly not related to this problem, nevertheless worth mentioning. Thanks Juro > -----Original Message----- > From: openembedded-core-bounces@lists.openembedded.org > [mailto:openembedded-core-bounces@lists.openembedded.org] On Behalf Of > Seebs > Sent: Tuesday, September 6, 2016 11:40 PM > To: Fan, Wenzong (Wind River) <wenzong.fan@windriver.com> > Cc: Patches and discussions about the oe-core layer <openembedded- > core@lists.openembedded.org> > Subject: Re: [OE-core] pseudo 1.8.1 doesn't work with docker & dumb-init > > On 7 Sep 2016, at 1:32, wenzong fan wrote: > > > Yes, I believe it's not a 100 reproducible issue. Maybe you could run > > it with other builds in parallel and try it 3 times or more. > > I can try, but that might need bigger hardware than I have to hand at > the moment. > > -s > -- > _______________________________________________ > Openembedded-core mailing list > Openembedded-core@lists.openembedded.org > http://lists.openembedded.org/mailman/listinfo/openembedded-core ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: pseudo 1.8.1 doesn't work with docker & dumb-init 2016-09-14 20:46 ` Bystricky, Juro @ 2016-09-15 2:24 ` Randy MacLeod 2016-09-15 19:08 ` Randy MacLeod 0 siblings, 1 reply; 11+ messages in thread From: Randy MacLeod @ 2016-09-15 2:24 UTC (permalink / raw) To: Bystricky, Juro, Seebs, Fan, Wenzong (Wind River) Cc: Patches and discussions about the oe-core layer On 2016-09-14 04:46 PM, Bystricky, Juro wrote: > I am pretty sure I glimpsed the messages: > Child process timeout after 2 seconds. > Child process exit status 4: lock_held > on several occasions recently, just before my Xserver was restarted and I was kicked back to the login prompt. > I typically ran several parallel bitbake builds. Ubuntu 16.04, not using container. The last message in the syslog (first error message) was always: > Fatal IO error 11 (Resource temporarily unavailable) on X server :0 > > Possibly not related to this problem, nevertheless worth mentioning. Yes, it may be. Thanks for reporting it. Two weeks ago, I was building a qemuarm64 image on my laptop (i7, 16 GB, SSD running Ubuntu-16.04) and I saw a similarity bizarre result from running a build in that chrome and then the X server were both killed. I wasn't in front of the system when this happened so I can't say exactly what was going on. I did collect some of the logs from my IRC client and chrome: [423679.028437] konversation[23416]: segfault at 7f72d2c33ce0 ip 00007f72eca4e818 sp 00007ffc7f450ae0 error 4 in libQt5Gui.so.5.5.1[7f72ec8da000+527000] [423679.325315] chrome[28083]: segfault at 968 ip 00007f63f7615643 sp 00007ffd26c25af0 error 4 in libX11.so.6.3.0[7f63f75ed000+135000] and then from the X server: Aug 29 16:11:59 laptop org.a11y.atspi.Registry[4763]: XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0" Aug 29 16:11:59 laptop org.a11y.atspi.Registry[4763]: after 67649 requests (67649 known processed) with 0 events remaining. Aug 29 16:11:59 laptop gnome-session[4748]: (diodon:4925): Gdk-WARNING **: diodon: Fatal IO error 11 (Resource temporarily unavailable) on X server :0 . ... Aug 29 16:11:59 laptop systemd[1]: Started Process Core Dump (PID 28084/UID 0). Aug 29 16:11:59 laptop gnome-session[4748]: Failed to connect to Mir: Failed to connect to server socket: No such file or directory Aug 29 16:11:59 laptop kernel: [423679.325315] chrome[28083]: segfault at 968 ip 00007f63f7615643 sp 00007ffd26c25af0 error 4 in libX11.so.6.3.0[7f63f75ed000+135000] In my case, I had added meta-oe to oe-core and was building: MACHINE=qemuarm64 bitbake imagemagick I reproduced it once in X then did NOT see it happen when I built with an X session running but building on the console. i.e. the build of imagemagick for qemuarm64 succeeded. I've removed the build logs it seems but I'll see if I can reproduce the failure overnight. ../Randy > > Thanks > > Juro > > >> -----Original Message----- >> From: openembedded-core-bounces@lists.openembedded.org >> [mailto:openembedded-core-bounces@lists.openembedded.org] On Behalf Of >> Seebs >> Sent: Tuesday, September 6, 2016 11:40 PM >> To: Fan, Wenzong (Wind River) <wenzong.fan@windriver.com> >> Cc: Patches and discussions about the oe-core layer <openembedded- >> core@lists.openembedded.org> >> Subject: Re: [OE-core] pseudo 1.8.1 doesn't work with docker & dumb-init >> >> On 7 Sep 2016, at 1:32, wenzong fan wrote: >> >>> Yes, I believe it's not a 100 reproducible issue. Maybe you could run >>> it with other builds in parallel and try it 3 times or more. >> >> I can try, but that might need bigger hardware than I have to hand at >> the moment. >> >> -s >> -- >> _______________________________________________ >> Openembedded-core mailing list >> Openembedded-core@lists.openembedded.org >> http://lists.openembedded.org/mailman/listinfo/openembedded-core -- # Randy MacLeod. SMTS, Linux, Wind River Direct: 613.963.1350 | 350 Terry Fox Drive, Suite 200, Ottawa, ON, Canada, K2K 2W5 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: pseudo 1.8.1 doesn't work with docker & dumb-init 2016-09-15 2:24 ` Randy MacLeod @ 2016-09-15 19:08 ` Randy MacLeod 0 siblings, 0 replies; 11+ messages in thread From: Randy MacLeod @ 2016-09-15 19:08 UTC (permalink / raw) To: Bystricky, Juro, Seebs, Fan, Wenzong (Wind River) Cc: Patches and discussions about the oe-core layer On 2016-09-14 10:24 PM, Randy MacLeod wrote: > I'll see if I can reproduce > the failure overnight. The laptop build worked without error. I may try again tonight. -- # Randy MacLeod. SMTS, Linux, Wind River Direct: 613.963.1350 | 350 Terry Fox Drive, Suite 200, Ottawa, ON, Canada, K2K 2W5 ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2016-09-15 19:08 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-08-31 9:21 pseudo 1.8.1 doesn't work with docker & dumb-init wenzong fan 2016-08-31 15:11 ` Joshua Lock 2016-09-02 1:24 ` wenzong fan 2016-08-31 15:48 ` Seebs 2016-09-02 1:33 ` wenzong fan 2016-09-02 2:10 ` Seebs 2016-09-07 6:32 ` wenzong fan 2016-09-07 6:40 ` Seebs 2016-09-14 20:46 ` Bystricky, Juro 2016-09-15 2:24 ` Randy MacLeod 2016-09-15 19:08 ` Randy MacLeod
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.