* pseudo 1.8.1 doesn't work with docker & dumb-init
@ 2016-08-31 9:21 wenzong fan
2016-08-31 15:11 ` Joshua Lock
2016-08-31 15:48 ` Seebs
0 siblings, 2 replies; 11+ messages in thread
From: wenzong fan @ 2016-08-31 9:21 UTC (permalink / raw)
To: 'Patches and discussions about the oe-core layer',
seebs, Richard Purdie
Hi Experts,
While I trying to build Yocto in Docker Container which using dumb-init
as init system, I found the build always be stopped at some point and
the container was terminated as well with below errors:
Child process timeout after 2 seconds.
Child process exit status 4: lock_held
Sometimes there's not any obvious error message.
After some `git bisect` testing, I believe the issue was started since
commit:
----------------------
9df3cdf42d8c1216682f497f0b166a43ef9f4184 is the first bad commit
commit 9df3cdf42d8c1216682f497f0b166a43ef9f4184
Author: Richard Purdie <richard.purdie@linuxfoundation.org>
Date: Tue Jul 5 13:18:31 2016 +0100
pseudo: Upgrade to 1.8.1
* Drop patches where the changes exist upstream
* Fetch from git as no tarball is available for 1.8.1
* Move common code to pseudo.inc
* Update patchset in git recipe
(From OE-Core rev: 0c36984d4c501d12fa91cf7371511641585cc256)
-----------------------
Finally I narrowed it down to pseudo commit:
------------------------
commit 77ee254a6c974aad9bcab2c58c9ee9e0880c9718
Author: Peter Seebach <peter.seebach@windriver.com>
Date: Tue Mar 1 16:21:15 2016 -0600
Server launch reworking.
This is the big overhaul to have the server provide meaningful exit
status
to clients.
In the process, I discovered that the server was running with
signals blocked
if launched by a client, which is not a good thing, and prevented
this from
working as intended.
Still looking to see why more than one server spawn seems to happen.
------------------------
I also created a testcase for reproducing the issue at:
https://github.com/WenzongFan/docker-build-yocto
For dumb-init please refer to:
https://github.com/Yelp/dumb-init
Could anyone help to fix the signal handling in pseudo?
Thanks
Wenzong
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: pseudo 1.8.1 doesn't work with docker & dumb-init
2016-08-31 9:21 pseudo 1.8.1 doesn't work with docker & dumb-init wenzong fan
@ 2016-08-31 15:11 ` Joshua Lock
2016-09-02 1:24 ` wenzong fan
2016-08-31 15:48 ` Seebs
1 sibling, 1 reply; 11+ messages in thread
From: Joshua Lock @ 2016-08-31 15:11 UTC (permalink / raw)
To: wenzong fan,
'Patches and discussions about the oe-core layer',
seebs, Richard Purdie
On Wed, 2016-08-31 at 17:21 +0800, wenzong fan wrote:
> Hi Experts,
>
> While I trying to build Yocto in Docker Container which using dumb-
> init
> as init system, I found the build always be stopped at some point
> and
> the container was terminated as well with below errors:
>
> Child process timeout after 2 seconds.
> Child process exit status 4: lock_held
>
> Sometimes there's not any obvious error message.
>
> After some `git bisect` testing, I believe the issue was started
> since
> commit:
>
> ----------------------
> 9df3cdf42d8c1216682f497f0b166a43ef9f4184 is the first bad commit
> commit 9df3cdf42d8c1216682f497f0b166a43ef9f4184
> Author: Richard Purdie <richard.purdie@linuxfoundation.org>
> Date: Tue Jul 5 13:18:31 2016 +0100
>
> pseudo: Upgrade to 1.8.1
>
> * Drop patches where the changes exist upstream
> * Fetch from git as no tarball is available for 1.8.1
> * Move common code to pseudo.inc
> * Update patchset in git recipe
>
> (From OE-Core rev: 0c36984d4c501d12fa91cf7371511641585cc256)
> -----------------------
>
> Finally I narrowed it down to pseudo commit:
>
> ------------------------
> commit 77ee254a6c974aad9bcab2c58c9ee9e0880c9718
> Author: Peter Seebach <peter.seebach@windriver.com>
> Date: Tue Mar 1 16:21:15 2016 -0600
>
> Server launch reworking.
>
> This is the big overhaul to have the server provide meaningful
> exit
> status
> to clients.
>
> In the process, I discovered that the server was running with
> signals blocked
> if launched by a client, which is not a good thing, and
> prevented
> this from
> working as intended.
>
> Still looking to see why more than one server spawn seems to
> happen.
> ------------------------
>
> I also created a testcase for reproducing the issue at:
>
> https://github.com/WenzongFan/docker-build-yocto
Thanks for providing a detailed reproducer. I'm trying to configure a
container behind my proxy here.
>
> For dumb-init please refer to:
>
> https://github.com/Yelp/dumb-init
>
> Could anyone help to fix the signal handling in pseudo?
It may not actually be pseudo at fault here. I've only skimmed the
dumb-init README but it looks like there might be a strange interaction
between the newly fixed signal handling in pseudo and dumb-init's
signal handling.
Should dumb-init be running in single-child/non-setsid mode so that
signals are only forwarded to the direct child rather than all child
processes in the dumb-init session? Is this a scenario you've tested?
Regards,
Joshua
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: pseudo 1.8.1 doesn't work with docker & dumb-init
2016-08-31 9:21 pseudo 1.8.1 doesn't work with docker & dumb-init wenzong fan
2016-08-31 15:11 ` Joshua Lock
@ 2016-08-31 15:48 ` Seebs
2016-09-02 1:33 ` wenzong fan
1 sibling, 1 reply; 11+ messages in thread
From: Seebs @ 2016-08-31 15:48 UTC (permalink / raw)
To: wenzong fan; +Cc: Patches and discussions about the oe-core layer
On 31 Aug 2016, at 4:21, wenzong fan wrote:
> Finally I narrowed it down to pseudo commit:
Yes, that makes sense, we expect that there'd be potential issues, but I
didn't have a reproducer for any. Thanks! I'll see whether it reproduces
for me now. Any specific version of docker I might need?
-s
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: pseudo 1.8.1 doesn't work with docker & dumb-init
2016-08-31 15:11 ` Joshua Lock
@ 2016-09-02 1:24 ` wenzong fan
0 siblings, 0 replies; 11+ messages in thread
From: wenzong fan @ 2016-09-02 1:24 UTC (permalink / raw)
To: Joshua Lock,
'Patches and discussions about the oe-core layer',
seebs, Richard Purdie
On 08/31/2016 11:11 PM, Joshua Lock wrote:
> On Wed, 2016-08-31 at 17:21 +0800, wenzong fan wrote:
>> Hi Experts,
>>
>> While I trying to build Yocto in Docker Container which using dumb-
>> init
>> as init system, I found the build always be stopped at some point
>> and
>> the container was terminated as well with below errors:
>>
>> Child process timeout after 2 seconds.
>> Child process exit status 4: lock_held
>>
>> Sometimes there's not any obvious error message.
>>
>> After some `git bisect` testing, I believe the issue was started
>> since
>> commit:
>>
>> ----------------------
>> 9df3cdf42d8c1216682f497f0b166a43ef9f4184 is the first bad commit
>> commit 9df3cdf42d8c1216682f497f0b166a43ef9f4184
>> Author: Richard Purdie <richard.purdie@linuxfoundation.org>
>> Date: Tue Jul 5 13:18:31 2016 +0100
>>
>> pseudo: Upgrade to 1.8.1
>>
>> * Drop patches where the changes exist upstream
>> * Fetch from git as no tarball is available for 1.8.1
>> * Move common code to pseudo.inc
>> * Update patchset in git recipe
>>
>> (From OE-Core rev: 0c36984d4c501d12fa91cf7371511641585cc256)
>> -----------------------
>>
>> Finally I narrowed it down to pseudo commit:
>>
>> ------------------------
>> commit 77ee254a6c974aad9bcab2c58c9ee9e0880c9718
>> Author: Peter Seebach <peter.seebach@windriver.com>
>> Date: Tue Mar 1 16:21:15 2016 -0600
>>
>> Server launch reworking.
>>
>> This is the big overhaul to have the server provide meaningful
>> exit
>> status
>> to clients.
>>
>> In the process, I discovered that the server was running with
>> signals blocked
>> if launched by a client, which is not a good thing, and
>> prevented
>> this from
>> working as intended.
>>
>> Still looking to see why more than one server spawn seems to
>> happen.
>> ------------------------
>>
>> I also created a testcase for reproducing the issue at:
>>
>> https://github.com/WenzongFan/docker-build-yocto
>
> Thanks for providing a detailed reproducer. I'm trying to configure a
> container behind my proxy here.
>
>>
>> For dumb-init please refer to:
>>
>> https://github.com/Yelp/dumb-init
>>
>> Could anyone help to fix the signal handling in pseudo?
>
> It may not actually be pseudo at fault here. I've only skimmed the
> dumb-init README but it looks like there might be a strange interaction
> between the newly fixed signal handling in pseudo and dumb-init's
> signal handling.
>
> Should dumb-init be running in single-child/non-setsid mode so that
> signals are only forwarded to the direct child rather than all child
> processes in the dumb-init session? Is this a scenario you've tested?
Yes, I had try below options, but all of them don't work:
1) Run dumb-init with the -c flag:
https://github.com/Yelp/dumb-init/issues/51 - single-child/non-setsid mode
2) Update dumb-init to latest version v1.1.3 (the release notes mention
fixes for race conditions)
3) Switch to tini which an alterative to dumb-init:
https://github.com/krallin/tini
Thanks
Wenzong
>
> Regards,
>
> Joshua
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: pseudo 1.8.1 doesn't work with docker & dumb-init
2016-08-31 15:48 ` Seebs
@ 2016-09-02 1:33 ` wenzong fan
2016-09-02 2:10 ` Seebs
0 siblings, 1 reply; 11+ messages in thread
From: wenzong fan @ 2016-09-02 1:33 UTC (permalink / raw)
To: Seebs; +Cc: Patches and discussions about the oe-core layer
On 08/31/2016 11:48 PM, Seebs wrote:
> On 31 Aug 2016, at 4:21, wenzong fan wrote:
>
>> Finally I narrowed it down to pseudo commit:
>
> Yes, that makes sense, we expect that there'd be potential issues, but I
> didn't have a reproducer for any. Thanks! I'll see whether it reproduces
> for me now. Any specific version of docker I might need?
No, I didn't think it's related to any specific docker version.
I tested it on "Docker version 1.7.1, build 786b29d" & "Docker version
1.11.2, build b9f10c9".
BTW, I also tested the docker build w/o dumb-init, and the build works ...
Thanks
Wenzong
>
> -s
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: pseudo 1.8.1 doesn't work with docker & dumb-init
2016-09-02 1:33 ` wenzong fan
@ 2016-09-02 2:10 ` Seebs
2016-09-07 6:32 ` wenzong fan
0 siblings, 1 reply; 11+ messages in thread
From: Seebs @ 2016-09-02 2:10 UTC (permalink / raw)
To: Patches and discussions about the oe-core layer
On 1 Sep 2016, at 20:33, wenzong fan wrote:
> No, I didn't think it's related to any specific docker version.
>
> I tested it on "Docker version 1.7.1, build 786b29d" & "Docker version
> 1.11.2, build b9f10c9".
>
> BTW, I also tested the docker build w/o dumb-init, and the build works
> ...
Yeah, it's definitely specific in some way to docker.
However, it doesn't appear to be 100% reproducible; I just tried a build
with your reproducer and it completed without problems. (Unless the
problems are more subtle, and don't prevent a build.) So this one's
gonna be really fun to track down.
-s
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: pseudo 1.8.1 doesn't work with docker & dumb-init
2016-09-02 2:10 ` Seebs
@ 2016-09-07 6:32 ` wenzong fan
2016-09-07 6:40 ` Seebs
0 siblings, 1 reply; 11+ messages in thread
From: wenzong fan @ 2016-09-07 6:32 UTC (permalink / raw)
To: Seebs, Patches and discussions about the oe-core layer
On 09/02/2016 10:10 AM, Seebs wrote:
> On 1 Sep 2016, at 20:33, wenzong fan wrote:
>
>> No, I didn't think it's related to any specific docker version.
>>
>> I tested it on "Docker version 1.7.1, build 786b29d" & "Docker version
>> 1.11.2, build b9f10c9".
>>
>> BTW, I also tested the docker build w/o dumb-init, and the build works
>> ...
>
> Yeah, it's definitely specific in some way to docker.
>
> However, it doesn't appear to be 100% reproducible; I just tried a build
> with your reproducer and it completed without problems. (Unless the
> problems are more subtle, and don't prevent a build.) So this one's
> gonna be really fun to track down.
Yes, I believe it's not a 100 reproducible issue. Maybe you could run it
with other builds in parallel and try it 3 times or more.
It keeps high probability on my work host which a server that shared by
several persons, I can always get the error from 1 ~ 3 times build.
Thanks
Wenzong
>
> -s
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: pseudo 1.8.1 doesn't work with docker & dumb-init
2016-09-07 6:32 ` wenzong fan
@ 2016-09-07 6:40 ` Seebs
2016-09-14 20:46 ` Bystricky, Juro
0 siblings, 1 reply; 11+ messages in thread
From: Seebs @ 2016-09-07 6:40 UTC (permalink / raw)
To: wenzong fan; +Cc: Patches and discussions about the oe-core layer
On 7 Sep 2016, at 1:32, wenzong fan wrote:
> Yes, I believe it's not a 100 reproducible issue. Maybe you could run
> it with other builds in parallel and try it 3 times or more.
I can try, but that might need bigger hardware than I have to hand at
the moment.
-s
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: pseudo 1.8.1 doesn't work with docker & dumb-init
2016-09-07 6:40 ` Seebs
@ 2016-09-14 20:46 ` Bystricky, Juro
2016-09-15 2:24 ` Randy MacLeod
0 siblings, 1 reply; 11+ messages in thread
From: Bystricky, Juro @ 2016-09-14 20:46 UTC (permalink / raw)
To: Seebs, Fan, Wenzong (Wind River)
Cc: Patches and discussions about the oe-core layer
I am pretty sure I glimpsed the messages:
Child process timeout after 2 seconds.
Child process exit status 4: lock_held
on several occasions recently, just before my Xserver was restarted and I was kicked back to the login prompt.
I typically ran several parallel bitbake builds. Ubuntu 16.04, not using container. The last message in the syslog (first error message) was always:
Fatal IO error 11 (Resource temporarily unavailable) on X server :0
Possibly not related to this problem, nevertheless worth mentioning.
Thanks
Juro
> -----Original Message-----
> From: openembedded-core-bounces@lists.openembedded.org
> [mailto:openembedded-core-bounces@lists.openembedded.org] On Behalf Of
> Seebs
> Sent: Tuesday, September 6, 2016 11:40 PM
> To: Fan, Wenzong (Wind River) <wenzong.fan@windriver.com>
> Cc: Patches and discussions about the oe-core layer <openembedded-
> core@lists.openembedded.org>
> Subject: Re: [OE-core] pseudo 1.8.1 doesn't work with docker & dumb-init
>
> On 7 Sep 2016, at 1:32, wenzong fan wrote:
>
> > Yes, I believe it's not a 100 reproducible issue. Maybe you could run
> > it with other builds in parallel and try it 3 times or more.
>
> I can try, but that might need bigger hardware than I have to hand at
> the moment.
>
> -s
> --
> _______________________________________________
> Openembedded-core mailing list
> Openembedded-core@lists.openembedded.org
> http://lists.openembedded.org/mailman/listinfo/openembedded-core
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: pseudo 1.8.1 doesn't work with docker & dumb-init
2016-09-14 20:46 ` Bystricky, Juro
@ 2016-09-15 2:24 ` Randy MacLeod
2016-09-15 19:08 ` Randy MacLeod
0 siblings, 1 reply; 11+ messages in thread
From: Randy MacLeod @ 2016-09-15 2:24 UTC (permalink / raw)
To: Bystricky, Juro, Seebs, Fan, Wenzong (Wind River)
Cc: Patches and discussions about the oe-core layer
On 2016-09-14 04:46 PM, Bystricky, Juro wrote:
> I am pretty sure I glimpsed the messages:
> Child process timeout after 2 seconds.
> Child process exit status 4: lock_held
> on several occasions recently, just before my Xserver was restarted and I was kicked back to the login prompt.
> I typically ran several parallel bitbake builds. Ubuntu 16.04, not using container. The last message in the syslog (first error message) was always:
> Fatal IO error 11 (Resource temporarily unavailable) on X server :0
>
> Possibly not related to this problem, nevertheless worth mentioning.
Yes, it may be. Thanks for reporting it.
Two weeks ago, I was building a qemuarm64 image on my laptop
(i7, 16 GB, SSD running Ubuntu-16.04)
and I saw a similarity bizarre result from running a build
in that chrome and then the X server were both killed.
I wasn't in front of the system when this happened so
I can't say exactly what was going on.
I did collect some of the logs from my IRC client and chrome:
[423679.028437] konversation[23416]: segfault at 7f72d2c33ce0 ip
00007f72eca4e818 sp 00007ffc7f450ae0 error 4 in
libQt5Gui.so.5.5.1[7f72ec8da000+527000]
[423679.325315] chrome[28083]: segfault at 968 ip 00007f63f7615643 sp
00007ffd26c25af0 error 4 in libX11.so.6.3.0[7f63f75ed000+135000]
and then from the X server:
Aug 29 16:11:59 laptop org.a11y.atspi.Registry[4763]: XIO: fatal IO
error 11 (Resource temporarily unavailable) on X server ":0"
Aug 29 16:11:59 laptop org.a11y.atspi.Registry[4763]: after 67649
requests (67649 known processed) with 0 events remaining.
Aug 29 16:11:59 laptop gnome-session[4748]: (diodon:4925): Gdk-WARNING
**: diodon: Fatal IO error 11 (Resource temporarily unavailable) on X
server :0
.
...
Aug 29 16:11:59 laptop systemd[1]: Started Process Core Dump (PID
28084/UID 0).
Aug 29 16:11:59 laptop gnome-session[4748]: Failed to connect to Mir:
Failed to connect to server socket: No such file or directory
Aug 29 16:11:59 laptop kernel: [423679.325315] chrome[28083]: segfault
at 968 ip 00007f63f7615643 sp 00007ffd26c25af0 error 4 in
libX11.so.6.3.0[7f63f75ed000+135000]
In my case, I had added meta-oe to oe-core and was building:
MACHINE=qemuarm64 bitbake imagemagick
I reproduced it once in X then did NOT see it happen when
I built with an X session running but building on the console.
i.e. the build of imagemagick for qemuarm64 succeeded.
I've removed the build logs it seems but I'll see if I can reproduce
the failure overnight.
../Randy
>
> Thanks
>
> Juro
>
>
>> -----Original Message-----
>> From: openembedded-core-bounces@lists.openembedded.org
>> [mailto:openembedded-core-bounces@lists.openembedded.org] On Behalf Of
>> Seebs
>> Sent: Tuesday, September 6, 2016 11:40 PM
>> To: Fan, Wenzong (Wind River) <wenzong.fan@windriver.com>
>> Cc: Patches and discussions about the oe-core layer <openembedded-
>> core@lists.openembedded.org>
>> Subject: Re: [OE-core] pseudo 1.8.1 doesn't work with docker & dumb-init
>>
>> On 7 Sep 2016, at 1:32, wenzong fan wrote:
>>
>>> Yes, I believe it's not a 100 reproducible issue. Maybe you could run
>>> it with other builds in parallel and try it 3 times or more.
>>
>> I can try, but that might need bigger hardware than I have to hand at
>> the moment.
>>
>> -s
>> --
>> _______________________________________________
>> Openembedded-core mailing list
>> Openembedded-core@lists.openembedded.org
>> http://lists.openembedded.org/mailman/listinfo/openembedded-core
--
# Randy MacLeod. SMTS, Linux, Wind River
Direct: 613.963.1350 | 350 Terry Fox Drive, Suite 200, Ottawa, ON,
Canada, K2K 2W5
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: pseudo 1.8.1 doesn't work with docker & dumb-init
2016-09-15 2:24 ` Randy MacLeod
@ 2016-09-15 19:08 ` Randy MacLeod
0 siblings, 0 replies; 11+ messages in thread
From: Randy MacLeod @ 2016-09-15 19:08 UTC (permalink / raw)
To: Bystricky, Juro, Seebs, Fan, Wenzong (Wind River)
Cc: Patches and discussions about the oe-core layer
On 2016-09-14 10:24 PM, Randy MacLeod wrote:
> I'll see if I can reproduce
> the failure overnight.
The laptop build worked without error. I may try again tonight.
--
# Randy MacLeod. SMTS, Linux, Wind River
Direct: 613.963.1350 | 350 Terry Fox Drive, Suite 200, Ottawa, ON,
Canada, K2K 2W5
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2016-09-15 19:08 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-31 9:21 pseudo 1.8.1 doesn't work with docker & dumb-init wenzong fan
2016-08-31 15:11 ` Joshua Lock
2016-09-02 1:24 ` wenzong fan
2016-08-31 15:48 ` Seebs
2016-09-02 1:33 ` wenzong fan
2016-09-02 2:10 ` Seebs
2016-09-07 6:32 ` wenzong fan
2016-09-07 6:40 ` Seebs
2016-09-14 20:46 ` Bystricky, Juro
2016-09-15 2:24 ` Randy MacLeod
2016-09-15 19:08 ` Randy MacLeod
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.