* Changes to job handling cause hangs in wait @ 2020-11-16 10:29 Andrej Shadura 2020-11-17 3:33 ` Herbert Xu 2020-12-01 5:38 ` Herbert Xu 0 siblings, 2 replies; 21+ messages in thread From: Andrej Shadura @ 2020-11-16 10:29 UTC (permalink / raw) To: dash; +Cc: Michael Biebl Hi, I’d like to forward a bug report I received. Michael, who reported it, was able to bisect it to 6c691b3, which was the first of the series of commits changing job handling. I must note that I had troubles debugging it: I was only able to reproduce the issue in an autopkgtest virtual machine, but not on my normal system nor in a fresh Debian unstable Docker container. I’m considering reverting some of the relevant commits, but it’d be best if we could track this down and get it fixed. The full details can be found at: https://bugs.debian.org/974705 Thanks. -- Cheers, Andrej ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Changes to job handling cause hangs in wait 2020-11-16 10:29 Changes to job handling cause hangs in wait Andrej Shadura @ 2020-11-17 3:33 ` Herbert Xu 2020-12-01 5:38 ` Herbert Xu 1 sibling, 0 replies; 21+ messages in thread From: Herbert Xu @ 2020-11-17 3:33 UTC (permalink / raw) To: Andrej Shadura; +Cc: dash, Michael Biebl On Mon, Nov 16, 2020 at 10:29:16AM +0000, Andrej Shadura wrote: > > I must note that I had troubles debugging it: I was only able to > reproduce the issue in an autopkgtest virtual machine, but not on my > normal system nor in a fresh Debian unstable Docker container. Since you can reproduce it, could you please run the offending script under strace -f and send me the result? That should pin-point the problem. Thanks, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Changes to job handling cause hangs in wait 2020-11-16 10:29 Changes to job handling cause hangs in wait Andrej Shadura 2020-11-17 3:33 ` Herbert Xu @ 2020-12-01 5:38 ` Herbert Xu 2020-12-01 5:42 ` Herbert Xu 1 sibling, 1 reply; 21+ messages in thread From: Herbert Xu @ 2020-12-01 5:38 UTC (permalink / raw) To: Andrej Shadura; +Cc: dash, Michael Biebl, 974705 On Mon, Nov 16, 2020 at 10:29:16AM +0000, Andrej Shadura wrote: > > I’d like to forward a bug report I received. Michael, who reported it, > was able to bisect it to 6c691b3, which was the first of the series of > commits changing job handling. > > I must note that I had troubles debugging it: I was only able to > reproduce the issue in an autopkgtest virtual machine, but not on my > normal system nor in a fresh Debian unstable Docker container. FWIW I'm unable to reproduce it with autopkgtest. This is what I get: root@test0:~# autopkgtest --test-name=timedated systemd-246.6/ -B -- lxc -s autopkgtest-sid autopkgtest [16:32:45]: starting date: 2020-12-01 autopkgtest [16:32:45]: version 5.15 autopkgtest [16:32:45]: host test0; command line: /usr/bin/autopkgtest --test-name=timedated systemd-246.6/ -B -- lxc -s autopkgtest-sid autopkgtest [16:33:13]: testbed dpkg architecture: amd64 autopkgtest [16:33:13]: testbed running kernel: Linux 5.9.0-rc1+ #18 SMP PREEMPT Sat Aug 29 23:45:30 AEST 2020 autopkgtest [16:33:13]: @@@@@@@@@@@@@@@@@@@@ unbuilt-tree systemd-246.6/ autopkgtest [16:33:18]: testing package systemd version 246.6-5 autopkgtest [16:33:18]: build not needed autopkgtest [16:33:18]: test timedated: preparing testbed Reading package lists... Done Building dependency tree Reading state information... Done Correcting dependencies...Starting pkgProblemResolver with broken count: 0 Starting 2 pkgProblemResolver with broken count: 0 Done Done Starting pkgProblemResolver with broken count: 0 Starting 2 pkgProblemResolver with broken count: 0 Done The following additional packages will be installed: acl libnss-systemd The following NEW packages will be installed: acl libnss-systemd 0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded. 1 not fully installed or removed. Need to get 255 kB of archives. After this operation, 577 kB of additional disk space will be used. Get:1 http://deb.debian.org/debian sid/main amd64 libnss-systemd amd64 246.6-5 [194 kB] Get:2 http://deb.debian.org/debian sid/main amd64 acl amd64 2.2.53-8 [60.8 kB] Fetched 255 kB in 1s (489 kB/s) debconf: delaying package configuration, since apt-utils is not installed Selecting previously unselected package libnss-systemd:amd64. (Reading database ... 12641 files and directories currently installed.) Preparing to unpack .../libnss-systemd_246.6-5_amd64.deb ... Unpacking libnss-systemd:amd64 (246.6-5) ... Selecting previously unselected package acl. Preparing to unpack .../acl_2.2.53-8_amd64.deb ... Unpacking acl (2.2.53-8) ... Setting up libnss-systemd:amd64 (246.6-5) ... First installation detected... Checking NSS setup... Setting up acl (2.2.53-8) ... Setting up autopkgtest-satdep (0) ... Processing triggers for libc-bin (2.31-4) ... (Reading database ... 12667 files and directories currently installed.) Removing autopkgtest-satdep (0) ... autopkgtest [16:33:29]: test timedated: [----------------------- original tz: Etc/UTC timedatectl works change timezone reset timezone to original no adjtime file UTC set in adjtime file non-zero values in adjtime file fourth line adjtime file no final newline in adjtime file only one line in adjtime file only one line in adjtime file, no final newline only two lines in adjtime file only two lines in adjtime file, no final newline unknown value in 3rd line of adjtime file disable NTP enable NTP re-disable NTP autopkgtest [16:33:32]: test timedated: -----------------------] autopkgtest [16:33:32]: test timedated: - - - - - - - - - - results - - - - - - - - - - timedated PASS autopkgtest [16:33:32]: @@@@@@@@@@@@@@@@@@@@ summary timedated PASS root@test0:~# Cheers, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Changes to job handling cause hangs in wait 2020-12-01 5:38 ` Herbert Xu @ 2020-12-01 5:42 ` Herbert Xu 2020-12-01 6:06 ` Herbert Xu 0 siblings, 1 reply; 21+ messages in thread From: Herbert Xu @ 2020-12-01 5:42 UTC (permalink / raw) To: Andrej Shadura; +Cc: dash, Michael Biebl, 974705 On Tue, Dec 01, 2020 at 04:38:37PM +1100, Herbert Xu wrote: > > FWIW I'm unable to reproduce it with autopkgtest. This is what > I get: > > root@test0:~# autopkgtest --test-name=timedated systemd-246.6/ -B -- lxc -s autopkgtest-sid > autopkgtest [16:32:45]: starting date: 2020-12-01 > autopkgtest [16:32:45]: version 5.15 > autopkgtest [16:32:45]: host test0; command line: /usr/bin/autopkgtest --test-name=timedated systemd-246.6/ -B -- lxc -s autopkgtest-sid > autopkgtest [16:33:13]: testbed dpkg architecture: amd64 > autopkgtest [16:33:13]: testbed running kernel: Linux 5.9.0-rc1+ #18 SMP PREEMPT Sat Aug 29 23:45:30 AEST 2020 > autopkgtest [16:33:13]: @@@@@@@@@@@@@@@@@@@@ unbuilt-tree systemd-246.6/ > autopkgtest [16:33:18]: testing package systemd version 246.6-5 Nevermind, I see that the script has been modified to use bash. I can reproduce the problem now so it's all good. Thanks, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Changes to job handling cause hangs in wait 2020-12-01 5:42 ` Herbert Xu @ 2020-12-01 6:06 ` Herbert Xu 2020-12-01 6:56 ` Herbert Xu 2020-12-01 10:14 ` Harald van Dijk 0 siblings, 2 replies; 21+ messages in thread From: Herbert Xu @ 2020-12-01 6:06 UTC (permalink / raw) To: Andrej Shadura; +Cc: dash, Michael Biebl, 974705 On Tue, Dec 01, 2020 at 04:42:03PM +1100, Herbert Xu wrote: > > Nevermind, I see that the script has been modified to use bash. > > I can reproduce the problem now so it's all good. OK the problem is this: sh -c 'sleep 1d& exec $MYSHELL -c "sleep 1& wait"' You can replace MYSHELL with whatever shell you want to use. Essentially dash will now wait for all children, even ones that were created prior to its existence, however, bash only waits for children that it created directly. FWIW ksh exhibits the same behaviour as dash and I think there is nothing wrong with this. So the problem is really in the parent of this shell, which appears to be bash: bash -c set -e; export USER=`id -nu`; . /etc/profile >/dev/null 2>&1 || true; . ~/.profile >/dev/null 2>&1 || true; buildtree="/tmp/autopkgtest-lxc.is4n6xxr/downtmp/build.f2G/real-tree"; mkdir -p -m 1777 -- "/tmp/autopkgtest-lxc.is4n6xxr/downtmp/timedated-artifacts"; export AUTOPKGTEST_ARTIFACTS="/tmp/autopkgtest-lxc.is4n6xxr/downtmp/timedated-artifacts"; export ADT_ARTIFACTS="$AUTOPKGTEST_ARTIFACTS"; mkdir -p -m 755 "/tmp/autopkgtest-lxc.is4n6xxr/downtmp/autopkgtest_tmp"; export AUTOPKGTEST_TMP="/tmp/autopkgtest-lxc.is4n6xxr/downtmp/autopkgtest_tmp"; export ADTTMP="$AUTOPKGTEST_TMP"; export DEBIAN_FRONTEND=noninteractive; export LANG=C.UTF-8; export DEB_BUILD_OPTIONS=parallel=2; unset LANGUAGE LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT LC_IDENTIFICATION LC_ALL;rm -f /tmp/autopkgtest_script_pid; set -C; echo $$ > /tmp/autopkgtest_script_pid; set +C; trap "rm -f /tmp/autopkgtest_script_pid" EXIT INT QUIT PIPE; cd "$buildtree"; export AUTOPKGTEST_NORMAL_USER=; export ADT_NORMAL_USER=; chmod +x /tmp/autopkgtest-lxc.is4n6xxr/downtmp/build.f2G/real-tree/debian/tests/timedated; touch /tmp/autopkgtest-lxc.is4n6xxr/downtmp/timedated-stdout /tmp/autopkgtest-lxc.is4n6xxr/downtmp/timedated-stderr; /tmp/autopkgtest-lxc.is4n6xxr/downtmp/build.f2G/real-tree/debian/tests/timedated 2> >(tee -a /tmp/autopkgtest-lxc.is4n6xxr/downtmp/timedated-stderr >&2) > >(tee -a /tmp/autopkgtest-lxc.is4n6xxr/downtmp/timedated-stdout); For some reason this is causing the final two tee's to be created as children of debian/tests/timedated rather than the bash shell. Cheers, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Changes to job handling cause hangs in wait 2020-12-01 6:06 ` Herbert Xu @ 2020-12-01 6:56 ` Herbert Xu 2020-12-01 23:21 ` Michael Biebl 2020-12-03 12:49 ` Michael Biebl 2020-12-01 10:14 ` Harald van Dijk 1 sibling, 2 replies; 21+ messages in thread From: Herbert Xu @ 2020-12-01 6:56 UTC (permalink / raw) To: Andrej Shadura; +Cc: dash, Michael Biebl, 974705 On Tue, Dec 01, 2020 at 05:06:18PM +1100, Herbert Xu wrote: > > For some reason this is causing the final two tee's to be created > as children of debian/tests/timedated rather than the bash shell. An alternative to changing the parent is of course to do wait $MONPID instead of wait I think this makes more sense anyway as otherwise someone could easily introduce a hang if they unwittingly add a backgrounded job to this script. Cheers, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Changes to job handling cause hangs in wait 2020-12-01 6:56 ` Herbert Xu @ 2020-12-01 23:21 ` Michael Biebl 2020-12-01 23:26 ` Herbert Xu 2020-12-03 12:49 ` Michael Biebl 1 sibling, 1 reply; 21+ messages in thread From: Michael Biebl @ 2020-12-01 23:21 UTC (permalink / raw) To: Herbert Xu, Andrej Shadura; +Cc: dash, 974705 Am 01.12.20 um 07:56 schrieb Herbert Xu: > On Tue, Dec 01, 2020 at 05:06:18PM +1100, Herbert Xu wrote: >> >> For some reason this is causing the final two tee's to be created >> as children of debian/tests/timedated rather than the bash shell. > > An alternative to changing the parent is of course to do > > wait $MONPID > > instead of > > wait > > I think this makes more sense anyway as otherwise someone could > easily introduce a hang if they unwittingly add a backgrounded > job to this script. I'll update the timedated test accordingly. Incidentally we already have https://salsa.debian.org/systemd-team/systemd/-/merge_requests/110 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Changes to job handling cause hangs in wait 2020-12-01 23:21 ` Michael Biebl @ 2020-12-01 23:26 ` Herbert Xu 2020-12-02 5:31 ` [PATCH] jobs: Only block in waitcmd on first run Herbert Xu 2020-12-03 12:27 ` Changes to job handling cause hangs in wait Michael Biebl 0 siblings, 2 replies; 21+ messages in thread From: Herbert Xu @ 2020-12-01 23:26 UTC (permalink / raw) To: Michael Biebl; +Cc: Andrej Shadura, dash, 974705 On Wed, Dec 02, 2020 at 12:21:52AM +0100, Michael Biebl wrote: > > I'll update the timedated test accordingly. > Incidentally we already have > https://salsa.debian.org/systemd-team/systemd/-/merge_requests/110 Thanks. I will make dash not wait for existing children as this is indeed a change in behaviour which is undesirable. Cheers, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH] jobs: Only block in waitcmd on first run 2020-12-01 23:26 ` Herbert Xu @ 2020-12-02 5:31 ` Herbert Xu 2020-12-03 12:27 ` Changes to job handling cause hangs in wait Michael Biebl 1 sibling, 0 replies; 21+ messages in thread From: Herbert Xu @ 2020-12-02 5:31 UTC (permalink / raw) To: Michael Biebl; +Cc: Andrej Shadura, dash, 974705 This patch ensures that waitcmd never blocks unless there are outstanding jobs. This could otherwise trigger a hang if children were created prior to the shell coming into existence, or if there are backgrounded children of other kinds (e.g., a here- document). Fixes: 6c691b3e5099 ("jobs: Only clear gotsigchld when waiting...") Reported-by: Michael Biebl <biebl@debian.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> diff --git a/src/jobs.c b/src/jobs.c index 3417633..516786f 100644 --- a/src/jobs.c +++ b/src/jobs.c @@ -81,6 +81,7 @@ #define DOWAIT_NONBLOCK 0 #define DOWAIT_BLOCK 1 #define DOWAIT_WAITCMD 2 +#define DOWAIT_WAITCMD_ALL 4 /* array of jobs */ static struct job *jobtab; @@ -615,7 +616,7 @@ waitcmd(int argc, char **argv) jp->waited = 1; jp = jp->prev_job; } - if (!dowait(DOWAIT_WAITCMD, 0)) + if (!dowait(DOWAIT_WAITCMD_ALL, 0)) goto sigout; } } @@ -1138,6 +1139,7 @@ static int dowait(int block, struct job *jp) pid = waitone(block, jp); rpid &= !!pid; + block &= ~DOWAIT_WAITCMD_ALL; if (!pid || (jp && jp->state != JOBRUNNING)) block = DOWAIT_NONBLOCK; } while (pid >= 0); -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: Changes to job handling cause hangs in wait 2020-12-01 23:26 ` Herbert Xu 2020-12-02 5:31 ` [PATCH] jobs: Only block in waitcmd on first run Herbert Xu @ 2020-12-03 12:27 ` Michael Biebl 2020-12-07 3:55 ` Herbert Xu 1 sibling, 1 reply; 21+ messages in thread From: Michael Biebl @ 2020-12-03 12:27 UTC (permalink / raw) To: Herbert Xu; +Cc: Andrej Shadura, dash, 974705 Am 02.12.20 um 00:26 schrieb Herbert Xu: > On Wed, Dec 02, 2020 at 12:21:52AM +0100, Michael Biebl wrote: >> >> I'll update the timedated test accordingly. >> Incidentally we already have >> https://salsa.debian.org/systemd-team/systemd/-/merge_requests/110 > > Thanks. Hm, so I applied this change and switched backed to dash, but now I get autopkgtest systemd --test-name=timedated -s -- lxc -s autopkgtest-sid ... disable NTP enable NTP Terminated autopkgtest [13:21:42]: test timedated: -----------------------] autopkgtest [13:21:42]: test timedated: - - - - - - - - - - results - - - - - - - - - - timedated FAIL non-zero exit status 143 Running with sh -x ... + kill 825 + wait 825 Terminated Herbert, any idea what going wrong there? (systemd 247.1-2 and dash 0.5.11+git20200708+dd9ef66-2) Michael ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Changes to job handling cause hangs in wait 2020-12-03 12:27 ` Changes to job handling cause hangs in wait Michael Biebl @ 2020-12-07 3:55 ` Herbert Xu 0 siblings, 0 replies; 21+ messages in thread From: Herbert Xu @ 2020-12-07 3:55 UTC (permalink / raw) To: Michael Biebl; +Cc: Andrej Shadura, dash, 974705 On Thu, Dec 03, 2020 at 01:27:51PM +0100, Michael Biebl wrote: > > Hm, so I applied this change and switched backed to dash, but now I get > > autopkgtest systemd --test-name=timedated -s -- lxc -s autopkgtest-sid > ... > disable NTP > enable NTP > Terminated > autopkgtest [13:21:42]: test timedated: -----------------------] > autopkgtest [13:21:42]: test timedated: - - - - - - - - - - results - - - - > - - - - - - > timedated FAIL non-zero exit status 143 > > > Running with sh -x > ... > + kill 825 > + wait 825 > Terminated > > > Herbert, any idea what going wrong there? > (systemd 247.1-2 and dash 0.5.11+git20200708+dd9ef66-2) Sorry, my bad. wait(1) with no arguments ignores the error status of the child and always return zero. wait(1) specifically on a child obviously returns the error status of that child. Since the child was killed, we need to ignore that, so this works for me: wait $MONPID 2> /dev/null || : Thanks, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Changes to job handling cause hangs in wait 2020-12-01 6:56 ` Herbert Xu 2020-12-01 23:21 ` Michael Biebl @ 2020-12-03 12:49 ` Michael Biebl 1 sibling, 0 replies; 21+ messages in thread From: Michael Biebl @ 2020-12-03 12:49 UTC (permalink / raw) To: Herbert Xu, Andrej Shadura; +Cc: dash, 974705 Am 01.12.20 um 07:56 schrieb Herbert Xu: > On Tue, Dec 01, 2020 at 05:06:18PM +1100, Herbert Xu wrote: >> >> For some reason this is causing the final two tee's to be created >> as children of debian/tests/timedated rather than the bash shell. > > An alternative to changing the parent is of course to do > > wait $MONPID > > instead of > > wait > > I think this makes more sense anyway as otherwise someone could > easily introduce a hang if they unwittingly add a backgrounded > job to this script. I applied this change, but now the script terminates with 143 [1] I'm happy for any suggestions to make the script more correct/robust and make it pass successfully. Michael [1] https://ci.debian.net/data/autopkgtest/testing/amd64/s/systemd/8600208/log.gz ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Changes to job handling cause hangs in wait 2020-12-01 6:06 ` Herbert Xu 2020-12-01 6:56 ` Herbert Xu @ 2020-12-01 10:14 ` Harald van Dijk 2020-12-01 10:34 ` Herbert Xu 1 sibling, 1 reply; 21+ messages in thread From: Harald van Dijk @ 2020-12-01 10:14 UTC (permalink / raw) To: Herbert Xu, Andrej Shadura; +Cc: dash, Michael Biebl, 974705 On 01/12/2020 06:06, Herbert Xu wrote: > On Tue, Dec 01, 2020 at 04:42:03PM +1100, Herbert Xu wrote: >> >> Nevermind, I see that the script has been modified to use bash. >> >> I can reproduce the problem now so it's all good. > > OK the problem is this: > > sh -c 'sleep 1d& exec $MYSHELL -c "sleep 1& wait"' > > You can replace MYSHELL with whatever shell you want to use. > > Essentially dash will now wait for all children, even ones that > were created prior to its existence, however, bash only waits for > children that it created directly. > > FWIW ksh exhibits the same behaviour as dash and I think there > is nothing wrong with this. POSIX says: "If the wait utility is invoked with no operands, it shall wait until all process IDs known to the invoking shell have terminated and exit with a zero exit status." I would say that child processes that were created before dash was started do not have process IDs known to dash. > So the problem is really in the parent of this shell, which appears > to be bash: > > bash -c set -e; export USER=`id -nu`; . /etc/profile >/dev/null 2>&1 || true; . ~/.profile >/dev/null 2>&1 || true; buildtree="/tmp/autopkgtest-lxc.is4n6xxr/downtmp/build.f2G/real-tree"; mkdir -p -m 1777 -- "/tmp/autopkgtest-lxc.is4n6xxr/downtmp/timedated-artifacts"; export AUTOPKGTEST_ARTIFACTS="/tmp/autopkgtest-lxc.is4n6xxr/downtmp/timedated-artifacts"; export ADT_ARTIFACTS="$AUTOPKGTEST_ARTIFACTS"; mkdir -p -m 755 "/tmp/autopkgtest-lxc.is4n6xxr/downtmp/autopkgtest_tmp"; export AUTOPKGTEST_TMP="/tmp/autopkgtest-lxc.is4n6xxr/downtmp/autopkgtest_tmp"; export ADTTMP="$AUTOPKGTEST_TMP"; export DEBIAN_FRONTEND=noninteractive; export LANG=C.UTF-8; export DEB_BUILD_OPTIONS=parallel=2; unset LANGUAGE LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT LC_IDENTIFICATION LC_ALL;rm -f /tmp/autopkgtest_script_pid; set -C; echo $$ > /tmp/autopkgtest_script_pid; set +C; trap "rm -f /tmp/autopkgtest_script_pid" EXIT INT QUIT PIPE; cd "$buildtree"; export AUTOPKGTEST_NORMAL_USER=; export ADT_NORMAL_USER=; chmod +x /tmp/autopkgtest-lxc.is4n6xxr/downtmp/build.f2G/real-tree/debian/tests/timedated; touch /tmp/autopkgtest-lxc.is4n6xxr/downtmp/timedated-stdout /tmp/autopkgtest-lxc.is4n6xxr/downtmp/timedated-stderr; /tmp/autopkgtest-lxc.is4n6xxr/downtmp/build.f2G/real-tree/debian/tests/timedated 2> >(tee -a /tmp/autopkgtest-lxc.is4n6xxr/downtmp/timedated-stderr >&2) > >(tee -a /tmp/autopkgtest-lxc.is4n6xxr/downtmp/timedated-stdout); > > For some reason this is causing the final two tee's to be created > as children of debian/tests/timedated rather than the bash shell. This is because of the same optimisation that dash also has, where it tries to avoid creating a subshell for the last command in a list when it can just exec() without a fork() instead. A minimal example without an explicit exec is bash -c 'dash -c ": & wait" <(sleep 1d)' Cheers, Harald van Dijk ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Changes to job handling cause hangs in wait 2020-12-01 10:14 ` Harald van Dijk @ 2020-12-01 10:34 ` Herbert Xu 2020-12-01 10:50 ` Harald van Dijk 0 siblings, 1 reply; 21+ messages in thread From: Herbert Xu @ 2020-12-01 10:34 UTC (permalink / raw) To: Harald van Dijk; +Cc: Andrej Shadura, dash, Michael Biebl, 974705 On Tue, Dec 01, 2020 at 10:14:04AM +0000, Harald van Dijk wrote: > > POSIX says: > > "If the wait utility is invoked with no operands, it shall wait until > all process IDs known to the invoking shell have terminated and exit with a > zero exit status." > > I would say that child processes that were created before dash was started > do not have process IDs known to dash. Well I disagree and the fact that the original ksh does the same thing is an important fact. > > bash -c set -e; export USER=`id -nu`; . /etc/profile >/dev/null 2>&1 || true; . ~/.profile >/dev/null 2>&1 || true; buildtree="/tmp/autopkgtest-lxc.is4n6xxr/downtmp/build.f2G/real-tree"; mkdir -p -m 1777 -- "/tmp/autopkgtest-lxc.is4n6xxr/downtmp/timedated-artifacts"; export AUTOPKGTEST_ARTIFACTS="/tmp/autopkgtest-lxc.is4n6xxr/downtmp/timedated-artifacts"; export ADT_ARTIFACTS="$AUTOPKGTEST_ARTIFACTS"; mkdir -p -m 755 "/tmp/autopkgtest-lxc.is4n6xxr/downtmp/autopkgtest_tmp"; export AUTOPKGTEST_TMP="/tmp/autopkgtest-lxc.is4n6xxr/downtmp/autopkgtest_tmp"; export ADTTMP="$AUTOPKGTEST_TMP"; export DEBIAN_FRONTEND=noninteractive; export LANG=C.UTF-8; export DEB_BUILD_OPTIONS=parallel=2; unset LANGUAGE LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT LC_IDENTIFICATION LC_ALL;rm -f /tmp/autopkgtest_script_pid; set -C; echo $$ > /tmp/autopkgtest_script_pid; set +C; trap "rm -f /tmp/autopkgtest_script_pid" EXIT INT Q UIT PIPE; cd "$buildtree"; export AUTOPKGTEST_NORMAL_USER=; export ADT_NORMAL_USER=; chmod +x /tmp/autopkgtest-lxc.is4n6xxr/downtmp/build.f2G/real-tree/debian/tests/timedated; touch /tmp/autopkgtest-lxc.is4n6xxr/downtmp/timedated-stdout /tmp/autopkgtest-lxc.is4n6xxr/downtmp/timedated-stderr; /tmp/autopkgtest-lxc.is4n6xxr/downtmp/build.f2G/real-tree/debian/tests/timedated 2> >(tee -a /tmp/autopkgtest-lxc.is4n6xxr/downtmp/timedated-stderr >&2) > >(tee -a /tmp/autopkgtest-lxc.is4n6xxr/downtmp/timedated-stdout); > > > > For some reason this is causing the final two tee's to be created > > as children of debian/tests/timedated rather than the bash shell. > > This is because of the same optimisation that dash also has, where it tries > to avoid creating a subshell for the last command in a list when it can just > exec() without a fork() instead. A minimal example without an explicit exec > is > > bash -c 'dash -c ": & wait" <(sleep 1d)' I'm not sure about that because bash itself is still hanging around, if it were really the -c optimisation then bash should not appear in the ps output at all. Cheers, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Changes to job handling cause hangs in wait 2020-12-01 10:34 ` Herbert Xu @ 2020-12-01 10:50 ` Harald van Dijk 2020-12-01 10:53 ` Herbert Xu 0 siblings, 1 reply; 21+ messages in thread From: Harald van Dijk @ 2020-12-01 10:50 UTC (permalink / raw) To: Herbert Xu; +Cc: Andrej Shadura, dash, Michael Biebl, 974705 On 01/12/2020 10:34, Herbert Xu wrote: > On Tue, Dec 01, 2020 at 10:14:04AM +0000, Harald van Dijk wrote: >> >> POSIX says: >> >> "If the wait utility is invoked with no operands, it shall wait until >> all process IDs known to the invoking shell have terminated and exit with a >> zero exit status." >> >> I would say that child processes that were created before dash was started >> do not have process IDs known to dash. > > Well I disagree Then dash still has a bug regardless: bash -c 'dash -c "wait" <(sleep 1d)' Here, dash does not wait for the child process. The sleep process is either known to dash, in which case dash must wait in both examples, or not known to dash, in which case dash must wait in neither example. > and the fact that the original ksh does the same > thing is an important fact. The original ksh is well-known not to conform to the current POSIX requirements, in some cases because POSIX changed requirements, in some cases because ksh's behaviour was clearly buggy and not worth standardising. >>> bash -c set -e; export USER=`id -nu`; . /etc/profile >/dev/null 2>&1 || true; . ~/.profile >/dev/null 2>&1 || true; buildtree="/tmp/autopkgtest-lxc.is4n6xxr/downtmp/build.f2G/real-tree"; mkdir -p -m 1777 -- "/tmp/autopkgtest-lxc.is4n6xxr/downtmp/timedated-artifacts"; export AUTOPKGTEST_ARTIFACTS="/tmp/autopkgtest-lxc.is4n6xxr/downtmp/timedated-artifacts"; export ADT_ARTIFACTS="$AUTOPKGTEST_ARTIFACTS"; mkdir -p -m 755 "/tmp/autopkgtest-lxc.is4n6xxr/downtmp/autopkgtest_tmp"; export AUTOPKGTEST_TMP="/tmp/autopkgtest-lxc.is4n6xxr/downtmp/autopkgtest_tmp"; export ADTTMP="$AUTOPKGTEST_TMP"; export DEBIAN_FRONTEND=noninteractive; export LANG=C.UTF-8; export DEB_BUILD_OPTIONS=parallel=2; unset LANGUAGE LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT LC_IDENTIFICATION LC_ALL;rm -f /tmp/autopkgtest_script_pid; set -C; echo $$ > /tmp/autopkgtest_script_pid; set +C; trap "rm -f /tmp/autopkgtest_script_pid" EXIT INT QUIT PIPE; cd "$buildtree"; export AUTOPKGTEST_NORMAL_USER=; export ADT_NORMAL_USER=; chmod +x /tmp/autopkgtest-lxc.is4n6xxr/downtmp/build.f2G/real-tree/debian/tests/timedated; touch /tmp/autopkgtest-lxc.is4n6xxr/downtmp/timedated-stdout /tmp/autopkgtest-lxc.is4n6xxr/downtmp/timedated-stderr; /tmp/autopkgtest-lxc.is4n6xxr/downtmp/build.f2G/real-tree/debian/tests/timedated 2> >(tee -a /tmp/autopkgtest-lxc.is4n6xxr/downtmp/timedated-stderr >&2) > >(tee -a /tmp/autopkgtest-lxc.is4n6xxr/downtmp/timedated-stdout); >>> >>> For some reason this is causing the final two tee's to be created >>> as children of debian/tests/timedated rather than the bash shell. >> >> This is because of the same optimisation that dash also has, where it tries >> to avoid creating a subshell for the last command in a list when it can just >> exec() without a fork() instead. A minimal example without an explicit exec >> is >> >> bash -c 'dash -c ": & wait" <(sleep 1d)' > > I'm not sure about that because bash itself is still hanging around, > if it were really the -c optimisation then bash should not appear in > the ps output at all. Good point, it's a bit more complicated. Can take a more in-depth look at that example later. Here's a simpler example without bash at all, then, where it is far clearer that the -c optimisation is responsible: dash -c 'sleep 1d & dash -c ": & wait"' This used to exit immediately, leaving the sleep process running in the background without waiting for it. On the dash that's currently provided by Ubuntu, based on 0.5.10.2, it still behaves that way. With current dash from Git, it does not. This is clearly a change in behaviour in dash not in response to any bug (real or not) in bash. Cheers, Harald van Dijk ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Changes to job handling cause hangs in wait 2020-12-01 10:50 ` Harald van Dijk @ 2020-12-01 10:53 ` Herbert Xu 2020-12-01 10:55 ` Harald van Dijk 2020-12-01 13:04 ` Michael Biebl 0 siblings, 2 replies; 21+ messages in thread From: Herbert Xu @ 2020-12-01 10:53 UTC (permalink / raw) To: Harald van Dijk; +Cc: Andrej Shadura, dash, Michael Biebl, 974705 On Tue, Dec 01, 2020 at 10:50:19AM +0000, Harald van Dijk wrote: > > This used to exit immediately, leaving the sleep process running in the > background without waiting for it. On the dash that's currently provided by > Ubuntu, based on 0.5.10.2, it still behaves that way. With current dash from > Git, it does not. This is clearly a change in behaviour in dash not in > response to any bug (real or not) in bash. I'm not suggesting it's a bug in bash. If anything it's a bug in the script. You should never do a naked wait unless you are sure that there are no other children around that you don't know of. In the original bug, the proper solution is to wait on the PID that the script just sent a kill signal to. Cheers, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Changes to job handling cause hangs in wait 2020-12-01 10:53 ` Herbert Xu @ 2020-12-01 10:55 ` Harald van Dijk 2020-12-01 10:56 ` Herbert Xu 2020-12-01 13:04 ` Michael Biebl 1 sibling, 1 reply; 21+ messages in thread From: Harald van Dijk @ 2020-12-01 10:55 UTC (permalink / raw) To: Herbert Xu; +Cc: Andrej Shadura, dash, Michael Biebl, 974705 On 01/12/2020 10:53, Herbert Xu wrote: > On Tue, Dec 01, 2020 at 10:50:19AM +0000, Harald van Dijk wrote: >> >> This used to exit immediately, leaving the sleep process running in the >> background without waiting for it. On the dash that's currently provided by >> Ubuntu, based on 0.5.10.2, it still behaves that way. With current dash from >> Git, it does not. This is clearly a change in behaviour in dash not in >> response to any bug (real or not) in bash. > > I'm not suggesting it's a bug in bash. You wrote: "So the problem is really in the parent of this shell, which appears to be bash:" Cheers, Harald van Dijk ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Changes to job handling cause hangs in wait 2020-12-01 10:55 ` Harald van Dijk @ 2020-12-01 10:56 ` Herbert Xu 2020-12-01 10:59 ` Harald van Dijk 0 siblings, 1 reply; 21+ messages in thread From: Herbert Xu @ 2020-12-01 10:56 UTC (permalink / raw) To: Harald van Dijk; +Cc: Andrej Shadura, dash, Michael Biebl, 974705 On Tue, Dec 01, 2020 at 10:55:06AM +0000, Harald van Dijk wrote: > > You wrote: "So the problem is really in the parent of this shell, which > appears to be bash:" You should read my follow-up email too that suggested changing the systemd script. Cheers, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Changes to job handling cause hangs in wait 2020-12-01 10:56 ` Herbert Xu @ 2020-12-01 10:59 ` Harald van Dijk 2020-12-01 11:01 ` Herbert Xu 0 siblings, 1 reply; 21+ messages in thread From: Harald van Dijk @ 2020-12-01 10:59 UTC (permalink / raw) To: Herbert Xu; +Cc: Andrej Shadura, dash, Michael Biebl, 974705 On 01/12/2020 10:56, Herbert Xu wrote: > On Tue, Dec 01, 2020 at 10:55:06AM +0000, Harald van Dijk wrote: >> >> You wrote: "So the problem is really in the parent of this shell, which >> appears to be bash:" > > You should read my follow-up email too that suggested changing > the systemd script. I do not appreciate your false accusation that I did not read your follow-up email. You suggested that as an alternative, without retracting your claim that this is a problem in bash. Cheers, Harald van Dijk ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Changes to job handling cause hangs in wait 2020-12-01 10:59 ` Harald van Dijk @ 2020-12-01 11:01 ` Herbert Xu 0 siblings, 0 replies; 21+ messages in thread From: Herbert Xu @ 2020-12-01 11:01 UTC (permalink / raw) To: Harald van Dijk; +Cc: Andrej Shadura, dash, Michael Biebl, 974705 On Tue, Dec 01, 2020 at 10:59:20AM +0000, Harald van Dijk wrote: > > I do not appreciate your false accusation that I did not read your follow-up > email. You suggested that as an alternative, without retracting your claim > that this is a problem in bash. Well I'm not accusing you of anything, I was just trying to make sure you are aware of my complete message. In any case, I never said that this is a bug in bash, merely that bash is involved in creating a situation where dash started out with children in the background. If anything this could just be an artefact of the bash script rather than bash itself. Cheers, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Changes to job handling cause hangs in wait 2020-12-01 10:53 ` Herbert Xu 2020-12-01 10:55 ` Harald van Dijk @ 2020-12-01 13:04 ` Michael Biebl 1 sibling, 0 replies; 21+ messages in thread From: Michael Biebl @ 2020-12-01 13:04 UTC (permalink / raw) To: Herbert Xu, Harald van Dijk; +Cc: Andrej Shadura, dash, 974705 Am 01.12.20 um 11:53 schrieb Herbert Xu: > On Tue, Dec 01, 2020 at 10:50:19AM +0000, Harald van Dijk wrote: >> >> This used to exit immediately, leaving the sleep process running in the >> background without waiting for it. On the dash that's currently provided by >> Ubuntu, based on 0.5.10.2, it still behaves that way. With current dash from >> Git, it does not. This is clearly a change in behaviour in dash not in >> response to any bug (real or not) in bash. > > I'm not suggesting it's a bug in bash. If anything it's a bug > in the script. Keep in mind, that the timedated script is not the only affected script, there is at least src:fence-agents as well and there might be other packages affected as well as local scripts which expect /bin/sh to behave POSIX compliant in that regard. If this change in behaviour is kept, I would at least expect a big fat NEWS entry in the Debian dash package which is shown on upgrades and it should probably also update the package description which currently still says that it is a POSIX-compliant shell. Regards, Michael ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2020-12-07 3:56 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-11-16 10:29 Changes to job handling cause hangs in wait Andrej Shadura 2020-11-17 3:33 ` Herbert Xu 2020-12-01 5:38 ` Herbert Xu 2020-12-01 5:42 ` Herbert Xu 2020-12-01 6:06 ` Herbert Xu 2020-12-01 6:56 ` Herbert Xu 2020-12-01 23:21 ` Michael Biebl 2020-12-01 23:26 ` Herbert Xu 2020-12-02 5:31 ` [PATCH] jobs: Only block in waitcmd on first run Herbert Xu 2020-12-03 12:27 ` Changes to job handling cause hangs in wait Michael Biebl 2020-12-07 3:55 ` Herbert Xu 2020-12-03 12:49 ` Michael Biebl 2020-12-01 10:14 ` Harald van Dijk 2020-12-01 10:34 ` Herbert Xu 2020-12-01 10:50 ` Harald van Dijk 2020-12-01 10:53 ` Herbert Xu 2020-12-01 10:55 ` Harald van Dijk 2020-12-01 10:56 ` Herbert Xu 2020-12-01 10:59 ` Harald van Dijk 2020-12-01 11:01 ` Herbert Xu 2020-12-01 13:04 ` Michael Biebl
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).