All of lore.kernel.org
 help / color / mirror / Atom feed
From: Petr Vorel <pvorel@suse.cz>
To: Cyril Hrubis <chrubis@suse.cz>
Cc: ltp@lists.linux.it
Subject: Re: [LTP] [PATCH] [RFC] lib: shell: Fix timeout process races
Date: Mon, 20 Sep 2021 09:52:57 +0200	[thread overview]
Message-ID: <YUg92YO7x1wQO/qD@pevik> (raw)
In-Reply-To: <20210917141719.5739-1-chrubis@suse.cz>

Hi Cyril,

Reviewed-by: Petr Vorel <pvorel@suse.cz>
Tested-by: Petr Vorel <pvorel@suse.cz>

> There were actually several races in the shell library timeout handling.

> This commit fixes hopefully all of them by:

> * Reimplementing the backgroud timer in C
+1

> * Making sure that the timer has started before we leave test setup
+1

> The rewrite of the backround timer to C allows us to run all the timeout
> logic in a single process, which simplifies the whole problem greatly
> since previously we had chain of processes that estabilished signal
> handlers to kill it's descendants, which in the end had a few races in
> it.

> The race that caused the problems is, as far as I can tell, in the way
> how shell spawns it's children. I haven't checked the shell code, but I
> guess that when shell runs a process in bacground it does vfork() +
> exec() and because signals are ignored during the operation. If the
> SIGTERM arrives at that point it gets lost.

> That means that we created a race window in the shell library each time
> we started a new process. The rewrite to C simplifies the code but we
> still end up with a single place where this can happen and that is when
> we execute the tst_timeout_kill binary. This is now fixed in the shell
> library by waiting until the background process gets to a sleep state,
> which means that the proces has been executed and waiting for the
> timeout.

> After these fixes I haven't been able to reproduce the hang with:

> cat > debug.sh <<EOF
> #!/bin/sh

> TST_SETUP="setup"
> TST_TESTFUNC="do_test"
> . tst_test.sh

> setup()
> {
>         tst_brk TCONF "quit now!"
> }

> do_test()
> {
>         tst_res TPASS "pass :)"
> }

> tst_run
> EOF

> # while true; do ./debug.sh; done
I verified it's ok on both VM which were previously affected.

After release I might write a test for tst_timeout_kill.c.
Thanks for fixing it!

Kind regards,
Petr

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

  parent reply	other threads:[~2021-09-20  7:53 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-17 14:17 [LTP] [PATCH] [RFC] lib: shell: Fix timeout process races Cyril Hrubis
2021-09-17 14:17 ` Cyril Hrubis
2021-09-20  4:51   ` Joerg Vehlow
2021-09-20  7:36     ` Cyril Hrubis
2021-09-20 12:02       ` Cyril Hrubis
2021-09-21  3:45       ` Li Wang
2021-09-20  7:52   ` Petr Vorel [this message]
2021-09-20  7:58   ` Petr Vorel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YUg92YO7x1wQO/qD@pevik \
    --to=pvorel@suse.cz \
    --cc=chrubis@suse.cz \
    --cc=ltp@lists.linux.it \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.