All of lore.kernel.org
 help / color / mirror / Atom feed
From: Cyril Hrubis <chrubis@suse.cz>
To: Joerg Vehlow <lkml@jv-coder.de>
Cc: ltp@lists.linux.it
Subject: Re: [LTP] [PATCH] [RFC] lib: shell: Fix timeout process races
Date: Mon, 20 Sep 2021 09:36:36 +0200	[thread overview]
Message-ID: <YUg6BM39OKgI5Ovl@yuki> (raw)
In-Reply-To: <01fb50a1-8edb-77ac-fba4-b70965022b3f@jv-coder.de>

Hi!
> > There were actually several races in the shell library timeout handling.
> >
> > This commit fixes hopefully all of them by:
> >
> > * Reimplementing the backgroud timer in C
> I did that once, but at that point it was kinda rejected ;)
> See https://lists.linux.it/pipermail/ltp/2021-May/022445.html
> and https://lists.linux.it/pipermail/ltp/2021-May/022453.html

I guess we found out the hard way that it's impossible to write raceless
timeouts in shell.

> > +	tst_timeout_kill $sec $pid &
> >   
> >   	_tst_setup_timer_pid=$!
> > +
> > +	while true; do
> > +		local state
> > +
> > +		state=$(cut -d' ' -f3 "/proc/$_tst_setup_timer_pid/stat")
> Hmm maybe we could use the checkpoint api here instead. Wouldn't this be 
> more portable?

Well there are other tests in LTP that depend on the /proc/$PID/stat
already and I do not remmeber any bugreports about it being broken or
not available. Linux generally does not work without /proc/ being
mounted anyways.

I wanted to avoid checkpoints because that requires shared memory and
it's more complex to setup and if we add it here it would be set up by
every shell process, but maybe that's not that bad. Maybe we should
set up the piece of shared memory as we do in C and use it for the
result counter as well. But anyways that is orthogonal from the fixes
here and could be done later on.

> > +
> > +		if [ "$state" = "S" ]; then
> > +			break;
> > +		fi
> > +
> > +		tst_sleep 1ms
> > +	done
> >   }
> >   
> >   tst_require_root()
> > diff --git a/testcases/lib/tst_timeout_kill.c b/testcases/lib/tst_timeout_kill.c
> > new file mode 100644
> > index 000000000..6e97514f1
> > --- /dev/null
> > +++ b/testcases/lib/tst_timeout_kill.c
> > @@ -0,0 +1,93 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +/*
> > + * Copyright (c) 2021 Cyril Hrubis <chrubis@suse.cz>
> > + */
> > +
> > +#include <stdio.h>
> > +#include <signal.h>
> > +#include <stdlib.h>
> > +#include <unistd.h>
> > +#include <tst_res_flags.h>
> > +#include <tst_ansi_color.h>
> > +
> > +static void print_help(const char *name)
> > +{
> > +	fprintf(stderr, "usage: %s timeout pid\n", name);
> > +}
> > +
> > +static void print_msg(int type, const char *msg)
> > +{
> > +	const char *strtype = "";
> > +
> > +	switch (type) {
> > +	case TBROK:
> > +		strtype = "TBROK";
> > +	break;
> > +	case TINFO:
> > +		strtype = "TINFO";
> > +	break;
> > +	}
> > +
> > +	if (tst_color_enabled(STDERR_FILENO)) {
> > +		fprintf(stderr, "%s%s: %s%s\n", tst_ttype2color(type),
> > +			strtype, ANSI_COLOR_RESET, msg);
> > +	} else {
> > +		fprintf(stderr, "%s: %s\n", strtype, msg);
> > +	}
> > +}
> Shouldn't this be reused from the library instead of being reimplemented?

I wanted to avoid calling the tst_res() functions to keep things simple,
but I guess that there is no reason not to use it.

> > +
> > +int main(int argc, char *argv[])
> > +{
> > +	int timeout, pid, ret, i;
> > +
> > +	if (argc != 3) {
> > +		print_help(argv[0]);
> > +		return 1;
> > +	}
> > +
> > +	timeout = atoi(argv[1]);
> > +	pid = atoi(argv[2]);
> > +
> > +	if (timeout < 0) {
> > +		fprintf(stderr, "Invalid timeout '%s'", argv[1]);
> > +		print_help(argv[0]);
> > +		return 1;
> > +	}
> > +
> > +	if (pid <= 1) {
> > +		fprintf(stderr, "Invalid pid '%s'", argv[2]);
> > +		print_help(argv[0]);
> > +		return 1;
> > +	}
> > +
> > +	if (timeout)
> > +		sleep(timeout);
> > +
> > +	print_msg(TBROK, "Test timed out, sending SIGTERM! If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1");
> > +
> > +	ret = kill(-pid, SIGTERM);
> > +	if (ret) {
> > +		print_msg(TBROK, "kill() failed");
> > +		return 1;
> > +	}
> > +
> > +	usleep(100000);
> > +
> > +	i = 10;
> > +
> > +	while (!kill(-pid, 0) && i-- > 0) {
> This was kill(pid, 0) in the original shell code. I am not sure if it is 
> important

I think that this is better this way since this should loop until there
is at least one process alive in the process group.

> > +		print_msg(TINFO, "Test is still running...");
> > +		sleep(1);
> > +	}
> > +
> > +	if (!kill(-pid, 0)) {
> Here as well
> > +		print_msg(TBROK, "Test is still running, sending SIGKILL");
> > +		ret = kill(-pid, SIGKILL);
> > +		if (ret) {
> > +			print_msg(TBROK, "kill() failed");
> > +			return 1;
> > +		}
> > +	}
> > +
> > +	return 0;
> > +}
> 
> Joerg

-- 
Cyril Hrubis
chrubis@suse.cz

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

  reply	other threads:[~2021-09-20  7:36 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-17 14:17 [LTP] [PATCH] [RFC] lib: shell: Fix timeout process races Cyril Hrubis
2021-09-17 14:17 ` Cyril Hrubis
2021-09-20  4:51   ` Joerg Vehlow
2021-09-20  7:36     ` Cyril Hrubis [this message]
2021-09-20 12:02       ` Cyril Hrubis
2021-09-21  3:45       ` Li Wang
2021-09-20  7:52   ` Petr Vorel
2021-09-20  7:58   ` Petr Vorel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YUg6BM39OKgI5Ovl@yuki \
    --to=chrubis@suse.cz \
    --cc=lkml@jv-coder.de \
    --cc=ltp@lists.linux.it \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.