* [LTP] [PATCH/RFC] tst_process_state_wait: wait for schedstats to settle when state == S
@ 2019-11-05 11:57 Jan Stancek
2019-11-06 9:59 ` Jan Stancek
0 siblings, 1 reply; 6+ messages in thread
From: Jan Stancek @ 2019-11-05 11:57 UTC (permalink / raw)
To: ltp
There can be a gap between task state update and process being scheduled
off from runqueue. For example futex_wait_queue_me() updates task state
before it queues the futex:
static void futex_wait_queue_me(struct futex_hash_bucket *hb, struct futex_q *q,
struct hrtimer_sleeper *timeout)
{
/*
* The task state is guaranteed to be set before another task can
* wake it. set_current_state() is implemented using smp_store_mb() and
* queue_me() calls spin_unlock() upon completion, both serializing
* access to the hash list and forcing another memory barrier.
*/
set_current_state(TASK_INTERRUPTIBLE);
queue_me(q, hb);
This can lead to test assuming, that child process is already queued on
futex and try to requeue futexes early (example below is modified to use
2 children: 1 wake, 1 requeue):
futex_cmp_requeue01.c:107: FAIL: futex_cmp_requeue() requeues 0 waiters, expected 1
In addition to tst_process_state_wait() waiting on 'S' state, also wait
for /proc/pid/schedstats' sum_exec_runtime and run_delay to settle:
1) time spent on the cpu
2) time spent waiting on a runqueue
Signed-off-by: Jan Stancek <jstancek@redhat.com>
---
lib/tst_process_state.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 53 insertions(+), 2 deletions(-)
diff --git a/lib/tst_process_state.c b/lib/tst_process_state.c
index 7a7824959d72..a24b49092ef2 100644
--- a/lib/tst_process_state.c
+++ b/lib/tst_process_state.c
@@ -28,6 +28,51 @@
#include "test.h"
#include "tst_process_state.h"
+#define SETTLE_SLEEP_US 20000
+#define MAX_SETTLE_SLEEP_US (5 * SETTLE_SLEEP_US)
+
+/*
+ * When schedstats settle we have more confidence process ('S') has
+ * been scheduled off from runqueue.
+ */
+static void schedstat_settle(const char *file, const int lineno, pid_t pid)
+{
+ char ss_path[128];
+ int sleep = 0;
+
+ snprintf(ss_path, sizeof(ss_path), "/proc/%i/schedstat", pid);
+ if (access(ss_path, F_OK)) {
+ /* CONFIG_SCHED_INFO is likely disabled */
+ usleep(MAX_SETTLE_SLEEP_US);
+ return;
+ }
+
+ for (;;) {
+ unsigned long long sum_exec, run_delay, sum_exec2, run_delay2;
+ int ret = 0;
+
+ ret |= file_scanf(file, lineno, ss_path, "%llu %llu",
+ &sum_exec, &run_delay);
+ usleep(SETTLE_SLEEP_US);
+ sleep += SETTLE_SLEEP_US;
+ ret |= file_scanf(file, lineno, ss_path, "%llu %llu",
+ &sum_exec2, &run_delay2);
+
+ if (ret) {
+ tst_resm(TWARN, "Error parsing %s", ss_path);
+ return;
+ }
+
+ if (sum_exec == sum_exec2 && run_delay == run_delay2)
+ return;
+
+ if (sleep >= MAX_SETTLE_SLEEP_US) {
+ tst_resm(TWARN, "MAX_SETTLE_SLEEP_US reached");
+ return;
+ }
+ }
+}
+
void tst_process_state_wait(const char *file, const int lineno,
void (*cleanup_fn)(void),
pid_t pid, const char state)
@@ -40,8 +85,11 @@ void tst_process_state_wait(const char *file, const int lineno,
safe_file_scanf(file, lineno, cleanup_fn, proc_path,
"%*i %*s %c", &cur_state);
- if (state == cur_state)
+ if (state == cur_state) {
+ if (state == 'S')
+ schedstat_settle(file, lineno, pid);
return;
+ }
usleep(10000);
}
@@ -69,8 +117,11 @@ int tst_process_state_wait2(pid_t pid, const char state)
}
fclose(f);
- if (state == cur_state)
+ if (state == cur_state) {
+ if (state == 'S')
+ schedstat_settle(__FILE__, __LINE__, pid);
return 0;
+ }
usleep(10000);
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [LTP] [PATCH/RFC] tst_process_state_wait: wait for schedstats to settle when state == S
2019-11-05 11:57 [LTP] [PATCH/RFC] tst_process_state_wait: wait for schedstats to settle when state == S Jan Stancek
@ 2019-11-06 9:59 ` Jan Stancek
2019-11-07 11:39 ` Cyril Hrubis
2019-11-07 12:15 ` Cyril Hrubis
0 siblings, 2 replies; 6+ messages in thread
From: Jan Stancek @ 2019-11-06 9:59 UTC (permalink / raw)
To: ltp
----- Original Message -----
> There can be a gap between task state update and process being scheduled
> off from runqueue. For example futex_wait_queue_me() updates task state
> before it queues the futex:
>
> static void futex_wait_queue_me(struct futex_hash_bucket *hb, struct
> futex_q *q,
> struct hrtimer_sleeper *timeout)
> {
> /*
> * The task state is guaranteed to be set before another task can
> * wake it. set_current_state() is implemented using smp_store_mb() and
> * queue_me() calls spin_unlock() upon completion, both serializing
> * access to the hash list and forcing another memory barrier.
> */
> set_current_state(TASK_INTERRUPTIBLE);
> queue_me(q, hb);
hb->lock is locked at this point, and requeue takes it too, so I'm not
sure what makes it fail. I've seen testcase fail in at least
2 different ways now. Here's the other one:
tst_test.c:1118: INFO: Timeout per run is 0h 05m 00s
futex_cmp_requeue01.c:106: PASS: futex_cmp_requeue() woke up 3 waiters and requeued 7 waiters
futex_cmp_requeue01.c:106: PASS: futex_cmp_requeue() woke up 0 waiters and requeued 10 waiters
futex_cmp_requeue01.c:106: PASS: futex_cmp_requeue() woke up 2 waiters and requeued 6 waiters
futex_cmp_requeue01.c:106: PASS: futex_cmp_requeue() woke up 50 waiters and requeued 50 waiters
futex_cmp_requeue01.c:106: PASS: futex_cmp_requeue() woke up 0 waiters and requeued 70 waiters
futex_cmp_requeue01.c:46: INFO: process 7923 wasn't woken up: ETIMEDOUT (110)
...
futex_cmp_requeue01.c:71: FAIL: futex_cmp_requeue() returned 776, expected 1000
futex_cmp_requeue01.c:85: FAIL: 224 waiters were not woken up normally
futex_cmp_requeue01.c:106: PASS: futex_cmp_requeue() woke up 300 waiters and requeued 500 waiters
^ permalink raw reply [flat|nested] 6+ messages in thread
* [LTP] [PATCH/RFC] tst_process_state_wait: wait for schedstats to settle when state == S
2019-11-06 9:59 ` Jan Stancek
@ 2019-11-07 11:39 ` Cyril Hrubis
2019-11-07 12:03 ` Cyril Hrubis
2019-11-07 12:15 ` Cyril Hrubis
1 sibling, 1 reply; 6+ messages in thread
From: Cyril Hrubis @ 2019-11-07 11:39 UTC (permalink / raw)
To: ltp
Hi!
> hb->lock is locked at this point, and requeue takes it too, so I'm not
> sure what makes it fail. I've seen testcase fail in at least
> 2 different ways now. Here's the other one:
Looking at the test shouldn't we wait for the processes to be sleeping
after the futex_cmp_requeue() syscall?
i.e.
diff --git a/testcases/kernel/syscalls/futex/futex_cmp_requeue01.c b/testcases/kernel/syscalls/futex/futex_cmp_requeue01.c
index f5264c2ba..41f5e88a6 100644
--- a/testcases/kernel/syscalls/futex/futex_cmp_requeue01.c
+++ b/testcases/kernel/syscalls/futex/futex_cmp_requeue01.c
@@ -71,6 +71,9 @@ static void verify_futex_cmp_requeue(unsigned int n)
TST_RET, exp_ret);
}
+ for (i = 0; i < tc->num_waiters; i++)
+ TST_PROCESS_STATE_WAIT(pid[i], 'S');
+
num_requeues = futex_wake(&futexes[1], tc->num_waiters, 0);
num_waits = futex_wake(&futexes[0], tc->num_waiters, 0);
--
Cyril Hrubis
chrubis@suse.cz
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [LTP] [PATCH/RFC] tst_process_state_wait: wait for schedstats to settle when state == S
2019-11-07 11:39 ` Cyril Hrubis
@ 2019-11-07 12:03 ` Cyril Hrubis
0 siblings, 0 replies; 6+ messages in thread
From: Cyril Hrubis @ 2019-11-07 12:03 UTC (permalink / raw)
To: ltp
Hi!
> > hb->lock is locked at this point, and requeue takes it too, so I'm not
> > sure what makes it fail. I've seen testcase fail in at least
> > 2 different ways now. Here's the other one:
>
> Looking at the test shouldn't we wait for the processes to be sleeping
> after the futex_cmp_requeue() syscall?
>
> i.e.
>
> diff --git a/testcases/kernel/syscalls/futex/futex_cmp_requeue01.c b/testcases/kernel/syscalls/futex/futex_cmp_requeue01.c
> index f5264c2ba..41f5e88a6 100644
> --- a/testcases/kernel/syscalls/futex/futex_cmp_requeue01.c
> +++ b/testcases/kernel/syscalls/futex/futex_cmp_requeue01.c
> @@ -71,6 +71,9 @@ static void verify_futex_cmp_requeue(unsigned int n)
> TST_RET, exp_ret);
> }
>
> + for (i = 0; i < tc->num_waiters; i++)
> + TST_PROCESS_STATE_WAIT(pid[i], 'S');
> +
> num_requeues = futex_wake(&futexes[1], tc->num_waiters, 0);
> num_waits = futex_wake(&futexes[0], tc->num_waiters, 0);
And this obviously wouldn't work since we are waking up some of them in
a subset of the testcases.
--
Cyril Hrubis
chrubis@suse.cz
^ permalink raw reply [flat|nested] 6+ messages in thread
* [LTP] [PATCH/RFC] tst_process_state_wait: wait for schedstats to settle when state == S
2019-11-06 9:59 ` Jan Stancek
2019-11-07 11:39 ` Cyril Hrubis
@ 2019-11-07 12:15 ` Cyril Hrubis
2019-11-07 12:31 ` Jan Stancek
1 sibling, 1 reply; 6+ messages in thread
From: Cyril Hrubis @ 2019-11-07 12:15 UTC (permalink / raw)
To: ltp
Hi!
> hb->lock is locked at this point, and requeue takes it too, so I'm not
> sure what makes it fail. I've seen testcase fail in at least
> 2 different ways now. Here's the other one:
Here is another theory, some of the processes may be sleeping in a
different place in the kernel, somewhere between the fork() and the
futex(), and hence we think that they have been suspended on the futex
but aren't.
I guess that what we can do is to put a counter in a piece of shared
memory and increment it from each child just before the futex_wait()
call and wait in the parent until the counter reached num_waiters.
--
Cyril Hrubis
chrubis@suse.cz
^ permalink raw reply [flat|nested] 6+ messages in thread
* [LTP] [PATCH/RFC] tst_process_state_wait: wait for schedstats to settle when state == S
2019-11-07 12:15 ` Cyril Hrubis
@ 2019-11-07 12:31 ` Jan Stancek
0 siblings, 0 replies; 6+ messages in thread
From: Jan Stancek @ 2019-11-07 12:31 UTC (permalink / raw)
To: ltp
----- Original Message -----
> Hi!
> > hb->lock is locked at this point, and requeue takes it too, so I'm not
> > sure what makes it fail. I've seen testcase fail in at least
> > 2 different ways now. Here's the other one:
>
> Here is another theory, some of the processes may be sleeping in a
> different place in the kernel, somewhere between the fork() and the
> futex(), and hence we think that they have been suspended on the futex
> but aren't.
>
> I guess that what we can do is to put a counter in a piece of shared
> memory and increment it from each child just before the futex_wait()
> call and wait in the parent until the counter reached num_waiters.
It does look related to spurious wake ups and fact that test doesn't
change futex value. I raised it on lkml, here's important part:
"If there is an actual use case for keeping the uaddr1 value the same across
a requeue then this needs to be described properly and also needs to be
handled differently and consistently in all cases not just for a spurious
wakeup."
https://lore.kernel.org/lkml/alpine.DEB.2.21.1911070009040.1869@nanos.tec.linutronix.de/T/#m5662b71d7e0d14b6d74137c1da81d774e5035f9a
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-11-07 12:31 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-05 11:57 [LTP] [PATCH/RFC] tst_process_state_wait: wait for schedstats to settle when state == S Jan Stancek
2019-11-06 9:59 ` Jan Stancek
2019-11-07 11:39 ` Cyril Hrubis
2019-11-07 12:03 ` Cyril Hrubis
2019-11-07 12:15 ` Cyril Hrubis
2019-11-07 12:31 ` Jan Stancek
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.