* [LTP] [PATCH v2] read_all: give more time to wait children finish read action
@ 2018-04-11 9:47 Li Wang
2018-04-11 11:19 ` Richard Palethorpe
0 siblings, 1 reply; 3+ messages in thread
From: Li Wang @ 2018-04-11 9:47 UTC (permalink / raw)
To: ltp
1. We get the following worker stalled messges in test:
# ./read_all -d /sys -q -r 10
tst_test.c:987: INFO: Timeout per run is 0h 05m 00s
read_all.c:280: BROK: Worker 26075 is stalled
read_all.c:280: WARN: Worker 26075 is stalled
read_all.c:280: WARN: Worker 26079 is stalled
read_all.c:280: WARN: Worker 26087 is stalled
The reason is that some children are still working on the read I/O but
parent trys to stopping them after visit_dir() immediately. Although
the stop_attemps is 65535, it still sometimes fails.
Instead, we use an exponential backoff way to loop the stop operation
in limited seconds.
2. The sched_work() push action in a infinite loop, here also let it
trys in limited time.
Signed-off-by: Li Wang <liwang@redhat.com>
---
testcases/kernel/fs/read_all/read_all.c | 24 ++++++++++++++++--------
1 file changed, 16 insertions(+), 8 deletions(-)
diff --git a/testcases/kernel/fs/read_all/read_all.c b/testcases/kernel/fs/read_all/read_all.c
index b7ed540..a9f9707 100644
--- a/testcases/kernel/fs/read_all/read_all.c
+++ b/testcases/kernel/fs/read_all/read_all.c
@@ -57,6 +57,8 @@
#define BUFFER_SIZE 1024
#define MAX_PATH 4096
#define MAX_DISPLAY 40
+#define MICROSECOND 1
+#define SECOND MICROSECOND * 1000000
struct queue {
sem_t sem;
@@ -265,20 +267,21 @@ static void spawn_workers(void)
static void stop_workers(void)
{
const char stop_code[1] = { '\0' };
- int i, stop_attempts;
+ int i, delay = 1;
if (!workers)
return;
for (i = 0; i < worker_count; i++) {
- stop_attempts = 0xffff;
if (workers[i].q) {
while (!queue_push(workers[i].q, stop_code)) {
- if (--stop_attempts < 0) {
+ if (delay < SECOND) {
+ usleep(delay);
+ delay *= 2;
+ } else {
tst_brk(TBROK,
"Worker %d is stalled",
workers[i].pid);
- break;
}
}
}
@@ -295,7 +298,7 @@ static void stop_workers(void)
static void sched_work(const char *path)
{
static int cur;
- int push_attempts = 0, pushed;
+ int push_attempts = 0, pushed, delay = 1;
while (1) {
pushed = queue_push(workers[cur].q, path);
@@ -306,9 +309,14 @@ static void sched_work(const char *path)
if (pushed)
break;
- if (++push_attempts > worker_count) {
- usleep(100);
- push_attempts = 0;
+ if (delay < SECOND) {
+ push_attempts++;
+ usleep(delay);
+ delay *= 2;
+ } else {
+ tst_brk(TBROK,
+ "Attempts %d times but still failed to push %s",
+ push_attempts, path);
}
}
}
--
2.9.3
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [LTP] [PATCH v2] read_all: give more time to wait children finish read action
2018-04-11 9:47 [LTP] [PATCH v2] read_all: give more time to wait children finish read action Li Wang
@ 2018-04-11 11:19 ` Richard Palethorpe
2018-04-12 3:01 ` Li Wang
0 siblings, 1 reply; 3+ messages in thread
From: Richard Palethorpe @ 2018-04-11 11:19 UTC (permalink / raw)
To: ltp
Hello Li,
Li Wang writes:
> 1. We get the following worker stalled messges in test:
> # ./read_all -d /sys -q -r 10
> tst_test.c:987: INFO: Timeout per run is 0h 05m 00s
> read_all.c:280: BROK: Worker 26075 is stalled
> read_all.c:280: WARN: Worker 26075 is stalled
> read_all.c:280: WARN: Worker 26079 is stalled
> read_all.c:280: WARN: Worker 26087 is stalled
>
> The reason is that some children are still working on the read I/O but
> parent trys to stopping them after visit_dir() immediately. Although
> the stop_attemps is 65535, it still sometimes fails.
>
> Instead, we use an exponential backoff way to loop the stop operation
> in limited seconds.
>
> 2. The sched_work() push action in a infinite loop, here also let it
> trys in limited time.
>
> Signed-off-by: Li Wang <liwang@redhat.com>
> ---
> testcases/kernel/fs/read_all/read_all.c | 24 ++++++++++++++++--------
> 1 file changed, 16 insertions(+), 8 deletions(-)
>
> diff --git a/testcases/kernel/fs/read_all/read_all.c b/testcases/kernel/fs/read_all/read_all.c
> index b7ed540..a9f9707 100644
> --- a/testcases/kernel/fs/read_all/read_all.c
> +++ b/testcases/kernel/fs/read_all/read_all.c
> @@ -57,6 +57,8 @@
> #define BUFFER_SIZE 1024
> #define MAX_PATH 4096
> #define MAX_DISPLAY 40
> +#define MICROSECOND 1
Not necessary.
> +#define SECOND MICROSECOND * 1000000
>
> struct queue {
> sem_t sem;
> @@ -265,20 +267,21 @@ static void spawn_workers(void)
> static void stop_workers(void)
> {
> const char stop_code[1] = { '\0' };
> - int i, stop_attempts;
> + int i, delay = 1;
>
> if (!workers)
> return;
>
> for (i = 0; i < worker_count; i++) {
> - stop_attempts = 0xffff;
> if (workers[i].q) {
Maybe change this to:
if (!workers[i].q)
continue;
To avoid a level of indentation.
> while (!queue_push(workers[i].q, stop_code)) {
> - if (--stop_attempts < 0) {
> + if (delay < SECOND) {
> + usleep(delay);
> + delay *= 2;
> + } else {
> tst_brk(TBROK,
> "Worker %d is stalled",
> workers[i].pid);
> - break;
> }
> }
> }
> @@ -295,7 +298,7 @@ static void stop_workers(void)
> static void sched_work(const char *path)
> {
> static int cur;
> - int push_attempts = 0, pushed;
> + int push_attempts = 0, pushed, delay = 1;
>
> while (1) {
> pushed = queue_push(workers[cur].q, path);
> @@ -306,9 +309,14 @@ static void sched_work(const char *path)
> if (pushed)
> break;
>
> - if (++push_attempts > worker_count) {
> - usleep(100);
> - push_attempts = 0;
> + if (delay < SECOND) {
> + push_attempts++;
> + usleep(delay);
> + delay *= 2;
> + } else {
> + tst_brk(TBROK,
> + "Attempts %d times but still failed to push %s",
^ Attempted
> + push_attempts, path);
> }
> }
> }
Maybe you could put the "if (delaly < SECOND) ..." into a function?
Otherwise this looks good to me. There are some other things I want to
change on this test, but we can leave those for another patch.
--
Thank you,
Richard.
^ permalink raw reply [flat|nested] 3+ messages in thread
* [LTP] [PATCH v2] read_all: give more time to wait children finish read action
2018-04-11 11:19 ` Richard Palethorpe
@ 2018-04-12 3:01 ` Li Wang
0 siblings, 0 replies; 3+ messages in thread
From: Li Wang @ 2018-04-12 3:01 UTC (permalink / raw)
To: ltp
Richard Palethorpe <rpalethorpe@suse.de> wrote:
>
> > #define MAX_DISPLAY 40
> > +#define MICROSECOND 1
>
> Not necessary.
>
Ok.
> >
> > for (i = 0; i < worker_count; i++) {
> > - stop_attempts = 0xffff;
> > if (workers[i].q) {
>
> Maybe change this to:
> if (!workers[i].q)
> continue;
>
> To avoid a level of indentation.
>
Sounds good.
> + tst_brk(TBROK,
> > + "Attempts %d times but still failed to
> push %s",
> ^ Attempted
>
Good catch.
>
> > + push_attempts, path);
> > }
> > }
> > }
>
> Maybe you could put the "if (delaly < SECOND) ..." into a function?
>
Firstly, I was thinking to take use of my new macros, but
consider that have not been reviewed. So I tend to keep
it as original, then replace them entirely after ltp gets
a generic loop functionality in library.
>
> Otherwise this looks good to me. There are some other things I want to
> change on this test, but we can leave those for another patch.
>
Thanks, Patch v3 is coming...
--
Li Wang
liwang@redhat.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linux.it/pipermail/ltp/attachments/20180412/49b93e3a/attachment-0001.html>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2018-04-12 3:01 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-11 9:47 [LTP] [PATCH v2] read_all: give more time to wait children finish read action Li Wang
2018-04-11 11:19 ` Richard Palethorpe
2018-04-12 3:01 ` Li Wang
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.