ltp.lists.linux.it archive mirror
 help / color / mirror / Atom feed
* [LTP] [PATCH] madvise06: Raise the bar for judging failure
@ 2023-02-18  4:09 Li Wang
  2023-02-27 11:33 ` Richard Palethorpe
  0 siblings, 1 reply; 8+ messages in thread
From: Li Wang @ 2023-02-18  4:09 UTC (permalink / raw)
  To: ltp; +Cc: Yongqiang Liu, Paul Bunyan, Eirik Fuller

There is an intermittent failure which we have observed many times whether
on rhel or mainline kernel. But we're unable to stable reproduce it:

    43	madvise06.c:201: TFAIL: less than 102400 Kb were moved to the swap cache
    ...

However it does not look like a kernel issue, because SwapCached change is
not strictly abiding by the principle of MADV_WILLNEED advice. That means it
all depends on the kernel's specific circumstances. The value of the threshold
is debatable at least from my point of view, its use 1/4 is not guaranteed
100% safe.

As MADV_WILLNEED is just advice to the kernel, not a guarantee. The kernel may
choose to ignore the advice, or may prioritize other memory management tasks
over pre-loading the advised pages.

So this patch is aimed at improving the accuracy and clarity of the test results.
Specifically, the use of two separate variables to track the results of different
comparisons will make it easier to understand what the test is doing.

Additionally, the change to report a test result of "TINFO" instead of "TFAIL"
when the swap cache size is less than expected would be intended to indicate
that this is an acceptable outcome.

Finally, the change to the second tst_res call is intended to make the test more
lenient, as it now passes if either no page faults occur or the swap cache size
is larger than expected.

Reported-by: Paul Bunyan <pbunyan@redhat.com>
Signed-off-by: Li Wang <liwang@redhat.com>
Cc: Richard Palethorpe <rpalethorpe@suse.de>
Cc: Yongqiang Liu <liuyongqiang13@huawei.com>
Cc: Eirik Fuller <efuller@redhat.com>
---
 testcases/kernel/syscalls/madvise/madvise06.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/testcases/kernel/syscalls/madvise/madvise06.c b/testcases/kernel/syscalls/madvise/madvise06.c
index c7967ae6f..5bd428bd9 100644
--- a/testcases/kernel/syscalls/madvise/madvise06.c
+++ b/testcases/kernel/syscalls/madvise/madvise06.c
@@ -164,7 +164,7 @@ static int get_page_fault_num(void)
 
 static void test_advice_willneed(void)
 {
-	int loops = 100, res;
+	int loops = 100, res1, res2;
 	char *target;
 	long swapcached_start, swapcached;
 	int page_fault_num_1, page_fault_num_2;
@@ -197,10 +197,10 @@ static void test_advice_willneed(void)
 	} while (swapcached < swapcached_start + PASS_THRESHOLD_KB && loops > 0);
 
 	meminfo_diag("After madvise");
-	res = swapcached > swapcached_start + PASS_THRESHOLD_KB;
-	tst_res(res ? TPASS : TFAIL,
+	res1 = swapcached > swapcached_start + PASS_THRESHOLD_KB;
+	tst_res(res1 ? TPASS : TINFO,
 		"%s than %ld Kb were moved to the swap cache",
-		res ? "more" : "less", PASS_THRESHOLD_KB);
+		res1 ? "more" : "less", PASS_THRESHOLD_KB);
 
 	loops = 100;
 	SAFE_FILE_LINES_SCANF("/proc/meminfo", "SwapCached: %ld", &swapcached_start);
@@ -225,9 +225,9 @@ static void test_advice_willneed(void)
 			page_fault_num_2);
 	meminfo_diag("After page access");
 
-	res = page_fault_num_2 - page_fault_num_1;
-	tst_res(res == 0 ? TPASS : TFAIL,
-		"%d pages were faulted out of 3 max", res);
+	res2 = page_fault_num_2 - page_fault_num_1;
+	tst_res(((res2 == 0) || res1) ? TPASS : TFAIL,
+		"%d pages were faulted out of 3 max", res2);
 
 	SAFE_MUNMAP(target, CHUNK_SZ);
 }
-- 
2.38.1


-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [LTP] [PATCH] madvise06: Raise the bar for judging failure
  2023-02-18  4:09 [LTP] [PATCH] madvise06: Raise the bar for judging failure Li Wang
@ 2023-02-27 11:33 ` Richard Palethorpe
  2023-02-28  5:45   ` Li Wang
  0 siblings, 1 reply; 8+ messages in thread
From: Richard Palethorpe @ 2023-02-27 11:33 UTC (permalink / raw)
  To: Li Wang; +Cc: Yang Xu, Yongqiang Liu, Paul Bunyan, Eirik Fuller, ltp

Hell Li,

Li Wang <liwang@redhat.com> writes:

> There is an intermittent failure which we have observed many times whether
> on rhel or mainline kernel. But we're unable to stable reproduce it:
>
>     43	madvise06.c:201: TFAIL: less than 102400 Kb were moved to the swap cache
>     ...
>
> However it does not look like a kernel issue, because SwapCached change is
> not strictly abiding by the principle of MADV_WILLNEED advice. That means it
> all depends on the kernel's specific circumstances. The value of the threshold
> is debatable at least from my point of view, its use 1/4 is not guaranteed
> 100% safe.
>
> As MADV_WILLNEED is just advice to the kernel, not a guarantee. The kernel may
> choose to ignore the advice, or may prioritize other memory management tasks
> over pre-loading the advised pages.
>
> So this patch is aimed at improving the accuracy and clarity of the test results.
> Specifically, the use of two separate variables to track the results of different
> comparisons will make it easier to understand what the test is doing.
>
> Additionally, the change to report a test result of "TINFO" instead of "TFAIL"
> when the swap cache size is less than expected would be intended to indicate
> that this is an acceptable outcome.
>
> Finally, the change to the second tst_res call is intended to make the test more
> lenient, as it now passes if either no page faults occur or the swap cache size
> is larger than expected.

Why not skip to making them all TINFO?

It's undefined what action will result from MADV_WILLNEED. If it were
better for performance *not* to read in pages, then it would be valid
for the kernel to ignore it.

Yang Xu added a tag for a perf regression that it could
reproduce. However looking at the kernel commit this was first found by
stress-ng.

commit 66383800df9cbdbf3b0c34d5a51bf35bcdb72fd2
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date:   Sat Nov 21 22:17:22 2020 -0800

    mm: fix madvise WILLNEED performance problem

    The calculation of the end page index was incorrect, leading to a
    regression of 70% when running stress-ng.

    With this fix, we instead see a performance improvement of 3%

I found a bug with this test, but it was causing an Oops. It wouldn't
matter if the test printed pass or fail.

So I think we are wasting our time by constantly tweaking this test.

-- 
Thank you,
Richard.

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [LTP] [PATCH] madvise06: Raise the bar for judging failure
  2023-02-27 11:33 ` Richard Palethorpe
@ 2023-02-28  5:45   ` Li Wang
  2023-03-02  7:41     ` [LTP] [PATCh v2] madvise06: stop throwing failure when MADV_WILLNEED is ignored Li Wang
  0 siblings, 1 reply; 8+ messages in thread
From: Li Wang @ 2023-02-28  5:45 UTC (permalink / raw)
  To: rpalethorpe; +Cc: Yang Xu, Yongqiang Liu, Paul Bunyan, Eirik Fuller, ltp

Hi Richard,

On Mon, Feb 27, 2023 at 8:27 PM Richard Palethorpe <rpalethorpe@suse.de>
wrote:

> Hell Li,
>
> Li Wang <liwang@redhat.com> writes:
>
> > There is an intermittent failure which we have observed many times
> whether
> > on rhel or mainline kernel. But we're unable to stable reproduce it:
> >
> >     43        madvise06.c:201: TFAIL: less than 102400 Kb were moved to
> the swap cache
> >     ...
> >
> > However it does not look like a kernel issue, because SwapCached change
> is
> > not strictly abiding by the principle of MADV_WILLNEED advice. That
> means it
> > all depends on the kernel's specific circumstances. The value of the
> threshold
> > is debatable at least from my point of view, its use 1/4 is not
> guaranteed
> > 100% safe.
> >
> > As MADV_WILLNEED is just advice to the kernel, not a guarantee. The
> kernel may
> > choose to ignore the advice, or may prioritize other memory management
> tasks
> > over pre-loading the advised pages.
> >
> > So this patch is aimed at improving the accuracy and clarity of the test
> results.
> > Specifically, the use of two separate variables to track the results of
> different
> > comparisons will make it easier to understand what the test is doing.
> >
> > Additionally, the change to report a test result of "TINFO" instead of
> "TFAIL"
> > when the swap cache size is less than expected would be intended to
> indicate
> > that this is an acceptable outcome.
> >
> > Finally, the change to the second tst_res call is intended to make the
> test more
> > lenient, as it now passes if either no page faults occur or the swap
> cache size
> > is larger than expected.
>
> Why not skip to making them all TINFO?
>
> It's undefined what action will result from MADV_WILLNEED. If it were
> better for performance *not* to read in pages, then it would be valid
> for the kernel to ignore it.
>

Yes, but I didn't do that because madvise06 test checks free_mem/free_swap
size at the beginning, it garantee the system at least with 2 * CHUNK_SZ
(800MB + 800MB) memory for the test performing, unless there is something
happening parallel otherwise kernel will handle MADV_WILLNEED request
correctly for most scenarios.

And we indeed do not see page-faults failure out of expected
anymore since commit 00e769e63515e51, so I just combined the
two judgments together in this patch. I believe it's enough and also
give a leeway to the kernel.

I hope there could be a lenient test for MADV_WILLNEED.
I will decisively take your suggestion once the failure appears again next
time.



>
> Yang Xu added a tag for a perf regression that it could
> reproduce. However looking at the kernel commit this was first found by
> stress-ng.
>
> commit 66383800df9cbdbf3b0c34d5a51bf35bcdb72fd2
> Author: Matthew Wilcox (Oracle) <willy@infradead.org>
> Date:   Sat Nov 21 22:17:22 2020 -0800
>
>     mm: fix madvise WILLNEED performance problem
>
>     The calculation of the end page index was incorrect, leading to a
>     regression of 70% when running stress-ng.
>
>     With this fix, we instead see a performance improvement of 3%
>
> I found a bug with this test, but it was causing an Oops. It wouldn't
> matter if the test printed pass or fail.
>
> So I think we are wasting our time by constantly tweaking this test.
>
> --
> Thank you,
> Richard.
>
>

-- 
Regards,
Li Wang

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [LTP] [PATCh v2] madvise06: stop throwing failure when MADV_WILLNEED is ignored
  2023-02-28  5:45   ` Li Wang
@ 2023-03-02  7:41     ` Li Wang
  2023-03-03  8:38       ` Petr Vorel
  0 siblings, 1 reply; 8+ messages in thread
From: Li Wang @ 2023-03-02  7:41 UTC (permalink / raw)
  To: ltp, rpalethorpe; +Cc: Yongqiang Liu, Paul Bunyan, Eirik Fuller

There is an intermittent failure which we have observed many times whether
on rhel or mainline kernel. But we're unable to stable reproduce it:

    43	madvise06.c:201: TFAIL: less than 102400 Kb were moved to the swap cache
    ...

However it does not look like a kernel issue, because SwapCached change is
not strictly abiding by the principle of MADV_WILLNEED advice. That means it
all depends on the kernel's specific circumstances. The value of the threshold
is debatable at least from my point of view, its use 1/4 is not guaranteed
100% safe.

As MADV_WILLNEED is just advice to the kernel, not a guarantee. The kernel may
choose to ignore the advice, or may prioritize other memory management tasks
over pre-loading the advised pages.

This change to report a test result of "TINFO" instead of "TFAIL" when the swap
cache size is less than expected would be intended to indicate that this is an
acceptable outcome. Same changes apply to the page_fault counting as well.

Reported-by: Paul Bunyan <pbunyan@redhat.com>
Cc: Richard Palethorpe <rpalethorpe@suse.de>
Cc: Yongqiang Liu <liuyongqiang13@huawei.com>
Cc: Eirik Fuller <efuller@redhat.com>
Signed-off-by: Li Wang <liwang@redhat.com>
---
 testcases/kernel/syscalls/madvise/madvise06.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/testcases/kernel/syscalls/madvise/madvise06.c b/testcases/kernel/syscalls/madvise/madvise06.c
index c7967ae6f..0e5ce3276 100644
--- a/testcases/kernel/syscalls/madvise/madvise06.c
+++ b/testcases/kernel/syscalls/madvise/madvise06.c
@@ -198,7 +198,7 @@ static void test_advice_willneed(void)
 
 	meminfo_diag("After madvise");
 	res = swapcached > swapcached_start + PASS_THRESHOLD_KB;
-	tst_res(res ? TPASS : TFAIL,
+	tst_res(res ? TPASS : TINFO,
 		"%s than %ld Kb were moved to the swap cache",
 		res ? "more" : "less", PASS_THRESHOLD_KB);
 
@@ -226,7 +226,7 @@ static void test_advice_willneed(void)
 	meminfo_diag("After page access");
 
 	res = page_fault_num_2 - page_fault_num_1;
-	tst_res(res == 0 ? TPASS : TFAIL,
+	tst_res(res == 0 ? TPASS : TINFO,
 		"%d pages were faulted out of 3 max", res);
 
 	SAFE_MUNMAP(target, CHUNK_SZ);
-- 
2.38.1


-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [LTP] [PATCh v2] madvise06: stop throwing failure when MADV_WILLNEED is ignored
  2023-03-02  7:41     ` [LTP] [PATCh v2] madvise06: stop throwing failure when MADV_WILLNEED is ignored Li Wang
@ 2023-03-03  8:38       ` Petr Vorel
  2023-03-08  0:18         ` Li Wang
  0 siblings, 1 reply; 8+ messages in thread
From: Petr Vorel @ 2023-03-03  8:38 UTC (permalink / raw)
  To: Li Wang; +Cc: Paul Bunyan, Yongqiang Liu, Eirik Fuller, ltp

Hi Li,

LGTM.
Acked-by: Petr Vorel <pvorel@suse.cz>

Kind regards,
Petr

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [LTP] [PATCh v2] madvise06: stop throwing failure when MADV_WILLNEED is ignored
  2023-03-03  8:38       ` Petr Vorel
@ 2023-03-08  0:18         ` Li Wang
  2023-03-13 10:15           ` Richard Palethorpe
  0 siblings, 1 reply; 8+ messages in thread
From: Li Wang @ 2023-03-08  0:18 UTC (permalink / raw)
  To: Petr Vorel, Richard Palethorpe
  Cc: Yongqiang Liu, Paul Bunyan, Eirik Fuller, ltp

Hi,

On Fri, Mar 3, 2023 at 3:44 PM Petr Vorel <pvorel@suse.cz> wrote:

> Hi Li,
>
> LGTM.
> Acked-by: Petr Vorel <pvorel@suse.cz>
>

Thank you for your ack, Petr.

I still want Richard gives me a final check.

@Richard, do you agree with pushing it, or other thoughts?


-- 
Regards,
Li Wang

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [LTP] [PATCh v2] madvise06: stop throwing failure when MADV_WILLNEED is ignored
  2023-03-08  0:18         ` Li Wang
@ 2023-03-13 10:15           ` Richard Palethorpe
  2023-03-13 10:59             ` Li Wang
  0 siblings, 1 reply; 8+ messages in thread
From: Richard Palethorpe @ 2023-03-13 10:15 UTC (permalink / raw)
  To: Li Wang; +Cc: Yongqiang Liu, Paul Bunyan, Eirik Fuller, ltp

Hello,

Li Wang <liwang@redhat.com> writes:

> Hi,
>
> On Fri, Mar 3, 2023 at 3:44 PM Petr Vorel <pvorel@suse.cz> wrote:
>
>> Hi Li,
>>
>> LGTM.
>> Acked-by: Petr Vorel <pvorel@suse.cz>
>>
>
> Thank you for your ack, Petr.
>
> I still want Richard gives me a final check.
>
> @Richard, do you agree with pushing it, or other thoughts?

I think it needs at least one unconditional tst_res(TPASS, ...)
otherwise the library will report TBROK if there is no pass or fail
result.

Possibly just tst_res(TPASS, "Nothing bad happend probably") at the end
or check for kernel taints.

With that:
Reviewed-by: Richard Palethorpe <rpalethorpe@suse.com>

Sorry for slow response.

-- 
Thank you,
Richard.

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [LTP] [PATCh v2] madvise06: stop throwing failure when MADV_WILLNEED is ignored
  2023-03-13 10:15           ` Richard Palethorpe
@ 2023-03-13 10:59             ` Li Wang
  0 siblings, 0 replies; 8+ messages in thread
From: Li Wang @ 2023-03-13 10:59 UTC (permalink / raw)
  To: rpalethorpe; +Cc: Yongqiang Liu, Paul Bunyan, Eirik Fuller, ltp

On Mon, Mar 13, 2023 at 6:20 PM Richard Palethorpe <rpalethorpe@suse.de>
wrote:

> Hello,
>
> Li Wang <liwang@redhat.com> writes:
>
> > Hi,
> >
> > On Fri, Mar 3, 2023 at 3:44 PM Petr Vorel <pvorel@suse.cz> wrote:
> >
> >> Hi Li,
> >>
> >> LGTM.
> >> Acked-by: Petr Vorel <pvorel@suse.cz>
> >>
> >
> > Thank you for your ack, Petr.
> >
> > I still want Richard gives me a final check.
> >
> > @Richard, do you agree with pushing it, or other thoughts?
>
> I think it needs at least one unconditional tst_res(TPASS, ...)
> otherwise the library will report TBROK if there is no pass or fail
> result.
>
> Possibly just tst_res(TPASS, "Nothing bad happend probably") at the end
> or check for kernel taints.
>

Good point!!!

I added the kernel taint check and pushed. Thanks~



>
> With that:
> Reviewed-by: Richard Palethorpe <rpalethorpe@suse.com>
>
> Sorry for slow response.
>
> --
> Thank you,
> Richard.
>
>

-- 
Regards,
Li Wang

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-03-13 11:00 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-18  4:09 [LTP] [PATCH] madvise06: Raise the bar for judging failure Li Wang
2023-02-27 11:33 ` Richard Palethorpe
2023-02-28  5:45   ` Li Wang
2023-03-02  7:41     ` [LTP] [PATCh v2] madvise06: stop throwing failure when MADV_WILLNEED is ignored Li Wang
2023-03-03  8:38       ` Petr Vorel
2023-03-08  0:18         ` Li Wang
2023-03-13 10:15           ` Richard Palethorpe
2023-03-13 10:59             ` Li Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).