All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] mm: correct status code which move_pages() returns for zero page
@ 2018-04-17 11:06 ` Li Wang
  0 siblings, 0 replies; 22+ messages in thread
From: Li Wang @ 2018-04-17 11:06 UTC (permalink / raw)
  To: mhocko; +Cc: linux-mm, ltp, Kirill A . Shutemov, Zi Yan

move_pages(2) declears that status code for zero page is supposed to
be -EFAULT. But now it (LTP/move_pages04 test) gets -EPERM, the root
cause is that not goto out_flush after store_status() saves the err
which add_page_for_migration() returns for zero page.

LTP move_pages04:
   TFAIL  :  move_pages04.c:143: status[1] is EPERM, expected EFAULT

Signed-off-by: Li Wang <liwang@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
---
 mm/migrate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index f65dd69..2b315fc 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1608,7 +1608,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
 			continue;
 
 		err = store_status(status, i, err, 1);
-		if (err)
+		if (!err)
 			goto out_flush;
 
 		err = do_move_pages_to_node(mm, &pagelist, current_node);
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [LTP] [RFC PATCH] mm: correct status code which move_pages() returns for zero page
@ 2018-04-17 11:06 ` Li Wang
  0 siblings, 0 replies; 22+ messages in thread
From: Li Wang @ 2018-04-17 11:06 UTC (permalink / raw)
  To: ltp

move_pages(2) declears that status code for zero page is supposed to
be -EFAULT. But now it (LTP/move_pages04 test) gets -EPERM, the root
cause is that not goto out_flush after store_status() saves the err
which add_page_for_migration() returns for zero page.

LTP move_pages04:
   TFAIL  :  move_pages04.c:143: status[1] is EPERM, expected EFAULT

Signed-off-by: Li Wang <liwang@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
---
 mm/migrate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index f65dd69..2b315fc 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1608,7 +1608,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
 			continue;
 
 		err = store_status(status, i, err, 1);
-		if (err)
+		if (!err)
 			goto out_flush;
 
 		err = do_move_pages_to_node(mm, &pagelist, current_node);
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH] mm: correct status code which move_pages() returns for zero page
  2018-04-17 11:06 ` [LTP] " Li Wang
@ 2018-04-17 13:03   ` Michal Hocko
  -1 siblings, 0 replies; 22+ messages in thread
From: Michal Hocko @ 2018-04-17 13:03 UTC (permalink / raw)
  To: Li Wang; +Cc: linux-mm, ltp, Kirill A . Shutemov, Zi Yan

On Tue 17-04-18 19:06:15, Li Wang wrote:
[...]
> diff --git a/mm/migrate.c b/mm/migrate.c
> index f65dd69..2b315fc 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1608,7 +1608,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
>  			continue;
>  
>  		err = store_status(status, i, err, 1);
> -		if (err)
> +		if (!err)
>  			goto out_flush;

This change just doesn't make any sense to me. Why should we bail out if
the store_status is successul? I am trying to wrap my head around the
test case. 6b9d757ecafc ("mm, numa: rework do_pages_move") tried to
explain that move_pages has some semantic issues and the new
implementation might be not 100% replacement. Anyway I am studying the
test case to come up with a proper fix.

>  
>  		err = do_move_pages_to_node(mm, &pagelist, current_node);
> -- 
> 2.9.5
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [LTP] [RFC PATCH] mm: correct status code which move_pages() returns for zero page
@ 2018-04-17 13:03   ` Michal Hocko
  0 siblings, 0 replies; 22+ messages in thread
From: Michal Hocko @ 2018-04-17 13:03 UTC (permalink / raw)
  To: ltp

On Tue 17-04-18 19:06:15, Li Wang wrote:
[...]
> diff --git a/mm/migrate.c b/mm/migrate.c
> index f65dd69..2b315fc 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1608,7 +1608,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
>  			continue;
>  
>  		err = store_status(status, i, err, 1);
> -		if (err)
> +		if (!err)
>  			goto out_flush;

This change just doesn't make any sense to me. Why should we bail out if
the store_status is successul? I am trying to wrap my head around the
test case. 6b9d757ecafc ("mm, numa: rework do_pages_move") tried to
explain that move_pages has some semantic issues and the new
implementation might be not 100% replacement. Anyway I am studying the
test case to come up with a proper fix.

>  
>  		err = do_move_pages_to_node(mm, &pagelist, current_node);
> -- 
> 2.9.5
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH] mm: correct status code which move_pages() returns for zero page
  2018-04-17 13:03   ` [LTP] " Michal Hocko
@ 2018-04-17 14:14     ` Michal Hocko
  -1 siblings, 0 replies; 22+ messages in thread
From: Michal Hocko @ 2018-04-17 14:14 UTC (permalink / raw)
  To: Li Wang; +Cc: linux-mm, ltp, Kirill A . Shutemov, Zi Yan

On Tue 17-04-18 15:03:00, Michal Hocko wrote:
> On Tue 17-04-18 19:06:15, Li Wang wrote:
> [...]
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index f65dd69..2b315fc 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -1608,7 +1608,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
> >  			continue;
> >  
> >  		err = store_status(status, i, err, 1);
> > -		if (err)
> > +		if (!err)
> >  			goto out_flush;
> 
> This change just doesn't make any sense to me. Why should we bail out if
> the store_status is successul? I am trying to wrap my head around the
> test case. 6b9d757ecafc ("mm, numa: rework do_pages_move") tried to
> explain that move_pages has some semantic issues and the new
> implementation might be not 100% replacement. Anyway I am studying the
> test case to come up with a proper fix.

OK, I get what the test cases does. I've failed to see the subtle
difference between alloc_pages_on_node and numa_alloc_onnode. The later
doesn't faul in anything.

Why are we getting EPERM is quite not yet clear to me.
add_page_for_migration uses FOLL_DUMP which should return EFAULT on
zero pages (no_page_table()).

	err = PTR_ERR(page);
	if (IS_ERR(page))
		goto out;

therefore bails out from add_page_for_migration and store_status should
store that value. There shouldn't be any EPERM on the way.

Let me try to reproduce and see what is going on. Btw. which kernel do
you try this on?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [LTP] [RFC PATCH] mm: correct status code which move_pages() returns for zero page
@ 2018-04-17 14:14     ` Michal Hocko
  0 siblings, 0 replies; 22+ messages in thread
From: Michal Hocko @ 2018-04-17 14:14 UTC (permalink / raw)
  To: ltp

On Tue 17-04-18 15:03:00, Michal Hocko wrote:
> On Tue 17-04-18 19:06:15, Li Wang wrote:
> [...]
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index f65dd69..2b315fc 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -1608,7 +1608,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
> >  			continue;
> >  
> >  		err = store_status(status, i, err, 1);
> > -		if (err)
> > +		if (!err)
> >  			goto out_flush;
> 
> This change just doesn't make any sense to me. Why should we bail out if
> the store_status is successul? I am trying to wrap my head around the
> test case. 6b9d757ecafc ("mm, numa: rework do_pages_move") tried to
> explain that move_pages has some semantic issues and the new
> implementation might be not 100% replacement. Anyway I am studying the
> test case to come up with a proper fix.

OK, I get what the test cases does. I've failed to see the subtle
difference between alloc_pages_on_node and numa_alloc_onnode. The later
doesn't faul in anything.

Why are we getting EPERM is quite not yet clear to me.
add_page_for_migration uses FOLL_DUMP which should return EFAULT on
zero pages (no_page_table()).

	err = PTR_ERR(page);
	if (IS_ERR(page))
		goto out;

therefore bails out from add_page_for_migration and store_status should
store that value. There shouldn't be any EPERM on the way.

Let me try to reproduce and see what is going on. Btw. which kernel do
you try this on?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH] mm: correct status code which move_pages() returns for zero page
  2018-04-17 14:14     ` [LTP] " Michal Hocko
@ 2018-04-17 14:28       ` Li Wang
  -1 siblings, 0 replies; 22+ messages in thread
From: Li Wang @ 2018-04-17 14:28 UTC (permalink / raw)
  To: Michal Hocko; +Cc: linux-mm, ltp, Kirill A . Shutemov, Zi Yan

[-- Attachment #1: Type: text/plain, Size: 2004 bytes --]

On Tue, Apr 17, 2018 at 10:14 PM, Michal Hocko <mhocko@suse.com> wrote:

> On Tue 17-04-18 15:03:00, Michal Hocko wrote:
> > On Tue 17-04-18 19:06:15, Li Wang wrote:
> > [...]
> > > diff --git a/mm/migrate.c b/mm/migrate.c
> > > index f65dd69..2b315fc 100644
> > > --- a/mm/migrate.c
> > > +++ b/mm/migrate.c
> > > @@ -1608,7 +1608,7 @@ static int do_pages_move(struct mm_struct *mm,
> nodemask_t task_nodes,
> > >                     continue;
> > >
> > >             err = store_status(status, i, err, 1);
> > > -           if (err)
> > > +           if (!err)
> > >                     goto out_flush;
> >
> > This change just doesn't make any sense to me. Why should we bail out if
> > the store_status is successul? I am trying to wrap my head around the
> > test case. 6b9d757ecafc ("mm, numa: rework do_pages_move") tried to
> > explain that move_pages has some semantic issues and the new
> > implementation might be not 100% replacement. Anyway I am studying the
> > test case to come up with a proper fix.
>
> OK, I get what the test cases does. I've failed to see the subtle
> difference between alloc_pages_on_node and numa_alloc_onnode. The later
> doesn't faul in anything.
>
> Why are we getting EPERM is quite not yet clear to me.
> add_page_for_migration uses FOLL_DUMP which should return EFAULT on
> zero pages (no_page_table()).
>
>         err = PTR_ERR(page);
>         if (IS_ERR(page))
>                 goto out;
>
> therefore bails out from add_page_for_migration and store_status should
> store that value. There shouldn't be any EPERM on the way.
>

Yes, I print the the return value and confirmed the
add_page_for_migration()​
do right things for zero page. and after store_status(...) the status saves
-EFAULT.
So I did the change above.



>
> Let me try to reproduce and see what is going on. Btw. which kernel do
> you try this on?
>

​The latest mainline kernel-4.17-rc1.



-- 
Li Wang
liwang@redhat.com

[-- Attachment #2: Type: text/html, Size: 3396 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [LTP] [RFC PATCH] mm: correct status code which move_pages() returns for zero page
@ 2018-04-17 14:28       ` Li Wang
  0 siblings, 0 replies; 22+ messages in thread
From: Li Wang @ 2018-04-17 14:28 UTC (permalink / raw)
  To: ltp

On Tue, Apr 17, 2018 at 10:14 PM, Michal Hocko <mhocko@suse.com> wrote:

> On Tue 17-04-18 15:03:00, Michal Hocko wrote:
> > On Tue 17-04-18 19:06:15, Li Wang wrote:
> > [...]
> > > diff --git a/mm/migrate.c b/mm/migrate.c
> > > index f65dd69..2b315fc 100644
> > > --- a/mm/migrate.c
> > > +++ b/mm/migrate.c
> > > @@ -1608,7 +1608,7 @@ static int do_pages_move(struct mm_struct *mm,
> nodemask_t task_nodes,
> > >                     continue;
> > >
> > >             err = store_status(status, i, err, 1);
> > > -           if (err)
> > > +           if (!err)
> > >                     goto out_flush;
> >
> > This change just doesn't make any sense to me. Why should we bail out if
> > the store_status is successul? I am trying to wrap my head around the
> > test case. 6b9d757ecafc ("mm, numa: rework do_pages_move") tried to
> > explain that move_pages has some semantic issues and the new
> > implementation might be not 100% replacement. Anyway I am studying the
> > test case to come up with a proper fix.
>
> OK, I get what the test cases does. I've failed to see the subtle
> difference between alloc_pages_on_node and numa_alloc_onnode. The later
> doesn't faul in anything.
>
> Why are we getting EPERM is quite not yet clear to me.
> add_page_for_migration uses FOLL_DUMP which should return EFAULT on
> zero pages (no_page_table()).
>
>         err = PTR_ERR(page);
>         if (IS_ERR(page))
>                 goto out;
>
> therefore bails out from add_page_for_migration and store_status should
> store that value. There shouldn't be any EPERM on the way.
>

Yes, I print the the return value and confirmed the
add_page_for_migration()​
do right things for zero page. and after store_status(...) the status saves
-EFAULT.
So I did the change above.



>
> Let me try to reproduce and see what is going on. Btw. which kernel do
> you try this on?
>

​The latest mainline kernel-4.17-rc1.



-- 
Li Wang
liwang@redhat.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linux.it/pipermail/ltp/attachments/20180417/9d58d972/attachment.html>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH] mm: correct status code which move_pages() returns for zero page
  2018-04-17 14:28       ` [LTP] " Li Wang
@ 2018-04-17 19:00         ` Michal Hocko
  -1 siblings, 0 replies; 22+ messages in thread
From: Michal Hocko @ 2018-04-17 19:00 UTC (permalink / raw)
  To: Li Wang; +Cc: linux-mm, ltp, Kirill A . Shutemov, Zi Yan

On Tue 17-04-18 22:28:33, Li Wang wrote:
> On Tue, Apr 17, 2018 at 10:14 PM, Michal Hocko <mhocko@suse.com> wrote:
> 
> > On Tue 17-04-18 15:03:00, Michal Hocko wrote:
> > > On Tue 17-04-18 19:06:15, Li Wang wrote:
> > > [...]
> > > > diff --git a/mm/migrate.c b/mm/migrate.c
> > > > index f65dd69..2b315fc 100644
> > > > --- a/mm/migrate.c
> > > > +++ b/mm/migrate.c
> > > > @@ -1608,7 +1608,7 @@ static int do_pages_move(struct mm_struct *mm,
> > nodemask_t task_nodes,
> > > >                     continue;
> > > >
> > > >             err = store_status(status, i, err, 1);
> > > > -           if (err)
> > > > +           if (!err)
> > > >                     goto out_flush;
> > >
> > > This change just doesn't make any sense to me. Why should we bail out if
> > > the store_status is successul? I am trying to wrap my head around the
> > > test case. 6b9d757ecafc ("mm, numa: rework do_pages_move") tried to
> > > explain that move_pages has some semantic issues and the new
> > > implementation might be not 100% replacement. Anyway I am studying the
> > > test case to come up with a proper fix.
> >
> > OK, I get what the test cases does. I've failed to see the subtle
> > difference between alloc_pages_on_node and numa_alloc_onnode. The later
> > doesn't faul in anything.
> >
> > Why are we getting EPERM is quite not yet clear to me.
> > add_page_for_migration uses FOLL_DUMP which should return EFAULT on
> > zero pages (no_page_table()).
> >
> >         err = PTR_ERR(page);
> >         if (IS_ERR(page))
> >                 goto out;
> >
> > therefore bails out from add_page_for_migration and store_status should
> > store that value. There shouldn't be any EPERM on the way.
> >
> 
> Yes, I print the the return value and confirmed the
> add_page_for_migration()a??
> do right things for zero page. and after store_status(...) the status saves
> -EFAULT.
> So I did the change above.

OK, I guess I knnow what is going on. I must be overwriting the status
on the way out by

out_flush:
	/* Make sure we do not overwrite the existing error */
	err1 = do_move_pages_to_node(mm, &pagelist, current_node);
	if (!err1)
		err1 = store_status(status, start, current_node, i - start);

This error handling is rather fragile and I was quite unhappy about it
at the time I was developing it. I have to remember all the details why
I've done it that way but I would bet my hat this is it. More on this
tomorrow.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [LTP] [RFC PATCH] mm: correct status code which move_pages() returns for zero page
@ 2018-04-17 19:00         ` Michal Hocko
  0 siblings, 0 replies; 22+ messages in thread
From: Michal Hocko @ 2018-04-17 19:00 UTC (permalink / raw)
  To: ltp

On Tue 17-04-18 22:28:33, Li Wang wrote:
> On Tue, Apr 17, 2018 at 10:14 PM, Michal Hocko <mhocko@suse.com> wrote:
> 
> > On Tue 17-04-18 15:03:00, Michal Hocko wrote:
> > > On Tue 17-04-18 19:06:15, Li Wang wrote:
> > > [...]
> > > > diff --git a/mm/migrate.c b/mm/migrate.c
> > > > index f65dd69..2b315fc 100644
> > > > --- a/mm/migrate.c
> > > > +++ b/mm/migrate.c
> > > > @@ -1608,7 +1608,7 @@ static int do_pages_move(struct mm_struct *mm,
> > nodemask_t task_nodes,
> > > >                     continue;
> > > >
> > > >             err = store_status(status, i, err, 1);
> > > > -           if (err)
> > > > +           if (!err)
> > > >                     goto out_flush;
> > >
> > > This change just doesn't make any sense to me. Why should we bail out if
> > > the store_status is successul? I am trying to wrap my head around the
> > > test case. 6b9d757ecafc ("mm, numa: rework do_pages_move") tried to
> > > explain that move_pages has some semantic issues and the new
> > > implementation might be not 100% replacement. Anyway I am studying the
> > > test case to come up with a proper fix.
> >
> > OK, I get what the test cases does. I've failed to see the subtle
> > difference between alloc_pages_on_node and numa_alloc_onnode. The later
> > doesn't faul in anything.
> >
> > Why are we getting EPERM is quite not yet clear to me.
> > add_page_for_migration uses FOLL_DUMP which should return EFAULT on
> > zero pages (no_page_table()).
> >
> >         err = PTR_ERR(page);
> >         if (IS_ERR(page))
> >                 goto out;
> >
> > therefore bails out from add_page_for_migration and store_status should
> > store that value. There shouldn't be any EPERM on the way.
> >
> 
> Yes, I print the the return value and confirmed the
> add_page_for_migration()​
> do right things for zero page. and after store_status(...) the status saves
> -EFAULT.
> So I did the change above.

OK, I guess I knnow what is going on. I must be overwriting the status
on the way out by

out_flush:
	/* Make sure we do not overwrite the existing error */
	err1 = do_move_pages_to_node(mm, &pagelist, current_node);
	if (!err1)
		err1 = store_status(status, start, current_node, i - start);

This error handling is rather fragile and I was quite unhappy about it
at the time I was developing it. I have to remember all the details why
I've done it that way but I would bet my hat this is it. More on this
tomorrow.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH] mm: correct status code which move_pages() returns for zero page
  2018-04-17 19:00         ` [LTP] " Michal Hocko
@ 2018-04-17 20:09           ` Zi Yan
  -1 siblings, 0 replies; 22+ messages in thread
From: Zi Yan @ 2018-04-17 20:09 UTC (permalink / raw)
  To: Michal Hocko, Li Wang; +Cc: linux-mm, ltp, Kirill A . Shutemov

[-- Attachment #1: Type: text/plain, Size: 3379 bytes --]

On 17 Apr 2018, at 15:00, Michal Hocko wrote:

> On Tue 17-04-18 22:28:33, Li Wang wrote:
>> On Tue, Apr 17, 2018 at 10:14 PM, Michal Hocko <mhocko@suse.com> wrote:
>>
>>> On Tue 17-04-18 15:03:00, Michal Hocko wrote:
>>>> On Tue 17-04-18 19:06:15, Li Wang wrote:
>>>> [...]
>>>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>>>> index f65dd69..2b315fc 100644
>>>>> --- a/mm/migrate.c
>>>>> +++ b/mm/migrate.c
>>>>> @@ -1608,7 +1608,7 @@ static int do_pages_move(struct mm_struct *mm,
>>> nodemask_t task_nodes,
>>>>>                     continue;
>>>>>
>>>>>             err = store_status(status, i, err, 1);
>>>>> -           if (err)
>>>>> +           if (!err)
>>>>>                     goto out_flush;
>>>>
>>>> This change just doesn't make any sense to me. Why should we bail out if
>>>> the store_status is successul? I am trying to wrap my head around the
>>>> test case. 6b9d757ecafc ("mm, numa: rework do_pages_move") tried to
>>>> explain that move_pages has some semantic issues and the new
>>>> implementation might be not 100% replacement. Anyway I am studying the
>>>> test case to come up with a proper fix.
>>>
>>> OK, I get what the test cases does. I've failed to see the subtle
>>> difference between alloc_pages_on_node and numa_alloc_onnode. The later
>>> doesn't faul in anything.
>>>
>>> Why are we getting EPERM is quite not yet clear to me.
>>> add_page_for_migration uses FOLL_DUMP which should return EFAULT on
>>> zero pages (no_page_table()).
>>>
>>>         err = PTR_ERR(page);
>>>         if (IS_ERR(page))
>>>                 goto out;
>>>
>>> therefore bails out from add_page_for_migration and store_status should
>>> store that value. There shouldn't be any EPERM on the way.
>>>
>>
>> Yes, I print the the return value and confirmed the
>> add_page_for_migration()​
>> do right things for zero page. and after store_status(...) the status saves
>> -EFAULT.
>> So I did the change above.
>
> OK, I guess I knnow what is going on. I must be overwriting the status
> on the way out by
>
> out_flush:
> 	/* Make sure we do not overwrite the existing error */
> 	err1 = do_move_pages_to_node(mm, &pagelist, current_node);
> 	if (!err1)
> 		err1 = store_status(status, start, current_node, i - start);
>
> This error handling is rather fragile and I was quite unhappy about it
> at the time I was developing it. I have to remember all the details why
> I've done it that way but I would bet my hat this is it. More on this
> tomorrow.

Hi Michal and Li,

The problem is that the variable start is not set properly after store_status(),
like the "start = i;" after the first store_status().

The following patch should fix the problem (it has passed all move_pages test cases from ltp
on my machine):

diff --git a/mm/migrate.c b/mm/migrate.c
index f65dd69e1fd1..32afa4723e7f 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1619,6 +1619,8 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
                        if (err)
                                goto out;
                }
+               /* Move to next page (i+1), after we have saved page status (until i) */
+               start = i + 1;
                current_node = NUMA_NO_NODE;
        }
 out_flush:

Feel free to check it by yourselves.

--
Best Regards
Yan Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 496 bytes --]

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [LTP] [RFC PATCH] mm: correct status code which move_pages() returns for zero page
@ 2018-04-17 20:09           ` Zi Yan
  0 siblings, 0 replies; 22+ messages in thread
From: Zi Yan @ 2018-04-17 20:09 UTC (permalink / raw)
  To: ltp

On 17 Apr 2018, at 15:00, Michal Hocko wrote:

> On Tue 17-04-18 22:28:33, Li Wang wrote:
>> On Tue, Apr 17, 2018 at 10:14 PM, Michal Hocko <mhocko@suse.com> wrote:
>>
>>> On Tue 17-04-18 15:03:00, Michal Hocko wrote:
>>>> On Tue 17-04-18 19:06:15, Li Wang wrote:
>>>> [...]
>>>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>>>> index f65dd69..2b315fc 100644
>>>>> --- a/mm/migrate.c
>>>>> +++ b/mm/migrate.c
>>>>> @@ -1608,7 +1608,7 @@ static int do_pages_move(struct mm_struct *mm,
>>> nodemask_t task_nodes,
>>>>>                     continue;
>>>>>
>>>>>             err = store_status(status, i, err, 1);
>>>>> -           if (err)
>>>>> +           if (!err)
>>>>>                     goto out_flush;
>>>>
>>>> This change just doesn't make any sense to me. Why should we bail out if
>>>> the store_status is successul? I am trying to wrap my head around the
>>>> test case. 6b9d757ecafc ("mm, numa: rework do_pages_move") tried to
>>>> explain that move_pages has some semantic issues and the new
>>>> implementation might be not 100% replacement. Anyway I am studying the
>>>> test case to come up with a proper fix.
>>>
>>> OK, I get what the test cases does. I've failed to see the subtle
>>> difference between alloc_pages_on_node and numa_alloc_onnode. The later
>>> doesn't faul in anything.
>>>
>>> Why are we getting EPERM is quite not yet clear to me.
>>> add_page_for_migration uses FOLL_DUMP which should return EFAULT on
>>> zero pages (no_page_table()).
>>>
>>>         err = PTR_ERR(page);
>>>         if (IS_ERR(page))
>>>                 goto out;
>>>
>>> therefore bails out from add_page_for_migration and store_status should
>>> store that value. There shouldn't be any EPERM on the way.
>>>
>>
>> Yes, I print the the return value and confirmed the
>> add_page_for_migration()​
>> do right things for zero page. and after store_status(...) the status saves
>> -EFAULT.
>> So I did the change above.
>
> OK, I guess I knnow what is going on. I must be overwriting the status
> on the way out by
>
> out_flush:
> 	/* Make sure we do not overwrite the existing error */
> 	err1 = do_move_pages_to_node(mm, &pagelist, current_node);
> 	if (!err1)
> 		err1 = store_status(status, start, current_node, i - start);
>
> This error handling is rather fragile and I was quite unhappy about it
> at the time I was developing it. I have to remember all the details why
> I've done it that way but I would bet my hat this is it. More on this
> tomorrow.

Hi Michal and Li,

The problem is that the variable start is not set properly after store_status(),
like the "start = i;" after the first store_status().

The following patch should fix the problem (it has passed all move_pages test cases from ltp
on my machine):

diff --git a/mm/migrate.c b/mm/migrate.c
index f65dd69e1fd1..32afa4723e7f 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1619,6 +1619,8 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
                        if (err)
                                goto out;
                }
+               /* Move to next page (i+1), after we have saved page status (until i) */
+               start = i + 1;
                current_node = NUMA_NO_NODE;
        }
 out_flush:

Feel free to check it by yourselves.

--
Best Regards
Yan Zi
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 496 bytes
Desc: OpenPGP digital signature
URL: <http://lists.linux.it/pipermail/ltp/attachments/20180417/ea924d1a/attachment-0001.sig>

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH] mm: correct status code which move_pages() returns for zero page
  2018-04-17 20:09           ` [LTP] " Zi Yan
@ 2018-04-18  9:07             ` Michal Hocko
  -1 siblings, 0 replies; 22+ messages in thread
From: Michal Hocko @ 2018-04-18  9:07 UTC (permalink / raw)
  To: Zi Yan; +Cc: Li Wang, linux-mm, ltp, Kirill A . Shutemov

On Tue 17-04-18 16:09:33, Zi Yan wrote:
> On 17 Apr 2018, at 15:00, Michal Hocko wrote:
> 
> > On Tue 17-04-18 22:28:33, Li Wang wrote:
> >> On Tue, Apr 17, 2018 at 10:14 PM, Michal Hocko <mhocko@suse.com> wrote:
> >>
> >>> On Tue 17-04-18 15:03:00, Michal Hocko wrote:
> >>>> On Tue 17-04-18 19:06:15, Li Wang wrote:
> >>>> [...]
> >>>>> diff --git a/mm/migrate.c b/mm/migrate.c
> >>>>> index f65dd69..2b315fc 100644
> >>>>> --- a/mm/migrate.c
> >>>>> +++ b/mm/migrate.c
> >>>>> @@ -1608,7 +1608,7 @@ static int do_pages_move(struct mm_struct *mm,
> >>> nodemask_t task_nodes,
> >>>>>                     continue;
> >>>>>
> >>>>>             err = store_status(status, i, err, 1);
> >>>>> -           if (err)
> >>>>> +           if (!err)
> >>>>>                     goto out_flush;
> >>>>
> >>>> This change just doesn't make any sense to me. Why should we bail out if
> >>>> the store_status is successul? I am trying to wrap my head around the
> >>>> test case. 6b9d757ecafc ("mm, numa: rework do_pages_move") tried to
> >>>> explain that move_pages has some semantic issues and the new
> >>>> implementation might be not 100% replacement. Anyway I am studying the
> >>>> test case to come up with a proper fix.
> >>>
> >>> OK, I get what the test cases does. I've failed to see the subtle
> >>> difference between alloc_pages_on_node and numa_alloc_onnode. The later
> >>> doesn't faul in anything.
> >>>
> >>> Why are we getting EPERM is quite not yet clear to me.
> >>> add_page_for_migration uses FOLL_DUMP which should return EFAULT on
> >>> zero pages (no_page_table()).
> >>>
> >>>         err = PTR_ERR(page);
> >>>         if (IS_ERR(page))
> >>>                 goto out;
> >>>
> >>> therefore bails out from add_page_for_migration and store_status should
> >>> store that value. There shouldn't be any EPERM on the way.
> >>>
> >>
> >> Yes, I print the the return value and confirmed the
> >> add_page_for_migration()a??
> >> do right things for zero page. and after store_status(...) the status saves
> >> -EFAULT.
> >> So I did the change above.
> >
> > OK, I guess I knnow what is going on. I must be overwriting the status
> > on the way out by
> >
> > out_flush:
> > 	/* Make sure we do not overwrite the existing error */
> > 	err1 = do_move_pages_to_node(mm, &pagelist, current_node);
> > 	if (!err1)
> > 		err1 = store_status(status, start, current_node, i - start);
> >
> > This error handling is rather fragile and I was quite unhappy about it
> > at the time I was developing it. I have to remember all the details why
> > I've done it that way but I would bet my hat this is it. More on this
> > tomorrow.
> 
> Hi Michal and Li,
> 
> The problem is that the variable start is not set properly after store_status(),
> like the "start = i;" after the first store_status().
> 
> The following patch should fix the problem (it has passed all move_pages test cases from ltp
> on my machine):
> 
> diff --git a/mm/migrate.c b/mm/migrate.c
> index f65dd69e1fd1..32afa4723e7f 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1619,6 +1619,8 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
>                         if (err)
>                                 goto out;
>                 }
> +               /* Move to next page (i+1), after we have saved page status (until i) */
> +               start = i + 1;
>                 current_node = NUMA_NO_NODE;
>         }
>  out_flush:
> 
> Feel free to check it by yourselves.

Yes, you are right. I never update start if the last page in the range
fails and so we overwrite the whole [start, i] range. I wish the code
wasn't that ugly and subtle but considering how we can fail in different
ways and that we want to batch as much as possible I do not see an easy
way.

Care to send the patch? I would just drop the comment.

Thanks!
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [LTP] [RFC PATCH] mm: correct status code which move_pages() returns for zero page
@ 2018-04-18  9:07             ` Michal Hocko
  0 siblings, 0 replies; 22+ messages in thread
From: Michal Hocko @ 2018-04-18  9:07 UTC (permalink / raw)
  To: ltp

On Tue 17-04-18 16:09:33, Zi Yan wrote:
> On 17 Apr 2018, at 15:00, Michal Hocko wrote:
> 
> > On Tue 17-04-18 22:28:33, Li Wang wrote:
> >> On Tue, Apr 17, 2018 at 10:14 PM, Michal Hocko <mhocko@suse.com> wrote:
> >>
> >>> On Tue 17-04-18 15:03:00, Michal Hocko wrote:
> >>>> On Tue 17-04-18 19:06:15, Li Wang wrote:
> >>>> [...]
> >>>>> diff --git a/mm/migrate.c b/mm/migrate.c
> >>>>> index f65dd69..2b315fc 100644
> >>>>> --- a/mm/migrate.c
> >>>>> +++ b/mm/migrate.c
> >>>>> @@ -1608,7 +1608,7 @@ static int do_pages_move(struct mm_struct *mm,
> >>> nodemask_t task_nodes,
> >>>>>                     continue;
> >>>>>
> >>>>>             err = store_status(status, i, err, 1);
> >>>>> -           if (err)
> >>>>> +           if (!err)
> >>>>>                     goto out_flush;
> >>>>
> >>>> This change just doesn't make any sense to me. Why should we bail out if
> >>>> the store_status is successul? I am trying to wrap my head around the
> >>>> test case. 6b9d757ecafc ("mm, numa: rework do_pages_move") tried to
> >>>> explain that move_pages has some semantic issues and the new
> >>>> implementation might be not 100% replacement. Anyway I am studying the
> >>>> test case to come up with a proper fix.
> >>>
> >>> OK, I get what the test cases does. I've failed to see the subtle
> >>> difference between alloc_pages_on_node and numa_alloc_onnode. The later
> >>> doesn't faul in anything.
> >>>
> >>> Why are we getting EPERM is quite not yet clear to me.
> >>> add_page_for_migration uses FOLL_DUMP which should return EFAULT on
> >>> zero pages (no_page_table()).
> >>>
> >>>         err = PTR_ERR(page);
> >>>         if (IS_ERR(page))
> >>>                 goto out;
> >>>
> >>> therefore bails out from add_page_for_migration and store_status should
> >>> store that value. There shouldn't be any EPERM on the way.
> >>>
> >>
> >> Yes, I print the the return value and confirmed the
> >> add_page_for_migration()​
> >> do right things for zero page. and after store_status(...) the status saves
> >> -EFAULT.
> >> So I did the change above.
> >
> > OK, I guess I knnow what is going on. I must be overwriting the status
> > on the way out by
> >
> > out_flush:
> > 	/* Make sure we do not overwrite the existing error */
> > 	err1 = do_move_pages_to_node(mm, &pagelist, current_node);
> > 	if (!err1)
> > 		err1 = store_status(status, start, current_node, i - start);
> >
> > This error handling is rather fragile and I was quite unhappy about it
> > at the time I was developing it. I have to remember all the details why
> > I've done it that way but I would bet my hat this is it. More on this
> > tomorrow.
> 
> Hi Michal and Li,
> 
> The problem is that the variable start is not set properly after store_status(),
> like the "start = i;" after the first store_status().
> 
> The following patch should fix the problem (it has passed all move_pages test cases from ltp
> on my machine):
> 
> diff --git a/mm/migrate.c b/mm/migrate.c
> index f65dd69e1fd1..32afa4723e7f 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1619,6 +1619,8 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
>                         if (err)
>                                 goto out;
>                 }
> +               /* Move to next page (i+1), after we have saved page status (until i) */
> +               start = i + 1;
>                 current_node = NUMA_NO_NODE;
>         }
>  out_flush:
> 
> Feel free to check it by yourselves.

Yes, you are right. I never update start if the last page in the range
fails and so we overwrite the whole [start, i] range. I wish the code
wasn't that ugly and subtle but considering how we can fail in different
ways and that we want to batch as much as possible I do not see an easy
way.

Care to send the patch? I would just drop the comment.

Thanks!
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH] mm: correct status code which move_pages() returns for zero page
  2018-04-18  9:07             ` [LTP] " Michal Hocko
@ 2018-04-18  9:19               ` Michal Hocko
  -1 siblings, 0 replies; 22+ messages in thread
From: Michal Hocko @ 2018-04-18  9:19 UTC (permalink / raw)
  To: Zi Yan; +Cc: Li Wang, linux-mm, ltp, Kirill A . Shutemov

On Wed 18-04-18 11:07:22, Michal Hocko wrote:
> On Tue 17-04-18 16:09:33, Zi Yan wrote:
[...]
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index f65dd69e1fd1..32afa4723e7f 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -1619,6 +1619,8 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
> >                         if (err)
> >                                 goto out;
> >                 }
> > +               /* Move to next page (i+1), after we have saved page status (until i) */
> > +               start = i + 1;
> >                 current_node = NUMA_NO_NODE;
> >         }
> >  out_flush:
> > 
> > Feel free to check it by yourselves.
> 
> Yes, you are right. I never update start if the last page in the range
> fails and so we overwrite the whole [start, i] range. I wish the code
> wasn't that ugly and subtle but considering how we can fail in different
> ways and that we want to batch as much as possible I do not see an easy
> way.
> 
> Care to send the patch? I would just drop the comment.

Hmm, thinking about it some more. An alternative would be to check for
list_empty on the page list. It is a bit larger diff but maybe that
would be tiny bit cleaner because there is simply no point to call
do_move_pages_to_node on an empty list in the first place.
---
diff --git a/mm/migrate.c b/mm/migrate.c
index 507cf9ba21bf..46f93b9ba724 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1634,12 +1634,14 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
 		current_node = NUMA_NO_NODE;
 	}
 out_flush:
-	/* Make sure we do not overwrite the existing error */
-	err1 = do_move_pages_to_node(mm, &pagelist, current_node);
-	if (!err1)
-		err1 = store_status(status, start, current_node, i - start);
-	if (!err)
-		err = err1;
+	if (!list_empty(&pagelist)) {
+		/* Make sure we do not overwrite the existing error */
+		err1 = do_move_pages_to_node(mm, &pagelist, current_node);
+		if (!err1)
+			err1 = store_status(status, start, current_node, i - start);
+		if (!err)
+			err = err1;
+	}
 out:
 	return err;
 }
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [LTP] [RFC PATCH] mm: correct status code which move_pages() returns for zero page
@ 2018-04-18  9:19               ` Michal Hocko
  0 siblings, 0 replies; 22+ messages in thread
From: Michal Hocko @ 2018-04-18  9:19 UTC (permalink / raw)
  To: ltp

On Wed 18-04-18 11:07:22, Michal Hocko wrote:
> On Tue 17-04-18 16:09:33, Zi Yan wrote:
[...]
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index f65dd69e1fd1..32afa4723e7f 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -1619,6 +1619,8 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
> >                         if (err)
> >                                 goto out;
> >                 }
> > +               /* Move to next page (i+1), after we have saved page status (until i) */
> > +               start = i + 1;
> >                 current_node = NUMA_NO_NODE;
> >         }
> >  out_flush:
> > 
> > Feel free to check it by yourselves.
> 
> Yes, you are right. I never update start if the last page in the range
> fails and so we overwrite the whole [start, i] range. I wish the code
> wasn't that ugly and subtle but considering how we can fail in different
> ways and that we want to batch as much as possible I do not see an easy
> way.
> 
> Care to send the patch? I would just drop the comment.

Hmm, thinking about it some more. An alternative would be to check for
list_empty on the page list. It is a bit larger diff but maybe that
would be tiny bit cleaner because there is simply no point to call
do_move_pages_to_node on an empty list in the first place.
---
diff --git a/mm/migrate.c b/mm/migrate.c
index 507cf9ba21bf..46f93b9ba724 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1634,12 +1634,14 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
 		current_node = NUMA_NO_NODE;
 	}
 out_flush:
-	/* Make sure we do not overwrite the existing error */
-	err1 = do_move_pages_to_node(mm, &pagelist, current_node);
-	if (!err1)
-		err1 = store_status(status, start, current_node, i - start);
-	if (!err)
-		err = err1;
+	if (!list_empty(&pagelist)) {
+		/* Make sure we do not overwrite the existing error */
+		err1 = do_move_pages_to_node(mm, &pagelist, current_node);
+		if (!err1)
+			err1 = store_status(status, start, current_node, i - start);
+		if (!err)
+			err = err1;
+	}
 out:
 	return err;
 }
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH] mm: correct status code which move_pages() returns for zero page
  2018-04-18  9:19               ` [LTP] " Michal Hocko
@ 2018-04-18 10:39                 ` Li Wang
  -1 siblings, 0 replies; 22+ messages in thread
From: Li Wang @ 2018-04-18 10:39 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Zi Yan, linux-mm, ltp, Kirill A . Shutemov

[-- Attachment #1: Type: text/plain, Size: 1607 bytes --]

On Wed, Apr 18, 2018 at 5:19 PM, Michal Hocko <mhocko@suse.com> wrote:

> On Wed 18-04-18 11:07:22, Michal Hocko wrote:
> > On Tue 17-04-18 16:09:33, Zi Yan wrote:
> [...]
> > > diff --git a/mm/migrate.c b/mm/migrate.c
> > > index f65dd69e1fd1..32afa4723e7f 100644
> > > --- a/mm/migrate.c
> > > +++ b/mm/migrate.c
> > > @@ -1619,6 +1619,8 @@ static int do_pages_move(struct mm_struct *mm,
> nodemask_t task_nodes,
> > >                         if (err)
> > >                                 goto out;
> > >                 }
> > > +               /* Move to next page (i+1), after we have saved page
> status (until i) */
> > > +               start = i + 1;
> > >                 current_node = NUMA_NO_NODE;
> > >         }
> > >  out_flush:
> > >
> > > Feel free to check it by yourselves.
> >
> > Yes, you are right. I never update start if the last page in the range
> > fails and so we overwrite the whole [start, i] range. I wish the code
> > wasn't that ugly and subtle but considering how we can fail in different
> > ways and that we want to batch as much as possible I do not see an easy
> > way.
> >
> > Care to send the patch? I would just drop the comment.
>
> Hmm, thinking about it some more. An alternative would be to check for
> list_empty on the page list. It is a bit larger diff but maybe that
> would be tiny bit cleaner because there is simply no point to call
> do_move_pages_to_node on an empty list in the first place.
>

​Hi Michal, Zi

I tried your patch separately, both of them works fine to me.


-- 
Li Wang
liwang@redhat.com

[-- Attachment #2: Type: text/html, Size: 3012 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [LTP] [RFC PATCH] mm: correct status code which move_pages() returns for zero page
@ 2018-04-18 10:39                 ` Li Wang
  0 siblings, 0 replies; 22+ messages in thread
From: Li Wang @ 2018-04-18 10:39 UTC (permalink / raw)
  To: ltp

On Wed, Apr 18, 2018 at 5:19 PM, Michal Hocko <mhocko@suse.com> wrote:

> On Wed 18-04-18 11:07:22, Michal Hocko wrote:
> > On Tue 17-04-18 16:09:33, Zi Yan wrote:
> [...]
> > > diff --git a/mm/migrate.c b/mm/migrate.c
> > > index f65dd69e1fd1..32afa4723e7f 100644
> > > --- a/mm/migrate.c
> > > +++ b/mm/migrate.c
> > > @@ -1619,6 +1619,8 @@ static int do_pages_move(struct mm_struct *mm,
> nodemask_t task_nodes,
> > >                         if (err)
> > >                                 goto out;
> > >                 }
> > > +               /* Move to next page (i+1), after we have saved page
> status (until i) */
> > > +               start = i + 1;
> > >                 current_node = NUMA_NO_NODE;
> > >         }
> > >  out_flush:
> > >
> > > Feel free to check it by yourselves.
> >
> > Yes, you are right. I never update start if the last page in the range
> > fails and so we overwrite the whole [start, i] range. I wish the code
> > wasn't that ugly and subtle but considering how we can fail in different
> > ways and that we want to batch as much as possible I do not see an easy
> > way.
> >
> > Care to send the patch? I would just drop the comment.
>
> Hmm, thinking about it some more. An alternative would be to check for
> list_empty on the page list. It is a bit larger diff but maybe that
> would be tiny bit cleaner because there is simply no point to call
> do_move_pages_to_node on an empty list in the first place.
>

​Hi Michal, Zi

I tried your patch separately, both of them works fine to me.


-- 
Li Wang
liwang@redhat.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linux.it/pipermail/ltp/attachments/20180418/2cec07f6/attachment.html>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH] mm: correct status code which move_pages() returns for zero page
  2018-04-18 10:39                 ` [LTP] " Li Wang
@ 2018-04-18 11:29                   ` Michal Hocko
  -1 siblings, 0 replies; 22+ messages in thread
From: Michal Hocko @ 2018-04-18 11:29 UTC (permalink / raw)
  To: Li Wang; +Cc: Zi Yan, linux-mm, ltp, Kirill A . Shutemov

On Wed 18-04-18 18:39:19, Li Wang wrote:
> On Wed, Apr 18, 2018 at 5:19 PM, Michal Hocko <mhocko@suse.com> wrote:
> 
> > On Wed 18-04-18 11:07:22, Michal Hocko wrote:
> > > On Tue 17-04-18 16:09:33, Zi Yan wrote:
> > [...]
> > > > diff --git a/mm/migrate.c b/mm/migrate.c
> > > > index f65dd69e1fd1..32afa4723e7f 100644
> > > > --- a/mm/migrate.c
> > > > +++ b/mm/migrate.c
> > > > @@ -1619,6 +1619,8 @@ static int do_pages_move(struct mm_struct *mm,
> > nodemask_t task_nodes,
> > > >                         if (err)
> > > >                                 goto out;
> > > >                 }
> > > > +               /* Move to next page (i+1), after we have saved page
> > status (until i) */
> > > > +               start = i + 1;
> > > >                 current_node = NUMA_NO_NODE;
> > > >         }
> > > >  out_flush:
> > > >
> > > > Feel free to check it by yourselves.
> > >
> > > Yes, you are right. I never update start if the last page in the range
> > > fails and so we overwrite the whole [start, i] range. I wish the code
> > > wasn't that ugly and subtle but considering how we can fail in different
> > > ways and that we want to batch as much as possible I do not see an easy
> > > way.
> > >
> > > Care to send the patch? I would just drop the comment.
> >
> > Hmm, thinking about it some more. An alternative would be to check for
> > list_empty on the page list. It is a bit larger diff but maybe that
> > would be tiny bit cleaner because there is simply no point to call
> > do_move_pages_to_node on an empty list in the first place.
> >
> 
> a??Hi Michal, Zi
> 
> I tried your patch separately, both of them works fine to me.

Thanks for retesting! Do you plan to post a patch with the changelog or
should I do it?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [LTP] [RFC PATCH] mm: correct status code which move_pages() returns for zero page
@ 2018-04-18 11:29                   ` Michal Hocko
  0 siblings, 0 replies; 22+ messages in thread
From: Michal Hocko @ 2018-04-18 11:29 UTC (permalink / raw)
  To: ltp

On Wed 18-04-18 18:39:19, Li Wang wrote:
> On Wed, Apr 18, 2018 at 5:19 PM, Michal Hocko <mhocko@suse.com> wrote:
> 
> > On Wed 18-04-18 11:07:22, Michal Hocko wrote:
> > > On Tue 17-04-18 16:09:33, Zi Yan wrote:
> > [...]
> > > > diff --git a/mm/migrate.c b/mm/migrate.c
> > > > index f65dd69e1fd1..32afa4723e7f 100644
> > > > --- a/mm/migrate.c
> > > > +++ b/mm/migrate.c
> > > > @@ -1619,6 +1619,8 @@ static int do_pages_move(struct mm_struct *mm,
> > nodemask_t task_nodes,
> > > >                         if (err)
> > > >                                 goto out;
> > > >                 }
> > > > +               /* Move to next page (i+1), after we have saved page
> > status (until i) */
> > > > +               start = i + 1;
> > > >                 current_node = NUMA_NO_NODE;
> > > >         }
> > > >  out_flush:
> > > >
> > > > Feel free to check it by yourselves.
> > >
> > > Yes, you are right. I never update start if the last page in the range
> > > fails and so we overwrite the whole [start, i] range. I wish the code
> > > wasn't that ugly and subtle but considering how we can fail in different
> > > ways and that we want to batch as much as possible I do not see an easy
> > > way.
> > >
> > > Care to send the patch? I would just drop the comment.
> >
> > Hmm, thinking about it some more. An alternative would be to check for
> > list_empty on the page list. It is a bit larger diff but maybe that
> > would be tiny bit cleaner because there is simply no point to call
> > do_move_pages_to_node on an empty list in the first place.
> >
> 
> ​Hi Michal, Zi
> 
> I tried your patch separately, both of them works fine to me.

Thanks for retesting! Do you plan to post a patch with the changelog or
should I do it?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH] mm: correct status code which move_pages() returns for zero page
  2018-04-18 11:29                   ` [LTP] " Michal Hocko
@ 2018-04-18 11:46                     ` Li Wang
  -1 siblings, 0 replies; 22+ messages in thread
From: Li Wang @ 2018-04-18 11:46 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Zi Yan, linux-mm, ltp, Kirill A . Shutemov

[-- Attachment #1: Type: text/plain, Size: 1981 bytes --]

On Wed, Apr 18, 2018, 19:29 Michal Hocko <mhocko@suse.com> wrote:

> On Wed 18-04-18 18:39:19, Li Wang wrote:
> > On Wed, Apr 18, 2018 at 5:19 PM, Michal Hocko <mhocko@suse.com> wrote:
> >
> > > On Wed 18-04-18 11:07:22, Michal Hocko wrote:
> > > > On Tue 17-04-18 16:09:33, Zi Yan wrote:
> > > [...]
> > > > > diff --git a/mm/migrate.c b/mm/migrate.c
> > > > > index f65dd69e1fd1..32afa4723e7f 100644
> > > > > --- a/mm/migrate.c
> > > > > +++ b/mm/migrate.c
> > > > > @@ -1619,6 +1619,8 @@ static int do_pages_move(struct mm_struct
> *mm,
> > > nodemask_t task_nodes,
> > > > >                         if (err)
> > > > >                                 goto out;
> > > > >                 }
> > > > > +               /* Move to next page (i+1), after we have saved
> page
> > > status (until i) */
> > > > > +               start = i + 1;
> > > > >                 current_node = NUMA_NO_NODE;
> > > > >         }
> > > > >  out_flush:
> > > > >
> > > > > Feel free to check it by yourselves.
> > > >
> > > > Yes, you are right. I never update start if the last page in the
> range
> > > > fails and so we overwrite the whole [start, i] range. I wish the code
> > > > wasn't that ugly and subtle but considering how we can fail in
> different
> > > > ways and that we want to batch as much as possible I do not see an
> easy
> > > > way.
> > > >
> > > > Care to send the patch? I would just drop the comment.
> > >
> > > Hmm, thinking about it some more. An alternative would be to check for
> > > list_empty on the page list. It is a bit larger diff but maybe that
> > > would be tiny bit cleaner because there is simply no point to call
> > > do_move_pages_to_node on an empty list in the first place.
> > >
> >
> > ​Hi Michal, Zi
> >
> > I tried your patch separately, both of them works fine to me.
>
> Thanks for retesting! Do you plan to post a patch with the changelog or
> should I do it?
>

You better.

Li Wang

[-- Attachment #2: Type: text/html, Size: 2927 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [LTP] [RFC PATCH] mm: correct status code which move_pages() returns for zero page
@ 2018-04-18 11:46                     ` Li Wang
  0 siblings, 0 replies; 22+ messages in thread
From: Li Wang @ 2018-04-18 11:46 UTC (permalink / raw)
  To: ltp

On Wed, Apr 18, 2018, 19:29 Michal Hocko <mhocko@suse.com> wrote:

> On Wed 18-04-18 18:39:19, Li Wang wrote:
> > On Wed, Apr 18, 2018 at 5:19 PM, Michal Hocko <mhocko@suse.com> wrote:
> >
> > > On Wed 18-04-18 11:07:22, Michal Hocko wrote:
> > > > On Tue 17-04-18 16:09:33, Zi Yan wrote:
> > > [...]
> > > > > diff --git a/mm/migrate.c b/mm/migrate.c
> > > > > index f65dd69e1fd1..32afa4723e7f 100644
> > > > > --- a/mm/migrate.c
> > > > > +++ b/mm/migrate.c
> > > > > @@ -1619,6 +1619,8 @@ static int do_pages_move(struct mm_struct
> *mm,
> > > nodemask_t task_nodes,
> > > > >                         if (err)
> > > > >                                 goto out;
> > > > >                 }
> > > > > +               /* Move to next page (i+1), after we have saved
> page
> > > status (until i) */
> > > > > +               start = i + 1;
> > > > >                 current_node = NUMA_NO_NODE;
> > > > >         }
> > > > >  out_flush:
> > > > >
> > > > > Feel free to check it by yourselves.
> > > >
> > > > Yes, you are right. I never update start if the last page in the
> range
> > > > fails and so we overwrite the whole [start, i] range. I wish the code
> > > > wasn't that ugly and subtle but considering how we can fail in
> different
> > > > ways and that we want to batch as much as possible I do not see an
> easy
> > > > way.
> > > >
> > > > Care to send the patch? I would just drop the comment.
> > >
> > > Hmm, thinking about it some more. An alternative would be to check for
> > > list_empty on the page list. It is a bit larger diff but maybe that
> > > would be tiny bit cleaner because there is simply no point to call
> > > do_move_pages_to_node on an empty list in the first place.
> > >
> >
> > ​Hi Michal, Zi
> >
> > I tried your patch separately, both of them works fine to me.
>
> Thanks for retesting! Do you plan to post a patch with the changelog or
> should I do it?
>

You better.

Li Wang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linux.it/pipermail/ltp/attachments/20180418/43a89c60/attachment.html>

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2018-04-18 11:46 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-17 11:06 [RFC PATCH] mm: correct status code which move_pages() returns for zero page Li Wang
2018-04-17 11:06 ` [LTP] " Li Wang
2018-04-17 13:03 ` Michal Hocko
2018-04-17 13:03   ` [LTP] " Michal Hocko
2018-04-17 14:14   ` Michal Hocko
2018-04-17 14:14     ` [LTP] " Michal Hocko
2018-04-17 14:28     ` Li Wang
2018-04-17 14:28       ` [LTP] " Li Wang
2018-04-17 19:00       ` Michal Hocko
2018-04-17 19:00         ` [LTP] " Michal Hocko
2018-04-17 20:09         ` Zi Yan
2018-04-17 20:09           ` [LTP] " Zi Yan
2018-04-18  9:07           ` Michal Hocko
2018-04-18  9:07             ` [LTP] " Michal Hocko
2018-04-18  9:19             ` Michal Hocko
2018-04-18  9:19               ` [LTP] " Michal Hocko
2018-04-18 10:39               ` Li Wang
2018-04-18 10:39                 ` [LTP] " Li Wang
2018-04-18 11:29                 ` Michal Hocko
2018-04-18 11:29                   ` [LTP] " Michal Hocko
2018-04-18 11:46                   ` Li Wang
2018-04-18 11:46                     ` [LTP] " Li Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.