linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/migrate.c: also overwrite error when it is bigger than zero
@ 2020-01-17  7:45 Wei Yang
  2020-01-17 22:27 ` Wei Yang
  0 siblings, 1 reply; 10+ messages in thread
From: Wei Yang @ 2020-01-17  7:45 UTC (permalink / raw)
  To: akpm; +Cc: linux-mm, linux-kernel, Wei Yang

If we get here after successfully adding page to list, err would be
the number of pages in the list.

Current code has two problems:

  * on success, 0 is not returned
  * on error, the real error code is not returned

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
---
 mm/migrate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 557da996b936..c3ef70de5876 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1677,7 +1677,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
 	err1 = do_move_pages_to_node(mm, &pagelist, current_node);
 	if (!err1)
 		err1 = store_status(status, start, current_node, i - start);
-	if (!err)
+	if (err >= 0)
 		err = err1;
 out:
 	return err;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/migrate.c: also overwrite error when it is bigger than zero
  2020-01-17  7:45 [PATCH] mm/migrate.c: also overwrite error when it is bigger than zero Wei Yang
@ 2020-01-17 22:27 ` Wei Yang
  2020-01-17 23:30   ` Yang Shi
  0 siblings, 1 reply; 10+ messages in thread
From: Wei Yang @ 2020-01-17 22:27 UTC (permalink / raw)
  To: Wei Yang; +Cc: akpm, linux-mm, linux-kernel

On Fri, Jan 17, 2020 at 03:45:34PM +0800, Wei Yang wrote:
>If we get here after successfully adding page to list, err would be
>the number of pages in the list.
>
>Current code has two problems:
>
>  * on success, 0 is not returned
>  * on error, the real error code is not returned
>

Well, this breaks the user interface. User would receive 1 even the migration
succeed.

The change is introduced by e0153fc2c760 ("mm: move_pages: return valid node
id in status if the page is already on the target node").

>Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>---
> mm/migrate.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/mm/migrate.c b/mm/migrate.c
>index 557da996b936..c3ef70de5876 100644
>--- a/mm/migrate.c
>+++ b/mm/migrate.c
>@@ -1677,7 +1677,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
> 	err1 = do_move_pages_to_node(mm, &pagelist, current_node);
> 	if (!err1)
> 		err1 = store_status(status, start, current_node, i - start);
>-	if (!err)
>+	if (err >= 0)
> 		err = err1;
> out:
> 	return err;
>-- 
>2.17.1

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/migrate.c: also overwrite error when it is bigger than zero
  2020-01-17 22:27 ` Wei Yang
@ 2020-01-17 23:30   ` Yang Shi
  2020-01-17 23:48     ` Wei Yang
  2020-01-18  5:32     ` Yang Shi
  0 siblings, 2 replies; 10+ messages in thread
From: Yang Shi @ 2020-01-17 23:30 UTC (permalink / raw)
  To: Wei Yang; +Cc: Andrew Morton, Linux MM, Linux Kernel Mailing List

On Fri, Jan 17, 2020 at 2:27 PM Wei Yang <richardw.yang@linux.intel.com> wrote:
>
> On Fri, Jan 17, 2020 at 03:45:34PM +0800, Wei Yang wrote:
> >If we get here after successfully adding page to list, err would be
> >the number of pages in the list.
> >
> >Current code has two problems:
> >
> >  * on success, 0 is not returned
> >  * on error, the real error code is not returned
> >
>
> Well, this breaks the user interface. User would receive 1 even the migration
> succeed.
>
> The change is introduced by e0153fc2c760 ("mm: move_pages: return valid node
> id in status if the page is already on the target node").

Yes, it may return a value which is > 0. But, it seems do_pages_move()
could return > 0 value even before this commit.

For example, if I read the code correctly, it would do:

If we already have some pages on the queue then
add_page_for_migration() return error, then do_move_pages_to_node() is
called, but it may return > 0 value (the number of pages that were
*not* migrated by migrate_pages()), then the code flow would just jump
to "out" and return the value. And, it may happen to be 1.

I'm not sure if it breaks the user interface since the behavior has
been existed for years, and it looks nobody complains about it. Maybe
glibc helps hide it or people just care if it is 0 and the status.

>
> >Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
> >---
> > mm/migrate.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> >diff --git a/mm/migrate.c b/mm/migrate.c
> >index 557da996b936..c3ef70de5876 100644
> >--- a/mm/migrate.c
> >+++ b/mm/migrate.c
> >@@ -1677,7 +1677,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
> >       err1 = do_move_pages_to_node(mm, &pagelist, current_node);
> >       if (!err1)
> >               err1 = store_status(status, start, current_node, i - start);
> >-      if (!err)
> >+      if (err >= 0)
> >               err = err1;
> > out:
> >       return err;
> >--
> >2.17.1
>
> --
> Wei Yang
> Help you, Help me
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/migrate.c: also overwrite error when it is bigger than zero
  2020-01-17 23:30   ` Yang Shi
@ 2020-01-17 23:48     ` Wei Yang
  2020-01-18  1:38       ` Mike Kravetz
  2020-01-18  4:56       ` Yang Shi
  2020-01-18  5:32     ` Yang Shi
  1 sibling, 2 replies; 10+ messages in thread
From: Wei Yang @ 2020-01-17 23:48 UTC (permalink / raw)
  To: Yang Shi; +Cc: Wei Yang, Andrew Morton, Linux MM, Linux Kernel Mailing List

On Fri, Jan 17, 2020 at 03:30:18PM -0800, Yang Shi wrote:
>On Fri, Jan 17, 2020 at 2:27 PM Wei Yang <richardw.yang@linux.intel.com> wrote:
>>
>> On Fri, Jan 17, 2020 at 03:45:34PM +0800, Wei Yang wrote:
>> >If we get here after successfully adding page to list, err would be
>> >the number of pages in the list.
>> >
>> >Current code has two problems:
>> >
>> >  * on success, 0 is not returned
>> >  * on error, the real error code is not returned
>> >
>>
>> Well, this breaks the user interface. User would receive 1 even the migration
>> succeed.
>>
>> The change is introduced by e0153fc2c760 ("mm: move_pages: return valid node
>> id in status if the page is already on the target node").
>
>Yes, it may return a value which is > 0. But, it seems do_pages_move()
>could return > 0 value even before this commit.
>
>For example, if I read the code correctly, it would do:
>
>If we already have some pages on the queue then
>add_page_for_migration() return error, then do_move_pages_to_node() is
>called, but it may return > 0 value (the number of pages that were
>*not* migrated by migrate_pages()), then the code flow would just jump
>to "out" and return the value. And, it may happen to be 1.
>

This is another point I think current code is not working well. And actually,
the behavior is not well defined or our kernel is broken for a while.

When you look at the man page, it says:

    RETURN VALUE
           On success move_pages() returns zero.  On error, it returns -1, and sets errno to indicate the error

So per my understanding, the design is to return -1 on error instead of the
pages not managed to move.

For the user interface, if original code check 0 for success, your change
breaks it. Because your code would return 1 instead of 0. Suppose most user
just read the man page for programming instead of reading the kernel source
code. I believe we need to fix it.

Not sure how to include some user interface related developer to look into
this issue. Hope this thread may catch their eyes.

>I'm not sure if it breaks the user interface since the behavior has
>been existed for years, and it looks nobody complains about it. Maybe
>glibc helps hide it or people just care if it is 0 and the status.
>
>>
>> >Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>> >---
>> > mm/migrate.c | 2 +-
>> > 1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> >diff --git a/mm/migrate.c b/mm/migrate.c
>> >index 557da996b936..c3ef70de5876 100644
>> >--- a/mm/migrate.c
>> >+++ b/mm/migrate.c
>> >@@ -1677,7 +1677,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
>> >       err1 = do_move_pages_to_node(mm, &pagelist, current_node);
>> >       if (!err1)
>> >               err1 = store_status(status, start, current_node, i - start);
>> >-      if (!err)
>> >+      if (err >= 0)
>> >               err = err1;
>> > out:
>> >       return err;
>> >--
>> >2.17.1
>>
>> --
>> Wei Yang
>> Help you, Help me
>>

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/migrate.c: also overwrite error when it is bigger than zero
  2020-01-17 23:48     ` Wei Yang
@ 2020-01-18  1:38       ` Mike Kravetz
  2020-01-19  2:17         ` Wei Yang
  2020-01-18  4:56       ` Yang Shi
  1 sibling, 1 reply; 10+ messages in thread
From: Mike Kravetz @ 2020-01-18  1:38 UTC (permalink / raw)
  To: Wei Yang, Yang Shi; +Cc: Andrew Morton, Linux MM, Linux Kernel Mailing List

On 1/17/20 3:48 PM, Wei Yang wrote:
> This is another point I think current code is not working well. And actually,
> the behavior is not well defined or our kernel is broken for a while.
> 
> When you look at the man page, it says:
> 
>     RETURN VALUE
>            On success move_pages() returns zero.  On error, it returns -1, and sets errno to indicate the error
> 

Is this from your migrate_pages(2) man page?

The latest version of the migrate_pages(2) man page in the git repo has this
for RETURN VALUE.

RETURN VALUE
       On  success  migrate_pages() returns the number of pages that could not
       be moved (i.e., a return of zero means that all pages were successfully
       moved).  On error, it returns -1, and sets errno to indicate the error.

-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/migrate.c: also overwrite error when it is bigger than zero
  2020-01-17 23:48     ` Wei Yang
  2020-01-18  1:38       ` Mike Kravetz
@ 2020-01-18  4:56       ` Yang Shi
  2020-01-19  2:41         ` Wei Yang
  1 sibling, 1 reply; 10+ messages in thread
From: Yang Shi @ 2020-01-18  4:56 UTC (permalink / raw)
  To: Wei Yang; +Cc: Andrew Morton, Linux MM, Linux Kernel Mailing List

On Fri, Jan 17, 2020 at 3:48 PM Wei Yang <richardw.yang@linux.intel.com> wrote:
>
> On Fri, Jan 17, 2020 at 03:30:18PM -0800, Yang Shi wrote:
> >On Fri, Jan 17, 2020 at 2:27 PM Wei Yang <richardw.yang@linux.intel.com> wrote:
> >>
> >> On Fri, Jan 17, 2020 at 03:45:34PM +0800, Wei Yang wrote:
> >> >If we get here after successfully adding page to list, err would be
> >> >the number of pages in the list.
> >> >
> >> >Current code has two problems:
> >> >
> >> >  * on success, 0 is not returned
> >> >  * on error, the real error code is not returned
> >> >
> >>
> >> Well, this breaks the user interface. User would receive 1 even the migration
> >> succeed.
> >>
> >> The change is introduced by e0153fc2c760 ("mm: move_pages: return valid node
> >> id in status if the page is already on the target node").
> >
> >Yes, it may return a value which is > 0. But, it seems do_pages_move()
> >could return > 0 value even before this commit.
> >
> >For example, if I read the code correctly, it would do:
> >
> >If we already have some pages on the queue then
> >add_page_for_migration() return error, then do_move_pages_to_node() is
> >called, but it may return > 0 value (the number of pages that were
> >*not* migrated by migrate_pages()), then the code flow would just jump
> >to "out" and return the value. And, it may happen to be 1.
> >
>
> This is another point I think current code is not working well. And actually,
> the behavior is not well defined or our kernel is broken for a while.

Yes, we already spotted a few mismatches, inconsistencies and edge
cases in these NUMA APIs.

>
> When you look at the man page, it says:
>
>     RETURN VALUE
>            On success move_pages() returns zero.  On error, it returns -1, and sets errno to indicate the error
>
> So per my understanding, the design is to return -1 on error instead of the
> pages not managed to move.

So do I.

>
> For the user interface, if original code check 0 for success, your change
> breaks it. Because your code would return 1 instead of 0. Suppose most user
> just read the man page for programming instead of reading the kernel source
> code. I believe we need to fix it.

Yes, I definitely agree we need fix it. But the commit log looks
confusing, particularly "on error, the real error code is not
returned". If the error is returned by add_page_for_migration() then
it will not be returned to userspace instead of reporting via status.
Do you mean this?

>
> Not sure how to include some user interface related developer to look into
> this issue. Hope this thread may catch their eyes.
>
> >I'm not sure if it breaks the user interface since the behavior has
> >been existed for years, and it looks nobody complains about it. Maybe
> >glibc helps hide it or people just care if it is 0 and the status.
> >
> >>
> >> >Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
> >> >---
> >> > mm/migrate.c | 2 +-
> >> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >> >
> >> >diff --git a/mm/migrate.c b/mm/migrate.c
> >> >index 557da996b936..c3ef70de5876 100644
> >> >--- a/mm/migrate.c
> >> >+++ b/mm/migrate.c
> >> >@@ -1677,7 +1677,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
> >> >       err1 = do_move_pages_to_node(mm, &pagelist, current_node);
> >> >       if (!err1)
> >> >               err1 = store_status(status, start, current_node, i - start);
> >> >-      if (!err)
> >> >+      if (err >= 0)
> >> >               err = err1;
> >> > out:
> >> >       return err;
> >> >--
> >> >2.17.1
> >>
> >> --
> >> Wei Yang
> >> Help you, Help me
> >>
>
> --
> Wei Yang
> Help you, Help me

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/migrate.c: also overwrite error when it is bigger than zero
  2020-01-17 23:30   ` Yang Shi
  2020-01-17 23:48     ` Wei Yang
@ 2020-01-18  5:32     ` Yang Shi
  1 sibling, 0 replies; 10+ messages in thread
From: Yang Shi @ 2020-01-18  5:32 UTC (permalink / raw)
  To: Wei Yang; +Cc: Andrew Morton, Linux MM, Linux Kernel Mailing List

On Fri, Jan 17, 2020 at 3:30 PM Yang Shi <shy828301@gmail.com> wrote:
>
> On Fri, Jan 17, 2020 at 2:27 PM Wei Yang <richardw.yang@linux.intel.com> wrote:
> >
> > On Fri, Jan 17, 2020 at 03:45:34PM +0800, Wei Yang wrote:
> > >If we get here after successfully adding page to list, err would be
> > >the number of pages in the list.
> > >
> > >Current code has two problems:
> > >
> > >  * on success, 0 is not returned
> > >  * on error, the real error code is not returned
> > >
> >
> > Well, this breaks the user interface. User would receive 1 even the migration
> > succeed.
> >
> > The change is introduced by e0153fc2c760 ("mm: move_pages: return valid node
> > id in status if the page is already on the target node").
>
> Yes, it may return a value which is > 0. But, it seems do_pages_move()
> could return > 0 value even before this commit.
>
> For example, if I read the code correctly, it would do:
>
> If we already have some pages on the queue then
> add_page_for_migration() return error, then do_move_pages_to_node() is
> called, but it may return > 0 value (the number of pages that were
> *not* migrated by migrate_pages()), then the code flow would just jump
> to "out" and return the value. And, it may happen to be 1.

Just figured out this is another regression introduced by a49bd4d71637
("mm, numa: rework do_pages_move"). Already posted a patch to fix it.

>
> I'm not sure if it breaks the user interface since the behavior has
> been existed for years, and it looks nobody complains about it. Maybe
> glibc helps hide it or people just care if it is 0 and the status.
>
> >
> > >Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
> > >---
> > > mm/migrate.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > >diff --git a/mm/migrate.c b/mm/migrate.c
> > >index 557da996b936..c3ef70de5876 100644
> > >--- a/mm/migrate.c
> > >+++ b/mm/migrate.c
> > >@@ -1677,7 +1677,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
> > >       err1 = do_move_pages_to_node(mm, &pagelist, current_node);
> > >       if (!err1)
> > >               err1 = store_status(status, start, current_node, i - start);
> > >-      if (!err)
> > >+      if (err >= 0)
> > >               err = err1;
> > > out:
> > >       return err;
> > >--
> > >2.17.1
> >
> > --
> > Wei Yang
> > Help you, Help me
> >

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/migrate.c: also overwrite error when it is bigger than zero
  2020-01-18  1:38       ` Mike Kravetz
@ 2020-01-19  2:17         ` Wei Yang
  0 siblings, 0 replies; 10+ messages in thread
From: Wei Yang @ 2020-01-19  2:17 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: Wei Yang, Yang Shi, Andrew Morton, Linux MM, Linux Kernel Mailing List

On Fri, Jan 17, 2020 at 05:38:10PM -0800, Mike Kravetz wrote:
>On 1/17/20 3:48 PM, Wei Yang wrote:
>> This is another point I think current code is not working well. And actually,
>> the behavior is not well defined or our kernel is broken for a while.
>> 
>> When you look at the man page, it says:
>> 
>>     RETURN VALUE
>>            On success move_pages() returns zero.  On error, it returns -1, and sets errno to indicate the error
>> 
>
>Is this from your migrate_pages(2) man page?
>

It is from my move_pages(2) man page.

>The latest version of the migrate_pages(2) man page in the git repo has this
>for RETURN VALUE.
>
>RETURN VALUE
>       On  success  migrate_pages() returns the number of pages that could not
>       be moved (i.e., a return of zero means that all pages were successfully
>       moved).  On error, it returns -1, and sets errno to indicate the error.
>
>-- 
>Mike Kravetz

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/migrate.c: also overwrite error when it is bigger than zero
  2020-01-18  4:56       ` Yang Shi
@ 2020-01-19  2:41         ` Wei Yang
  2020-01-19  5:54           ` Yang Shi
  0 siblings, 1 reply; 10+ messages in thread
From: Wei Yang @ 2020-01-19  2:41 UTC (permalink / raw)
  To: Yang Shi; +Cc: Wei Yang, Andrew Morton, Linux MM, Linux Kernel Mailing List

On Fri, Jan 17, 2020 at 08:56:27PM -0800, Yang Shi wrote:
>On Fri, Jan 17, 2020 at 3:48 PM Wei Yang <richardw.yang@linux.intel.com> wrote:
>>
>> On Fri, Jan 17, 2020 at 03:30:18PM -0800, Yang Shi wrote:
>> >On Fri, Jan 17, 2020 at 2:27 PM Wei Yang <richardw.yang@linux.intel.com> wrote:
>> >>
>> >> On Fri, Jan 17, 2020 at 03:45:34PM +0800, Wei Yang wrote:
>> >> >If we get here after successfully adding page to list, err would be
>> >> >the number of pages in the list.
>> >> >
>> >> >Current code has two problems:
>> >> >
>> >> >  * on success, 0 is not returned
>> >> >  * on error, the real error code is not returned
>> >> >
>> >>
>> >> Well, this breaks the user interface. User would receive 1 even the migration
>> >> succeed.
>> >>
>> >> The change is introduced by e0153fc2c760 ("mm: move_pages: return valid node
>> >> id in status if the page is already on the target node").
>> >
>> >Yes, it may return a value which is > 0. But, it seems do_pages_move()
>> >could return > 0 value even before this commit.
>> >
>> >For example, if I read the code correctly, it would do:
>> >
>> >If we already have some pages on the queue then
>> >add_page_for_migration() return error, then do_move_pages_to_node() is
>> >called, but it may return > 0 value (the number of pages that were
>> >*not* migrated by migrate_pages()), then the code flow would just jump
>> >to "out" and return the value. And, it may happen to be 1.
>> >
>>
>> This is another point I think current code is not working well. And actually,
>> the behavior is not well defined or our kernel is broken for a while.
>
>Yes, we already spotted a few mismatches, inconsistencies and edge
>cases in these NUMA APIs.
>
>>
>> When you look at the man page, it says:
>>
>>     RETURN VALUE
>>            On success move_pages() returns zero.  On error, it returns -1, and sets errno to indicate the error
>>
>> So per my understanding, the design is to return -1 on error instead of the
>> pages not managed to move.
>
>So do I.
>
>>
>> For the user interface, if original code check 0 for success, your change
>> breaks it. Because your code would return 1 instead of 0. Suppose most user
>> just read the man page for programming instead of reading the kernel source
>> code. I believe we need to fix it.
>
>Yes, I definitely agree we need fix it. But the commit log looks
>confusing, particularly "on error, the real error code is not
>returned". If the error is returned by add_page_for_migration() then
>it will not be returned to userspace instead of reporting via status.
>Do you mean this?
>

Sorry for the confusion.

Here I mean, if add_page_for_migratioin() return 1, and the following err1
from do_move_pages_to_node() is set, the err1 is not returned.

The reason is err is not 0 at this point.

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/migrate.c: also overwrite error when it is bigger than zero
  2020-01-19  2:41         ` Wei Yang
@ 2020-01-19  5:54           ` Yang Shi
  0 siblings, 0 replies; 10+ messages in thread
From: Yang Shi @ 2020-01-19  5:54 UTC (permalink / raw)
  To: Wei Yang; +Cc: Andrew Morton, Linux MM, Linux Kernel Mailing List

On Sat, Jan 18, 2020 at 6:41 PM Wei Yang <richardw.yang@linux.intel.com> wrote:
>
> On Fri, Jan 17, 2020 at 08:56:27PM -0800, Yang Shi wrote:
> >On Fri, Jan 17, 2020 at 3:48 PM Wei Yang <richardw.yang@linux.intel.com> wrote:
> >>
> >> On Fri, Jan 17, 2020 at 03:30:18PM -0800, Yang Shi wrote:
> >> >On Fri, Jan 17, 2020 at 2:27 PM Wei Yang <richardw.yang@linux.intel.com> wrote:
> >> >>
> >> >> On Fri, Jan 17, 2020 at 03:45:34PM +0800, Wei Yang wrote:
> >> >> >If we get here after successfully adding page to list, err would be
> >> >> >the number of pages in the list.
> >> >> >
> >> >> >Current code has two problems:
> >> >> >
> >> >> >  * on success, 0 is not returned
> >> >> >  * on error, the real error code is not returned
> >> >> >
> >> >>
> >> >> Well, this breaks the user interface. User would receive 1 even the migration
> >> >> succeed.
> >> >>
> >> >> The change is introduced by e0153fc2c760 ("mm: move_pages: return valid node
> >> >> id in status if the page is already on the target node").
> >> >
> >> >Yes, it may return a value which is > 0. But, it seems do_pages_move()
> >> >could return > 0 value even before this commit.
> >> >
> >> >For example, if I read the code correctly, it would do:
> >> >
> >> >If we already have some pages on the queue then
> >> >add_page_for_migration() return error, then do_move_pages_to_node() is
> >> >called, but it may return > 0 value (the number of pages that were
> >> >*not* migrated by migrate_pages()), then the code flow would just jump
> >> >to "out" and return the value. And, it may happen to be 1.
> >> >
> >>
> >> This is another point I think current code is not working well. And actually,
> >> the behavior is not well defined or our kernel is broken for a while.
> >
> >Yes, we already spotted a few mismatches, inconsistencies and edge
> >cases in these NUMA APIs.
> >
> >>
> >> When you look at the man page, it says:
> >>
> >>     RETURN VALUE
> >>            On success move_pages() returns zero.  On error, it returns -1, and sets errno to indicate the error
> >>
> >> So per my understanding, the design is to return -1 on error instead of the
> >> pages not managed to move.
> >
> >So do I.
> >
> >>
> >> For the user interface, if original code check 0 for success, your change
> >> breaks it. Because your code would return 1 instead of 0. Suppose most user
> >> just read the man page for programming instead of reading the kernel source
> >> code. I believe we need to fix it.
> >
> >Yes, I definitely agree we need fix it. But the commit log looks
> >confusing, particularly "on error, the real error code is not
> >returned". If the error is returned by add_page_for_migration() then
> >it will not be returned to userspace instead of reporting via status.
> >Do you mean this?
> >
>
> Sorry for the confusion.
>
> Here I mean, if add_page_for_migratioin() return 1, and the following err1
> from do_move_pages_to_node() is set, the err1 is not returned.
>
> The reason is err is not 0 at this point.

Yes, I see your point. Please elaborate this in the commit log.

>
> --
> Wei Yang
> Help you, Help me

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-01-19  5:55 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-17  7:45 [PATCH] mm/migrate.c: also overwrite error when it is bigger than zero Wei Yang
2020-01-17 22:27 ` Wei Yang
2020-01-17 23:30   ` Yang Shi
2020-01-17 23:48     ` Wei Yang
2020-01-18  1:38       ` Mike Kravetz
2020-01-19  2:17         ` Wei Yang
2020-01-18  4:56       ` Yang Shi
2020-01-19  2:41         ` Wei Yang
2020-01-19  5:54           ` Yang Shi
2020-01-18  5:32     ` Yang Shi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).