All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Xu <peterx@redhat.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com
Subject: Re: [PATCH 1/2] mm/mempolicy: Allow lookup_node() to handle fatal signal
Date: Tue, 14 Apr 2020 13:04:29 +0200	[thread overview]
Message-ID: <20200414110429.GF4629@dhcp22.suse.cz> (raw)
In-Reply-To: <CAHk-=whwRqkwdaJQf4g0-Evd6RmXR3dkkKyfnPjbnkeia=b1ug@mail.gmail.com>

On Thu 09-04-20 09:42:20, Linus Torvalds wrote:
> On Thu, Apr 9, 2020 at 12:03 AM Michal Hocko <mhocko@kernel.org> wrote:
> >
> > This patch however doesn't go all the way to revert it because 0 return
> > value is impossible.
> 
> I'm not convinced it's impossible.

__get_user_pages is documented as
 * -- If nr_pages is 0, returns 0.
 * -- If nr_pages is >0, but no pages were pinned, returns -errno.
 * -- If nr_pages is >0, and some pages were pinned, returns the number of
 *    pages pinned. Again, this may be less than nr_pages.

but let me double check the actual code... There seem to be only one
exception the above rule AFAICS. faultin_page returning EBUSY will
be overriden to either 0 for the first page or return the number of
already pinned pages. So nr_pages > 0 && ret = 0 is indeed possible
from __get_user_pages :/ That will be the case only for VM_FAULT_RETRY,
thoug.

Now __get_user_pages_locked behaves differently. It keeps retrying the
fault until it succeeds unless FOLL_NOWAIT is specified. Then it would
return 0. Why we need to return 0 is not really clear to me but it
seem to be a long term behavior. I believe we need to document it.

> And if it is, then the current code is harmless.

Yes from the above it seems that the check is indeed harmless becasue
this path doesn't use FOLL_NOWAIT and so it will never see 0 return.
I find a reference to EINTR confusing so I would still love to change
that.

> Now, I do agree that we probably should go through and clarify the
> whole range of different get_user_pages() cases of returning zero (or
> not doing so), but right now it's so confusing that I'd prefer to keep
> that (possibly unnecessary) belt-and-suspenders check for zero in
> there.
>
> If/when somebody actually does a real audit and the result is "these
> functions cannot return zero" and it's documented, then we can remove
> those checks.

Would you mind this patch instead?

commit bc6c0fa7c7fb5eb54963dca65ae4a62ba04c9efa
Author: Michal Hocko <mhocko@suse.com>
Date:   Thu Apr 9 08:26:57 2020 +0200

    mm, mempolicy: fix up gup usage in lookup_node
    
    ba841078cd05 ("mm/mempolicy: Allow lookup_node() to handle fatal signal") has
    added a special casing for 0 return value because that was a possible
    gup return value when interrupted by fatal signal. This has been fixed
    by ae46d2aa6a7f ("mm/gup: Let __get_user_pages_locked() return -EINTR
    for fatal signal") in the mean time so ba841078cd05 can be reverted.
    
    This patch however doesn't go all the way to revert it because the check
    for 0 is wrong and confusing here. Firstly it is inherently unsafe to
    access the page when get_user_pages_locked returns 0 (aka no page
    returned).
    Fortunatelly this will not happen because get_user_pages_locked will not
    return 0 when nr_pages > 0 unless FOLL_NOWAIT is specified which is not
    the case here. Document this potential error code in gup code while we
    are at it.
    
    Signed-off-by: Michal Hocko <mhocko@suse.com>

diff --git a/mm/gup.c b/mm/gup.c
index 50681f0286de..a8575b880baf 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -980,6 +980,7 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags)
  * -- If nr_pages is >0, but no pages were pinned, returns -errno.
  * -- If nr_pages is >0, and some pages were pinned, returns the number of
  *    pages pinned. Again, this may be less than nr_pages.
+ * -- 0 return value is possible when the fault would need to be retried.
  *
  * The caller is responsible for releasing returned @pages, via put_page().
  *
@@ -1247,6 +1248,10 @@ int fixup_user_fault(struct task_struct *tsk, struct mm_struct *mm,
 }
 EXPORT_SYMBOL_GPL(fixup_user_fault);
 
+/*
+ * Please note that this function, unlike __get_user_pages will not
+ * return 0 for nr_pages > 0 without FOLL_NOWAIT
+ */
 static __always_inline long __get_user_pages_locked(struct task_struct *tsk,
 						struct mm_struct *mm,
 						unsigned long start,
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 48ba9729062e..1965e2681877 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -927,10 +927,7 @@ static int lookup_node(struct mm_struct *mm, unsigned long addr)
 
 	int locked = 1;
 	err = get_user_pages_locked(addr & PAGE_MASK, 1, 0, &p, &locked);
-	if (err == 0) {
-		/* E.g. GUP interrupted by fatal signal */
-		err = -EFAULT;
-	} else if (err > 0) {
+	if (err > 0) {
 		err = page_to_nid(p);
 		put_page(p);
 	}
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2020-04-14 11:04 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-08  1:40 [PATCH 0/2] mm: Two small fixes for recent syzbot reports Peter Xu
2020-04-08  1:40 ` [PATCH 1/2] mm/mempolicy: Allow lookup_node() to handle fatal signal Peter Xu
2020-04-08 10:21   ` Michal Hocko
2020-04-08 14:20     ` Peter Xu
2020-04-08 14:30       ` Michal Hocko
2020-04-08 15:24         ` Peter Xu
2020-04-08 15:26           ` Michal Hocko
2020-04-09  7:02   ` Michal Hocko
2020-04-09 12:52     ` Peter Xu
2020-04-09 13:00       ` Peter Xu
2020-04-09 13:53       ` Michal Hocko
2020-04-09 16:42     ` Linus Torvalds
2020-04-09 16:42       ` Linus Torvalds
2020-04-14 11:04       ` Michal Hocko [this message]
2020-04-14 13:49         ` Peter Xu
2020-04-14 14:18           ` Michal Hocko
2020-04-20 12:47         ` Michal Hocko
2020-04-20 17:31           ` Linus Torvalds
2020-04-20 17:31             ` Linus Torvalds
2020-04-21  7:09             ` Michal Hocko
2020-04-08  1:40 ` [PATCH 2/2] mm/gup: Mark lock taken only after a successful retake Peter Xu
2020-04-09  0:47 ` [PATCH 0/2] mm: Two small fixes for recent syzbot reports Andrew Morton
2020-04-09 11:49   ` Matthew Wilcox
2020-04-09 13:00     ` Dmitry Vyukov
2020-04-09 13:00       ` Dmitry Vyukov
2020-04-09 18:16       ` Andrew Morton
2020-04-09 18:53         ` Linus Torvalds
2020-04-09 18:53           ` Linus Torvalds
2020-04-09 19:12           ` Andrew Morton
2020-04-09 19:46             ` Linus Torvalds
2020-04-09 19:46               ` Linus Torvalds
2020-04-09 19:56               ` Matthew Wilcox
2020-04-09 19:58                 ` Linus Torvalds
2020-04-09 19:58                   ` Linus Torvalds
2020-04-09 20:27                   ` Eric Biggers
2020-04-09 20:34                     ` Linus Torvalds
2020-04-09 20:34                       ` Linus Torvalds
2020-04-09 23:34                       ` Stephen Rothwell
2020-04-10  1:11                       ` Theodore Y. Ts'o
2020-04-09 12:55   ` Dmitry Vyukov
2020-04-09 12:55     ` Dmitry Vyukov
2020-04-09 16:32     ` Linus Torvalds
2020-04-09 16:32       ` Linus Torvalds
2020-04-09 16:58       ` Qian Cai
2020-04-09 17:05         ` Linus Torvalds
2020-04-09 17:05           ` Linus Torvalds
2020-04-09 17:58           ` Qian Cai
2020-04-09 18:06             ` Linus Torvalds
2020-04-09 18:06               ` Linus Torvalds
2020-04-09 21:14               ` Qian Cai
2020-04-10 13:12                 ` Tetsuo Handa
2020-04-10 14:26                   ` Qian Cai
2020-04-10 17:26                     ` Andrew Morton
2020-04-10 19:46                       ` Qian Cai
2020-04-09 23:29       ` Stephen Rothwell
2020-04-13 22:06         ` Qian Cai
2020-04-13 23:05           ` Jens Axboe
2020-04-14 11:12           ` Dmitry Vyukov
2020-04-14 11:12             ` Dmitry Vyukov
2020-04-14 11:59             ` Qian Cai
2020-04-14 12:05               ` Dmitry Vyukov
2020-04-14 12:05                 ` Dmitry Vyukov
2020-04-14 19:28             ` Dan Rue
2020-04-15 11:09               ` Dmitry Vyukov
2020-04-15 11:09                 ` Dmitry Vyukov
2020-04-15 16:23                 ` Dan Rue
2020-04-16  0:34             ` Stephen Rothwell
2020-05-11 15:29               ` Dmitry Vyukov
2020-05-11 15:29                 ` Dmitry Vyukov
2020-04-14  4:07         ` Hillf Danton
2020-04-14  4:31           ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200414110429.GF4629@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=peterx@redhat.com \
    --cc=syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.