All of lore.kernel.org
 help / color / mirror / Atom feed
From: Martin Schwidefsky <schwidefsky@de.ibm.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christoph Hellwig <hch@infradead.org>,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	linuxppc-dev@lists.ozlabs.org,
	linux-s390 <linux-s390@vger.kernel.org>
Subject: Re: Linux 5.1-rc5
Date: Tue, 16 Apr 2019 14:06:58 +0200	[thread overview]
Message-ID: <20190416140658.2cb73a3f@mschwideX1> (raw)
In-Reply-To: <20190416110906.6c773aff@mschwideX1>

On Tue, 16 Apr 2019 11:09:06 +0200
Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:

> On Mon, 15 Apr 2019 09:17:10 -0700
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > On Sun, Apr 14, 2019 at 10:19 PM Christoph Hellwig <hch@infradead.org> wrote:  
> > >
> > > Can we please have the page refcount overflow fixes out on the list
> > > for review, even if it is after the fact?    
> > 
> > They were actually on a list for review long before the fact, but it
> > was the security mailing list. The issue actually got discussed back
> > in January along with early versions of the patches, but then we
> > dropped the ball because it just wasn't on anybody's radar and it got
> > resurrected late March. Willy wrote a rather bigger patch-series, and
> > review of that is what then resulted in those commits. So they may
> > look recent, but that's just because the original patches got
> > seriously edited down and rewritten.  
> 
> First time I hear about this, thanks for the heads up.
>  
> > That said, powerpc and s390 should at least look at maybe adding a
> > check for the page ref in their gup paths too. Powerpc has the special
> > gup_hugepte() case, and s390 has its own version of gup entirely. I
> > was actually hoping the s390 guys would look at using the generic gup
> > code.  
> 
> We did look at converting the s390 gup code to CONFIG_HAVE_GENERIC_GUP,
> there are some details that need careful consideration. The top one
> is access_ok(), for s390 we always return true. The generic gup code
> relies on the fact that a page table walk with a specific address is
> doable if access_ok() returned true, the s390 specific check is slightly
> different:
> 
>         if ((end <= start) || (end > mm->context.asce_limit))
>                 return 0;
> 
> The obvious approach would be to modify access_ok() to check against
> the asce_limit. I will try and see if anything breaks, e.g. the automatic
> page table upgrade.

I tested the waters in regard to access_ok() and the generic gup code.
The good news is that mm/gup.c with CONFIG_HAVE_GENERIC_GUP=y seems to
work just fine if the access_ok() issue is taken care of. But..

Bloat-o-meter with a non-empty uaccess_ok() that checks against
current->mm->context.asce_limit:

add/remove: 8/2 grow/shrink: 611/11 up/down: 61352/-1914 (59438)

with CONFIG_HAVE_GENERIC_GUP on top of that

add/remove: 10/2 grow/shrink: 612/12 up/down: 63568/-3280 (60288)

This is not nice, would a patch like the following be acceptable?
--
Subject: [PATCH] mm: introduce mm_pgd_walk_ok

Add the architecture overrideable function mm_pgd_walk_ok() to check
if a block of memory is inside the limits of the page table hierarchy
of a given mm struct.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---
 include/asm-generic/pgtable.h | 4 ++++
 mm/gup.c                      | 4 ++--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index fa782fba51ee..7d2a8a58f1c1 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -1186,4 +1186,8 @@ static inline bool arch_has_pfn_modify_check(void)
 #define mm_pmd_folded(mm)	__is_defined(__PAGETABLE_PMD_FOLDED)
 #endif
 
+#ifndef mm_pgd_walk_ok
+#define mm_pgd_walk_ok(mm, addr, size)	access_ok(addr, size)
+#endif
+
 #endif /* _ASM_GENERIC_PGTABLE_H */
diff --git a/mm/gup.c b/mm/gup.c
index 91819b8ad9cc..b3eb3f45d237 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1990,7 +1990,7 @@ int __get_user_pages_fast(unsigned long start, int nr_pages, int write,
 	len = (unsigned long) nr_pages << PAGE_SHIFT;
 	end = start + len;
 
-	if (unlikely(!access_ok((void __user *)start, len)))
+	if (unlikely(!mm_pgd_walk_ok(current->mm, (void __user *)start, len)))
 		return 0;
 
 	/*
@@ -2044,7 +2044,7 @@ int get_user_pages_fast(unsigned long start, int nr_pages, int write,
 	if (nr_pages <= 0)
 		return 0;
 
-	if (unlikely(!access_ok((void __user *)start, len)))
+	if (unlikely(!mm_pgd_walk_ok(current->mm, (void __user *)start, len)))
 		return -EFAULT;
 
 	if (gup_fast_permitted(start, nr_pages)) {
-- 
2.16.4

With an empty access_ok() but a "real" mm_pgd_walk_ok() the results are
much more reasonable:

add/remove: 2/0 grow/shrink: 2/1 up/down: 2186/-1382 (804)

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


WARNING: multiple messages have this Message-ID (diff)
From: Martin Schwidefsky <schwidefsky@de.ibm.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christoph Hellwig <hch@infradead.org>,
	linuxppc-dev@lists.ozlabs.org,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
	linux-s390 <linux-s390@vger.kernel.org>
Subject: Re: Linux 5.1-rc5
Date: Tue, 16 Apr 2019 14:06:58 +0200	[thread overview]
Message-ID: <20190416140658.2cb73a3f@mschwideX1> (raw)
In-Reply-To: <20190416110906.6c773aff@mschwideX1>

On Tue, 16 Apr 2019 11:09:06 +0200
Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:

> On Mon, 15 Apr 2019 09:17:10 -0700
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > On Sun, Apr 14, 2019 at 10:19 PM Christoph Hellwig <hch@infradead.org> wrote:  
> > >
> > > Can we please have the page refcount overflow fixes out on the list
> > > for review, even if it is after the fact?    
> > 
> > They were actually on a list for review long before the fact, but it
> > was the security mailing list. The issue actually got discussed back
> > in January along with early versions of the patches, but then we
> > dropped the ball because it just wasn't on anybody's radar and it got
> > resurrected late March. Willy wrote a rather bigger patch-series, and
> > review of that is what then resulted in those commits. So they may
> > look recent, but that's just because the original patches got
> > seriously edited down and rewritten.  
> 
> First time I hear about this, thanks for the heads up.
>  
> > That said, powerpc and s390 should at least look at maybe adding a
> > check for the page ref in their gup paths too. Powerpc has the special
> > gup_hugepte() case, and s390 has its own version of gup entirely. I
> > was actually hoping the s390 guys would look at using the generic gup
> > code.  
> 
> We did look at converting the s390 gup code to CONFIG_HAVE_GENERIC_GUP,
> there are some details that need careful consideration. The top one
> is access_ok(), for s390 we always return true. The generic gup code
> relies on the fact that a page table walk with a specific address is
> doable if access_ok() returned true, the s390 specific check is slightly
> different:
> 
>         if ((end <= start) || (end > mm->context.asce_limit))
>                 return 0;
> 
> The obvious approach would be to modify access_ok() to check against
> the asce_limit. I will try and see if anything breaks, e.g. the automatic
> page table upgrade.

I tested the waters in regard to access_ok() and the generic gup code.
The good news is that mm/gup.c with CONFIG_HAVE_GENERIC_GUP=y seems to
work just fine if the access_ok() issue is taken care of. But..

Bloat-o-meter with a non-empty uaccess_ok() that checks against
current->mm->context.asce_limit:

add/remove: 8/2 grow/shrink: 611/11 up/down: 61352/-1914 (59438)

with CONFIG_HAVE_GENERIC_GUP on top of that

add/remove: 10/2 grow/shrink: 612/12 up/down: 63568/-3280 (60288)

This is not nice, would a patch like the following be acceptable?
--
Subject: [PATCH] mm: introduce mm_pgd_walk_ok

Add the architecture overrideable function mm_pgd_walk_ok() to check
if a block of memory is inside the limits of the page table hierarchy
of a given mm struct.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---
 include/asm-generic/pgtable.h | 4 ++++
 mm/gup.c                      | 4 ++--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index fa782fba51ee..7d2a8a58f1c1 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -1186,4 +1186,8 @@ static inline bool arch_has_pfn_modify_check(void)
 #define mm_pmd_folded(mm)	__is_defined(__PAGETABLE_PMD_FOLDED)
 #endif
 
+#ifndef mm_pgd_walk_ok
+#define mm_pgd_walk_ok(mm, addr, size)	access_ok(addr, size)
+#endif
+
 #endif /* _ASM_GENERIC_PGTABLE_H */
diff --git a/mm/gup.c b/mm/gup.c
index 91819b8ad9cc..b3eb3f45d237 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1990,7 +1990,7 @@ int __get_user_pages_fast(unsigned long start, int nr_pages, int write,
 	len = (unsigned long) nr_pages << PAGE_SHIFT;
 	end = start + len;
 
-	if (unlikely(!access_ok((void __user *)start, len)))
+	if (unlikely(!mm_pgd_walk_ok(current->mm, (void __user *)start, len)))
 		return 0;
 
 	/*
@@ -2044,7 +2044,7 @@ int get_user_pages_fast(unsigned long start, int nr_pages, int write,
 	if (nr_pages <= 0)
 		return 0;
 
-	if (unlikely(!access_ok((void __user *)start, len)))
+	if (unlikely(!mm_pgd_walk_ok(current->mm, (void __user *)start, len)))
 		return -EFAULT;
 
 	if (gup_fast_permitted(start, nr_pages)) {
-- 
2.16.4

With an empty access_ok() but a "real" mm_pgd_walk_ok() the results are
much more reasonable:

add/remove: 2/0 grow/shrink: 2/1 up/down: 2186/-1382 (804)

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

  reply	other threads:[~2019-04-16 12:10 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-14 22:40 Linux 5.1-rc5 Linus Torvalds
2019-04-15  5:19 ` Christoph Hellwig
2019-04-15 16:17   ` Linus Torvalds
2019-04-15 16:17     ` Linus Torvalds
2019-04-16  9:09     ` Martin Schwidefsky
2019-04-16  9:09       ` Martin Schwidefsky
2019-04-16 12:06       ` Martin Schwidefsky [this message]
2019-04-16 12:06         ` Martin Schwidefsky
2019-04-16 16:16         ` Linus Torvalds
2019-04-16 16:16           ` Linus Torvalds
2019-04-16 16:49           ` Linus Torvalds
2019-04-16 16:49             ` Linus Torvalds
2019-04-17  7:46             ` Martin Schwidefsky
2019-04-17  7:46               ` Martin Schwidefsky
2019-04-17  8:02               ` Martin Schwidefsky
2019-04-17  8:02                 ` Martin Schwidefsky
2019-04-17 16:57                 ` Linus Torvalds
2019-04-17 16:57                   ` Linus Torvalds
2019-04-18  8:02                   ` Martin Schwidefsky
2019-04-18  8:02                     ` Martin Schwidefsky
2019-04-18 15:49                     ` Linus Torvalds
2019-04-18 15:49                       ` Linus Torvalds
2019-04-18 18:41                       ` Martin Schwidefsky
2019-04-18 18:41                         ` Martin Schwidefsky
2019-04-19 13:33                         ` Martin Schwidefsky
2019-04-19 13:33                           ` Martin Schwidefsky
2019-04-19 17:27                           ` Linus Torvalds
2019-04-19 17:27                             ` Linus Torvalds
2019-04-23 15:38                             ` Martin Schwidefsky
2019-04-23 15:38                               ` Martin Schwidefsky
2019-04-23 16:06                               ` Linus Torvalds
2019-04-23 16:06                                 ` Linus Torvalds
2019-04-17  3:38     ` Michael Ellerman
2019-04-17  3:38       ` Michael Ellerman
2019-04-17  4:13       ` Linus Torvalds
2019-04-17  4:13         ` Linus Torvalds
2019-05-02 12:21     ` Greg KH
2019-05-02 12:21       ` Greg KH
2019-05-02 14:17       ` Martin Schwidefsky
2019-05-02 14:17         ` Martin Schwidefsky
2019-05-02 14:31         ` Greg KH
2019-05-02 14:31           ` Greg KH
2019-05-02 15:10           ` Martin Schwidefsky
2019-05-02 15:10             ` Martin Schwidefsky
2019-05-20 11:09             ` Greg KH
2019-05-20 11:09               ` Greg KH
2019-05-03 13:31       ` Michael Ellerman
2019-05-03 13:31         ` Michael Ellerman
2019-05-02 23:15     ` Christoph Hellwig
2019-05-02 23:15       ` Christoph Hellwig
2019-05-02 23:15       ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190416140658.2cb73a3f@mschwideX1 \
    --to=schwidefsky@de.ibm.com \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.