linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Martin Schwidefsky <schwidefsky@de.ibm.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christoph Hellwig <hch@infradead.org>,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	linuxppc-dev@lists.ozlabs.org,
	linux-s390 <linux-s390@vger.kernel.org>
Subject: Re: Linux 5.1-rc5
Date: Wed, 17 Apr 2019 10:02:44 +0200	[thread overview]
Message-ID: <20190417100244.42e29736@mschwideX1> (raw)
In-Reply-To: <20190417094637.51ad4c67@mschwideX1>

On Wed, 17 Apr 2019 09:46:37 +0200
Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:

> On Tue, 16 Apr 2019 09:49:46 -0700
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > On Tue, Apr 16, 2019 at 9:16 AM Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:  
> > >
> > > We actually already *have* this function.
> > >
> > > It's called "gup_fast_permitted()" and it's used by x86-64 to verify
> > > the proper address range. Exactly like s390 needs..
> > >
> > > Could you please use that instead?    
> > 
> > IOW, something like the attached.
> > 
> > Obviously untested. And maybe 'current' isn't declared in
> > <asm/pgtable.h>, in which case you'd need to modify it to instead make
> > the inline function be "s390_gup_fast_permitted()" that takes a
> > pointer to the mm, and do something like
> > 
> >   #define gup_fast_permitted(start, pages) \
> >          s390_gup_fast_permitted(current->mm, start, pages)
> > 
> > instead.
> > 
> > But I think you get the idea..  
> 
> Nice, I did not realize that gup_fast_permitted is a platform
> override-able function. So that part is doable in arch/s390. But I
> spoke to soon, I got my first crash and realized that the common gup code
> is not usable as it is. The reason is this e.g. this sequence:
> 
> 	pgdp = pgd_offset(current->mm, addr);
> 	pgd_t pgd = READ_ONCE(*pgdp);
> 	/* some checking on pgd */
> 	gup_p4d_range(pgd, addr, next, write, pages, nr);
> 
> 	p4dp = p4d_offset(&pgd, addr);
> 	p4d_t p4d = READ_ONCE(*p4dp);
> 	/* some checking on p4d */
> 	gup_pud_range(p4d, addr, next, write, pages, nr);
> 
> 	pudp = pud_offset(&p4d, addr);
> 	pud_t pud = READ_ONCE(*pudp);
> 	/* some checking on pud */
> 	gup_pmd_range(pud, addr, next, write, pages, nr;
> 
> Each step along the way will read the page table entry and pass the
> table entry to the next function. This clashes with the page table
> folding on s390. The s390 gup code looks more like this:
> 
> 	pgdp = pgd_offset(current->mm, addr);
> 	/* some checking on pgd */
> 	pgd_t pgd = READ_ONCE(*pgdp);
> 	gup_p4d_range(pgdp, pgd, addr, next, write, pages, &nr);
> 
> 	p4dp = p4d_offset(pgdp, addr);
> 	p4d_t p4d = READ_ONCE(*p4dp);
> 	/* some checking on p4d */
> 	gup_pud_range(p4dp, p4d, addr, next, write, pages, nr);
> 
> 	pudp = pud_offset(p4dp, addr);
> 	pud_t pud = READ_ONCE(*pudp);
> 	/* some checking on pud */
> 	gup_pmd_range(pudp, pud, addr, next, write, pages, nr;
> 
> There are magic dereferences in the s390 versions of p4d_offset,
> pud_offset and pmd_offset functions. To make this work the pointer
> passed to these functions may not be the local copy of the already
> dereferenced table entry. I'll cook up a patch for the common code.

Grumpf, that does *not* work. For gup the table entries may be read only
once. Now I remember why I open-coded p4d_offset, pud_offset and pmd_offset
in arch/s390/mm/gup.c, to avoid to read the table entries twice.
It will be hard to use the common gup code after all.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


  reply	other threads:[~2019-04-17  8:02 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-14 22:40 Linux 5.1-rc5 Linus Torvalds
2019-04-15  5:19 ` Christoph Hellwig
2019-04-15 16:17   ` Linus Torvalds
2019-04-16  9:09     ` Martin Schwidefsky
2019-04-16 12:06       ` Martin Schwidefsky
2019-04-16 16:16         ` Linus Torvalds
2019-04-16 16:49           ` Linus Torvalds
2019-04-17  7:46             ` Martin Schwidefsky
2019-04-17  8:02               ` Martin Schwidefsky [this message]
2019-04-17 16:57                 ` Linus Torvalds
2019-04-18  8:02                   ` Martin Schwidefsky
2019-04-18 15:49                     ` Linus Torvalds
2019-04-18 18:41                       ` Martin Schwidefsky
2019-04-19 13:33                         ` Martin Schwidefsky
2019-04-19 17:27                           ` Linus Torvalds
2019-04-23 15:38                             ` Martin Schwidefsky
2019-04-23 16:06                               ` Linus Torvalds
2019-04-17  3:38     ` Michael Ellerman
2019-04-17  4:13       ` Linus Torvalds
2019-05-02 12:21     ` Greg KH
2019-05-02 14:17       ` Martin Schwidefsky
2019-05-02 14:31         ` Greg KH
2019-05-02 15:10           ` Martin Schwidefsky
2019-05-20 11:09             ` Greg KH
2019-05-03 13:31       ` Michael Ellerman
2019-05-02 23:15     ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190417100244.42e29736@mschwideX1 \
    --to=schwidefsky@de.ibm.com \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).