All of lore.kernel.org
 help / color / mirror / Atom feed
From: Martin Schwidefsky <schwidefsky@de.ibm.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christoph Hellwig <hch@infradead.org>,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	linuxppc-dev@lists.ozlabs.org,
	linux-s390 <linux-s390@vger.kernel.org>
Subject: Re: Linux 5.1-rc5
Date: Wed, 17 Apr 2019 10:02:44 +0200	[thread overview]
Message-ID: <20190417100244.42e29736@mschwideX1> (raw)
In-Reply-To: <20190417094637.51ad4c67@mschwideX1>

On Wed, 17 Apr 2019 09:46:37 +0200
Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:

> On Tue, 16 Apr 2019 09:49:46 -0700
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > On Tue, Apr 16, 2019 at 9:16 AM Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:  
> > >
> > > We actually already *have* this function.
> > >
> > > It's called "gup_fast_permitted()" and it's used by x86-64 to verify
> > > the proper address range. Exactly like s390 needs..
> > >
> > > Could you please use that instead?    
> > 
> > IOW, something like the attached.
> > 
> > Obviously untested. And maybe 'current' isn't declared in
> > <asm/pgtable.h>, in which case you'd need to modify it to instead make
> > the inline function be "s390_gup_fast_permitted()" that takes a
> > pointer to the mm, and do something like
> > 
> >   #define gup_fast_permitted(start, pages) \
> >          s390_gup_fast_permitted(current->mm, start, pages)
> > 
> > instead.
> > 
> > But I think you get the idea..  
> 
> Nice, I did not realize that gup_fast_permitted is a platform
> override-able function. So that part is doable in arch/s390. But I
> spoke to soon, I got my first crash and realized that the common gup code
> is not usable as it is. The reason is this e.g. this sequence:
> 
> 	pgdp = pgd_offset(current->mm, addr);
> 	pgd_t pgd = READ_ONCE(*pgdp);
> 	/* some checking on pgd */
> 	gup_p4d_range(pgd, addr, next, write, pages, nr);
> 
> 	p4dp = p4d_offset(&pgd, addr);
> 	p4d_t p4d = READ_ONCE(*p4dp);
> 	/* some checking on p4d */
> 	gup_pud_range(p4d, addr, next, write, pages, nr);
> 
> 	pudp = pud_offset(&p4d, addr);
> 	pud_t pud = READ_ONCE(*pudp);
> 	/* some checking on pud */
> 	gup_pmd_range(pud, addr, next, write, pages, nr;
> 
> Each step along the way will read the page table entry and pass the
> table entry to the next function. This clashes with the page table
> folding on s390. The s390 gup code looks more like this:
> 
> 	pgdp = pgd_offset(current->mm, addr);
> 	/* some checking on pgd */
> 	pgd_t pgd = READ_ONCE(*pgdp);
> 	gup_p4d_range(pgdp, pgd, addr, next, write, pages, &nr);
> 
> 	p4dp = p4d_offset(pgdp, addr);
> 	p4d_t p4d = READ_ONCE(*p4dp);
> 	/* some checking on p4d */
> 	gup_pud_range(p4dp, p4d, addr, next, write, pages, nr);
> 
> 	pudp = pud_offset(p4dp, addr);
> 	pud_t pud = READ_ONCE(*pudp);
> 	/* some checking on pud */
> 	gup_pmd_range(pudp, pud, addr, next, write, pages, nr;
> 
> There are magic dereferences in the s390 versions of p4d_offset,
> pud_offset and pmd_offset functions. To make this work the pointer
> passed to these functions may not be the local copy of the already
> dereferenced table entry. I'll cook up a patch for the common code.

Grumpf, that does *not* work. For gup the table entries may be read only
once. Now I remember why I open-coded p4d_offset, pud_offset and pmd_offset
in arch/s390/mm/gup.c, to avoid to read the table entries twice.
It will be hard to use the common gup code after all.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


WARNING: multiple messages have this Message-ID (diff)
From: Martin Schwidefsky <schwidefsky@de.ibm.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christoph Hellwig <hch@infradead.org>,
	linuxppc-dev@lists.ozlabs.org,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
	linux-s390 <linux-s390@vger.kernel.org>
Subject: Re: Linux 5.1-rc5
Date: Wed, 17 Apr 2019 10:02:44 +0200	[thread overview]
Message-ID: <20190417100244.42e29736@mschwideX1> (raw)
In-Reply-To: <20190417094637.51ad4c67@mschwideX1>

On Wed, 17 Apr 2019 09:46:37 +0200
Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:

> On Tue, 16 Apr 2019 09:49:46 -0700
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > On Tue, Apr 16, 2019 at 9:16 AM Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:  
> > >
> > > We actually already *have* this function.
> > >
> > > It's called "gup_fast_permitted()" and it's used by x86-64 to verify
> > > the proper address range. Exactly like s390 needs..
> > >
> > > Could you please use that instead?    
> > 
> > IOW, something like the attached.
> > 
> > Obviously untested. And maybe 'current' isn't declared in
> > <asm/pgtable.h>, in which case you'd need to modify it to instead make
> > the inline function be "s390_gup_fast_permitted()" that takes a
> > pointer to the mm, and do something like
> > 
> >   #define gup_fast_permitted(start, pages) \
> >          s390_gup_fast_permitted(current->mm, start, pages)
> > 
> > instead.
> > 
> > But I think you get the idea..  
> 
> Nice, I did not realize that gup_fast_permitted is a platform
> override-able function. So that part is doable in arch/s390. But I
> spoke to soon, I got my first crash and realized that the common gup code
> is not usable as it is. The reason is this e.g. this sequence:
> 
> 	pgdp = pgd_offset(current->mm, addr);
> 	pgd_t pgd = READ_ONCE(*pgdp);
> 	/* some checking on pgd */
> 	gup_p4d_range(pgd, addr, next, write, pages, nr);
> 
> 	p4dp = p4d_offset(&pgd, addr);
> 	p4d_t p4d = READ_ONCE(*p4dp);
> 	/* some checking on p4d */
> 	gup_pud_range(p4d, addr, next, write, pages, nr);
> 
> 	pudp = pud_offset(&p4d, addr);
> 	pud_t pud = READ_ONCE(*pudp);
> 	/* some checking on pud */
> 	gup_pmd_range(pud, addr, next, write, pages, nr;
> 
> Each step along the way will read the page table entry and pass the
> table entry to the next function. This clashes with the page table
> folding on s390. The s390 gup code looks more like this:
> 
> 	pgdp = pgd_offset(current->mm, addr);
> 	/* some checking on pgd */
> 	pgd_t pgd = READ_ONCE(*pgdp);
> 	gup_p4d_range(pgdp, pgd, addr, next, write, pages, &nr);
> 
> 	p4dp = p4d_offset(pgdp, addr);
> 	p4d_t p4d = READ_ONCE(*p4dp);
> 	/* some checking on p4d */
> 	gup_pud_range(p4dp, p4d, addr, next, write, pages, nr);
> 
> 	pudp = pud_offset(p4dp, addr);
> 	pud_t pud = READ_ONCE(*pudp);
> 	/* some checking on pud */
> 	gup_pmd_range(pudp, pud, addr, next, write, pages, nr;
> 
> There are magic dereferences in the s390 versions of p4d_offset,
> pud_offset and pmd_offset functions. To make this work the pointer
> passed to these functions may not be the local copy of the already
> dereferenced table entry. I'll cook up a patch for the common code.

Grumpf, that does *not* work. For gup the table entries may be read only
once. Now I remember why I open-coded p4d_offset, pud_offset and pmd_offset
in arch/s390/mm/gup.c, to avoid to read the table entries twice.
It will be hard to use the common gup code after all.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

  reply	other threads:[~2019-04-17  8:02 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-14 22:40 Linux 5.1-rc5 Linus Torvalds
2019-04-15  5:19 ` Christoph Hellwig
2019-04-15 16:17   ` Linus Torvalds
2019-04-15 16:17     ` Linus Torvalds
2019-04-16  9:09     ` Martin Schwidefsky
2019-04-16  9:09       ` Martin Schwidefsky
2019-04-16 12:06       ` Martin Schwidefsky
2019-04-16 12:06         ` Martin Schwidefsky
2019-04-16 16:16         ` Linus Torvalds
2019-04-16 16:16           ` Linus Torvalds
2019-04-16 16:49           ` Linus Torvalds
2019-04-16 16:49             ` Linus Torvalds
2019-04-17  7:46             ` Martin Schwidefsky
2019-04-17  7:46               ` Martin Schwidefsky
2019-04-17  8:02               ` Martin Schwidefsky [this message]
2019-04-17  8:02                 ` Martin Schwidefsky
2019-04-17 16:57                 ` Linus Torvalds
2019-04-17 16:57                   ` Linus Torvalds
2019-04-18  8:02                   ` Martin Schwidefsky
2019-04-18  8:02                     ` Martin Schwidefsky
2019-04-18 15:49                     ` Linus Torvalds
2019-04-18 15:49                       ` Linus Torvalds
2019-04-18 18:41                       ` Martin Schwidefsky
2019-04-18 18:41                         ` Martin Schwidefsky
2019-04-19 13:33                         ` Martin Schwidefsky
2019-04-19 13:33                           ` Martin Schwidefsky
2019-04-19 17:27                           ` Linus Torvalds
2019-04-19 17:27                             ` Linus Torvalds
2019-04-23 15:38                             ` Martin Schwidefsky
2019-04-23 15:38                               ` Martin Schwidefsky
2019-04-23 16:06                               ` Linus Torvalds
2019-04-23 16:06                                 ` Linus Torvalds
2019-04-17  3:38     ` Michael Ellerman
2019-04-17  3:38       ` Michael Ellerman
2019-04-17  4:13       ` Linus Torvalds
2019-04-17  4:13         ` Linus Torvalds
2019-05-02 12:21     ` Greg KH
2019-05-02 12:21       ` Greg KH
2019-05-02 14:17       ` Martin Schwidefsky
2019-05-02 14:17         ` Martin Schwidefsky
2019-05-02 14:31         ` Greg KH
2019-05-02 14:31           ` Greg KH
2019-05-02 15:10           ` Martin Schwidefsky
2019-05-02 15:10             ` Martin Schwidefsky
2019-05-20 11:09             ` Greg KH
2019-05-20 11:09               ` Greg KH
2019-05-03 13:31       ` Michael Ellerman
2019-05-03 13:31         ` Michael Ellerman
2019-05-02 23:15     ` Christoph Hellwig
2019-05-02 23:15       ` Christoph Hellwig
2019-05-02 23:15       ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190417100244.42e29736@mschwideX1 \
    --to=schwidefsky@de.ibm.com \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.