linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Will Deacon <will@kernel.org>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Hugh Dickins <hughd@google.com>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Yang Shi <shy828301@gmail.com>,
	Wang Yugui <wangyugui@e16-tech.com>,
	Matthew Wilcox <willy@infradead.org>,
	Alistair Popple <apopple@nvidia.com>,
	Ralph Campbell <rcampbell@nvidia.com>, Zi Yan <ziy@nvidia.com>,
	Peter Xu <peterx@redhat.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 03/11] mm: page_vma_mapped_walk(): use pmd_read_atomic()
Date: Tue, 15 Jun 2021 10:46:39 +0100	[thread overview]
Message-ID: <20210615094639.GC19878@willie-the-truck> (raw)
In-Reply-To: <20210611194249.GS1096940@ziepe.ca>

On Fri, Jun 11, 2021 at 04:42:49PM -0300, Jason Gunthorpe wrote:
> On Fri, Jun 11, 2021 at 12:05:42PM -0700, Hugh Dickins wrote:
> > > diff --git a/arch/x86/include/asm/pgtable-3level.h b/arch/x86/include/asm/pgtable-3level.h
> > > index e896ebef8c24cb..0bf1fdec928e71 100644
> > > +++ b/arch/x86/include/asm/pgtable-3level.h
> > > @@ -75,7 +75,7 @@ static inline void native_set_pte(pte_t *ptep, pte_t pte)
> > >  static inline pmd_t pmd_read_atomic(pmd_t *pmdp)
> > >  {
> > >  	pmdval_t ret;
> > > -	u32 *tmp = (u32 *)pmdp;
> > > +	u32 *tmp = READ_ONCE((u32 *)pmdp);
> > >  
> > >  	ret = (pmdval_t) (*tmp);
> > >  	if (ret) {
> > > @@ -84,7 +84,7 @@ static inline pmd_t pmd_read_atomic(pmd_t *pmdp)
> > >  		 * or we can end up with a partial pmd.
> > >  		 */
> > >  		smp_rmb();
> > > -		ret |= ((pmdval_t)*(tmp + 1)) << 32;
> > > +		ret |= READ_ONCE((pmdval_t)*(tmp + 1)) << 32;
> > >  	}
> > 
> > Maybe that.  Or maybe now (since Will's changes) it can just do
> > one READ_ONCE() of the whole, then adjust its local copy.
> 
> I think the smb_rmb() is critical here to ensure a PTE table pointer
> is coherent, READ_ONCE is not a substitute, unless I am miss
> understanding what Will's changes are???

Yes, I agree that the barrier is needed here for x86 PAE. I would really
have liked to enforce native-sized access in READ_ONCE(), but unfortunately
there is plenty of code out there which is resilient to a 64-bit access
being split into two separate 32-bit accesses and so I wasn't able to go
that far.

That being said, pmd_read_atomic() probably _should_ be using READ_ONCE()
because using it inconsistently can give rise to broken codegen, e.g. if
you do:

	pmdval_t x, y, z;

	x = *pmdp;			// Invalid
	y = READ_ONCE(*pmdp);		// Valid
	if (pmd_valid(y))
		z = *pmdp;		// Invalid again!

Then the compiler can allocate the same register for x and z, but will
issue an additional load for y. If a concurrent update takes place to the
pmd which transitions from Invalid -> Valid, then it will look as though
things went back in time, because z will be stale. We actually hit this
on arm64 in practice [1].

Will

[1] https://lore.kernel.org/lkml/20171003114244.430374928@linuxfoundation.org/

  reply	other threads:[~2021-06-15  9:46 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-10  6:31 [PATCH 00/11] mm: page_vma_mapped_walk() cleanup and THP fixes Hugh Dickins
2021-06-10  6:34 ` [PATCH 01/11] mm: page_vma_mapped_walk(): use page for pvmw->page Hugh Dickins
2021-06-10  8:12   ` Alistair Popple
2021-06-10  8:55   ` Kirill A. Shutemov
2021-06-10 14:14     ` Peter Xu
2021-06-10 22:35       ` Hugh Dickins
2021-06-10  6:36 ` [PATCH 02/11] mm: page_vma_mapped_walk(): settle PageHuge on entry Hugh Dickins
2021-06-10  8:57   ` Kirill A. Shutemov
2021-06-10 14:17   ` Peter Xu
2021-06-10 22:45     ` Hugh Dickins
2021-06-10  6:38 ` [PATCH 03/11] mm: page_vma_mapped_walk(): use pmd_read_atomic() Hugh Dickins
2021-06-10  9:06   ` Kirill A. Shutemov
2021-06-10 12:15     ` Jason Gunthorpe
2021-06-11  6:37       ` Hugh Dickins
2021-06-11 15:36         ` Jason Gunthorpe
2021-06-11 19:05           ` Hugh Dickins
2021-06-11 19:42             ` Jason Gunthorpe
2021-06-15  9:46               ` Will Deacon [this message]
2021-06-16  0:42                 ` Jason Gunthorpe
2021-06-16 10:27                   ` Will Deacon
2021-06-11 19:33           ` Hugh Dickins
2021-06-10  6:40 ` [PATCH 04/11] mm: page_vma_mapped_walk(): use pmde for *pvmw->pmd Hugh Dickins
2021-06-10  9:10   ` Kirill A. Shutemov
2021-06-10 14:31   ` Peter Xu
2021-06-10  6:42 ` [PATCH 05/11] mm: page_vma_mapped_walk(): prettify PVMW_MIGRATION block Hugh Dickins
2021-06-10  9:16   ` Kirill A. Shutemov
2021-06-10 14:48   ` Peter Xu
2021-06-10  6:44 ` [PATCH 06/11] mm: page_vma_mapped_walk(): crossing page table boundary Hugh Dickins
2021-06-10  9:32   ` Kirill A. Shutemov
2021-06-10 23:02     ` Hugh Dickins
2021-06-11 11:23       ` Kirill A. Shutemov
2021-06-10  6:46 ` [PATCH 07/11] mm: page_vma_mapped_walk(): add a level of indentation Hugh Dickins
2021-06-10  9:34   ` Kirill A. Shutemov
2021-06-10  6:48 ` [PATCH 08/11] mm: page_vma_mapped_walk(): use goto instead of while (1) Hugh Dickins
2021-06-10  9:39   ` Kirill A. Shutemov
2021-06-10  6:50 ` [PATCH 09/11] mm: page_vma_mapped_walk(): get vma_address_end() earlier Hugh Dickins
2021-06-10  9:40   ` Kirill A. Shutemov
2021-06-10  6:52 ` [PATCH 10/11] mm/thp: fix page_vma_mapped_walk() if THP mapped by ptes Hugh Dickins
2021-06-10  9:42   ` Kirill A. Shutemov
2021-06-10  6:54 ` [PATCH 11/11] mm/thp: another PVMW_SYNC fix in page_vma_mapped_walk() Hugh Dickins
2021-06-10  9:43   ` Kirill A. Shutemov
2021-06-11 18:29     ` Hugh Dickins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210615094639.GC19878@willie-the-truck \
    --to=will@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=hughd@google.com \
    --cc=jgg@ziepe.ca \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=peterx@redhat.com \
    --cc=rcampbell@nvidia.com \
    --cc=shy828301@gmail.com \
    --cc=wangyugui@e16-tech.com \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).