All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: "Longpeng (Mike)" <longpeng2@huawei.com>,
	akpm@linux-foundation.org, kirill.shutemov@linux.intel.com,
	linux-kernel@vger.kernel.org, arei.gonglei@huawei.com,
	weidong.huang@huawei.com, weifuqiang@huawei.com,
	kvm@vger.kernel.org, linux-mm@kvack.org,
	Matthew Wilcox <willy@infradead.org>,
	Sean Christopherson <sean.j.christopherson@intel.com>,
	stable@vger.kernel.org
Subject: Re: [PATCH v2] mm/hugetlb: fix a addressing exception caused by huge_pte_offset()
Date: Mon, 23 Mar 2020 15:07:06 -0300	[thread overview]
Message-ID: <20200323180706.GC20941@ziepe.ca> (raw)
In-Reply-To: <69055395-e7e5-a8e2-7f3e-f61607149318@oracle.com>

On Mon, Mar 23, 2020 at 10:27:48AM -0700, Mike Kravetz wrote:

> >  	pgd = pgd_offset(mm, addr);
> > -	if (!pgd_present(*pgd))
> > +	if (!pgd_present(READ_ONCE(*pgd)))
> >  		return NULL;
> >  	p4d = p4d_offset(pgd, addr);
> > -	if (!p4d_present(*p4d))
> > +	if (!p4d_present(READ_ONCE(*p4d)))
> >  		return NULL;
> >  
> >       pud = pud_offset(p4d, addr);
> 
> One would argue that pgd and p4d can not change from present to !present
> during the execution of this code.  To me, that seems like the issue which
> would cause an issue.  Of course, I could be missing something.

This I am not sure of, I think it must be true under the read side of
the mmap_sem, but probably not guarenteed under RCU..

In any case, it doesn't matter, the fact that *p4d can change at all
is problematic. Unwinding the above inlines we get:

  p4d = p4d_offset(pgd, addr)
  if (!p4d_present(*p4d))
      return NULL;
  pud = (pud_t *)p4d_page_vaddr(*p4d) + pud_index(address);

According to our memory model the compiler/CPU is free to execute this
as:

  p4d = p4d_offset(pgd, addr)
  p4d_for_vaddr = *p4d;
  if (!p4d_present(*p4d))
      return NULL;
  pud = (pud_t *)p4d_page_vaddr(p4d_for_vaddr) + pud_index(address);

In the case where p4 goes from !present -> present (ie
handle_mm_fault()):

p4d_for_vaddr == p4d_none, and p4d_present(*p4d) == true, meaning the
p4d_page_vaddr() will crash.

Basically the problem here is not just missing READ_ONCE, but that the
p4d is read multiple times at all. It should be written like gup_fast
does, to guarantee a single CPU read of the unstable data:

  p4d = READ_ONCE(*p4d_offset(pgdp, addr));
  if (!p4d_present(p4))
      return NULL;
  pud = pud_offset(&p4d, addr);

At least this is what I've been able to figure out :\

> > Also, the remark about pmd_offset() seems accurate. The
> > get_user_fast_pages() pattern seems like the correct one to emulate:
> > 
> >   pud = READ_ONCE(*pudp);
> >   if (pud_none(pud)) 
> >      ..
> >   if (!pud_'is a pmd pointer')
> >      ..
> >   pmdp = pmd_offset(&pud, address);
> >   pmd = READ_ONCE(*pmd);
> >   [...]
> > 
> > Passing &pud in avoids another de-reference of the pudp. Honestly all
> > these APIs that take in page table pointers and internally
> > de-reference them seem very hard to use correctly when the page table
> > access isn't fully locked against write.

And the same protocol for the PUD, etc.

> > It looks like at least the p4d read from the pgd is also unlocked here
> > as handle_mm_fault() writes to it??
> 
> Yes, there is no locking required to call huge_pte_offset().

None? Not RCU or read mmap_sem?

Jason

  reply	other threads:[~2020-03-23 18:07 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-22  3:33 [PATCH v2] mm/hugetlb: fix a addressing exception caused by huge_pte_offset() Longpeng(Mike)
2020-03-21 23:38 ` Mike Kravetz
2020-03-23  2:03   ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2020-03-23  2:03     ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2020-03-23  2:54     ` Mike Kravetz
2020-03-23  2:54       ` Mike Kravetz
2020-03-23  3:43       ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2020-03-23  3:43         ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2020-03-23 14:40       ` Sean Christopherson
2020-03-23 14:40         ` Sean Christopherson
2020-03-23 16:44         ` Jason Gunthorpe
2020-03-23 16:44           ` Jason Gunthorpe
2020-03-23 16:09   ` Jason Gunthorpe
2020-03-23 17:27     ` Mike Kravetz
2020-03-23 18:07       ` Jason Gunthorpe [this message]
2020-03-23 20:35         ` Mike Kravetz
2020-03-23 22:52           ` Jason Gunthorpe
2020-03-24  2:37             ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2020-03-24  2:37               ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2020-03-24 11:55               ` Jason Gunthorpe
2020-03-24 11:55                 ` Jason Gunthorpe
2020-03-24 15:25                 ` Mike Kravetz
2020-03-24 15:25                   ` Mike Kravetz
2020-03-24 15:55                   ` Jason Gunthorpe
2020-03-24 15:55                     ` Jason Gunthorpe
2020-03-24 16:19                     ` Mike Kravetz
2020-03-24 16:19                       ` Mike Kravetz
2020-03-24 17:59                       ` Jason Gunthorpe
2020-03-24 17:59                         ` Jason Gunthorpe
2020-03-24 19:47                         ` Mike Kravetz
2020-03-24 19:47                           ` Mike Kravetz
2020-02-22  5:23 Qian Cai
2020-02-22  6:33 ` Longpeng (Mike)
2020-02-22 11:50   ` Qian Cai
2020-02-22 17:02   ` Matthew Wilcox
2020-02-23  1:24     ` Longpeng (Mike)
2020-02-27 21:41       ` Mike Kravetz
2020-03-21 22:46         ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200323180706.GC20941@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=akpm@linux-foundation.org \
    --cc=arei.gonglei@huawei.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=longpeng2@huawei.com \
    --cc=mike.kravetz@oracle.com \
    --cc=sean.j.christopherson@intel.com \
    --cc=stable@vger.kernel.org \
    --cc=weidong.huang@huawei.com \
    --cc=weifuqiang@huawei.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.