linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: "Longpeng (Mike)" <longpeng2@huawei.com>,
	akpm@linux-foundation.org, kirill.shutemov@linux.intel.com,
	linux-kernel@vger.kernel.org, arei.gonglei@huawei.com,
	weidong.huang@huawei.com, weifuqiang@huawei.com,
	kvm@vger.kernel.org, linux-mm@kvack.org,
	Matthew Wilcox <willy@infradead.org>,
	Sean Christopherson <sean.j.christopherson@intel.com>,
	stable@vger.kernel.org
Subject: Re: [PATCH v2] mm/hugetlb: fix a addressing exception caused by huge_pte_offset()
Date: Mon, 23 Mar 2020 13:09:55 -0300	[thread overview]
Message-ID: <20200323160955.GY20941@ziepe.ca> (raw)
In-Reply-To: <51a25d55-de49-4c0a-c994-bf1a8cfc8638@oracle.com>

On Sat, Mar 21, 2020 at 04:38:19PM -0700, Mike Kravetz wrote:

> Andrew dropped this patch from his tree which caused me to go back and
> look at the status of this patch/issue.
> 
> It is pretty obvious that code in the current huge_pte_offset routine
> is racy.  I checked out the assembly code produced by my compiler and
> verified that the line,
> 
> 	if (pud_huge(*pud) || !pud_present(*pud))
> 
> does actually dereference *pud twice.  So, the value could change between
> those two dereferences.   Longpeng (Mike) could easlily recreate the issue
> if he put a delay between the two dereferences.  I believe the only
> reservations/concerns about the patch below was the use of READ_ONCE().
> Is that correct?

I'm looking at a similar situation in pagewalk.c right now with PUD,
and it is very confusing to see that locks are being held, memory
accessed without READ_ONCE, but actually it has concurrent writes.

I think it is valuable to annotate with READ_ONCE when the author
knows this is an unlocked data access, regardless of what the compiler
does.

pagewalk probably has the same racy bug you show here, I'm going to
send a very similar looking patch to pagewalk hopefully soon.

Also, the remark about pmd_offset() seems accurate. The
get_user_fast_pages() pattern seems like the correct one to emulate:

  pud = READ_ONCE(*pudp);
  if (pud_none(pud)) 
     ..
  if (!pud_'is a pmd pointer')
     ..
  pmdp = pmd_offset(&pud, address);
  pmd = READ_ONCE(*pmd);
  [...]

Passing &pud in avoids another de-reference of the pudp. Honestly all
these APIs that take in page table pointers and internally
de-reference them seem very hard to use correctly when the page table
access isn't fully locked against write.

This also relies on 'some kind of locking' to prevent the pmdp from
becoming freed concurrently while this is running.

.. also this only works if READ_ONCE() is atomic, ie the pud can't be
64 bit on a 32 bit platform. At least pmd has this problem, I haven't
figured out if pud does??

> Are there any objections to the patch if the READ_ONCE() calls are removed?

I think if we know there is no concurrent data access then it makes
sense to keep the READ_ONCE.

It looks like at least the p4d read from the pgd is also unlocked here
as handle_mm_fault() writes to it??

Jason


  parent reply	other threads:[~2020-03-23 16:09 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-22  3:33 [PATCH v2] mm/hugetlb: fix a addressing exception caused by huge_pte_offset() Longpeng(Mike)
2020-03-21 23:38 ` Mike Kravetz
2020-03-23  2:03   ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2020-03-23  2:54     ` Mike Kravetz
2020-03-23  3:43       ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2020-03-23 14:40       ` Sean Christopherson
2020-03-23 16:44         ` Jason Gunthorpe
2020-03-23 16:09   ` Jason Gunthorpe [this message]
2020-03-23 17:27     ` Mike Kravetz
2020-03-23 18:07       ` Jason Gunthorpe
2020-03-23 20:35         ` Mike Kravetz
2020-03-23 22:52           ` Jason Gunthorpe
2020-03-24  2:37             ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2020-03-24 11:55               ` Jason Gunthorpe
2020-03-24 15:25                 ` Mike Kravetz
2020-03-24 15:55                   ` Jason Gunthorpe
2020-03-24 16:19                     ` Mike Kravetz
2020-03-24 17:59                       ` Jason Gunthorpe
2020-03-24 19:47                         ` Mike Kravetz
2020-02-22  5:23 Qian Cai
2020-02-22  6:33 ` Longpeng (Mike)
2020-02-22 11:50   ` Qian Cai
2020-02-22 17:02   ` Matthew Wilcox
2020-02-23  1:24     ` Longpeng (Mike)
2020-02-27 21:41       ` Mike Kravetz
2020-03-21 22:46         ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200323160955.GY20941@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=akpm@linux-foundation.org \
    --cc=arei.gonglei@huawei.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=longpeng2@huawei.com \
    --cc=mike.kravetz@oracle.com \
    --cc=sean.j.christopherson@intel.com \
    --cc=stable@vger.kernel.org \
    --cc=weidong.huang@huawei.com \
    --cc=weifuqiang@huawei.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).