All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Jann Horn <jannh@google.com>
Cc: security@kernel.org, Andrew Morton <akpm@linux-foundation.org>,
	Yang Shi <shy828301@gmail.com>, Peter Xu <peterx@redhat.com>,
	John Hubbard <jhubbard@nvidia.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH v3 1/3] mm/khugepaged: Take the right locks for page table retraction
Date: Mon, 28 Nov 2022 18:34:40 +0100	[thread overview]
Message-ID: <a830d5fb-a2e7-6fdd-0426-8b884b4141bc@redhat.com> (raw)
In-Reply-To: <CAG48ez11R7LMxCM0QrqHT16ugqsECswCosNkQV62QsLZaLaeYQ@mail.gmail.com>

On 28.11.22 18:28, Jann Horn wrote:
> On Mon, Nov 28, 2022 at 2:53 PM David Hildenbrand <david@redhat.com> wrote:
>> On 25.11.22 22:37, Jann Horn wrote:
>>> pagetable walks on address ranges mapped by VMAs can be done under the mmap
>>> lock, the lock of an anon_vma attached to the VMA, or the lock of the VMA's
>>> address_space. Only one of these needs to be held, and it does not need to
>>> be held in exclusive mode.
>>>
>>> Under those circumstances, the rules for concurrent access to page table
>>> entries are:
>>>
>>>    - Terminal page table entries (entries that don't point to another page
>>>      table) can be arbitrarily changed under the page table lock, with the
>>>      exception that they always need to be consistent for
>>>      hardware page table walks and lockless_pages_from_mm().
>>>      This includes that they can be changed into non-terminal entries.
>>>    - Non-terminal page table entries (which point to another page table)
>>>      can not be modified; readers are allowed to READ_ONCE() an entry, verify
>>>      that it is non-terminal, and then assume that its value will stay as-is.
>>>
>>> Retracting a page table involves modifying a non-terminal entry, so
>>> page-table-level locks are insufficient to protect against concurrent
>>> page table traversal; it requires taking all the higher-level locks under
>>> which it is possible to start a page walk in the relevant range in
>>> exclusive mode.
>>>
>>> The collapse_huge_page() path for anonymous THP already follows this rule,
>>> but the shmem/file THP path was getting it wrong, making it possible for
>>> concurrent rmap-based operations to cause corruption.
>>
>> This sounds sane and correct to me. No expert on file-THP, though.
>>
>> For anon-THP it's the mmap lock and the rmap locks. I assume the only
>> difference for file-THP is that the rmap lock is actually the mapping
>> lock. Looking at rmap_walk_file(), that seems to be the case.
> 
> Yeah. You can also have private file VMAs that are associated with
> both a mapping and a set of anon_vmas, and in that case you would need
> to lock the mmap, the mapping, and the anon_vma root; but the file THP
> code in khugepaged instead just bails on file VMAs with an anon_vma.

Right, that's my understanding as well.

> 
>> I wish at least PTE table removal could be done easier ... I already
>> experimented some time ago with some ideas (e.g., lock in PMD table
>> memmap) but it's all far from trivial and space in the memmap is rare.
> 
> Because you want it to be faster? Is that for the THP usecase or something else?

Page table reclaim and page table migration, where you might only have 
limited context and wouldn't want to take all these expensive locks in 
write mode (IOW, you wouldn't want to care about them at all).

Feel free to add my

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb


      reply	other threads:[~2022-11-28 17:35 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-25 21:37 [PATCH v3 1/3] mm/khugepaged: Take the right locks for page table retraction Jann Horn
2022-11-25 21:37 ` [PATCH v3 2/3] mm/khugepaged: Fix GUP-fast interaction by sending IPI Jann Horn
2022-11-28 13:46   ` David Hildenbrand
2022-11-28 16:58     ` Jann Horn
2022-11-28 17:00       ` David Hildenbrand
2022-11-25 21:37 ` [PATCH v3 3/3] mm/khugepaged: Invoke MMU notifiers in shmem/file collapse paths Jann Horn
2022-11-28 17:37   ` David Hildenbrand
2022-11-28 17:57     ` Jann Horn
2022-11-28 18:06       ` David Hildenbrand
2022-11-28 13:52 ` [PATCH v3 1/3] mm/khugepaged: Take the right locks for page table retraction David Hildenbrand
2022-11-28 17:28   ` Jann Horn
2022-11-28 17:34     ` David Hildenbrand [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a830d5fb-a2e7-6fdd-0426-8b884b4141bc@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=jannh@google.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=peterx@redhat.com \
    --cc=security@kernel.org \
    --cc=shy828301@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.