linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: "Kirill A. Shutemov" <kirill@shutemov.name>,
	Jann Horn <jannh@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Zach O'Keefe <zokeefe@google.com>,
	linux-kernel@vger.kernel.org, Yang Shi <shy828301@gmail.com>
Subject: Re: [PATCH] mm/khugepaged: Fix ->anon_vma race
Date: Mon, 16 Jan 2023 14:07:41 +0100	[thread overview]
Message-ID: <5a7fdfa7-5b25-0ed4-2479-661d387b397b@redhat.com> (raw)
In-Reply-To: <20230116123403.fiyv22esqgh7bzp3@box.shutemov.name>

On 16.01.23 13:34, Kirill A. Shutemov wrote:
> On Mon, Jan 16, 2023 at 01:06:59PM +0100, Jann Horn wrote:
>> On Sun, Jan 15, 2023 at 8:07 PM Kirill A. Shutemov <kirill@shutemov.name> wrote:
>>> On Fri, Jan 13, 2023 at 08:28:59PM +0100, Jann Horn wrote:
>>>> No, that lockdep assert has to be there. Page table traversal is
>>>> allowed under any one of the mmap lock, the anon_vma lock (if the VMA
>>>> is associated with an anon_vma), and the mapping lock (if the VMA is
>>>> associated with a mapping); and so to be able to remove page tables,
>>>> we must hold all three of them.
>>>
>>> Okay, that's fair. I agree with the patch now. Maybe adjust the commit
>>> message a bit?
>>
>> Just to make sure we're on the same page: Are you suggesting that I
>> add this text?
>> "Page table traversal is allowed under any one of the mmap lock, the
>> anon_vma lock (if the VMA is associated with an anon_vma), and the
>> mapping lock (if the VMA is associated with a mapping); and so to be
>> able to remove page tables, we must hold all three of them."
>> Or something else?
> 
> Looks good to me.
> 
>>> Anyway:
>>>
>>> Acked-by: Kirill A. Shutemov <kirill.shutemov@intel.linux.com>
>>
>> Thanks!
>>
>>> BTW, I've noticied that you recently added tlb_remove_table_sync_one().
>>> I'm not sure why it is needed. Why IPI in pmdp_collapse_flush() in not
>>> good enough to serialize against GUP fast?
>>
>> If that sent an IPI, it would be good enough; but
>> pmdp_collapse_flush() is not guaranteed to send an IPI.
>> It does a TLB flush, but on some architectures (including arm64 and
>> also virtualized x86), a remote TLB flush can be done without an IPI.
>> For example, arm64 has some fancy hardware support for remote TLB
>> invalidation without IPIs ("broadcast TLB invalidation"), and
>> virtualized x86 has (depending on the hypervisor) things like TLB
>> shootdown hypercalls (under Hyper-V, see hyperv_flush_tlb_multi) or
>> TLB shootdown signalling for preempted CPUs through shared memory
>> (under KVM, see kvm_flush_tlb_multi).
> 
> I think such architectures must provide proper pmdp_collapse_flush()
> with the required serialization. Power and S390 already do that.
> 

The plan is to eventually move away from (ab)using IPI to synchronize with
GUP-fast. Moving further into that direction a is wrong.

The flush was added as a quick fix for all architectures by Jann, until
we can do better.

Even for ppc64, see:

commit bedf03416913d88c796288f9dca109a53608c745
Author: Yang Shi <shy828301@gmail.com>
Date:   Wed Sep 7 11:01:44 2022 -0700

     powerpc/64s/radix: don't need to broadcast IPI for radix pmd collapse flush
     
     The IPI broadcast is used to serialize against fast-GUP, but fast-GUP will
     move to use RCU instead of disabling local interrupts in fast-GUP.  Using
     an IPI is the old-styled way of serializing against fast-GUP although it
     still works as expected now.
     
     And fast-GUP now fixed the potential race with THP collapse by checking
     whether PMD is changed or not.  So IPI broadcast in radix pmd collapse
     flush is not necessary anymore.  But it is still needed for hash TLB.


-- 
Thanks,

David / dhildenb



  parent reply	other threads:[~2023-01-16 13:07 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-11 13:33 [PATCH] mm/khugepaged: Fix ->anon_vma race Jann Horn
2023-01-12  1:06 ` Yang Shi
2023-01-13 19:36   ` Jann Horn
2023-01-12  8:56 ` Kirill A. Shutemov
2023-01-12 18:12   ` Yang Shi
2023-01-13  0:10     ` Kirill A. Shutemov
2023-01-13  3:22       ` Yang Shi
2023-01-13 19:28   ` Jann Horn
2023-01-15 19:06     ` Kirill A. Shutemov
2023-01-16 12:06       ` Jann Horn
2023-01-16 12:34         ` Kirill A. Shutemov
2023-01-16 12:54           ` Jann Horn
2023-01-16 13:07           ` David Hildenbrand [this message]
2023-01-16 13:47             ` Kirill A. Shutemov
2023-01-23 11:07               ` David Hildenbrand
2023-01-24  0:51                 ` Kirill A. Shutemov
2023-01-24 10:19                   ` David Hildenbrand
2023-01-17 18:57       ` Yang Shi
2023-01-17 19:12 ` Jann Horn
2023-01-17 22:55   ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5a7fdfa7-5b25-0ed4-2479-661d387b397b@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=jannh@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=shy828301@gmail.com \
    --cc=zokeefe@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).