All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Koenig, Christian" <Christian.Koenig@amd.com>
To: "Yang, Philip" <Philip.Yang@amd.com>, Jason Gunthorpe <jgg@mellanox.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
	Ralph Campbell <rcampbell@nvidia.com>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	John Hubbard <jhubbard@nvidia.com>,
	"Kuehling, Felix" <Felix.Kuehling@amd.com>,
	"amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Jerome Glisse <jglisse@redhat.com>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>,
	Ben Skeggs <bskeggs@redhat.com>
Subject: Re: [PATCH hmm 00/15] Consolidate the mmu notifier interval_tree and locking
Date: Thu, 17 Oct 2019 16:47:20 +0000	[thread overview]
Message-ID: <d6bcbd2a-2519-8945-eaf5-4f4e738c7fa9@amd.com> (raw)
In-Reply-To: <2046e0b4-ba05-0683-5804-e9bbf903658d@amd.com>

Sending once more as text.

Am 17.10.19 um 18:26 schrieb Yang, Philip:
> On 2019-10-17 4:54 a.m., Christian König wrote:
>> Am 16.10.19 um 18:04 schrieb Jason Gunthorpe:
>>> On Wed, Oct 16, 2019 at 10:58:02AM +0200, Christian König wrote:
>>>> Am 15.10.19 um 20:12 schrieb Jason Gunthorpe:
>>>>> From: Jason Gunthorpe <jgg@mellanox.com>
>>>>>
>>>>> 8 of the mmu_notifier using drivers (i915_gem, radeon_mn, umem_odp,
>>>>> hfi1,
>>>>> scif_dma, vhost, gntdev, hmm) drivers are using a common pattern where
>>>>> they only use invalidate_range_start/end and immediately check the
>>>>> invalidating range against some driver data structure to tell if the
>>>>> driver is interested. Half of them use an interval_tree, the others are
>>>>> simple linear search lists.
>>>>>
>>>>> Of the ones I checked they largely seem to have various kinds of races,
>>>>> bugs and poor implementation. This is a result of the complexity in how
>>>>> the notifier interacts with get_user_pages(). It is extremely
>>>>> difficult to
>>>>> use it correctly.
>>>>>
>>>>> Consolidate all of this code together into the core mmu_notifier and
>>>>> provide a locking scheme similar to hmm_mirror that allows the user to
>>>>> safely use get_user_pages() and reliably know if the page list still
>>>>> matches the mm.
>>>> That sounds really good, but could you outline for a moment how that is
>>>> archived?
>>> It uses the same basic scheme as hmm and rdma odp, outlined in the
>>> revisions to hmm.rst later on.
>>>
>>> Basically,
>>>
>>>    seq = mmu_range_read_begin(&mrn);
>>>
>>>    // This is a speculative region
>>>    .. get_user_pages()/hmm_range_fault() ..
>> How do we enforce that this get_user_pages()/hmm_range_fault() doesn't
>> see outdated page table information?
>>
>> In other words how the the following race prevented:
>>
>> CPU A CPU B
>> invalidate_range_start()
>>         mmu_range_read_begin()
>>         get_user_pages()/hmm_range_fault()
>> Updating the ptes
>> invalidate_range_end()
>>
>>
>> I mean get_user_pages() tries to circumvent this issue by grabbing a
>> reference to the pages in question, but that isn't sufficient for the
>> SVM use case.
>>
>> That's the reason why we had this horrible solution with a r/w lock and
>> a linked list of BOs in an interval tree.
>>
>> Regards,
>> Christian.
> get_user_pages/hmm_range_fault() and invalidate_range_start() both are
> called while holding mm->map_sem, so they are always serialized.

Not even remotely.

For calling get_user_pages()/hmm_range_fault() you only need to hold the 
mmap_sem in read mode.

And IIRC invalidate_range_start() is sometimes called without holding 
the mmap_sem at all.

So again how are they serialized?

Regards,
Christian.

>
> Philip
>>>    // Result cannot be derferenced
>>>
>>>    take_lock(driver->update);
>>>    if (mmu_range_read_retry(&mrn, range.notifier_seq) {
>>>       // collision! The results are not correct
>>>       goto again
>>>    }
>>>
>>>    // no collision, and now under lock. Now we can de-reference the
>>> pages/etc
>>>    // program HW
>>>    // Now the invalidate callback is responsible to synchronize against
>>> changes
>>>    unlock(driver->update)
>>>
>>> Basically, anything that was using hmm_mirror correctly transisions
>>> over fairly trivially, just with the modification to store a sequence
>>> number to close that race described in the hmm commit.
>>>
>>> For something like AMD gpu I expect it to transition to use dma_fence
>>> from the notifier for coherency right before it unlocks driver->update.
>>>
>>> Jason
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


WARNING: multiple messages have this Message-ID (diff)
From: "Koenig, Christian" <Christian.Koenig-5C7GfCeVMHo@public.gmane.org>
To: "Yang, Philip" <Philip.Yang-5C7GfCeVMHo@public.gmane.org>,
	Jason Gunthorpe <jgg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Andrea Arcangeli
	<aarcange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Ralph Campbell
	<rcampbell-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>,
	"linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	John Hubbard <jhubbard-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>,
	"Kuehling, Felix" <Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>,
	"amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org"
	<amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>,
	"linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org"
	<linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org>,
	Jerome Glisse <jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	"dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org"
	<dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>,
	Ben Skeggs <bskeggs-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Subject: Re: [PATCH hmm 00/15] Consolidate the mmu notifier interval_tree and locking
Date: Thu, 17 Oct 2019 16:47:20 +0000	[thread overview]
Message-ID: <d6bcbd2a-2519-8945-eaf5-4f4e738c7fa9@amd.com> (raw)
In-Reply-To: <2046e0b4-ba05-0683-5804-e9bbf903658d-5C7GfCeVMHo@public.gmane.org>

Sending once more as text.

Am 17.10.19 um 18:26 schrieb Yang, Philip:
> On 2019-10-17 4:54 a.m., Christian König wrote:
>> Am 16.10.19 um 18:04 schrieb Jason Gunthorpe:
>>> On Wed, Oct 16, 2019 at 10:58:02AM +0200, Christian König wrote:
>>>> Am 15.10.19 um 20:12 schrieb Jason Gunthorpe:
>>>>> From: Jason Gunthorpe <jgg@mellanox.com>
>>>>>
>>>>> 8 of the mmu_notifier using drivers (i915_gem, radeon_mn, umem_odp,
>>>>> hfi1,
>>>>> scif_dma, vhost, gntdev, hmm) drivers are using a common pattern where
>>>>> they only use invalidate_range_start/end and immediately check the
>>>>> invalidating range against some driver data structure to tell if the
>>>>> driver is interested. Half of them use an interval_tree, the others are
>>>>> simple linear search lists.
>>>>>
>>>>> Of the ones I checked they largely seem to have various kinds of races,
>>>>> bugs and poor implementation. This is a result of the complexity in how
>>>>> the notifier interacts with get_user_pages(). It is extremely
>>>>> difficult to
>>>>> use it correctly.
>>>>>
>>>>> Consolidate all of this code together into the core mmu_notifier and
>>>>> provide a locking scheme similar to hmm_mirror that allows the user to
>>>>> safely use get_user_pages() and reliably know if the page list still
>>>>> matches the mm.
>>>> That sounds really good, but could you outline for a moment how that is
>>>> archived?
>>> It uses the same basic scheme as hmm and rdma odp, outlined in the
>>> revisions to hmm.rst later on.
>>>
>>> Basically,
>>>
>>>    seq = mmu_range_read_begin(&mrn);
>>>
>>>    // This is a speculative region
>>>    .. get_user_pages()/hmm_range_fault() ..
>> How do we enforce that this get_user_pages()/hmm_range_fault() doesn't
>> see outdated page table information?
>>
>> In other words how the the following race prevented:
>>
>> CPU A CPU B
>> invalidate_range_start()
>>         mmu_range_read_begin()
>>         get_user_pages()/hmm_range_fault()
>> Updating the ptes
>> invalidate_range_end()
>>
>>
>> I mean get_user_pages() tries to circumvent this issue by grabbing a
>> reference to the pages in question, but that isn't sufficient for the
>> SVM use case.
>>
>> That's the reason why we had this horrible solution with a r/w lock and
>> a linked list of BOs in an interval tree.
>>
>> Regards,
>> Christian.
> get_user_pages/hmm_range_fault() and invalidate_range_start() both are
> called while holding mm->map_sem, so they are always serialized.

Not even remotely.

For calling get_user_pages()/hmm_range_fault() you only need to hold the 
mmap_sem in read mode.

And IIRC invalidate_range_start() is sometimes called without holding 
the mmap_sem at all.

So again how are they serialized?

Regards,
Christian.

>
> Philip
>>>    // Result cannot be derferenced
>>>
>>>    take_lock(driver->update);
>>>    if (mmu_range_read_retry(&mrn, range.notifier_seq) {
>>>       // collision! The results are not correct
>>>       goto again
>>>    }
>>>
>>>    // no collision, and now under lock. Now we can de-reference the
>>> pages/etc
>>>    // program HW
>>>    // Now the invalidate callback is responsible to synchronize against
>>> changes
>>>    unlock(driver->update)
>>>
>>> Basically, anything that was using hmm_mirror correctly transisions
>>> over fairly trivially, just with the modification to store a sequence
>>> number to close that race described in the hmm commit.
>>>
>>> For something like AMD gpu I expect it to transition to use dma_fence
>>> from the notifier for coherency right before it unlocks driver->update.
>>>
>>> Jason
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

  reply	other threads:[~2019-10-17 16:47 UTC|newest]

Thread overview: 138+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-15 18:12 [PATCH hmm 00/15] Consolidate the mmu notifier interval_tree and locking Jason Gunthorpe
2019-10-15 18:12 ` Jason Gunthorpe
2019-10-15 18:12 ` [PATCH hmm 01/15] mm/mmu_notifier: define the header pre-processor parts even if disabled Jason Gunthorpe
2019-10-15 18:12   ` Jason Gunthorpe
2019-10-21 18:32   ` Jerome Glisse
2019-10-21 18:32     ` Jerome Glisse
2019-10-15 18:12 ` [PATCH hmm 02/15] mm/mmu_notifier: add an interval tree notifier Jason Gunthorpe
2019-10-15 18:12   ` Jason Gunthorpe
2019-10-21 18:30   ` Jerome Glisse
2019-10-21 18:30     ` Jerome Glisse
2019-10-21 18:54     ` Jason Gunthorpe
2019-10-21 18:54       ` Jason Gunthorpe
2019-10-21 19:11       ` Jerome Glisse
2019-10-21 19:11         ` Jerome Glisse
2019-10-21 19:24         ` Jason Gunthorpe
2019-10-21 19:24           ` Jason Gunthorpe
2019-10-21 19:47           ` Jerome Glisse
2019-10-21 19:47             ` Jerome Glisse
2019-10-27 23:15   ` Jason Gunthorpe
2019-10-27 23:15     ` Jason Gunthorpe
2019-10-27 23:15     ` Jason Gunthorpe
2019-10-15 18:12 ` [PATCH hmm 03/15] mm/hmm: allow hmm_range to be used with a mmu_range_notifier or hmm_mirror Jason Gunthorpe
2019-10-15 18:12   ` Jason Gunthorpe
2019-10-21 18:33   ` Jerome Glisse
2019-10-21 18:33     ` Jerome Glisse
2019-10-15 18:12 ` [PATCH hmm 04/15] mm/hmm: define the pre-processor related parts of hmm.h even if disabled Jason Gunthorpe
2019-10-15 18:12   ` Jason Gunthorpe
2019-10-21 18:31   ` Jerome Glisse
2019-10-21 18:31     ` Jerome Glisse
2019-10-15 18:12 ` [PATCH hmm 05/15] RDMA/odp: Use mmu_range_notifier_insert() Jason Gunthorpe
2019-10-15 18:12   ` Jason Gunthorpe
2019-11-04 20:25   ` Jason Gunthorpe
2019-11-04 20:25     ` Jason Gunthorpe
2019-11-04 20:25     ` Jason Gunthorpe
2019-10-15 18:12 ` [PATCH hmm 06/15] RDMA/hfi1: Use mmu_range_notifier_inset for user_exp_rcv Jason Gunthorpe
2019-10-15 18:12   ` Jason Gunthorpe
2019-10-29 12:15   ` Dennis Dalessandro
2019-10-29 12:15     ` Dennis Dalessandro
2019-10-29 12:15     ` Dennis Dalessandro
2019-10-15 18:12 ` [PATCH hmm 07/15] drm/radeon: use mmu_range_notifier_insert Jason Gunthorpe
2019-10-15 18:12   ` Jason Gunthorpe
2019-10-15 18:12 ` [PATCH hmm 08/15] xen/gntdev: Use select for DMA_SHARED_BUFFER Jason Gunthorpe
2019-10-15 18:12   ` [Xen-devel] " Jason Gunthorpe
2019-10-15 18:12   ` Jason Gunthorpe
2019-10-16  5:11   ` Jürgen Groß
2019-10-16  5:11     ` [Xen-devel] " Jürgen Groß
2019-10-16  5:11     ` Jürgen Groß
2019-10-16  6:35     ` Oleksandr Andrushchenko
2019-10-16  6:35       ` [Xen-devel] " Oleksandr Andrushchenko
2019-10-16  6:35       ` Oleksandr Andrushchenko
2019-10-21 19:12       ` Jason Gunthorpe
2019-10-21 19:12         ` [Xen-devel] " Jason Gunthorpe
2019-10-21 19:12         ` Jason Gunthorpe
2019-10-28  6:25         ` [Xen-devel] " Oleksandr Andrushchenko
2019-10-28  6:25           ` Oleksandr Andrushchenko
2019-10-28  6:25           ` Oleksandr Andrushchenko
2019-10-28  6:25           ` Oleksandr Andrushchenko
2019-10-28  6:25           ` Oleksandr Andrushchenko
2019-10-15 18:12 ` [PATCH hmm 09/15] xen/gntdev: use mmu_range_notifier_insert Jason Gunthorpe
2019-10-15 18:12   ` [Xen-devel] " Jason Gunthorpe
2019-10-15 18:12   ` Jason Gunthorpe
2019-10-15 18:12 ` [PATCH hmm 10/15] nouveau: use mmu_notifier directly for invalidate_range_start Jason Gunthorpe
2019-10-15 18:12   ` Jason Gunthorpe
2019-10-15 18:12 ` [PATCH hmm 11/15] nouveau: use mmu_range_notifier instead of hmm_mirror Jason Gunthorpe
2019-10-15 18:12   ` Jason Gunthorpe
2019-10-15 18:12 ` [PATCH hmm 12/15] drm/amdgpu: Call find_vma under mmap_sem Jason Gunthorpe
2019-10-15 18:12   ` Jason Gunthorpe
2019-10-15 18:12 ` [PATCH hmm 13/15] drm/amdgpu: Use mmu_range_insert instead of hmm_mirror Jason Gunthorpe
2019-10-15 18:12   ` Jason Gunthorpe
2019-10-15 18:12 ` [PATCH hmm 14/15] drm/amdgpu: Use mmu_range_notifier " Jason Gunthorpe
2019-10-15 18:12   ` Jason Gunthorpe
2019-10-15 18:12 ` [PATCH hmm 15/15] mm/hmm: remove hmm_mirror and related Jason Gunthorpe
2019-10-15 18:12   ` Jason Gunthorpe
2019-10-21 18:38   ` Jerome Glisse
2019-10-21 18:38     ` Jerome Glisse
2019-10-21 18:57     ` Jason Gunthorpe
2019-10-21 18:57       ` Jason Gunthorpe
2019-10-21 19:19       ` Jerome Glisse
2019-10-21 19:19         ` Jerome Glisse
2019-10-16  8:58 ` [PATCH hmm 00/15] Consolidate the mmu notifier interval_tree and locking Christian König
2019-10-16  8:58   ` Christian König
2019-10-16 16:04   ` Jason Gunthorpe
2019-10-16 16:04     ` Jason Gunthorpe
2019-10-17  8:54     ` Christian König
2019-10-17  8:54       ` Christian König
2019-10-17 16:26       ` Yang, Philip
2019-10-17 16:26         ` Yang, Philip
2019-10-17 16:47         ` Koenig, Christian [this message]
2019-10-17 16:47           ` Koenig, Christian
2019-10-18 20:36           ` Jason Gunthorpe
2019-10-18 20:36             ` Jason Gunthorpe
2019-10-20 14:21             ` Koenig, Christian
2019-10-20 14:21               ` Koenig, Christian
2019-10-21 13:57               ` Jason Gunthorpe
2019-10-21 13:57                 ` Jason Gunthorpe
2019-10-21 14:28                 ` Koenig, Christian
2019-10-21 14:28                   ` Koenig, Christian
2019-10-21 15:12                   ` Jason Gunthorpe
2019-10-21 15:12                     ` Jason Gunthorpe
2019-10-22  7:57                     ` Daniel Vetter
2019-10-22  7:57                       ` Daniel Vetter
2019-10-22 15:01                       ` Jason Gunthorpe
2019-10-22 15:01                         ` Jason Gunthorpe
2019-10-23  9:08                         ` Daniel Vetter
2019-10-23  9:08                           ` Daniel Vetter
2019-10-23  9:08                           ` Daniel Vetter
2019-10-23  9:32                           ` Christian König
2019-10-23  9:32                             ` Christian König
2019-10-23  9:32                             ` Christian König
2019-10-23 16:52                             ` Jerome Glisse
2019-10-23 16:52                               ` Jerome Glisse
2019-10-23 16:52                               ` Jerome Glisse
2019-10-23 16:52                               ` Jerome Glisse
2019-10-23 17:24                               ` Jason Gunthorpe
2019-10-23 17:24                                 ` Jason Gunthorpe
2019-10-23 17:24                                 ` Jason Gunthorpe
2019-10-23 17:24                                 ` Jason Gunthorpe
2019-10-24  2:16                                 ` Christoph Hellwig
2019-10-24  2:16                                   ` Christoph Hellwig
2019-10-24  2:16                                   ` Christoph Hellwig
2019-10-21 15:55 ` Dennis Dalessandro
2019-10-21 15:55   ` Dennis Dalessandro
2019-10-21 16:58   ` Jason Gunthorpe
2019-10-21 16:58     ` Jason Gunthorpe
2019-10-22 11:56     ` Dennis Dalessandro
2019-10-22 11:56       ` Dennis Dalessandro
2019-10-22 14:37       ` Jason Gunthorpe
2019-10-22 14:37         ` Jason Gunthorpe
2019-10-21 18:40 ` Jerome Glisse
2019-10-21 18:40   ` Jerome Glisse
2019-10-21 19:06   ` Jason Gunthorpe
2019-10-21 19:06     ` Jason Gunthorpe
2019-10-23 20:26     ` Jerome Glisse
2019-10-23 20:26       ` Jerome Glisse
2019-10-23 20:26       ` Jerome Glisse
2019-10-23 20:26       ` Jerome Glisse
2019-10-17 16:44 Koenig, Christian
2019-10-17 16:44 ` Koenig, Christian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d6bcbd2a-2519-8945-eaf5-4f4e738c7fa9@amd.com \
    --to=christian.koenig@amd.com \
    --cc=Felix.Kuehling@amd.com \
    --cc=Philip.Yang@amd.com \
    --cc=aarcange@redhat.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=bskeggs@redhat.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=jgg@mellanox.com \
    --cc=jglisse@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-mm@kvack.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=rcampbell@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.