* [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review @ 2019-06-14 0:44 Jason Gunthorpe 2019-06-14 0:44 ` [PATCH v3 hmm 01/12] mm/hmm: fix use after free with struct hmm in the mmu notifiers Jason Gunthorpe ` (11 more replies) 0 siblings, 12 replies; 45+ messages in thread From: Jason Gunthorpe @ 2019-06-14 0:44 UTC (permalink / raw) To: Jerome Glisse, Ralph Campbell, John Hubbard, Felix.Kuehling Cc: Andrea Arcangeli, linux-rdma, amd-gfx, linux-mm, Jason Gunthorpe, dri-devel, Ben Skeggs From: Jason Gunthorpe <jgg@mellanox.com> This patch series arised out of discussions with Jerome when looking at the ODP changes, particularly informed by use after free races we have already found and fixed in the ODP code (thanks to syzkaller) working with mmu notifiers, and the discussion with Ralph on how to resolve the lifetime model. Overall this brings in a simplified locking scheme and easy to explain lifetime model: If a hmm_range is valid, then the hmm is valid, if a hmm is valid then the mm is allocated memory. If the mm needs to still be alive (ie to lock the mmap_sem, find a vma, etc) then the mmget must be obtained via mmget_not_zero(). The use unlocked reads on 'hmm->dead' are also eliminated in favour of using standard mmget() locking to prevent the mm from being released. Many of the debugging checks of !range->hmm and !hmm->mm are dropped in favour of poison - which is much clearer as to the lifetime intent. The trailing patches are just some random cleanups I noticed when reviewing this code. I would like to run some testing with the ODP patch, but haven't yet. Otherwise I think this is reviewed enough, and if there is nothing more say I hope to apply it next week. I plan to continue to work on the idea with CH to move more of this mirror code into mmu notifiers and other places, but this will take some time and research. Thanks to everyone who took time to look at this! Jason Gunthorpe (12): mm/hmm: fix use after free with struct hmm in the mmu notifiers mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register mm/hmm: Hold a mmgrab from hmm to mm mm/hmm: Simplify hmm_get_or_create and make it reliable mm/hmm: Remove duplicate condition test before wait_event_timeout mm/hmm: Hold on to the mmget for the lifetime of the range mm/hmm: Use lockdep instead of comments mm/hmm: Remove racy protection against double-unregistration mm/hmm: Poison hmm_range during unregister mm/hmm: Do not use list*_rcu() for hmm->ranges mm/hmm: Remove confusing comment and logic from hmm_release mm/hmm: Fix error flows in hmm_invalidate_range_start drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +- include/linux/hmm.h | 52 +---- kernel/fork.c | 1 - mm/hmm.c | 286 ++++++++++++-------------- 4 files changed, 140 insertions(+), 201 deletions(-) -- 2.21.0 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v3 hmm 01/12] mm/hmm: fix use after free with struct hmm in the mmu notifiers 2019-06-14 0:44 [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review Jason Gunthorpe @ 2019-06-14 0:44 ` Jason Gunthorpe [not found] ` <20190614004450.20252-2-jgg-uk2M96/98Pc@public.gmane.org> 2019-06-14 0:44 ` [PATCH v3 hmm 02/12] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register Jason Gunthorpe ` (10 subsequent siblings) 11 siblings, 1 reply; 45+ messages in thread From: Jason Gunthorpe @ 2019-06-14 0:44 UTC (permalink / raw) To: Jerome Glisse, Ralph Campbell, John Hubbard, Felix.Kuehling Cc: Andrea Arcangeli, Philip Yang, linux-rdma, amd-gfx, linux-mm, Jason Gunthorpe, dri-devel, Ira Weiny, Ben Skeggs From: Jason Gunthorpe <jgg@mellanox.com> mmu_notifier_unregister_no_release() is not a fence and the mmu_notifier system will continue to reference hmm->mn until the srcu grace period expires. Resulting in use after free races like this: CPU0 CPU1 __mmu_notifier_invalidate_range_start() srcu_read_lock hlist_for_each () // mn == hmm->mn hmm_mirror_unregister() hmm_put() hmm_free() mmu_notifier_unregister_no_release() hlist_del_init_rcu(hmm-mn->list) mn->ops->invalidate_range_start(mn, range); mm_get_hmm() mm->hmm = NULL; kfree(hmm) mutex_lock(&hmm->lock); Use SRCU to kfree the hmm memory so that the notifiers can rely on hmm existing. Get the now-safe hmm struct through container_of and directly check kref_get_unless_zero to lock it against free. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Tested-by: Philip Yang <Philip.Yang@amd.com> --- v2: - Spell 'free' properly (Jerome/Ralph) v3: - Have only one clearer comment about kref_get_unless_zero (John) --- include/linux/hmm.h | 1 + mm/hmm.c | 23 +++++++++++++++++------ 2 files changed, 18 insertions(+), 6 deletions(-) diff --git a/include/linux/hmm.h b/include/linux/hmm.h index 7007123842ba76..cb01cf1fa3c08b 100644 --- a/include/linux/hmm.h +++ b/include/linux/hmm.h @@ -93,6 +93,7 @@ struct hmm { struct mmu_notifier mmu_notifier; struct rw_semaphore mirrors_sem; wait_queue_head_t wq; + struct rcu_head rcu; long notifiers; bool dead; }; diff --git a/mm/hmm.c b/mm/hmm.c index 826816ab237799..f6956d78e3cb25 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -104,6 +104,11 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm) return NULL; } +static void hmm_free_rcu(struct rcu_head *rcu) +{ + kfree(container_of(rcu, struct hmm, rcu)); +} + static void hmm_free(struct kref *kref) { struct hmm *hmm = container_of(kref, struct hmm, kref); @@ -116,7 +121,7 @@ static void hmm_free(struct kref *kref) mm->hmm = NULL; spin_unlock(&mm->page_table_lock); - kfree(hmm); + mmu_notifier_call_srcu(&hmm->rcu, hmm_free_rcu); } static inline void hmm_put(struct hmm *hmm) @@ -144,10 +149,14 @@ void hmm_mm_destroy(struct mm_struct *mm) static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm) { - struct hmm *hmm = mm_get_hmm(mm); + struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier); struct hmm_mirror *mirror; struct hmm_range *range; + /* Bail out if hmm is in the process of being freed */ + if (!kref_get_unless_zero(&hmm->kref)) + return; + /* Report this HMM as dying. */ hmm->dead = true; @@ -185,13 +194,14 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm) static int hmm_invalidate_range_start(struct mmu_notifier *mn, const struct mmu_notifier_range *nrange) { - struct hmm *hmm = mm_get_hmm(nrange->mm); + struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier); struct hmm_mirror *mirror; struct hmm_update update; struct hmm_range *range; int ret = 0; - VM_BUG_ON(!hmm); + if (!kref_get_unless_zero(&hmm->kref)) + return 0; update.start = nrange->start; update.end = nrange->end; @@ -236,9 +246,10 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn, static void hmm_invalidate_range_end(struct mmu_notifier *mn, const struct mmu_notifier_range *nrange) { - struct hmm *hmm = mm_get_hmm(nrange->mm); + struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier); - VM_BUG_ON(!hmm); + if (!kref_get_unless_zero(&hmm->kref)) + return; mutex_lock(&hmm->lock); hmm->notifiers--; -- 2.21.0 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 45+ messages in thread
[parent not found: <20190614004450.20252-2-jgg-uk2M96/98Pc@public.gmane.org>]
* Re: [PATCH v3 hmm 01/12] mm/hmm: fix use after free with struct hmm in the mmu notifiers [not found] ` <20190614004450.20252-2-jgg-uk2M96/98Pc@public.gmane.org> @ 2019-06-15 13:56 ` Christoph Hellwig 0 siblings, 0 replies; 45+ messages in thread From: Christoph Hellwig @ 2019-06-15 13:56 UTC (permalink / raw) To: Jason Gunthorpe Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard, Felix.Kuehling-5C7GfCeVMHo, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, Jason Gunthorpe, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ira Weiny, Ben Skeggs Looks good, Reviewed-by: Christoph Hellwig <hch@lst.de> _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v3 hmm 02/12] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register 2019-06-14 0:44 [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review Jason Gunthorpe 2019-06-14 0:44 ` [PATCH v3 hmm 01/12] mm/hmm: fix use after free with struct hmm in the mmu notifiers Jason Gunthorpe @ 2019-06-14 0:44 ` Jason Gunthorpe [not found] ` <20190614004450.20252-3-jgg-uk2M96/98Pc@public.gmane.org> 2019-06-14 0:44 ` [PATCH v3 hmm 03/12] mm/hmm: Hold a mmgrab from hmm to mm Jason Gunthorpe ` (9 subsequent siblings) 11 siblings, 1 reply; 45+ messages in thread From: Jason Gunthorpe @ 2019-06-14 0:44 UTC (permalink / raw) To: Jerome Glisse, Ralph Campbell, John Hubbard, Felix.Kuehling Cc: Andrea Arcangeli, Philip Yang, linux-rdma, amd-gfx, linux-mm, Jason Gunthorpe, dri-devel, Ira Weiny, Ben Skeggs From: Jason Gunthorpe <jgg@mellanox.com> Ralph observes that hmm_range_register() can only be called by a driver while a mirror is registered. Make this clear in the API by passing in the mirror structure as a parameter. This also simplifies understanding the lifetime model for struct hmm, as the hmm pointer must be valid as part of a registered mirror so all we need in hmm_register_range() is a simple kref_get. Suggested-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Philip Yang <Philip.Yang@amd.com> --- v2 - Include the oneline patch to nouveau_svm.c --- drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +- include/linux/hmm.h | 7 ++++--- mm/hmm.c | 13 ++++--------- 3 files changed, 9 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c index 93ed43c413f0bb..8c92374afcf227 100644 --- a/drivers/gpu/drm/nouveau/nouveau_svm.c +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c @@ -649,7 +649,7 @@ nouveau_svm_fault(struct nvif_notify *notify) range.values = nouveau_svm_pfn_values; range.pfn_shift = NVIF_VMM_PFNMAP_V0_ADDR_SHIFT; again: - ret = hmm_vma_fault(&range, true); + ret = hmm_vma_fault(&svmm->mirror, &range, true); if (ret == 0) { mutex_lock(&svmm->mutex); if (!hmm_vma_range_done(&range)) { diff --git a/include/linux/hmm.h b/include/linux/hmm.h index cb01cf1fa3c08b..1fba6979adf460 100644 --- a/include/linux/hmm.h +++ b/include/linux/hmm.h @@ -496,7 +496,7 @@ static inline bool hmm_mirror_mm_is_alive(struct hmm_mirror *mirror) * Please see Documentation/vm/hmm.rst for how to use the range API. */ int hmm_range_register(struct hmm_range *range, - struct mm_struct *mm, + struct hmm_mirror *mirror, unsigned long start, unsigned long end, unsigned page_shift); @@ -532,7 +532,8 @@ static inline bool hmm_vma_range_done(struct hmm_range *range) } /* This is a temporary helper to avoid merge conflict between trees. */ -static inline int hmm_vma_fault(struct hmm_range *range, bool block) +static inline int hmm_vma_fault(struct hmm_mirror *mirror, + struct hmm_range *range, bool block) { long ret; @@ -545,7 +546,7 @@ static inline int hmm_vma_fault(struct hmm_range *range, bool block) range->default_flags = 0; range->pfn_flags_mask = -1UL; - ret = hmm_range_register(range, range->vma->vm_mm, + ret = hmm_range_register(range, mirror, range->start, range->end, PAGE_SHIFT); if (ret) diff --git a/mm/hmm.c b/mm/hmm.c index f6956d78e3cb25..22a97ada108b4e 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -914,13 +914,13 @@ static void hmm_pfns_clear(struct hmm_range *range, * Track updates to the CPU page table see include/linux/hmm.h */ int hmm_range_register(struct hmm_range *range, - struct mm_struct *mm, + struct hmm_mirror *mirror, unsigned long start, unsigned long end, unsigned page_shift) { unsigned long mask = ((1UL << page_shift) - 1UL); - struct hmm *hmm; + struct hmm *hmm = mirror->hmm; range->valid = false; range->hmm = NULL; @@ -934,20 +934,15 @@ int hmm_range_register(struct hmm_range *range, range->start = start; range->end = end; - hmm = hmm_get_or_create(mm); - if (!hmm) - return -EFAULT; - /* Check if hmm_mm_destroy() was call. */ - if (hmm->mm == NULL || hmm->dead) { - hmm_put(hmm); + if (hmm->mm == NULL || hmm->dead) return -EFAULT; - } /* Initialize range to track CPU page table updates. */ mutex_lock(&hmm->lock); range->hmm = hmm; + kref_get(&hmm->kref); list_add_rcu(&range->list, &hmm->ranges); /* -- 2.21.0 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 45+ messages in thread
[parent not found: <20190614004450.20252-3-jgg-uk2M96/98Pc@public.gmane.org>]
* Re: [PATCH v3 hmm 02/12] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register [not found] ` <20190614004450.20252-3-jgg-uk2M96/98Pc@public.gmane.org> @ 2019-06-15 13:59 ` Christoph Hellwig 2019-06-18 13:05 ` Jason Gunthorpe 0 siblings, 1 reply; 45+ messages in thread From: Christoph Hellwig @ 2019-06-15 13:59 UTC (permalink / raw) To: Jason Gunthorpe Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard, Felix.Kuehling-5C7GfCeVMHo, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, Jason Gunthorpe, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ira Weiny, Ben Skeggs On Thu, Jun 13, 2019 at 09:44:40PM -0300, Jason Gunthorpe wrote: > From: Jason Gunthorpe <jgg@mellanox.com> > > Ralph observes that hmm_range_register() can only be called by a driver > while a mirror is registered. Make this clear in the API by passing in the > mirror structure as a parameter. > > This also simplifies understanding the lifetime model for struct hmm, as > the hmm pointer must be valid as part of a registered mirror so all we > need in hmm_register_range() is a simple kref_get. Looks good, at least an an intermediate step: Reviewed-by: Christoph Hellwig <hch@lst.de> > index f6956d78e3cb25..22a97ada108b4e 100644 > --- a/mm/hmm.c > +++ b/mm/hmm.c > @@ -914,13 +914,13 @@ static void hmm_pfns_clear(struct hmm_range *range, > * Track updates to the CPU page table see include/linux/hmm.h > */ > int hmm_range_register(struct hmm_range *range, > - struct mm_struct *mm, > + struct hmm_mirror *mirror, > unsigned long start, > unsigned long end, > unsigned page_shift) > { > unsigned long mask = ((1UL << page_shift) - 1UL); > - struct hmm *hmm; > + struct hmm *hmm = mirror->hmm; > > range->valid = false; > range->hmm = NULL; > @@ -934,20 +934,15 @@ int hmm_range_register(struct hmm_range *range, > range->start = start; > range->end = end; But while you're at it: the calling conventions of hmm_range_register are still rather odd, as the staet, end and page_shift arguments are only used to fill out fields in the range structure passed in. Might be worth cleaning up as well if we change the calling convention. _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH v3 hmm 02/12] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register 2019-06-15 13:59 ` Christoph Hellwig @ 2019-06-18 13:05 ` Jason Gunthorpe [not found] ` <20190618130544.GC6961-uk2M96/98Pc@public.gmane.org> 0 siblings, 1 reply; 45+ messages in thread From: Jason Gunthorpe @ 2019-06-18 13:05 UTC (permalink / raw) To: Christoph Hellwig Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma, John Hubbard, Felix.Kuehling, dri-devel, linux-mm, Jerome Glisse, amd-gfx, Ira Weiny, Ben Skeggs On Sat, Jun 15, 2019 at 06:59:06AM -0700, Christoph Hellwig wrote: > On Thu, Jun 13, 2019 at 09:44:40PM -0300, Jason Gunthorpe wrote: > > From: Jason Gunthorpe <jgg@mellanox.com> > > > > Ralph observes that hmm_range_register() can only be called by a driver > > while a mirror is registered. Make this clear in the API by passing in the > > mirror structure as a parameter. > > > > This also simplifies understanding the lifetime model for struct hmm, as > > the hmm pointer must be valid as part of a registered mirror so all we > > need in hmm_register_range() is a simple kref_get. > > Looks good, at least an an intermediate step: > > Reviewed-by: Christoph Hellwig <hch@lst.de> > > > index f6956d78e3cb25..22a97ada108b4e 100644 > > +++ b/mm/hmm.c > > @@ -914,13 +914,13 @@ static void hmm_pfns_clear(struct hmm_range *range, > > * Track updates to the CPU page table see include/linux/hmm.h > > */ > > int hmm_range_register(struct hmm_range *range, > > - struct mm_struct *mm, > > + struct hmm_mirror *mirror, > > unsigned long start, > > unsigned long end, > > unsigned page_shift) > > { > > unsigned long mask = ((1UL << page_shift) - 1UL); > > - struct hmm *hmm; > > + struct hmm *hmm = mirror->hmm; > > > > range->valid = false; > > range->hmm = NULL; > > @@ -934,20 +934,15 @@ int hmm_range_register(struct hmm_range *range, > > range->start = start; > > range->end = end; > > But while you're at it: the calling conventions of hmm_range_register > are still rather odd, as the staet, end and page_shift arguments are > only used to fill out fields in the range structure passed in. Might > be worth cleaning up as well if we change the calling convention. I'm thinking to tackle that as part of the mmu notififer invlock idea.. Once the range looses the lock then we don't really need to register it at all. Thanks, Jason _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 45+ messages in thread
[parent not found: <20190618130544.GC6961-uk2M96/98Pc@public.gmane.org>]
* Re: [PATCH v3 hmm 02/12] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register [not found] ` <20190618130544.GC6961-uk2M96/98Pc@public.gmane.org> @ 2019-06-19 8:14 ` Christoph Hellwig 0 siblings, 0 replies; 45+ messages in thread From: Christoph Hellwig @ 2019-06-19 8:14 UTC (permalink / raw) To: Jason Gunthorpe Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard, Felix.Kuehling-5C7GfCeVMHo, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Christoph Hellwig, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ira Weiny, Ben Skeggs On Tue, Jun 18, 2019 at 10:05:44AM -0300, Jason Gunthorpe wrote: > I'm thinking to tackle that as part of the mmu notififer invlock > idea.. Once the range looses the lock then we don't really need to > register it at all. Ok. _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v3 hmm 03/12] mm/hmm: Hold a mmgrab from hmm to mm 2019-06-14 0:44 [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review Jason Gunthorpe 2019-06-14 0:44 ` [PATCH v3 hmm 01/12] mm/hmm: fix use after free with struct hmm in the mmu notifiers Jason Gunthorpe 2019-06-14 0:44 ` [PATCH v3 hmm 02/12] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register Jason Gunthorpe @ 2019-06-14 0:44 ` Jason Gunthorpe [not found] ` <20190614004450.20252-4-jgg-uk2M96/98Pc@public.gmane.org> 2019-06-14 0:44 ` [PATCH v3 hmm 04/12] mm/hmm: Simplify hmm_get_or_create and make it reliable Jason Gunthorpe ` (8 subsequent siblings) 11 siblings, 1 reply; 45+ messages in thread From: Jason Gunthorpe @ 2019-06-14 0:44 UTC (permalink / raw) To: Jerome Glisse, Ralph Campbell, John Hubbard, Felix.Kuehling Cc: Andrea Arcangeli, Philip Yang, linux-rdma, amd-gfx, linux-mm, Jason Gunthorpe, dri-devel, Ira Weiny, Ben Skeggs From: Jason Gunthorpe <jgg@mellanox.com> So long as a struct hmm pointer exists, so should the struct mm it is linked too. Hold the mmgrab() as soon as a hmm is created, and mmdrop() it once the hmm refcount goes to zero. Since mmdrop() (ie a 0 kref on struct mm) is now impossible with a !NULL mm->hmm delete the hmm_hmm_destroy(). Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Jérôme Glisse <jglisse@redhat.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Philip Yang <Philip.Yang@amd.com> --- v2: - Fix error unwind paths in hmm_get_or_create (Jerome/Jason) --- include/linux/hmm.h | 3 --- kernel/fork.c | 1 - mm/hmm.c | 22 ++++------------------ 3 files changed, 4 insertions(+), 22 deletions(-) diff --git a/include/linux/hmm.h b/include/linux/hmm.h index 1fba6979adf460..1d97b6d62c5bcf 100644 --- a/include/linux/hmm.h +++ b/include/linux/hmm.h @@ -577,14 +577,11 @@ static inline int hmm_vma_fault(struct hmm_mirror *mirror, } /* Below are for HMM internal use only! Not to be used by device driver! */ -void hmm_mm_destroy(struct mm_struct *mm); - static inline void hmm_mm_init(struct mm_struct *mm) { mm->hmm = NULL; } #else /* IS_ENABLED(CONFIG_HMM_MIRROR) */ -static inline void hmm_mm_destroy(struct mm_struct *mm) {} static inline void hmm_mm_init(struct mm_struct *mm) {} #endif /* IS_ENABLED(CONFIG_HMM_MIRROR) */ diff --git a/kernel/fork.c b/kernel/fork.c index 75675b9bf6dfd3..c704c3cedee78d 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -673,7 +673,6 @@ void __mmdrop(struct mm_struct *mm) WARN_ON_ONCE(mm == current->active_mm); mm_free_pgd(mm); destroy_context(mm); - hmm_mm_destroy(mm); mmu_notifier_mm_destroy(mm); check_mm(mm); put_user_ns(mm->user_ns); diff --git a/mm/hmm.c b/mm/hmm.c index 22a97ada108b4e..080b17a2e87e2d 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -20,6 +20,7 @@ #include <linux/swapops.h> #include <linux/hugetlb.h> #include <linux/memremap.h> +#include <linux/sched/mm.h> #include <linux/jump_label.h> #include <linux/dma-mapping.h> #include <linux/mmu_notifier.h> @@ -73,6 +74,7 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm) hmm->notifiers = 0; hmm->dead = false; hmm->mm = mm; + mmgrab(hmm->mm); spin_lock(&mm->page_table_lock); if (!mm->hmm) @@ -100,6 +102,7 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm) mm->hmm = NULL; spin_unlock(&mm->page_table_lock); error: + mmdrop(hmm->mm); kfree(hmm); return NULL; } @@ -121,6 +124,7 @@ static void hmm_free(struct kref *kref) mm->hmm = NULL; spin_unlock(&mm->page_table_lock); + mmdrop(hmm->mm); mmu_notifier_call_srcu(&hmm->rcu, hmm_free_rcu); } @@ -129,24 +133,6 @@ static inline void hmm_put(struct hmm *hmm) kref_put(&hmm->kref, hmm_free); } -void hmm_mm_destroy(struct mm_struct *mm) -{ - struct hmm *hmm; - - spin_lock(&mm->page_table_lock); - hmm = mm_get_hmm(mm); - mm->hmm = NULL; - if (hmm) { - hmm->mm = NULL; - hmm->dead = true; - spin_unlock(&mm->page_table_lock); - hmm_put(hmm); - return; - } - - spin_unlock(&mm->page_table_lock); -} - static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm) { struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier); -- 2.21.0 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 45+ messages in thread
[parent not found: <20190614004450.20252-4-jgg-uk2M96/98Pc@public.gmane.org>]
* Re: [PATCH v3 hmm 03/12] mm/hmm: Hold a mmgrab from hmm to mm [not found] ` <20190614004450.20252-4-jgg-uk2M96/98Pc@public.gmane.org> @ 2019-06-15 13:59 ` Christoph Hellwig 0 siblings, 0 replies; 45+ messages in thread From: Christoph Hellwig @ 2019-06-15 13:59 UTC (permalink / raw) To: Jason Gunthorpe Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard, Felix.Kuehling-5C7GfCeVMHo, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, Jason Gunthorpe, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ira Weiny, Ben Skeggs Looks good, Reviewed-by: Christoph Hellwig <hch@lst.de> _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v3 hmm 04/12] mm/hmm: Simplify hmm_get_or_create and make it reliable 2019-06-14 0:44 [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review Jason Gunthorpe ` (2 preceding siblings ...) 2019-06-14 0:44 ` [PATCH v3 hmm 03/12] mm/hmm: Hold a mmgrab from hmm to mm Jason Gunthorpe @ 2019-06-14 0:44 ` Jason Gunthorpe [not found] ` <20190614004450.20252-5-jgg-uk2M96/98Pc@public.gmane.org> 2019-06-14 0:44 ` [PATCH v3 hmm 05/12] mm/hmm: Remove duplicate condition test before wait_event_timeout Jason Gunthorpe ` (7 subsequent siblings) 11 siblings, 1 reply; 45+ messages in thread From: Jason Gunthorpe @ 2019-06-14 0:44 UTC (permalink / raw) To: Jerome Glisse, Ralph Campbell, John Hubbard, Felix.Kuehling Cc: Andrea Arcangeli, Philip Yang, linux-rdma, amd-gfx, linux-mm, Jason Gunthorpe, dri-devel, Ira Weiny, Ben Skeggs From: Jason Gunthorpe <jgg@mellanox.com> As coded this function can false-fail in various racy situations. Make it reliable and simpler by running under the write side of the mmap_sem and avoiding the false-failing compare/exchange pattern. Due to the mmap_sem this no longer has to avoid racing with a 2nd parallel hmm_get_or_create(). Unfortunately this still has to use the page_table_lock as the non-sleeping lock protecting mm->hmm, since the contexts where we free the hmm are incompatible with mmap_sem. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Philip Yang <Philip.Yang@amd.com> --- v2: - Fix error unwind of mmgrab (Jerome) - Use hmm local instead of 2nd container_of (Jerome) v3: - Can't use mmap_sem in the SRCU callback, keep using the page_table_lock (Philip) --- mm/hmm.c | 84 ++++++++++++++++++++++++-------------------------------- 1 file changed, 36 insertions(+), 48 deletions(-) diff --git a/mm/hmm.c b/mm/hmm.c index 080b17a2e87e2d..4c64d4c32f4825 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -31,16 +31,6 @@ #if IS_ENABLED(CONFIG_HMM_MIRROR) static const struct mmu_notifier_ops hmm_mmu_notifier_ops; -static inline struct hmm *mm_get_hmm(struct mm_struct *mm) -{ - struct hmm *hmm = READ_ONCE(mm->hmm); - - if (hmm && kref_get_unless_zero(&hmm->kref)) - return hmm; - - return NULL; -} - /** * hmm_get_or_create - register HMM against an mm (HMM internal) * @@ -55,11 +45,19 @@ static inline struct hmm *mm_get_hmm(struct mm_struct *mm) */ static struct hmm *hmm_get_or_create(struct mm_struct *mm) { - struct hmm *hmm = mm_get_hmm(mm); - bool cleanup = false; + struct hmm *hmm; - if (hmm) - return hmm; + lockdep_assert_held_exclusive(&mm->mmap_sem); + + /* Abuse the page_table_lock to also protect mm->hmm. */ + spin_lock(&mm->page_table_lock); + if (mm->hmm) { + if (kref_get_unless_zero(&mm->hmm->kref)) { + spin_unlock(&mm->page_table_lock); + return mm->hmm; + } + } + spin_unlock(&mm->page_table_lock); hmm = kmalloc(sizeof(*hmm), GFP_KERNEL); if (!hmm) @@ -74,57 +72,47 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm) hmm->notifiers = 0; hmm->dead = false; hmm->mm = mm; - mmgrab(hmm->mm); - spin_lock(&mm->page_table_lock); - if (!mm->hmm) - mm->hmm = hmm; - else - cleanup = true; - spin_unlock(&mm->page_table_lock); + hmm->mmu_notifier.ops = &hmm_mmu_notifier_ops; + if (__mmu_notifier_register(&hmm->mmu_notifier, mm)) { + kfree(hmm); + return NULL; + } - if (cleanup) - goto error; + mmgrab(hmm->mm); /* - * We should only get here if hold the mmap_sem in write mode ie on - * registration of first mirror through hmm_mirror_register() + * We hold the exclusive mmap_sem here so we know that mm->hmm is + * still NULL or 0 kref, and is safe to update. */ - hmm->mmu_notifier.ops = &hmm_mmu_notifier_ops; - if (__mmu_notifier_register(&hmm->mmu_notifier, mm)) - goto error_mm; - - return hmm; - -error_mm: spin_lock(&mm->page_table_lock); - if (mm->hmm == hmm) - mm->hmm = NULL; + mm->hmm = hmm; spin_unlock(&mm->page_table_lock); -error: - mmdrop(hmm->mm); - kfree(hmm); - return NULL; + return hmm; } static void hmm_free_rcu(struct rcu_head *rcu) { - kfree(container_of(rcu, struct hmm, rcu)); + struct hmm *hmm = container_of(rcu, struct hmm, rcu); + + /* + * The mm->hmm pointer is kept valid while notifier ops can be running + * so they don't have to deal with a NULL mm->hmm value + */ + spin_lock(&hmm->mm->page_table_lock); + if (hmm->mm->hmm == hmm) + hmm->mm->hmm = NULL; + spin_unlock(&hmm->mm->page_table_lock); + mmdrop(hmm->mm); + + kfree(hmm); } static void hmm_free(struct kref *kref) { struct hmm *hmm = container_of(kref, struct hmm, kref); - struct mm_struct *mm = hmm->mm; - - mmu_notifier_unregister_no_release(&hmm->mmu_notifier, mm); - spin_lock(&mm->page_table_lock); - if (mm->hmm == hmm) - mm->hmm = NULL; - spin_unlock(&mm->page_table_lock); - - mmdrop(hmm->mm); + mmu_notifier_unregister_no_release(&hmm->mmu_notifier, hmm->mm); mmu_notifier_call_srcu(&hmm->rcu, hmm_free_rcu); } -- 2.21.0 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 45+ messages in thread
[parent not found: <20190614004450.20252-5-jgg-uk2M96/98Pc@public.gmane.org>]
* Re: [PATCH v3 hmm 04/12] mm/hmm: Simplify hmm_get_or_create and make it reliable [not found] ` <20190614004450.20252-5-jgg-uk2M96/98Pc@public.gmane.org> @ 2019-06-15 14:12 ` Christoph Hellwig 2019-06-18 0:36 ` Jason Gunthorpe 2019-06-18 18:55 ` Jason Gunthorpe 0 siblings, 2 replies; 45+ messages in thread From: Christoph Hellwig @ 2019-06-15 14:12 UTC (permalink / raw) To: Jason Gunthorpe Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard, Felix.Kuehling-5C7GfCeVMHo, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, Jason Gunthorpe, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ira Weiny, Ben Skeggs > + spin_lock(&mm->page_table_lock); > + if (mm->hmm) { > + if (kref_get_unless_zero(&mm->hmm->kref)) { > + spin_unlock(&mm->page_table_lock); > + return mm->hmm; > + } > + } > + spin_unlock(&mm->page_table_lock); This could become: spin_lock(&mm->page_table_lock); hmm = mm->hmm if (hmm && kref_get_unless_zero(&hmm->kref)) goto out_unlock; spin_unlock(&mm->page_table_lock); as the last two lines of the function already drop the page_table_lock and then return hmm. Or drop the "hmm = mm->hmm" asignment above and return mm->hmm as that should be always identical to hmm at the end to save another line. > + /* > + * The mm->hmm pointer is kept valid while notifier ops can be running > + * so they don't have to deal with a NULL mm->hmm value > + */ The comment confuses me. How does the page_table_lock relate to possibly running notifiers, as I can't find that we take page_table_lock? Or is it just about the fact that we only clear mm->hmm in the free callback, and not in hmm_free? _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH v3 hmm 04/12] mm/hmm: Simplify hmm_get_or_create and make it reliable 2019-06-15 14:12 ` Christoph Hellwig @ 2019-06-18 0:36 ` Jason Gunthorpe 2019-06-18 18:55 ` Jason Gunthorpe 1 sibling, 0 replies; 45+ messages in thread From: Jason Gunthorpe @ 2019-06-18 0:36 UTC (permalink / raw) To: Christoph Hellwig Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma, John Hubbard, Felix.Kuehling, dri-devel, linux-mm, Jerome Glisse, amd-gfx, Ira Weiny, Ben Skeggs On Sat, Jun 15, 2019 at 07:12:11AM -0700, Christoph Hellwig wrote: > > + spin_lock(&mm->page_table_lock); > > + if (mm->hmm) { > > + if (kref_get_unless_zero(&mm->hmm->kref)) { > > + spin_unlock(&mm->page_table_lock); > > + return mm->hmm; > > + } > > + } > > + spin_unlock(&mm->page_table_lock); > > This could become: > > spin_lock(&mm->page_table_lock); > hmm = mm->hmm > if (hmm && kref_get_unless_zero(&hmm->kref)) > goto out_unlock; > spin_unlock(&mm->page_table_lock); > > as the last two lines of the function already drop the page_table_lock > and then return hmm. Or drop the "hmm = mm->hmm" asignment above and > return mm->hmm as that should be always identical to hmm at the end > to save another line. Yeah, I can fuss it some more. > > + /* > > + * The mm->hmm pointer is kept valid while notifier ops can be running > > + * so they don't have to deal with a NULL mm->hmm value > > + */ > > The comment confuses me. How does the page_table_lock relate to > possibly running notifiers, as I can't find that we take > page_table_lock? Or is it just about the fact that we only clear > mm->hmm in the free callback, and not in hmm_free? It was late when I wrote this fixup, the comment is faulty, and there is no reason to delay this until the SRCU cleanup at this point in the series. The ops all get their struct hmm from container_of, the only thing that refers to mm->hmm is hmm_get_or_create(). I'll revise it tomorrow, the comment will go away and the =NULL will go to the release callback Jason _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH v3 hmm 04/12] mm/hmm: Simplify hmm_get_or_create and make it reliable 2019-06-15 14:12 ` Christoph Hellwig 2019-06-18 0:36 ` Jason Gunthorpe @ 2019-06-18 18:55 ` Jason Gunthorpe 1 sibling, 0 replies; 45+ messages in thread From: Jason Gunthorpe @ 2019-06-18 18:55 UTC (permalink / raw) To: Christoph Hellwig Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma, John Hubbard, Felix.Kuehling, dri-devel, linux-mm, Jerome Glisse, amd-gfx, Ira Weiny, Ben Skeggs On Sat, Jun 15, 2019 at 07:12:11AM -0700, Christoph Hellwig wrote: > > + spin_lock(&mm->page_table_lock); > > + if (mm->hmm) { > > + if (kref_get_unless_zero(&mm->hmm->kref)) { > > + spin_unlock(&mm->page_table_lock); > > + return mm->hmm; > > + } > > + } > > + spin_unlock(&mm->page_table_lock); > > This could become: > > spin_lock(&mm->page_table_lock); > hmm = mm->hmm > if (hmm && kref_get_unless_zero(&hmm->kref)) > goto out_unlock; > spin_unlock(&mm->page_table_lock); > > as the last two lines of the function already drop the page_table_lock > and then return hmm. Or drop the "hmm = mm->hmm" asignment above and > return mm->hmm as that should be always identical to hmm at the end > to save another line. > > > + /* > > + * The mm->hmm pointer is kept valid while notifier ops can be running > > + * so they don't have to deal with a NULL mm->hmm value > > + */ > > The comment confuses me. How does the page_table_lock relate to > possibly running notifiers, as I can't find that we take > page_table_lock? Or is it just about the fact that we only clear > mm->hmm in the free callback, and not in hmm_free? Revised with: From bdc02a1d502db08457823e6b2b983861a3574a76 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe <jgg@mellanox.com> Date: Thu, 23 May 2019 10:24:13 -0300 Subject: [PATCH] mm/hmm: Simplify hmm_get_or_create and make it reliable As coded this function can false-fail in various racy situations. Make it reliable and simpler by running under the write side of the mmap_sem and avoiding the false-failing compare/exchange pattern. Due to the mmap_sem this no longer has to avoid racing with a 2nd parallel hmm_get_or_create(). Unfortunately this still has to use the page_table_lock as the non-sleeping lock protecting mm->hmm, since the contexts where we free the hmm are incompatible with mmap_sem. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Philip Yang <Philip.Yang@amd.com> --- v2: - Fix error unwind of mmgrab (Jerome) - Use hmm local instead of 2nd container_of (Jerome) v3: - Can't use mmap_sem in the SRCU callback, keep using the page_table_lock (Philip) v4: - Put the mm->hmm = NULL in the kref release, reduce LOC in hmm_get_or_create() (Christoph) --- mm/hmm.c | 77 ++++++++++++++++++++++---------------------------------- 1 file changed, 30 insertions(+), 47 deletions(-) diff --git a/mm/hmm.c b/mm/hmm.c index 080b17a2e87e2d..0423f4ca3a7e09 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -31,16 +31,6 @@ #if IS_ENABLED(CONFIG_HMM_MIRROR) static const struct mmu_notifier_ops hmm_mmu_notifier_ops; -static inline struct hmm *mm_get_hmm(struct mm_struct *mm) -{ - struct hmm *hmm = READ_ONCE(mm->hmm); - - if (hmm && kref_get_unless_zero(&hmm->kref)) - return hmm; - - return NULL; -} - /** * hmm_get_or_create - register HMM against an mm (HMM internal) * @@ -55,11 +45,16 @@ static inline struct hmm *mm_get_hmm(struct mm_struct *mm) */ static struct hmm *hmm_get_or_create(struct mm_struct *mm) { - struct hmm *hmm = mm_get_hmm(mm); - bool cleanup = false; + struct hmm *hmm; + + lockdep_assert_held_exclusive(&mm->mmap_sem); - if (hmm) - return hmm; + /* Abuse the page_table_lock to also protect mm->hmm. */ + spin_lock(&mm->page_table_lock); + hmm = mm->hmm; + if (mm->hmm && kref_get_unless_zero(&mm->hmm->kref)) + goto out_unlock; + spin_unlock(&mm->page_table_lock); hmm = kmalloc(sizeof(*hmm), GFP_KERNEL); if (!hmm) @@ -74,57 +69,45 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm) hmm->notifiers = 0; hmm->dead = false; hmm->mm = mm; - mmgrab(hmm->mm); - spin_lock(&mm->page_table_lock); - if (!mm->hmm) - mm->hmm = hmm; - else - cleanup = true; - spin_unlock(&mm->page_table_lock); + hmm->mmu_notifier.ops = &hmm_mmu_notifier_ops; + if (__mmu_notifier_register(&hmm->mmu_notifier, mm)) { + kfree(hmm); + return NULL; + } - if (cleanup) - goto error; + mmgrab(hmm->mm); /* - * We should only get here if hold the mmap_sem in write mode ie on - * registration of first mirror through hmm_mirror_register() + * We hold the exclusive mmap_sem here so we know that mm->hmm is + * still NULL or 0 kref, and is safe to update. */ - hmm->mmu_notifier.ops = &hmm_mmu_notifier_ops; - if (__mmu_notifier_register(&hmm->mmu_notifier, mm)) - goto error_mm; - - return hmm; - -error_mm: spin_lock(&mm->page_table_lock); - if (mm->hmm == hmm) - mm->hmm = NULL; + mm->hmm = hmm; + +out_unlock: spin_unlock(&mm->page_table_lock); -error: - mmdrop(hmm->mm); - kfree(hmm); - return NULL; + return hmm; } static void hmm_free_rcu(struct rcu_head *rcu) { - kfree(container_of(rcu, struct hmm, rcu)); + struct hmm *hmm = container_of(rcu, struct hmm, rcu); + + mmdrop(hmm->mm); + kfree(hmm); } static void hmm_free(struct kref *kref) { struct hmm *hmm = container_of(kref, struct hmm, kref); - struct mm_struct *mm = hmm->mm; - mmu_notifier_unregister_no_release(&hmm->mmu_notifier, mm); + spin_lock(&hmm->mm->page_table_lock); + if (hmm->mm->hmm == hmm) + hmm->mm->hmm = NULL; + spin_unlock(&hmm->mm->page_table_lock); - spin_lock(&mm->page_table_lock); - if (mm->hmm == hmm) - mm->hmm = NULL; - spin_unlock(&mm->page_table_lock); - - mmdrop(hmm->mm); + mmu_notifier_unregister_no_release(&hmm->mmu_notifier, hmm->mm); mmu_notifier_call_srcu(&hmm->rcu, hmm_free_rcu); } -- 2.21.0 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 45+ messages in thread
* [PATCH v3 hmm 05/12] mm/hmm: Remove duplicate condition test before wait_event_timeout 2019-06-14 0:44 [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review Jason Gunthorpe ` (3 preceding siblings ...) 2019-06-14 0:44 ` [PATCH v3 hmm 04/12] mm/hmm: Simplify hmm_get_or_create and make it reliable Jason Gunthorpe @ 2019-06-14 0:44 ` Jason Gunthorpe [not found] ` <20190614004450.20252-6-jgg-uk2M96/98Pc@public.gmane.org> 2019-06-14 0:44 ` [PATCH v3 hmm 06/12] mm/hmm: Hold on to the mmget for the lifetime of the range Jason Gunthorpe ` (6 subsequent siblings) 11 siblings, 1 reply; 45+ messages in thread From: Jason Gunthorpe @ 2019-06-14 0:44 UTC (permalink / raw) To: Jerome Glisse, Ralph Campbell, John Hubbard, Felix.Kuehling Cc: Andrea Arcangeli, Philip Yang, linux-rdma, amd-gfx, linux-mm, Jason Gunthorpe, dri-devel, Ira Weiny, Ben Skeggs From: Jason Gunthorpe <jgg@mellanox.com> The wait_event_timeout macro already tests the condition as its first action, so there is no reason to open code another version of this, all that does is skip the might_sleep() debugging in common cases, which is not helpful. Further, based on prior patches, we can now simplify the required condition test: - If range is valid memory then so is range->hmm - If hmm_release() has run then range->valid is set to false at the same time as dead, so no reason to check both. - A valid hmm has a valid hmm->mm. Allowing the return value of wait_event_timeout() (along with its internal barriers) to compute the result of the function. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Philip Yang <Philip.Yang@amd.com> --- v3 - Simplify the wait_event_timeout to not check valid --- include/linux/hmm.h | 13 ++----------- 1 file changed, 2 insertions(+), 11 deletions(-) diff --git a/include/linux/hmm.h b/include/linux/hmm.h index 1d97b6d62c5bcf..26e7c477490c4e 100644 --- a/include/linux/hmm.h +++ b/include/linux/hmm.h @@ -209,17 +209,8 @@ static inline unsigned long hmm_range_page_size(const struct hmm_range *range) static inline bool hmm_range_wait_until_valid(struct hmm_range *range, unsigned long timeout) { - /* Check if mm is dead ? */ - if (range->hmm == NULL || range->hmm->dead || range->hmm->mm == NULL) { - range->valid = false; - return false; - } - if (range->valid) - return true; - wait_event_timeout(range->hmm->wq, range->valid || range->hmm->dead, - msecs_to_jiffies(timeout)); - /* Return current valid status just in case we get lucky */ - return range->valid; + return wait_event_timeout(range->hmm->wq, range->valid, + msecs_to_jiffies(timeout)) != 0; } /* -- 2.21.0 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 45+ messages in thread
[parent not found: <20190614004450.20252-6-jgg-uk2M96/98Pc@public.gmane.org>]
* Re: [PATCH v3 hmm 05/12] mm/hmm: Remove duplicate condition test before wait_event_timeout [not found] ` <20190614004450.20252-6-jgg-uk2M96/98Pc@public.gmane.org> @ 2019-06-15 14:12 ` Christoph Hellwig 0 siblings, 0 replies; 45+ messages in thread From: Christoph Hellwig @ 2019-06-15 14:12 UTC (permalink / raw) To: Jason Gunthorpe Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard, Felix.Kuehling-5C7GfCeVMHo, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, Jason Gunthorpe, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ira Weiny, Ben Skeggs Looks good, Reviewed-by: Christoph Hellwig <hch@lst.de> _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v3 hmm 06/12] mm/hmm: Hold on to the mmget for the lifetime of the range 2019-06-14 0:44 [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review Jason Gunthorpe ` (4 preceding siblings ...) 2019-06-14 0:44 ` [PATCH v3 hmm 05/12] mm/hmm: Remove duplicate condition test before wait_event_timeout Jason Gunthorpe @ 2019-06-14 0:44 ` Jason Gunthorpe [not found] ` <20190614004450.20252-7-jgg-uk2M96/98Pc@public.gmane.org> 2019-06-14 0:44 ` [PATCH v3 hmm 07/12] mm/hmm: Use lockdep instead of comments Jason Gunthorpe ` (5 subsequent siblings) 11 siblings, 1 reply; 45+ messages in thread From: Jason Gunthorpe @ 2019-06-14 0:44 UTC (permalink / raw) To: Jerome Glisse, Ralph Campbell, John Hubbard, Felix.Kuehling Cc: Andrea Arcangeli, Philip Yang, linux-rdma, amd-gfx, linux-mm, Jason Gunthorpe, dri-devel, Ben Skeggs From: Jason Gunthorpe <jgg@mellanox.com> Range functions like hmm_range_snapshot() and hmm_range_fault() call find_vma, which requires hodling the mmget() and the mmap_sem for the mm. Make this simpler for the callers by holding the mmget() inside the range for the lifetime of the range. Other functions that accept a range should only be called if the range is registered. This has the side effect of directly preventing hmm_release() from happening while a range is registered. That means range->dead cannot be false during the lifetime of the range, so remove dead and hmm_mirror_mm_is_alive() entirely. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Tested-by: Philip Yang <Philip.Yang@amd.com> --- v2: - Use Jerome's idea of just holding the mmget() for the range lifetime, rework the patch to use that as as simplification to remove dead in one step --- include/linux/hmm.h | 26 -------------------------- mm/hmm.c | 28 ++++++++++------------------ 2 files changed, 10 insertions(+), 44 deletions(-) diff --git a/include/linux/hmm.h b/include/linux/hmm.h index 26e7c477490c4e..bf013e96525771 100644 --- a/include/linux/hmm.h +++ b/include/linux/hmm.h @@ -82,7 +82,6 @@ * @mirrors_sem: read/write semaphore protecting the mirrors list * @wq: wait queue for user waiting on a range invalidation * @notifiers: count of active mmu notifiers - * @dead: is the mm dead ? */ struct hmm { struct mm_struct *mm; @@ -95,7 +94,6 @@ struct hmm { wait_queue_head_t wq; struct rcu_head rcu; long notifiers; - bool dead; }; /* @@ -459,30 +457,6 @@ struct hmm_mirror { int hmm_mirror_register(struct hmm_mirror *mirror, struct mm_struct *mm); void hmm_mirror_unregister(struct hmm_mirror *mirror); -/* - * hmm_mirror_mm_is_alive() - test if mm is still alive - * @mirror: the HMM mm mirror for which we want to lock the mmap_sem - * Return: false if the mm is dead, true otherwise - * - * This is an optimization, it will not always accurately return false if the - * mm is dead; i.e., there can be false negatives (process is being killed but - * HMM is not yet informed of that). It is only intended to be used to optimize - * out cases where the driver is about to do something time consuming and it - * would be better to skip it if the mm is dead. - */ -static inline bool hmm_mirror_mm_is_alive(struct hmm_mirror *mirror) -{ - struct mm_struct *mm; - - if (!mirror || !mirror->hmm) - return false; - mm = READ_ONCE(mirror->hmm->mm); - if (mirror->hmm->dead || !mm) - return false; - - return true; -} - /* * Please see Documentation/vm/hmm.rst for how to use the range API. */ diff --git a/mm/hmm.c b/mm/hmm.c index 4c64d4c32f4825..58712d74edd585 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -70,7 +70,6 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm) mutex_init(&hmm->lock); kref_init(&hmm->kref); hmm->notifiers = 0; - hmm->dead = false; hmm->mm = mm; hmm->mmu_notifier.ops = &hmm_mmu_notifier_ops; @@ -125,20 +124,17 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm) { struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier); struct hmm_mirror *mirror; - struct hmm_range *range; /* Bail out if hmm is in the process of being freed */ if (!kref_get_unless_zero(&hmm->kref)) return; - /* Report this HMM as dying. */ - hmm->dead = true; - - /* Wake-up everyone waiting on any range. */ mutex_lock(&hmm->lock); - list_for_each_entry(range, &hmm->ranges, list) - range->valid = false; - wake_up_all(&hmm->wq); + /* + * Since hmm_range_register() holds the mmget() lock hmm_release() is + * prevented as long as a range exists. + */ + WARN_ON(!list_empty(&hmm->ranges)); mutex_unlock(&hmm->lock); down_write(&hmm->mirrors_sem); @@ -908,8 +904,8 @@ int hmm_range_register(struct hmm_range *range, range->start = start; range->end = end; - /* Check if hmm_mm_destroy() was call. */ - if (hmm->mm == NULL || hmm->dead) + /* Prevent hmm_release() from running while the range is valid */ + if (!mmget_not_zero(hmm->mm)) return -EFAULT; /* Initialize range to track CPU page table updates. */ @@ -952,6 +948,7 @@ void hmm_range_unregister(struct hmm_range *range) /* Drop reference taken by hmm_range_register() */ range->valid = false; + mmput(hmm->mm); hmm_put(hmm); range->hmm = NULL; } @@ -979,10 +976,7 @@ long hmm_range_snapshot(struct hmm_range *range) struct vm_area_struct *vma; struct mm_walk mm_walk; - /* Check if hmm_mm_destroy() was call. */ - if (hmm->mm == NULL || hmm->dead) - return -EFAULT; - + lockdep_assert_held(&hmm->mm->mmap_sem); do { /* If range is no longer valid force retry. */ if (!range->valid) @@ -1077,9 +1071,7 @@ long hmm_range_fault(struct hmm_range *range, bool block) struct mm_walk mm_walk; int ret; - /* Check if hmm_mm_destroy() was call. */ - if (hmm->mm == NULL || hmm->dead) - return -EFAULT; + lockdep_assert_held(&hmm->mm->mmap_sem); do { /* If range is no longer valid force retry. */ -- 2.21.0 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 45+ messages in thread
[parent not found: <20190614004450.20252-7-jgg-uk2M96/98Pc@public.gmane.org>]
* Re: [PATCH v3 hmm 06/12] mm/hmm: Hold on to the mmget for the lifetime of the range [not found] ` <20190614004450.20252-7-jgg-uk2M96/98Pc@public.gmane.org> @ 2019-06-15 14:14 ` Christoph Hellwig 2019-06-18 15:11 ` Jason Gunthorpe 0 siblings, 1 reply; 45+ messages in thread From: Christoph Hellwig @ 2019-06-15 14:14 UTC (permalink / raw) To: Jason Gunthorpe Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard, Felix.Kuehling-5C7GfCeVMHo, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, Jason Gunthorpe, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs > mutex_lock(&hmm->lock); > - list_for_each_entry(range, &hmm->ranges, list) > - range->valid = false; > - wake_up_all(&hmm->wq); > + /* > + * Since hmm_range_register() holds the mmget() lock hmm_release() is > + * prevented as long as a range exists. > + */ > + WARN_ON(!list_empty(&hmm->ranges)); > mutex_unlock(&hmm->lock); This can just use list_empty_careful and avoid the lock entirely. Otherwise looks good: Reviewed-by: Christoph Hellwig <hch@lst.de> _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH v3 hmm 06/12] mm/hmm: Hold on to the mmget for the lifetime of the range 2019-06-15 14:14 ` Christoph Hellwig @ 2019-06-18 15:11 ` Jason Gunthorpe [not found] ` <20190618151100.GI6961-uk2M96/98Pc@public.gmane.org> 0 siblings, 1 reply; 45+ messages in thread From: Jason Gunthorpe @ 2019-06-18 15:11 UTC (permalink / raw) To: Christoph Hellwig Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma, John Hubbard, Felix.Kuehling, dri-devel, linux-mm, Jerome Glisse, amd-gfx, Ben Skeggs On Sat, Jun 15, 2019 at 07:14:35AM -0700, Christoph Hellwig wrote: > > mutex_lock(&hmm->lock); > > - list_for_each_entry(range, &hmm->ranges, list) > > - range->valid = false; > > - wake_up_all(&hmm->wq); > > + /* > > + * Since hmm_range_register() holds the mmget() lock hmm_release() is > > + * prevented as long as a range exists. > > + */ > > + WARN_ON(!list_empty(&hmm->ranges)); > > mutex_unlock(&hmm->lock); > > This can just use list_empty_careful and avoid the lock entirely. Sure, it is just a debugging helper and the mmput should serialize thinigs enough to be reliable. I had to move the RCU patch ahead of this. Thanks diff --git a/mm/hmm.c b/mm/hmm.c index a9ace28984ea42..1eddda45cefae7 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -124,13 +124,11 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm) if (!kref_get_unless_zero(&hmm->kref)) return; - mutex_lock(&hmm->lock); /* * Since hmm_range_register() holds the mmget() lock hmm_release() is * prevented as long as a range exists. */ - WARN_ON(!list_empty(&hmm->ranges)); - mutex_unlock(&hmm->lock); + WARN_ON(!list_empty_careful(&hmm->ranges)); down_write(&hmm->mirrors_sem); mirror = list_first_entry_or_null(&hmm->mirrors, struct hmm_mirror, @@ -938,7 +936,7 @@ void hmm_range_unregister(struct hmm_range *range) return; mutex_lock(&hmm->lock); - list_del(&range->list); + list_del_init(&range->list); mutex_unlock(&hmm->lock); /* Drop reference taken by hmm_range_register() */ _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 45+ messages in thread
[parent not found: <20190618151100.GI6961-uk2M96/98Pc@public.gmane.org>]
* Re: [PATCH v3 hmm 06/12] mm/hmm: Hold on to the mmget for the lifetime of the range [not found] ` <20190618151100.GI6961-uk2M96/98Pc@public.gmane.org> @ 2019-06-19 8:18 ` Christoph Hellwig 2019-06-19 11:34 ` Jason Gunthorpe 0 siblings, 1 reply; 45+ messages in thread From: Christoph Hellwig @ 2019-06-19 8:18 UTC (permalink / raw) To: Jason Gunthorpe Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard, Felix.Kuehling-5C7GfCeVMHo, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Christoph Hellwig, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs > mutex_lock(&hmm->lock); > - list_del(&range->list); > + list_del_init(&range->list); > mutex_unlock(&hmm->lock); I don't see the point why this is a list_del_init - that just reinitializeѕ range->list, but doesn't change anything for the list head it was removed from. (and if the list_del_init was intended a later patch in your branch reverts it to plain list_del..) _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH v3 hmm 06/12] mm/hmm: Hold on to the mmget for the lifetime of the range 2019-06-19 8:18 ` Christoph Hellwig @ 2019-06-19 11:34 ` Jason Gunthorpe [not found] ` <20190619113452.GB9360-uk2M96/98Pc@public.gmane.org> 0 siblings, 1 reply; 45+ messages in thread From: Jason Gunthorpe @ 2019-06-19 11:34 UTC (permalink / raw) To: Christoph Hellwig Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma, John Hubbard, Felix.Kuehling, dri-devel, linux-mm, Jerome Glisse, amd-gfx, Ben Skeggs On Wed, Jun 19, 2019 at 01:18:58AM -0700, Christoph Hellwig wrote: > > mutex_lock(&hmm->lock); > > - list_del(&range->list); > > + list_del_init(&range->list); > > mutex_unlock(&hmm->lock); > > I don't see the point why this is a list_del_init - that just > reinitializeѕ range->list, but doesn't change anything for the list > head it was removed from. (and if the list_del_init was intended > a later patch in your branch reverts it to plain list_del..) Just following the instructions: /** * list_empty_careful - tests whether a list is empty and not being modified * @head: the list to test * * Description: * tests whether a list is empty _and_ checks that no other CPU might be * in the process of modifying either member (next or prev) * * NOTE: using list_empty_careful() without synchronization * can only be safe if the only activity that can happen * to the list entry is list_del_init(). Eg. it cannot be used * if another CPU could re-list_add() it. */ Agree it doesn't seem obvious why this is relevant when checking the list head.. Maybe the comment is a bit misleading? Jason _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 45+ messages in thread
[parent not found: <20190619113452.GB9360-uk2M96/98Pc@public.gmane.org>]
* Re: [PATCH v3 hmm 06/12] mm/hmm: Hold on to the mmget for the lifetime of the range [not found] ` <20190619113452.GB9360-uk2M96/98Pc@public.gmane.org> @ 2019-06-19 11:54 ` Christoph Hellwig 0 siblings, 0 replies; 45+ messages in thread From: Christoph Hellwig @ 2019-06-19 11:54 UTC (permalink / raw) To: Jason Gunthorpe Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard, Felix.Kuehling-5C7GfCeVMHo, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Christoph Hellwig, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs On Wed, Jun 19, 2019 at 08:34:52AM -0300, Jason Gunthorpe wrote: > /** > * list_empty_careful - tests whether a list is empty and not being modified > * @head: the list to test > * > * Description: > * tests whether a list is empty _and_ checks that no other CPU might be > * in the process of modifying either member (next or prev) > * > * NOTE: using list_empty_careful() without synchronization > * can only be safe if the only activity that can happen > * to the list entry is list_del_init(). Eg. it cannot be used > * if another CPU could re-list_add() it. > */ > > Agree it doesn't seem obvious why this is relevant when checking the > list head.. > > Maybe the comment is a bit misleading? From looking at the commit log in the history tree list_empty_careful was initially added by Linus, and then mingo added that comment later. I don't see how list_del_init would change anything here, so I suspect list_del_init was just used as a short hand for list_del or list_del_init. _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v3 hmm 07/12] mm/hmm: Use lockdep instead of comments 2019-06-14 0:44 [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review Jason Gunthorpe ` (5 preceding siblings ...) 2019-06-14 0:44 ` [PATCH v3 hmm 06/12] mm/hmm: Hold on to the mmget for the lifetime of the range Jason Gunthorpe @ 2019-06-14 0:44 ` Jason Gunthorpe [not found] ` <20190614004450.20252-8-jgg-uk2M96/98Pc@public.gmane.org> 2019-06-14 0:44 ` [PATCH v3 hmm 08/12] mm/hmm: Remove racy protection against double-unregistration Jason Gunthorpe ` (4 subsequent siblings) 11 siblings, 1 reply; 45+ messages in thread From: Jason Gunthorpe @ 2019-06-14 0:44 UTC (permalink / raw) To: Jerome Glisse, Ralph Campbell, John Hubbard, Felix.Kuehling Cc: Andrea Arcangeli, Philip Yang, linux-rdma, amd-gfx, Souptick Joarder, linux-mm, Jason Gunthorpe, dri-devel, Ben Skeggs From: Jason Gunthorpe <jgg@mellanox.com> So we can check locking at runtime. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Jérôme Glisse <jglisse@redhat.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Acked-by: Souptick Joarder <jrdr.linux@gmail.com> Tested-by: Philip Yang <Philip.Yang@amd.com> --- v2 - Fix missing & in lockdeps (Jason) --- mm/hmm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/hmm.c b/mm/hmm.c index 58712d74edd585..c0f622f86223c2 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -253,11 +253,11 @@ static const struct mmu_notifier_ops hmm_mmu_notifier_ops = { * * To start mirroring a process address space, the device driver must register * an HMM mirror struct. - * - * THE mm->mmap_sem MUST BE HELD IN WRITE MODE ! */ int hmm_mirror_register(struct hmm_mirror *mirror, struct mm_struct *mm) { + lockdep_assert_held_exclusive(&mm->mmap_sem); + /* Sanity check */ if (!mm || !mirror || !mirror->ops) return -EINVAL; -- 2.21.0 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 45+ messages in thread
[parent not found: <20190614004450.20252-8-jgg-uk2M96/98Pc@public.gmane.org>]
* Re: [PATCH v3 hmm 07/12] mm/hmm: Use lockdep instead of comments [not found] ` <20190614004450.20252-8-jgg-uk2M96/98Pc@public.gmane.org> @ 2019-06-15 14:14 ` Christoph Hellwig 0 siblings, 0 replies; 45+ messages in thread From: Christoph Hellwig @ 2019-06-15 14:14 UTC (permalink / raw) To: Jason Gunthorpe Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard, Felix.Kuehling-5C7GfCeVMHo, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Souptick Joarder, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, Jason Gunthorpe, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs Looks good, Reviewed-by: Christoph Hellwig <hch@lst.de> _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v3 hmm 08/12] mm/hmm: Remove racy protection against double-unregistration 2019-06-14 0:44 [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review Jason Gunthorpe ` (6 preceding siblings ...) 2019-06-14 0:44 ` [PATCH v3 hmm 07/12] mm/hmm: Use lockdep instead of comments Jason Gunthorpe @ 2019-06-14 0:44 ` Jason Gunthorpe [not found] ` <20190614004450.20252-9-jgg-uk2M96/98Pc@public.gmane.org> 2019-06-14 0:44 ` [PATCH v3 hmm 09/12] mm/hmm: Poison hmm_range during unregister Jason Gunthorpe ` (3 subsequent siblings) 11 siblings, 1 reply; 45+ messages in thread From: Jason Gunthorpe @ 2019-06-14 0:44 UTC (permalink / raw) To: Jerome Glisse, Ralph Campbell, John Hubbard, Felix.Kuehling Cc: Andrea Arcangeli, Philip Yang, linux-rdma, amd-gfx, linux-mm, Jason Gunthorpe, dri-devel, Ben Skeggs From: Jason Gunthorpe <jgg@mellanox.com> No other register/unregister kernel API attempts to provide this kind of protection as it is inherently racy, so just drop it. Callers should provide their own protection, it appears nouveau already does, but just in case drop a debugging POISON. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Jérôme Glisse <jglisse@redhat.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Tested-by: Philip Yang <Philip.Yang@amd.com> --- mm/hmm.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/mm/hmm.c b/mm/hmm.c index c0f622f86223c2..e3e0a811a3a774 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -283,18 +283,13 @@ EXPORT_SYMBOL(hmm_mirror_register); */ void hmm_mirror_unregister(struct hmm_mirror *mirror) { - struct hmm *hmm = READ_ONCE(mirror->hmm); - - if (hmm == NULL) - return; + struct hmm *hmm = mirror->hmm; down_write(&hmm->mirrors_sem); list_del_init(&mirror->list); - /* To protect us against double unregister ... */ - mirror->hmm = NULL; up_write(&hmm->mirrors_sem); - hmm_put(hmm); + memset(&mirror->hmm, POISON_INUSE, sizeof(mirror->hmm)); } EXPORT_SYMBOL(hmm_mirror_unregister); -- 2.21.0 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 45+ messages in thread
[parent not found: <20190614004450.20252-9-jgg-uk2M96/98Pc@public.gmane.org>]
* Re: [PATCH v3 hmm 08/12] mm/hmm: Remove racy protection against double-unregistration [not found] ` <20190614004450.20252-9-jgg-uk2M96/98Pc@public.gmane.org> @ 2019-06-15 14:16 ` Christoph Hellwig 2019-06-18 13:13 ` Jason Gunthorpe 0 siblings, 1 reply; 45+ messages in thread From: Christoph Hellwig @ 2019-06-15 14:16 UTC (permalink / raw) To: Jason Gunthorpe Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard, Felix.Kuehling-5C7GfCeVMHo, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, Jason Gunthorpe, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs On Thu, Jun 13, 2019 at 09:44:46PM -0300, Jason Gunthorpe wrote: > From: Jason Gunthorpe <jgg@mellanox.com> > > No other register/unregister kernel API attempts to provide this kind of > protection as it is inherently racy, so just drop it. > > Callers should provide their own protection, it appears nouveau already > does, but just in case drop a debugging POISON. I don't even think we even need to bother with the POISON, normal list debugging will already catch a double unregistration anyway. Otherwise looks fine: Reviewed-by: Christoph Hellwig <hch@lst.de> _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH v3 hmm 08/12] mm/hmm: Remove racy protection against double-unregistration 2019-06-15 14:16 ` Christoph Hellwig @ 2019-06-18 13:13 ` Jason Gunthorpe [not found] ` <20190618131324.GF6961-uk2M96/98Pc@public.gmane.org> 0 siblings, 1 reply; 45+ messages in thread From: Jason Gunthorpe @ 2019-06-18 13:13 UTC (permalink / raw) To: Christoph Hellwig Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma, John Hubbard, Felix.Kuehling, dri-devel, linux-mm, Jerome Glisse, amd-gfx, Ben Skeggs On Sat, Jun 15, 2019 at 07:16:12AM -0700, Christoph Hellwig wrote: > On Thu, Jun 13, 2019 at 09:44:46PM -0300, Jason Gunthorpe wrote: > > From: Jason Gunthorpe <jgg@mellanox.com> > > > > No other register/unregister kernel API attempts to provide this kind of > > protection as it is inherently racy, so just drop it. > > > > Callers should provide their own protection, it appears nouveau already > > does, but just in case drop a debugging POISON. > > I don't even think we even need to bother with the POISON, normal list > debugging will already catch a double unregistration anyway. mirror->hmm isn't a list so list debugging won't help. My concern when I wrote this was that one of the in flight patches I can't see might be depending on this double-unregister-is-safe behavior, so I wanted them to crash reliably. It is a really overly conservative thing to do.. Thanks, Jason _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 45+ messages in thread
[parent not found: <20190618131324.GF6961-uk2M96/98Pc@public.gmane.org>]
* Re: [PATCH v3 hmm 08/12] mm/hmm: Remove racy protection against double-unregistration [not found] ` <20190618131324.GF6961-uk2M96/98Pc@public.gmane.org> @ 2019-06-18 13:27 ` Christoph Hellwig 2019-06-18 18:57 ` Jason Gunthorpe 0 siblings, 1 reply; 45+ messages in thread From: Christoph Hellwig @ 2019-06-18 13:27 UTC (permalink / raw) To: Jason Gunthorpe Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard, Felix.Kuehling-5C7GfCeVMHo, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Christoph Hellwig, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs On Tue, Jun 18, 2019 at 10:13:24AM -0300, Jason Gunthorpe wrote: > > I don't even think we even need to bother with the POISON, normal list > > debugging will already catch a double unregistration anyway. > > mirror->hmm isn't a list so list debugging won't help. > > My concern when I wrote this was that one of the in flight patches I > can't see might be depending on this double-unregister-is-safe > behavior, so I wanted them to crash reliably. > > It is a really overly conservative thing to do.. mirror->list is a list, and if we do a list_del on it during the second unregistration it will trip up on the list poisoning. _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH v3 hmm 08/12] mm/hmm: Remove racy protection against double-unregistration 2019-06-18 13:27 ` Christoph Hellwig @ 2019-06-18 18:57 ` Jason Gunthorpe [not found] ` <20190618185757.GP6961-uk2M96/98Pc@public.gmane.org> 0 siblings, 1 reply; 45+ messages in thread From: Jason Gunthorpe @ 2019-06-18 18:57 UTC (permalink / raw) To: Christoph Hellwig Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma, John Hubbard, Felix.Kuehling, dri-devel, linux-mm, Jerome Glisse, amd-gfx, Ben Skeggs On Tue, Jun 18, 2019 at 06:27:22AM -0700, Christoph Hellwig wrote: > On Tue, Jun 18, 2019 at 10:13:24AM -0300, Jason Gunthorpe wrote: > > > I don't even think we even need to bother with the POISON, normal list > > > debugging will already catch a double unregistration anyway. > > > > mirror->hmm isn't a list so list debugging won't help. > > > > My concern when I wrote this was that one of the in flight patches I > > can't see might be depending on this double-unregister-is-safe > > behavior, so I wanted them to crash reliably. > > > > It is a really overly conservative thing to do.. > > mirror->list is a list, and if we do a list_del on it during the > second unregistration it will trip up on the list poisoning. With the previous loose coupling of the mirror and the range some code might rance to try to create a range without a mirror, which will now reliably crash with the poison. It isn't so much the double unregister that worries me, but racing unregister with range functions. Jason _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 45+ messages in thread
[parent not found: <20190618185757.GP6961-uk2M96/98Pc@public.gmane.org>]
* Re: [PATCH v3 hmm 08/12] mm/hmm: Remove racy protection against double-unregistration [not found] ` <20190618185757.GP6961-uk2M96/98Pc@public.gmane.org> @ 2019-06-19 8:19 ` Christoph Hellwig 0 siblings, 0 replies; 45+ messages in thread From: Christoph Hellwig @ 2019-06-19 8:19 UTC (permalink / raw) To: Jason Gunthorpe Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard, Felix.Kuehling-5C7GfCeVMHo, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Christoph Hellwig, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs On Tue, Jun 18, 2019 at 03:57:57PM -0300, Jason Gunthorpe wrote: > With the previous loose coupling of the mirror and the range some code > might rance to try to create a range without a mirror, which will now > reliably crash with the poison. > > It isn't so much the double unregister that worries me, but racing > unregister with range functions. Oh well. It was just a nitpick for the highly unusual code patterns in the two unregister routines, probably not worth fighting over even if I still don't see the point. _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v3 hmm 09/12] mm/hmm: Poison hmm_range during unregister 2019-06-14 0:44 [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review Jason Gunthorpe ` (7 preceding siblings ...) 2019-06-14 0:44 ` [PATCH v3 hmm 08/12] mm/hmm: Remove racy protection against double-unregistration Jason Gunthorpe @ 2019-06-14 0:44 ` Jason Gunthorpe [not found] ` <20190614004450.20252-10-jgg-uk2M96/98Pc@public.gmane.org> 2019-06-14 0:44 ` [PATCH v3 hmm 10/12] mm/hmm: Do not use list*_rcu() for hmm->ranges Jason Gunthorpe ` (2 subsequent siblings) 11 siblings, 1 reply; 45+ messages in thread From: Jason Gunthorpe @ 2019-06-14 0:44 UTC (permalink / raw) To: Jerome Glisse, Ralph Campbell, John Hubbard, Felix.Kuehling Cc: Andrea Arcangeli, Philip Yang, linux-rdma, amd-gfx, Souptick Joarder, linux-mm, Jason Gunthorpe, dri-devel, Ira Weiny, Ben Skeggs From: Jason Gunthorpe <jgg@mellanox.com> Trying to misuse a range outside its lifetime is a kernel bug. Use poison bytes to help detect this condition. Double unregister will reliably crash. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Jérôme Glisse <jglisse@redhat.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Acked-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Philip Yang <Philip.Yang@amd.com> --- v2 - Keep range start/end valid after unregistration (Jerome) v3 - Revise some comments (John) - Remove start/end WARN_ON (Souptick) --- mm/hmm.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/mm/hmm.c b/mm/hmm.c index e3e0a811a3a774..e214668cba3474 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -933,19 +933,21 @@ void hmm_range_unregister(struct hmm_range *range) { struct hmm *hmm = range->hmm; - /* Sanity check this really should not happen. */ - if (hmm == NULL || range->end <= range->start) - return; - mutex_lock(&hmm->lock); list_del_rcu(&range->list); mutex_unlock(&hmm->lock); /* Drop reference taken by hmm_range_register() */ - range->valid = false; mmput(hmm->mm); hmm_put(hmm); - range->hmm = NULL; + + /* + * The range is now invalid and the ref on the hmm is dropped, so + * poison the pointer. Leave other fields in place, for the caller's + * use. + */ + range->valid = false; + memset(&range->hmm, POISON_INUSE, sizeof(range->hmm)); } EXPORT_SYMBOL(hmm_range_unregister); -- 2.21.0 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 45+ messages in thread
[parent not found: <20190614004450.20252-10-jgg-uk2M96/98Pc@public.gmane.org>]
* Re: [PATCH v3 hmm 09/12] mm/hmm: Poison hmm_range during unregister [not found] ` <20190614004450.20252-10-jgg-uk2M96/98Pc@public.gmane.org> @ 2019-06-15 14:17 ` Christoph Hellwig 2019-06-18 18:04 ` Jason Gunthorpe 0 siblings, 1 reply; 45+ messages in thread From: Christoph Hellwig @ 2019-06-15 14:17 UTC (permalink / raw) To: Jason Gunthorpe Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard, Felix.Kuehling-5C7GfCeVMHo, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Souptick Joarder, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, Jason Gunthorpe, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ira Weiny, Ben Skeggs > - /* Sanity check this really should not happen. */ > - if (hmm == NULL || range->end <= range->start) > - return; > - > mutex_lock(&hmm->lock); > list_del_rcu(&range->list); > mutex_unlock(&hmm->lock); > > /* Drop reference taken by hmm_range_register() */ > - range->valid = false; > mmput(hmm->mm); > hmm_put(hmm); > - range->hmm = NULL; > + > + /* > + * The range is now invalid and the ref on the hmm is dropped, so > + * poison the pointer. Leave other fields in place, for the caller's > + * use. > + */ > + range->valid = false; > + memset(&range->hmm, POISON_INUSE, sizeof(range->hmm)); Formatting seems to be messed up. But again I don't see the value in the poisoning, just let normal linked list debugging do its work. The other cleanups looks fine to me. _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH v3 hmm 09/12] mm/hmm: Poison hmm_range during unregister 2019-06-15 14:17 ` Christoph Hellwig @ 2019-06-18 18:04 ` Jason Gunthorpe 0 siblings, 0 replies; 45+ messages in thread From: Jason Gunthorpe @ 2019-06-18 18:04 UTC (permalink / raw) To: Christoph Hellwig Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma, John Hubbard, Felix.Kuehling, dri-devel, Souptick Joarder, linux-mm, Jerome Glisse, amd-gfx, Ira Weiny, Ben Skeggs On Sat, Jun 15, 2019 at 07:17:26AM -0700, Christoph Hellwig wrote: > > - /* Sanity check this really should not happen. */ > > - if (hmm == NULL || range->end <= range->start) > > - return; > > - > > mutex_lock(&hmm->lock); > > list_del_rcu(&range->list); > > mutex_unlock(&hmm->lock); > > > > /* Drop reference taken by hmm_range_register() */ > > - range->valid = false; > > mmput(hmm->mm); > > hmm_put(hmm); > > - range->hmm = NULL; > > + > > + /* > > + * The range is now invalid and the ref on the hmm is dropped, so > > + * poison the pointer. Leave other fields in place, for the caller's > > + * use. > > + */ > > + range->valid = false; > > + memset(&range->hmm, POISON_INUSE, sizeof(range->hmm)); > > Formatting seems to be messed up. But again I don't see the value > in the poisoning, just let normal linked list debugging do its work. > The other cleanups looks fine to me. tabs vs spaces, I fixed it. This one is more murky than the other - it is to prevent the caller from using any of the range APIs after the range is unregistered, but we could also safely use NULL here, I think. Jason _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v3 hmm 10/12] mm/hmm: Do not use list*_rcu() for hmm->ranges 2019-06-14 0:44 [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review Jason Gunthorpe ` (8 preceding siblings ...) 2019-06-14 0:44 ` [PATCH v3 hmm 09/12] mm/hmm: Poison hmm_range during unregister Jason Gunthorpe @ 2019-06-14 0:44 ` Jason Gunthorpe [not found] ` <20190614004450.20252-11-jgg-uk2M96/98Pc@public.gmane.org> 2019-06-14 0:44 ` [PATCH v3 hmm 11/12] mm/hmm: Remove confusing comment and logic from hmm_release Jason Gunthorpe 2019-06-14 0:44 ` [PATCH v3 hmm 12/12] mm/hmm: Fix error flows in hmm_invalidate_range_start Jason Gunthorpe 11 siblings, 1 reply; 45+ messages in thread From: Jason Gunthorpe @ 2019-06-14 0:44 UTC (permalink / raw) To: Jerome Glisse, Ralph Campbell, John Hubbard, Felix.Kuehling Cc: Andrea Arcangeli, Philip Yang, linux-rdma, Ira Weiny, amd-gfx, Souptick Joarder, linux-mm, Jason Gunthorpe, dri-devel, Ben Skeggs From: Jason Gunthorpe <jgg@mellanox.com> This list is always read and written while holding hmm->lock so there is no need for the confusing _rcu annotations. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Jérôme Glisse <jglisse@redhat.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Acked-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Acked-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: Ira Weiny <iweiny@intel.com> Tested-by: Philip Yang <Philip.Yang@amd.com> --- mm/hmm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/hmm.c b/mm/hmm.c index e214668cba3474..26af511cbdd075 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -908,7 +908,7 @@ int hmm_range_register(struct hmm_range *range, range->hmm = hmm; kref_get(&hmm->kref); - list_add_rcu(&range->list, &hmm->ranges); + list_add(&range->list, &hmm->ranges); /* * If there are any concurrent notifiers we have to wait for them for @@ -934,7 +934,7 @@ void hmm_range_unregister(struct hmm_range *range) struct hmm *hmm = range->hmm; mutex_lock(&hmm->lock); - list_del_rcu(&range->list); + list_del(&range->list); mutex_unlock(&hmm->lock); /* Drop reference taken by hmm_range_register() */ -- 2.21.0 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 45+ messages in thread
[parent not found: <20190614004450.20252-11-jgg-uk2M96/98Pc@public.gmane.org>]
* Re: [PATCH v3 hmm 10/12] mm/hmm: Do not use list*_rcu() for hmm->ranges [not found] ` <20190614004450.20252-11-jgg-uk2M96/98Pc@public.gmane.org> @ 2019-06-15 14:18 ` Christoph Hellwig 2019-06-18 0:38 ` Jason Gunthorpe 0 siblings, 1 reply; 45+ messages in thread From: Christoph Hellwig @ 2019-06-15 14:18 UTC (permalink / raw) To: Jason Gunthorpe Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard, Felix.Kuehling-5C7GfCeVMHo, Ira Weiny, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Souptick Joarder, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, Jason Gunthorpe, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs On Thu, Jun 13, 2019 at 09:44:48PM -0300, Jason Gunthorpe wrote: > range->hmm = hmm; > kref_get(&hmm->kref); > - list_add_rcu(&range->list, &hmm->ranges); > + list_add(&range->list, &hmm->ranges); > > /* > * If there are any concurrent notifiers we have to wait for them for > @@ -934,7 +934,7 @@ void hmm_range_unregister(struct hmm_range *range) > struct hmm *hmm = range->hmm; > > mutex_lock(&hmm->lock); > - list_del_rcu(&range->list); > + list_del(&range->list); > mutex_unlock(&hmm->lock); Looks fine: Signed-off-by: Christoph Hellwig <hch@lst.de> Btw, is there any reason new ranges are added to the front and not the tail of the list? _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH v3 hmm 10/12] mm/hmm: Do not use list*_rcu() for hmm->ranges 2019-06-15 14:18 ` Christoph Hellwig @ 2019-06-18 0:38 ` Jason Gunthorpe 0 siblings, 0 replies; 45+ messages in thread From: Jason Gunthorpe @ 2019-06-18 0:38 UTC (permalink / raw) To: Christoph Hellwig Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma, John Hubbard, Felix.Kuehling, Ira Weiny, dri-devel, Souptick Joarder, linux-mm, Jerome Glisse, amd-gfx, Ben Skeggs On Sat, Jun 15, 2019 at 07:18:26AM -0700, Christoph Hellwig wrote: > On Thu, Jun 13, 2019 at 09:44:48PM -0300, Jason Gunthorpe wrote: > > range->hmm = hmm; > > kref_get(&hmm->kref); > > - list_add_rcu(&range->list, &hmm->ranges); > > + list_add(&range->list, &hmm->ranges); > > > > /* > > * If there are any concurrent notifiers we have to wait for them for > > @@ -934,7 +934,7 @@ void hmm_range_unregister(struct hmm_range *range) > > struct hmm *hmm = range->hmm; > > > > mutex_lock(&hmm->lock); > > - list_del_rcu(&range->list); > > + list_del(&range->list); > > mutex_unlock(&hmm->lock); > > Looks fine: > > Signed-off-by: Christoph Hellwig <hch@lst.de> > > Btw, is there any reason new ranges are added to the front and not the > tail of the list? Couldn't find one. I think order on this list doesn't matter. Jason _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v3 hmm 11/12] mm/hmm: Remove confusing comment and logic from hmm_release 2019-06-14 0:44 [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review Jason Gunthorpe ` (9 preceding siblings ...) 2019-06-14 0:44 ` [PATCH v3 hmm 10/12] mm/hmm: Do not use list*_rcu() for hmm->ranges Jason Gunthorpe @ 2019-06-14 0:44 ` Jason Gunthorpe [not found] ` <20190614004450.20252-12-jgg-uk2M96/98Pc@public.gmane.org> 2019-06-14 0:44 ` [PATCH v3 hmm 12/12] mm/hmm: Fix error flows in hmm_invalidate_range_start Jason Gunthorpe 11 siblings, 1 reply; 45+ messages in thread From: Jason Gunthorpe @ 2019-06-14 0:44 UTC (permalink / raw) To: Jerome Glisse, Ralph Campbell, John Hubbard, Felix.Kuehling Cc: Andrea Arcangeli, Philip Yang, linux-rdma, amd-gfx, linux-mm, Jason Gunthorpe, dri-devel, Ben Skeggs From: Jason Gunthorpe <jgg@mellanox.com> hmm_release() is called exactly once per hmm. ops->release() cannot accidentally trigger any action that would recurse back onto hmm->mirrors_sem. This fixes a use after-free race of the form: CPU0 CPU1 hmm_release() up_write(&hmm->mirrors_sem); hmm_mirror_unregister(mirror) down_write(&hmm->mirrors_sem); up_write(&hmm->mirrors_sem); kfree(mirror) mirror->ops->release(mirror) The only user we have today for ops->release is an empty function, so this is unambiguously safe. As a consequence of plugging this race drivers are not allowed to register/unregister mirrors from within a release op. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Tested-by: Philip Yang <Philip.Yang@amd.com> --- mm/hmm.c | 28 +++++++++------------------- 1 file changed, 9 insertions(+), 19 deletions(-) diff --git a/mm/hmm.c b/mm/hmm.c index 26af511cbdd075..c0d43302fd6b2f 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -137,26 +137,16 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm) WARN_ON(!list_empty(&hmm->ranges)); mutex_unlock(&hmm->lock); - down_write(&hmm->mirrors_sem); - mirror = list_first_entry_or_null(&hmm->mirrors, struct hmm_mirror, - list); - while (mirror) { - list_del_init(&mirror->list); - if (mirror->ops->release) { - /* - * Drop mirrors_sem so the release callback can wait - * on any pending work that might itself trigger a - * mmu_notifier callback and thus would deadlock with - * us. - */ - up_write(&hmm->mirrors_sem); + down_read(&hmm->mirrors_sem); + list_for_each_entry(mirror, &hmm->mirrors, list) { + /* + * Note: The driver is not allowed to trigger + * hmm_mirror_unregister() from this thread. + */ + if (mirror->ops->release) mirror->ops->release(mirror); - down_write(&hmm->mirrors_sem); - } - mirror = list_first_entry_or_null(&hmm->mirrors, - struct hmm_mirror, list); } - up_write(&hmm->mirrors_sem); + up_read(&hmm->mirrors_sem); hmm_put(hmm); } @@ -286,7 +276,7 @@ void hmm_mirror_unregister(struct hmm_mirror *mirror) struct hmm *hmm = mirror->hmm; down_write(&hmm->mirrors_sem); - list_del_init(&mirror->list); + list_del(&mirror->list); up_write(&hmm->mirrors_sem); hmm_put(hmm); memset(&mirror->hmm, POISON_INUSE, sizeof(mirror->hmm)); -- 2.21.0 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 45+ messages in thread
[parent not found: <20190614004450.20252-12-jgg-uk2M96/98Pc@public.gmane.org>]
* Re: [PATCH v3 hmm 11/12] mm/hmm: Remove confusing comment and logic from hmm_release [not found] ` <20190614004450.20252-12-jgg-uk2M96/98Pc@public.gmane.org> @ 2019-06-15 14:21 ` Christoph Hellwig 2019-06-18 0:45 ` Jason Gunthorpe 0 siblings, 1 reply; 45+ messages in thread From: Christoph Hellwig @ 2019-06-15 14:21 UTC (permalink / raw) To: Jason Gunthorpe Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard, Felix.Kuehling-5C7GfCeVMHo, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, Jason Gunthorpe, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs On Thu, Jun 13, 2019 at 09:44:49PM -0300, Jason Gunthorpe wrote: > From: Jason Gunthorpe <jgg@mellanox.com> > > hmm_release() is called exactly once per hmm. ops->release() cannot > accidentally trigger any action that would recurse back onto > hmm->mirrors_sem. In linux-next amdgpu actually calls hmm_mirror_unregister from its release function. That whole release function looks rather sketchy, but we probably need to sort that out first. _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH v3 hmm 11/12] mm/hmm: Remove confusing comment and logic from hmm_release 2019-06-15 14:21 ` Christoph Hellwig @ 2019-06-18 0:45 ` Jason Gunthorpe [not found] ` <20190618004509.GE30762-uk2M96/98Pc@public.gmane.org> 0 siblings, 1 reply; 45+ messages in thread From: Jason Gunthorpe @ 2019-06-18 0:45 UTC (permalink / raw) To: Christoph Hellwig Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma, John Hubbard, Felix.Kuehling, dri-devel, linux-mm, Jerome Glisse, amd-gfx, Ben Skeggs On Sat, Jun 15, 2019 at 07:21:06AM -0700, Christoph Hellwig wrote: > On Thu, Jun 13, 2019 at 09:44:49PM -0300, Jason Gunthorpe wrote: > > From: Jason Gunthorpe <jgg@mellanox.com> > > > > hmm_release() is called exactly once per hmm. ops->release() cannot > > accidentally trigger any action that would recurse back onto > > hmm->mirrors_sem. > > In linux-next amdgpu actually calls hmm_mirror_unregister from its > release function. That whole release function looks rather sketchy, > but we probably need to sort that out first. Does it? I see this: static void amdgpu_hmm_mirror_release(struct hmm_mirror *mirror) { struct amdgpu_mn *amn = container_of(mirror, struct amdgpu_mn, mirror); INIT_WORK(&amn->work, amdgpu_mn_destroy); schedule_work(&amn->work); } static struct hmm_mirror_ops amdgpu_hmm_mirror_ops[] = { [AMDGPU_MN_TYPE_GFX] = { .sync_cpu_device_pagetables = amdgpu_mn_sync_pagetables_gfx, .release = amdgpu_hmm_mirror_release }, [AMDGPU_MN_TYPE_HSA] = { .sync_cpu_device_pagetables = amdgpu_mn_sync_pagetables_hsa, .release = amdgpu_hmm_mirror_release }, }; Am I looking at the wrong thing? Looks like it calls it through a work queue should should be OK.. Though very strange that amdgpu only destroys the mirror via release, that cannot be right. Jason _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 45+ messages in thread
[parent not found: <20190618004509.GE30762-uk2M96/98Pc@public.gmane.org>]
* Re: [PATCH v3 hmm 11/12] mm/hmm: Remove confusing comment and logic from hmm_release [not found] ` <20190618004509.GE30762-uk2M96/98Pc@public.gmane.org> @ 2019-06-18 5:37 ` Christoph Hellwig [not found] ` <20190618053733.GA25048-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> 0 siblings, 1 reply; 45+ messages in thread From: Christoph Hellwig @ 2019-06-18 5:37 UTC (permalink / raw) To: Jason Gunthorpe Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard, Felix.Kuehling-5C7GfCeVMHo, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Christoph Hellwig, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs On Mon, Jun 17, 2019 at 09:45:09PM -0300, Jason Gunthorpe wrote: > Am I looking at the wrong thing? Looks like it calls it through a work > queue should should be OK.. Yes, it calls it through a work queue. I guess that is fine because it needs to take the lock again. > Though very strange that amdgpu only destroys the mirror via release, > that cannot be right. As said the whole things looks rather odd to me. _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 45+ messages in thread
[parent not found: <20190618053733.GA25048-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>]
* Re: [PATCH v3 hmm 11/12] mm/hmm: Remove confusing comment and logic from hmm_release [not found] ` <20190618053733.GA25048-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> @ 2019-06-19 0:53 ` Kuehling, Felix [not found] ` <be4f8573-6284-04a6-7862-23bb357bfe3c-5C7GfCeVMHo@public.gmane.org> 0 siblings, 1 reply; 45+ messages in thread From: Kuehling, Felix @ 2019-06-19 0:53 UTC (permalink / raw) To: Christoph Hellwig, Jason Gunthorpe Cc: Andrea Arcangeli, Yang, Philip, Ralph Campbell, linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs On 2019-06-18 1:37, Christoph Hellwig wrote: > On Mon, Jun 17, 2019 at 09:45:09PM -0300, Jason Gunthorpe wrote: >> Am I looking at the wrong thing? Looks like it calls it through a work >> queue should should be OK.. > Yes, it calls it through a work queue. I guess that is fine because > it needs to take the lock again. > >> Though very strange that amdgpu only destroys the mirror via release, >> that cannot be right. > As said the whole things looks rather odd to me. This code is derived from our old MMU notifier code. Before HMM we used to register a single MMU notifier per mm_struct and look up virtual address ranges that had been registered for mirroring via driver API calls. The idea was to reuse a single MMU notifier for the life time of the process. It would remain registered until we got a notifier_release. hmm_mirror took the place of that when we converted the code to HMM. I suppose we could destroy the mirror earlier, when we have no more registered virtual address ranges, and create a new one if needed later. Regards, Felix _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 45+ messages in thread
[parent not found: <be4f8573-6284-04a6-7862-23bb357bfe3c-5C7GfCeVMHo@public.gmane.org>]
* Re: [PATCH v3 hmm 11/12] mm/hmm: Remove confusing comment and logic from hmm_release [not found] ` <be4f8573-6284-04a6-7862-23bb357bfe3c-5C7GfCeVMHo@public.gmane.org> @ 2019-06-19 8:07 ` Christoph Hellwig 2019-06-19 11:56 ` Jason Gunthorpe 0 siblings, 1 reply; 45+ messages in thread From: Christoph Hellwig @ 2019-06-19 8:07 UTC (permalink / raw) To: Kuehling, Felix Cc: Andrea Arcangeli, Yang, Philip, Ralph Campbell, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Christoph Hellwig, Jason Gunthorpe, Jerome Glisse, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs On Wed, Jun 19, 2019 at 12:53:55AM +0000, Kuehling, Felix wrote: > This code is derived from our old MMU notifier code. Before HMM we used > to register a single MMU notifier per mm_struct and look up virtual > address ranges that had been registered for mirroring via driver API > calls. The idea was to reuse a single MMU notifier for the life time of > the process. It would remain registered until we got a notifier_release. > > hmm_mirror took the place of that when we converted the code to HMM. > > I suppose we could destroy the mirror earlier, when we have no more > registered virtual address ranges, and create a new one if needed later. I didn't write the code, but if you look at hmm_mirror it already is a multiplexer over the mmu notifier, and the intent clearly seems that you register one per range that you want to mirror, and not multiplex it once again. In other words - I think each amdgpu_mn_node should probably have its own hmm_mirror. And while the amdgpu_mn_node objects are currently stored in an interval tree it seems like they are only linearly iterated anyway, so a list actually seems pretty suitable. If not we need to improve the core data structures instead of working around them. _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH v3 hmm 11/12] mm/hmm: Remove confusing comment and logic from hmm_release 2019-06-19 8:07 ` Christoph Hellwig @ 2019-06-19 11:56 ` Jason Gunthorpe [not found] ` <20190619115632.GC9360-uk2M96/98Pc@public.gmane.org> 0 siblings, 1 reply; 45+ messages in thread From: Jason Gunthorpe @ 2019-06-19 11:56 UTC (permalink / raw) To: Christoph Hellwig Cc: Andrea Arcangeli, Yang, Philip, Ralph Campbell, linux-rdma, John Hubbard, Kuehling, Felix, dri-devel, linux-mm, Jerome Glisse, amd-gfx, Ben Skeggs On Wed, Jun 19, 2019 at 01:07:05AM -0700, Christoph Hellwig wrote: > On Wed, Jun 19, 2019 at 12:53:55AM +0000, Kuehling, Felix wrote: > > This code is derived from our old MMU notifier code. Before HMM we used > > to register a single MMU notifier per mm_struct and look up virtual > > address ranges that had been registered for mirroring via driver API > > calls. The idea was to reuse a single MMU notifier for the life time of > > the process. It would remain registered until we got a notifier_release. > > > > hmm_mirror took the place of that when we converted the code to HMM. > > > > I suppose we could destroy the mirror earlier, when we have no more > > registered virtual address ranges, and create a new one if needed later. > > I didn't write the code, but if you look at hmm_mirror it already is > a multiplexer over the mmu notifier, and the intent clearly seems that > you register one per range that you want to mirror, and not multiplex > it once again. In other words - I think each amdgpu_mn_node should > probably have its own hmm_mirror. And while the amdgpu_mn_node objects > are currently stored in an interval tree it seems like they are only > linearly iterated anyway, so a list actually seems pretty suitable. If > not we need to improve the core data structures instead of working > around them. This looks a lot like the ODP code (amdgpu_mn_node == ib_umem_odp) The interval tree is to quickly find the driver object(s) that have the virtual pages during invalidation: static int amdgpu_mn_sync_pagetables_gfx(struct hmm_mirror *mirror, const struct hmm_update *update) { it = interval_tree_iter_first(&amn->objects, start, end); while (it) { [..] amdgpu_mn_invalidate_node(node, start, end); And following the ODP model there should be a single hmm_mirror per-mm (user can fork and stuff, this is something I want to have core code help with). The hmm_mirror can either exist so long as objects exist, or it can exist until the chardev is closed - but never longer than the chardev's lifetime. Maybe we should be considering providing a mmu notifier & interval tree & lock abstraction since ODP & AMD are very similar here.. Jason _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 45+ messages in thread
[parent not found: <20190619115632.GC9360-uk2M96/98Pc@public.gmane.org>]
* Re: [PATCH v3 hmm 11/12] mm/hmm: Remove confusing comment and logic from hmm_release [not found] ` <20190619115632.GC9360-uk2M96/98Pc@public.gmane.org> @ 2019-06-19 12:03 ` Christoph Hellwig 0 siblings, 0 replies; 45+ messages in thread From: Christoph Hellwig @ 2019-06-19 12:03 UTC (permalink / raw) To: Jason Gunthorpe Cc: Andrea Arcangeli, Yang, Philip, Ralph Campbell, linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard, Kuehling, Felix, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Christoph Hellwig, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs On Wed, Jun 19, 2019 at 08:56:32AM -0300, Jason Gunthorpe wrote: > This looks a lot like the ODP code (amdgpu_mn_node == ib_umem_odp) > > The interval tree is to quickly find the driver object(s) that have > the virtual pages during invalidation: > > static int amdgpu_mn_sync_pagetables_gfx(struct hmm_mirror *mirror, > const struct hmm_update *update) > { > it = interval_tree_iter_first(&amn->objects, start, end); > while (it) { > [..] > amdgpu_mn_invalidate_node(node, start, end); > > And following the ODP model there should be a single hmm_mirror per-mm > (user can fork and stuff, this is something I want to have core code > help with). That makes the hmm_mirror object pretty silly, though as the scope is then exactly the same as the mmu_notifier itself. > The hmm_mirror can either exist so long as objects exist, or it can > exist until the chardev is closed - but never longer than the > chardev's lifetime. > > Maybe we should be considering providing a mmu notifier & interval > tree & lock abstraction since ODP & AMD are very similar here.. It defintively sounds like a good idea to move this kind of object management into common code. Nouvea actually seems like the odd one out here by not having a list of objects below the mirror, but then again and interval tree with a single entry wouldn't really hurt it either. _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v3 hmm 12/12] mm/hmm: Fix error flows in hmm_invalidate_range_start 2019-06-14 0:44 [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review Jason Gunthorpe ` (10 preceding siblings ...) 2019-06-14 0:44 ` [PATCH v3 hmm 11/12] mm/hmm: Remove confusing comment and logic from hmm_release Jason Gunthorpe @ 2019-06-14 0:44 ` Jason Gunthorpe [not found] ` <20190614004450.20252-13-jgg-uk2M96/98Pc@public.gmane.org> 11 siblings, 1 reply; 45+ messages in thread From: Jason Gunthorpe @ 2019-06-14 0:44 UTC (permalink / raw) To: Jerome Glisse, Ralph Campbell, John Hubbard, Felix.Kuehling Cc: Andrea Arcangeli, Philip Yang, linux-rdma, amd-gfx, linux-mm, Jason Gunthorpe, dri-devel, Ben Skeggs From: Jason Gunthorpe <jgg@mellanox.com> If the trylock on the hmm->mirrors_sem fails the function will return without decrementing the notifiers that were previously incremented. Since the caller will not call invalidate_range_end() on EAGAIN this will result in notifiers becoming permanently incremented and deadlock. If the sync_cpu_device_pagetables() required blocking the function will not return EAGAIN even though the device continues to touch the pages. This is a violation of the mmu notifier contract. Switch, and rename, the ranges_lock to a spin lock so we can reliably obtain it without blocking during error unwind. The error unwind is necessary since the notifiers count must be held incremented across the call to sync_cpu_device_pagetables() as we cannot allow the range to become marked valid by a parallel invalidate_start/end() pair while doing sync_cpu_device_pagetables(). Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Tested-by: Philip Yang <Philip.Yang@amd.com> --- include/linux/hmm.h | 2 +- mm/hmm.c | 77 +++++++++++++++++++++++++++------------------ 2 files changed, 48 insertions(+), 31 deletions(-) diff --git a/include/linux/hmm.h b/include/linux/hmm.h index bf013e96525771..0fa8ea34ccef6d 100644 --- a/include/linux/hmm.h +++ b/include/linux/hmm.h @@ -86,7 +86,7 @@ struct hmm { struct mm_struct *mm; struct kref kref; - struct mutex lock; + spinlock_t ranges_lock; struct list_head ranges; struct list_head mirrors; struct mmu_notifier mmu_notifier; diff --git a/mm/hmm.c b/mm/hmm.c index c0d43302fd6b2f..1172a4f0206963 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -67,7 +67,7 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm) init_rwsem(&hmm->mirrors_sem); hmm->mmu_notifier.ops = NULL; INIT_LIST_HEAD(&hmm->ranges); - mutex_init(&hmm->lock); + spin_lock_init(&hmm->ranges_lock); kref_init(&hmm->kref); hmm->notifiers = 0; hmm->mm = mm; @@ -124,18 +124,19 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm) { struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier); struct hmm_mirror *mirror; + unsigned long flags; /* Bail out if hmm is in the process of being freed */ if (!kref_get_unless_zero(&hmm->kref)) return; - mutex_lock(&hmm->lock); + spin_lock_irqsave(&hmm->ranges_lock, flags); /* * Since hmm_range_register() holds the mmget() lock hmm_release() is * prevented as long as a range exists. */ WARN_ON(!list_empty(&hmm->ranges)); - mutex_unlock(&hmm->lock); + spin_unlock_irqrestore(&hmm->ranges_lock, flags); down_read(&hmm->mirrors_sem); list_for_each_entry(mirror, &hmm->mirrors, list) { @@ -151,6 +152,23 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm) hmm_put(hmm); } +static void notifiers_decrement(struct hmm *hmm) +{ + lockdep_assert_held(&hmm->ranges_lock); + + hmm->notifiers--; + if (!hmm->notifiers) { + struct hmm_range *range; + + list_for_each_entry(range, &hmm->ranges, list) { + if (range->valid) + continue; + range->valid = true; + } + wake_up_all(&hmm->wq); + } +} + static int hmm_invalidate_range_start(struct mmu_notifier *mn, const struct mmu_notifier_range *nrange) { @@ -158,6 +176,7 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn, struct hmm_mirror *mirror; struct hmm_update update; struct hmm_range *range; + unsigned long flags; int ret = 0; if (!kref_get_unless_zero(&hmm->kref)) @@ -168,12 +187,7 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn, update.event = HMM_UPDATE_INVALIDATE; update.blockable = mmu_notifier_range_blockable(nrange); - if (mmu_notifier_range_blockable(nrange)) - mutex_lock(&hmm->lock); - else if (!mutex_trylock(&hmm->lock)) { - ret = -EAGAIN; - goto out; - } + spin_lock_irqsave(&hmm->ranges_lock, flags); hmm->notifiers++; list_for_each_entry(range, &hmm->ranges, list) { if (update.end < range->start || update.start >= range->end) @@ -181,7 +195,7 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn, range->valid = false; } - mutex_unlock(&hmm->lock); + spin_unlock_irqrestore(&hmm->ranges_lock, flags); if (mmu_notifier_range_blockable(nrange)) down_read(&hmm->mirrors_sem); @@ -189,16 +203,26 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn, ret = -EAGAIN; goto out; } + list_for_each_entry(mirror, &hmm->mirrors, list) { - int ret; + int rc; - ret = mirror->ops->sync_cpu_device_pagetables(mirror, &update); - if (!update.blockable && ret == -EAGAIN) + rc = mirror->ops->sync_cpu_device_pagetables(mirror, &update); + if (rc) { + if (WARN_ON(update.blockable || rc != -EAGAIN)) + continue; + ret = -EAGAIN; break; + } } up_read(&hmm->mirrors_sem); out: + if (ret) { + spin_lock_irqsave(&hmm->ranges_lock, flags); + notifiers_decrement(hmm); + spin_unlock_irqrestore(&hmm->ranges_lock, flags); + } hmm_put(hmm); return ret; } @@ -207,23 +231,14 @@ static void hmm_invalidate_range_end(struct mmu_notifier *mn, const struct mmu_notifier_range *nrange) { struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier); + unsigned long flags; if (!kref_get_unless_zero(&hmm->kref)) return; - mutex_lock(&hmm->lock); - hmm->notifiers--; - if (!hmm->notifiers) { - struct hmm_range *range; - - list_for_each_entry(range, &hmm->ranges, list) { - if (range->valid) - continue; - range->valid = true; - } - wake_up_all(&hmm->wq); - } - mutex_unlock(&hmm->lock); + spin_lock_irqsave(&hmm->ranges_lock, flags); + notifiers_decrement(hmm); + spin_unlock_irqrestore(&hmm->ranges_lock, flags); hmm_put(hmm); } @@ -876,6 +891,7 @@ int hmm_range_register(struct hmm_range *range, { unsigned long mask = ((1UL << page_shift) - 1UL); struct hmm *hmm = mirror->hmm; + unsigned long flags; range->valid = false; range->hmm = NULL; @@ -894,7 +910,7 @@ int hmm_range_register(struct hmm_range *range, return -EFAULT; /* Initialize range to track CPU page table updates. */ - mutex_lock(&hmm->lock); + spin_lock_irqsave(&hmm->ranges_lock, flags); range->hmm = hmm; kref_get(&hmm->kref); @@ -906,7 +922,7 @@ int hmm_range_register(struct hmm_range *range, */ if (!hmm->notifiers) range->valid = true; - mutex_unlock(&hmm->lock); + spin_unlock_irqrestore(&hmm->ranges_lock, flags); return 0; } @@ -922,10 +938,11 @@ EXPORT_SYMBOL(hmm_range_register); void hmm_range_unregister(struct hmm_range *range) { struct hmm *hmm = range->hmm; + unsigned long flags; - mutex_lock(&hmm->lock); + spin_lock_irqsave(&hmm->ranges_lock, flags); list_del(&range->list); - mutex_unlock(&hmm->lock); + spin_unlock_irqrestore(&hmm->ranges_lock, flags); /* Drop reference taken by hmm_range_register() */ mmput(hmm->mm); -- 2.21.0 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 45+ messages in thread
[parent not found: <20190614004450.20252-13-jgg-uk2M96/98Pc@public.gmane.org>]
* Re: [PATCH v3 hmm 12/12] mm/hmm: Fix error flows in hmm_invalidate_range_start [not found] ` <20190614004450.20252-13-jgg-uk2M96/98Pc@public.gmane.org> @ 2019-06-15 14:25 ` Christoph Hellwig 0 siblings, 0 replies; 45+ messages in thread From: Christoph Hellwig @ 2019-06-15 14:25 UTC (permalink / raw) To: Jason Gunthorpe Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard, Felix.Kuehling-5C7GfCeVMHo, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, Jason Gunthorpe, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs On Thu, Jun 13, 2019 at 09:44:50PM -0300, Jason Gunthorpe wrote: > From: Jason Gunthorpe <jgg@mellanox.com> > > If the trylock on the hmm->mirrors_sem fails the function will return > without decrementing the notifiers that were previously incremented. Since > the caller will not call invalidate_range_end() on EAGAIN this will result > in notifiers becoming permanently incremented and deadlock. > > If the sync_cpu_device_pagetables() required blocking the function will > not return EAGAIN even though the device continues to touch the > pages. This is a violation of the mmu notifier contract. > > Switch, and rename, the ranges_lock to a spin lock so we can reliably > obtain it without blocking during error unwind. > > The error unwind is necessary since the notifiers count must be held > incremented across the call to sync_cpu_device_pagetables() as we cannot > allow the range to become marked valid by a parallel > invalidate_start/end() pair while doing sync_cpu_device_pagetables(). > > Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> > Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> > Tested-by: Philip Yang <Philip.Yang@amd.com> > --- > include/linux/hmm.h | 2 +- > mm/hmm.c | 77 +++++++++++++++++++++++++++------------------ > 2 files changed, 48 insertions(+), 31 deletions(-) > > diff --git a/include/linux/hmm.h b/include/linux/hmm.h > index bf013e96525771..0fa8ea34ccef6d 100644 > --- a/include/linux/hmm.h > +++ b/include/linux/hmm.h > @@ -86,7 +86,7 @@ > struct hmm { > struct mm_struct *mm; > struct kref kref; > - struct mutex lock; > + spinlock_t ranges_lock; > struct list_head ranges; > struct list_head mirrors; > struct mmu_notifier mmu_notifier; > diff --git a/mm/hmm.c b/mm/hmm.c > index c0d43302fd6b2f..1172a4f0206963 100644 > --- a/mm/hmm.c > +++ b/mm/hmm.c > @@ -67,7 +67,7 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm) > init_rwsem(&hmm->mirrors_sem); > hmm->mmu_notifier.ops = NULL; > INIT_LIST_HEAD(&hmm->ranges); > - mutex_init(&hmm->lock); > + spin_lock_init(&hmm->ranges_lock); > kref_init(&hmm->kref); > hmm->notifiers = 0; > hmm->mm = mm; > @@ -124,18 +124,19 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm) > { > struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier); > struct hmm_mirror *mirror; > + unsigned long flags; > > /* Bail out if hmm is in the process of being freed */ > if (!kref_get_unless_zero(&hmm->kref)) > return; > > - mutex_lock(&hmm->lock); > + spin_lock_irqsave(&hmm->ranges_lock, flags); > /* > * Since hmm_range_register() holds the mmget() lock hmm_release() is > * prevented as long as a range exists. > */ > WARN_ON(!list_empty(&hmm->ranges)); > - mutex_unlock(&hmm->lock); > + spin_unlock_irqrestore(&hmm->ranges_lock, flags); > > down_read(&hmm->mirrors_sem); > list_for_each_entry(mirror, &hmm->mirrors, list) { > @@ -151,6 +152,23 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm) > hmm_put(hmm); > } > > +static void notifiers_decrement(struct hmm *hmm) > +{ > + lockdep_assert_held(&hmm->ranges_lock); > + > + hmm->notifiers--; > + if (!hmm->notifiers) { Nitpick, when doing dec and test or inc and test ops I find it much easier to read if they are merged into one line, i.e. if (!--hmm->notifiers) { Otherwise this looks fine: Reviewed-by: Christoph Hellwig <hch@lst.de> _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 45+ messages in thread
end of thread, other threads:[~2019-06-19 12:03 UTC | newest] Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-06-14 0:44 [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review Jason Gunthorpe 2019-06-14 0:44 ` [PATCH v3 hmm 01/12] mm/hmm: fix use after free with struct hmm in the mmu notifiers Jason Gunthorpe [not found] ` <20190614004450.20252-2-jgg-uk2M96/98Pc@public.gmane.org> 2019-06-15 13:56 ` Christoph Hellwig 2019-06-14 0:44 ` [PATCH v3 hmm 02/12] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register Jason Gunthorpe [not found] ` <20190614004450.20252-3-jgg-uk2M96/98Pc@public.gmane.org> 2019-06-15 13:59 ` Christoph Hellwig 2019-06-18 13:05 ` Jason Gunthorpe [not found] ` <20190618130544.GC6961-uk2M96/98Pc@public.gmane.org> 2019-06-19 8:14 ` Christoph Hellwig 2019-06-14 0:44 ` [PATCH v3 hmm 03/12] mm/hmm: Hold a mmgrab from hmm to mm Jason Gunthorpe [not found] ` <20190614004450.20252-4-jgg-uk2M96/98Pc@public.gmane.org> 2019-06-15 13:59 ` Christoph Hellwig 2019-06-14 0:44 ` [PATCH v3 hmm 04/12] mm/hmm: Simplify hmm_get_or_create and make it reliable Jason Gunthorpe [not found] ` <20190614004450.20252-5-jgg-uk2M96/98Pc@public.gmane.org> 2019-06-15 14:12 ` Christoph Hellwig 2019-06-18 0:36 ` Jason Gunthorpe 2019-06-18 18:55 ` Jason Gunthorpe 2019-06-14 0:44 ` [PATCH v3 hmm 05/12] mm/hmm: Remove duplicate condition test before wait_event_timeout Jason Gunthorpe [not found] ` <20190614004450.20252-6-jgg-uk2M96/98Pc@public.gmane.org> 2019-06-15 14:12 ` Christoph Hellwig 2019-06-14 0:44 ` [PATCH v3 hmm 06/12] mm/hmm: Hold on to the mmget for the lifetime of the range Jason Gunthorpe [not found] ` <20190614004450.20252-7-jgg-uk2M96/98Pc@public.gmane.org> 2019-06-15 14:14 ` Christoph Hellwig 2019-06-18 15:11 ` Jason Gunthorpe [not found] ` <20190618151100.GI6961-uk2M96/98Pc@public.gmane.org> 2019-06-19 8:18 ` Christoph Hellwig 2019-06-19 11:34 ` Jason Gunthorpe [not found] ` <20190619113452.GB9360-uk2M96/98Pc@public.gmane.org> 2019-06-19 11:54 ` Christoph Hellwig 2019-06-14 0:44 ` [PATCH v3 hmm 07/12] mm/hmm: Use lockdep instead of comments Jason Gunthorpe [not found] ` <20190614004450.20252-8-jgg-uk2M96/98Pc@public.gmane.org> 2019-06-15 14:14 ` Christoph Hellwig 2019-06-14 0:44 ` [PATCH v3 hmm 08/12] mm/hmm: Remove racy protection against double-unregistration Jason Gunthorpe [not found] ` <20190614004450.20252-9-jgg-uk2M96/98Pc@public.gmane.org> 2019-06-15 14:16 ` Christoph Hellwig 2019-06-18 13:13 ` Jason Gunthorpe [not found] ` <20190618131324.GF6961-uk2M96/98Pc@public.gmane.org> 2019-06-18 13:27 ` Christoph Hellwig 2019-06-18 18:57 ` Jason Gunthorpe [not found] ` <20190618185757.GP6961-uk2M96/98Pc@public.gmane.org> 2019-06-19 8:19 ` Christoph Hellwig 2019-06-14 0:44 ` [PATCH v3 hmm 09/12] mm/hmm: Poison hmm_range during unregister Jason Gunthorpe [not found] ` <20190614004450.20252-10-jgg-uk2M96/98Pc@public.gmane.org> 2019-06-15 14:17 ` Christoph Hellwig 2019-06-18 18:04 ` Jason Gunthorpe 2019-06-14 0:44 ` [PATCH v3 hmm 10/12] mm/hmm: Do not use list*_rcu() for hmm->ranges Jason Gunthorpe [not found] ` <20190614004450.20252-11-jgg-uk2M96/98Pc@public.gmane.org> 2019-06-15 14:18 ` Christoph Hellwig 2019-06-18 0:38 ` Jason Gunthorpe 2019-06-14 0:44 ` [PATCH v3 hmm 11/12] mm/hmm: Remove confusing comment and logic from hmm_release Jason Gunthorpe [not found] ` <20190614004450.20252-12-jgg-uk2M96/98Pc@public.gmane.org> 2019-06-15 14:21 ` Christoph Hellwig 2019-06-18 0:45 ` Jason Gunthorpe [not found] ` <20190618004509.GE30762-uk2M96/98Pc@public.gmane.org> 2019-06-18 5:37 ` Christoph Hellwig [not found] ` <20190618053733.GA25048-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> 2019-06-19 0:53 ` Kuehling, Felix [not found] ` <be4f8573-6284-04a6-7862-23bb357bfe3c-5C7GfCeVMHo@public.gmane.org> 2019-06-19 8:07 ` Christoph Hellwig 2019-06-19 11:56 ` Jason Gunthorpe [not found] ` <20190619115632.GC9360-uk2M96/98Pc@public.gmane.org> 2019-06-19 12:03 ` Christoph Hellwig 2019-06-14 0:44 ` [PATCH v3 hmm 12/12] mm/hmm: Fix error flows in hmm_invalidate_range_start Jason Gunthorpe [not found] ` <20190614004450.20252-13-jgg-uk2M96/98Pc@public.gmane.org> 2019-06-15 14:25 ` Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).