linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC] mm/hmm: pass mmu_notifier_range to sync_cpu_device_pagetables
@ 2019-06-08  0:14 Ralph Campbell
  2019-06-08  9:10 ` Christoph Hellwig
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Ralph Campbell @ 2019-06-08  0:14 UTC (permalink / raw)
  To: Jerome Glisse, John Hubbard, Felix.Kuehling, Jason Gunthorpe
  Cc: linux-rdma, linux-mm, Andrea Arcangeli, dri-devel, amd-gfx,
	Ralph Campbell

HMM defines its own struct hmm_update which is passed to the
sync_cpu_device_pagetables() callback function. This is
sufficient when the only action is to invalidate. However,
a device may want to know the reason for the invalidation and
be able to see the new permissions on a range, update device access
rights or range statistics. Since sync_cpu_device_pagetables()
can be called from try_to_unmap(), the mmap_sem may not be held
and find_vma() is not safe to be called.
Pass the struct mmu_notifier_range to sync_cpu_device_pagetables()
to allow the full invalidation information to be used.

Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
---

I'm sending this out now since we are updating many of the HMM APIs
and I think it will be useful.


 drivers/gpu/drm/nouveau/nouveau_svm.c |  4 ++--
 include/linux/hmm.h                   | 27 ++-------------------------
 mm/hmm.c                              | 13 ++++---------
 3 files changed, 8 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c
index 8c92374afcf2..c34b98fafe2f 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -252,13 +252,13 @@ nouveau_svmm_invalidate(struct nouveau_svmm *svmm, u64 start, u64 limit)
 
 static int
 nouveau_svmm_sync_cpu_device_pagetables(struct hmm_mirror *mirror,
-					const struct hmm_update *update)
+					const struct mmu_notifier_range *update)
 {
 	struct nouveau_svmm *svmm = container_of(mirror, typeof(*svmm), mirror);
 	unsigned long start = update->start;
 	unsigned long limit = update->end;
 
-	if (!update->blockable)
+	if (!mmu_notifier_range_blockable(update))
 		return -EAGAIN;
 
 	SVMM_DBG(svmm, "invalidate %016lx-%016lx", start, limit);
diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index 0fa8ea34ccef..07a2d38fde34 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -377,29 +377,6 @@ static inline uint64_t hmm_pfn_from_pfn(const struct hmm_range *range,
 
 struct hmm_mirror;
 
-/*
- * enum hmm_update_event - type of update
- * @HMM_UPDATE_INVALIDATE: invalidate range (no indication as to why)
- */
-enum hmm_update_event {
-	HMM_UPDATE_INVALIDATE,
-};
-
-/*
- * struct hmm_update - HMM update information for callback
- *
- * @start: virtual start address of the range to update
- * @end: virtual end address of the range to update
- * @event: event triggering the update (what is happening)
- * @blockable: can the callback block/sleep ?
- */
-struct hmm_update {
-	unsigned long start;
-	unsigned long end;
-	enum hmm_update_event event;
-	bool blockable;
-};
-
 /*
  * struct hmm_mirror_ops - HMM mirror device operations callback
  *
@@ -420,7 +397,7 @@ struct hmm_mirror_ops {
 	/* sync_cpu_device_pagetables() - synchronize page tables
 	 *
 	 * @mirror: pointer to struct hmm_mirror
-	 * @update: update information (see struct hmm_update)
+	 * @update: update information (see struct mmu_notifier_range)
 	 * Return: -EAGAIN if update.blockable false and callback need to
 	 *          block, 0 otherwise.
 	 *
@@ -434,7 +411,7 @@ struct hmm_mirror_ops {
 	 * synchronous call.
 	 */
 	int (*sync_cpu_device_pagetables)(struct hmm_mirror *mirror,
-					  const struct hmm_update *update);
+				const struct mmu_notifier_range *update);
 };
 
 /*
diff --git a/mm/hmm.c b/mm/hmm.c
index 9aad3550f2bb..b49a43712554 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -164,7 +164,6 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn,
 {
 	struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
 	struct hmm_mirror *mirror;
-	struct hmm_update update;
 	struct hmm_range *range;
 	unsigned long flags;
 	int ret = 0;
@@ -173,15 +172,10 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn,
 	if (!kref_get_unless_zero(&hmm->kref))
 		return 0;
 
-	update.start = nrange->start;
-	update.end = nrange->end;
-	update.event = HMM_UPDATE_INVALIDATE;
-	update.blockable = mmu_notifier_range_blockable(nrange);
-
 	spin_lock_irqsave(&hmm->ranges_lock, flags);
 	hmm->notifiers++;
 	list_for_each_entry(range, &hmm->ranges, list) {
-		if (update.end < range->start || update.start >= range->end)
+		if (nrange->end < range->start || nrange->start >= range->end)
 			continue;
 
 		range->valid = false;
@@ -198,9 +192,10 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn,
 	list_for_each_entry(mirror, &hmm->mirrors, list) {
 		int rc;
 
-		rc = mirror->ops->sync_cpu_device_pagetables(mirror, &update);
+		rc = mirror->ops->sync_cpu_device_pagetables(mirror, nrange);
 		if (rc) {
-			if (WARN_ON(update.blockable || rc != -EAGAIN))
+			if (WARN_ON(mmu_notifier_range_blockable(nrange) ||
+				    rc != -EAGAIN))
 				continue;
 			ret = -EAGAIN;
 			break;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC] mm/hmm: pass mmu_notifier_range to sync_cpu_device_pagetables
  2019-06-08  0:14 [RFC] mm/hmm: pass mmu_notifier_range to sync_cpu_device_pagetables Ralph Campbell
@ 2019-06-08  9:10 ` Christoph Hellwig
  2019-06-08 11:41   ` Jason Gunthorpe
  2019-06-08 11:50 ` Jason Gunthorpe
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2019-06-08  9:10 UTC (permalink / raw)
  To: Ralph Campbell
  Cc: Jerome Glisse, John Hubbard, Felix.Kuehling, Jason Gunthorpe,
	linux-rdma, linux-mm, Andrea Arcangeli, dri-devel, amd-gfx

On Fri, Jun 07, 2019 at 05:14:52PM -0700, Ralph Campbell wrote:
> HMM defines its own struct hmm_update which is passed to the
> sync_cpu_device_pagetables() callback function. This is
> sufficient when the only action is to invalidate. However,
> a device may want to know the reason for the invalidation and
> be able to see the new permissions on a range, update device access
> rights or range statistics. Since sync_cpu_device_pagetables()
> can be called from try_to_unmap(), the mmap_sem may not be held
> and find_vma() is not safe to be called.
> Pass the struct mmu_notifier_range to sync_cpu_device_pagetables()
> to allow the full invalidation information to be used.
> 
> Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
> ---
> 
> I'm sending this out now since we are updating many of the HMM APIs
> and I think it will be useful.

This is the right thing to do.  But the really right thing is to just
kill the hmm_mirror API entirely and move to mmu_notifiers.  At least
for noveau this already is way simpler, although right now it defeats
Jasons patch to avoid allocating the struct hmm in the fault path.
But as said before that can be avoided by just killing struct hmm,
which for many reasons is the right thing to do anyway.

I've got a series here, which is a bit broken (epecially the last
patch can't work as-is), but should explain where I'm trying to head:

http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/hmm-mirror-simplification


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] mm/hmm: pass mmu_notifier_range to sync_cpu_device_pagetables
  2019-06-08  9:10 ` Christoph Hellwig
@ 2019-06-08 11:41   ` Jason Gunthorpe
  2019-06-10  0:16     ` John Hubbard
  0 siblings, 1 reply; 13+ messages in thread
From: Jason Gunthorpe @ 2019-06-08 11:41 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Ralph Campbell, Jerome Glisse, John Hubbard, Felix.Kuehling,
	linux-rdma, linux-mm, Andrea Arcangeli, dri-devel, amd-gfx

On Sat, Jun 08, 2019 at 02:10:08AM -0700, Christoph Hellwig wrote:
> On Fri, Jun 07, 2019 at 05:14:52PM -0700, Ralph Campbell wrote:
> > HMM defines its own struct hmm_update which is passed to the
> > sync_cpu_device_pagetables() callback function. This is
> > sufficient when the only action is to invalidate. However,
> > a device may want to know the reason for the invalidation and
> > be able to see the new permissions on a range, update device access
> > rights or range statistics. Since sync_cpu_device_pagetables()
> > can be called from try_to_unmap(), the mmap_sem may not be held
> > and find_vma() is not safe to be called.
> > Pass the struct mmu_notifier_range to sync_cpu_device_pagetables()
> > to allow the full invalidation information to be used.
> > 
> > Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
> > 
> > I'm sending this out now since we are updating many of the HMM APIs
> > and I think it will be useful.
> 
> This is the right thing to do.  But the really right thing is to just
> kill the hmm_mirror API entirely and move to mmu_notifiers.  At least
> for noveau this already is way simpler, although right now it defeats
> Jasons patch to avoid allocating the struct hmm in the fault path.
> But as said before that can be avoided by just killing struct hmm,
> which for many reasons is the right thing to do anyway.
> 
> I've got a series here, which is a bit broken (epecially the last
> patch can't work as-is), but should explain where I'm trying to head:
> 
> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/hmm-mirror-simplification

At least the current hmm approach does rely on the collision retry
locking scheme in struct hmm/struct hmm_range for the pagefault side
to work right.

So, before we can apply patch one in this series we need to fix
hmm_vma_fault() and all its varients. Otherwise the driver will be
broken.

I'm hoping to first define what this locking should be (see other
emails to Ralph) then, ideally, see if we can extend mmu notifiers to
get it directly withouth hmm stuff.

Then we apply your patch one and the hmm ops wrapper dies.

Jason


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] mm/hmm: pass mmu_notifier_range to sync_cpu_device_pagetables
  2019-06-08  0:14 [RFC] mm/hmm: pass mmu_notifier_range to sync_cpu_device_pagetables Ralph Campbell
  2019-06-08  9:10 ` Christoph Hellwig
@ 2019-06-08 11:50 ` Jason Gunthorpe
  2019-06-09 19:46 ` Ira Weiny
  2019-07-02 19:53 ` Jason Gunthorpe
  3 siblings, 0 replies; 13+ messages in thread
From: Jason Gunthorpe @ 2019-06-08 11:50 UTC (permalink / raw)
  To: Ralph Campbell
  Cc: Jerome Glisse, John Hubbard, Felix.Kuehling, linux-rdma,
	linux-mm, Andrea Arcangeli, dri-devel, amd-gfx

On Fri, Jun 07, 2019 at 05:14:52PM -0700, Ralph Campbell wrote:
> HMM defines its own struct hmm_update which is passed to the
> sync_cpu_device_pagetables() callback function. This is
> sufficient when the only action is to invalidate. However,
> a device may want to know the reason for the invalidation and
> be able to see the new permissions on a range, update device access
> rights or range statistics. Since sync_cpu_device_pagetables()
> can be called from try_to_unmap(), the mmap_sem may not be held
> and find_vma() is not safe to be called.
> Pass the struct mmu_notifier_range to sync_cpu_device_pagetables()
> to allow the full invalidation information to be used.
> 
> Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
> ---
> 
> I'm sending this out now since we are updating many of the HMM APIs
> and I think it will be useful.

I agree with CH that struct hmm_update seems particularly pointless
and we really should just use mmu_notifier_range directly.

We need to find out from the DRM folks if we can merge this kind of
stuff through hmm.git and then resolve any conflicts that might arise
in DRM tree or in nouveau tree?

But I would like to see this patch go in this cycle, thanks

Jason


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] mm/hmm: pass mmu_notifier_range to sync_cpu_device_pagetables
  2019-06-08  0:14 [RFC] mm/hmm: pass mmu_notifier_range to sync_cpu_device_pagetables Ralph Campbell
  2019-06-08  9:10 ` Christoph Hellwig
  2019-06-08 11:50 ` Jason Gunthorpe
@ 2019-06-09 19:46 ` Ira Weiny
  2019-07-02 19:53 ` Jason Gunthorpe
  3 siblings, 0 replies; 13+ messages in thread
From: Ira Weiny @ 2019-06-09 19:46 UTC (permalink / raw)
  To: Ralph Campbell
  Cc: Jerome Glisse, John Hubbard, Felix.Kuehling, Jason Gunthorpe,
	linux-rdma, linux-mm, Andrea Arcangeli, dri-devel, amd-gfx

On Fri, Jun 07, 2019 at 05:14:52PM -0700, Ralph Campbell wrote:
> HMM defines its own struct hmm_update which is passed to the
> sync_cpu_device_pagetables() callback function. This is
> sufficient when the only action is to invalidate. However,
> a device may want to know the reason for the invalidation and
> be able to see the new permissions on a range, update device access
> rights or range statistics. Since sync_cpu_device_pagetables()
> can be called from try_to_unmap(), the mmap_sem may not be held
> and find_vma() is not safe to be called.
> Pass the struct mmu_notifier_range to sync_cpu_device_pagetables()
> to allow the full invalidation information to be used.
> 
> Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>

I don't disagree with Christoph or Jason but since I've been trying to sort out
where hmm does and does not fit any chance to remove a custom structure is a
good simplification IMO.  So...

Reviewed-by: Ira Weiny <ira.weiny@intel.com>

> ---
> 
> I'm sending this out now since we are updating many of the HMM APIs
> and I think it will be useful.
> 
> 
>  drivers/gpu/drm/nouveau/nouveau_svm.c |  4 ++--
>  include/linux/hmm.h                   | 27 ++-------------------------
>  mm/hmm.c                              | 13 ++++---------
>  3 files changed, 8 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c
> index 8c92374afcf2..c34b98fafe2f 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_svm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
> @@ -252,13 +252,13 @@ nouveau_svmm_invalidate(struct nouveau_svmm *svmm, u64 start, u64 limit)
>  
>  static int
>  nouveau_svmm_sync_cpu_device_pagetables(struct hmm_mirror *mirror,
> -					const struct hmm_update *update)
> +					const struct mmu_notifier_range *update)
>  {
>  	struct nouveau_svmm *svmm = container_of(mirror, typeof(*svmm), mirror);
>  	unsigned long start = update->start;
>  	unsigned long limit = update->end;
>  
> -	if (!update->blockable)
> +	if (!mmu_notifier_range_blockable(update))
>  		return -EAGAIN;
>  
>  	SVMM_DBG(svmm, "invalidate %016lx-%016lx", start, limit);
> diff --git a/include/linux/hmm.h b/include/linux/hmm.h
> index 0fa8ea34ccef..07a2d38fde34 100644
> --- a/include/linux/hmm.h
> +++ b/include/linux/hmm.h
> @@ -377,29 +377,6 @@ static inline uint64_t hmm_pfn_from_pfn(const struct hmm_range *range,
>  
>  struct hmm_mirror;
>  
> -/*
> - * enum hmm_update_event - type of update
> - * @HMM_UPDATE_INVALIDATE: invalidate range (no indication as to why)
> - */
> -enum hmm_update_event {
> -	HMM_UPDATE_INVALIDATE,
> -};
> -
> -/*
> - * struct hmm_update - HMM update information for callback
> - *
> - * @start: virtual start address of the range to update
> - * @end: virtual end address of the range to update
> - * @event: event triggering the update (what is happening)
> - * @blockable: can the callback block/sleep ?
> - */
> -struct hmm_update {
> -	unsigned long start;
> -	unsigned long end;
> -	enum hmm_update_event event;
> -	bool blockable;
> -};
> -
>  /*
>   * struct hmm_mirror_ops - HMM mirror device operations callback
>   *
> @@ -420,7 +397,7 @@ struct hmm_mirror_ops {
>  	/* sync_cpu_device_pagetables() - synchronize page tables
>  	 *
>  	 * @mirror: pointer to struct hmm_mirror
> -	 * @update: update information (see struct hmm_update)
> +	 * @update: update information (see struct mmu_notifier_range)
>  	 * Return: -EAGAIN if update.blockable false and callback need to
>  	 *          block, 0 otherwise.
>  	 *
> @@ -434,7 +411,7 @@ struct hmm_mirror_ops {
>  	 * synchronous call.
>  	 */
>  	int (*sync_cpu_device_pagetables)(struct hmm_mirror *mirror,
> -					  const struct hmm_update *update);
> +				const struct mmu_notifier_range *update);
>  };
>  
>  /*
> diff --git a/mm/hmm.c b/mm/hmm.c
> index 9aad3550f2bb..b49a43712554 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -164,7 +164,6 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn,
>  {
>  	struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
>  	struct hmm_mirror *mirror;
> -	struct hmm_update update;
>  	struct hmm_range *range;
>  	unsigned long flags;
>  	int ret = 0;
> @@ -173,15 +172,10 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn,
>  	if (!kref_get_unless_zero(&hmm->kref))
>  		return 0;
>  
> -	update.start = nrange->start;
> -	update.end = nrange->end;
> -	update.event = HMM_UPDATE_INVALIDATE;
> -	update.blockable = mmu_notifier_range_blockable(nrange);
> -
>  	spin_lock_irqsave(&hmm->ranges_lock, flags);
>  	hmm->notifiers++;
>  	list_for_each_entry(range, &hmm->ranges, list) {
> -		if (update.end < range->start || update.start >= range->end)
> +		if (nrange->end < range->start || nrange->start >= range->end)
>  			continue;
>  
>  		range->valid = false;
> @@ -198,9 +192,10 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn,
>  	list_for_each_entry(mirror, &hmm->mirrors, list) {
>  		int rc;
>  
> -		rc = mirror->ops->sync_cpu_device_pagetables(mirror, &update);
> +		rc = mirror->ops->sync_cpu_device_pagetables(mirror, nrange);
>  		if (rc) {
> -			if (WARN_ON(update.blockable || rc != -EAGAIN))
> +			if (WARN_ON(mmu_notifier_range_blockable(nrange) ||
> +				    rc != -EAGAIN))
>  				continue;
>  			ret = -EAGAIN;
>  			break;
> -- 
> 2.20.1
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] mm/hmm: pass mmu_notifier_range to sync_cpu_device_pagetables
  2019-06-08 11:41   ` Jason Gunthorpe
@ 2019-06-10  0:16     ` John Hubbard
  0 siblings, 0 replies; 13+ messages in thread
From: John Hubbard @ 2019-06-10  0:16 UTC (permalink / raw)
  To: Jason Gunthorpe, Christoph Hellwig
  Cc: Ralph Campbell, Jerome Glisse, Felix.Kuehling, linux-rdma,
	linux-mm, Andrea Arcangeli, dri-devel, amd-gfx

On 6/8/19 4:41 AM, Jason Gunthorpe wrote:
> On Sat, Jun 08, 2019 at 02:10:08AM -0700, Christoph Hellwig wrote:
>> On Fri, Jun 07, 2019 at 05:14:52PM -0700, Ralph Campbell wrote:
>>> HMM defines its own struct hmm_update which is passed to the
>>> sync_cpu_device_pagetables() callback function. This is
>>> sufficient when the only action is to invalidate. However,
>>> a device may want to know the reason for the invalidation and
>>> be able to see the new permissions on a range, update device access
>>> rights or range statistics. Since sync_cpu_device_pagetables()
>>> can be called from try_to_unmap(), the mmap_sem may not be held
>>> and find_vma() is not safe to be called.
>>> Pass the struct mmu_notifier_range to sync_cpu_device_pagetables()
>>> to allow the full invalidation information to be used.
>>>
>>> Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
>>>
>>> I'm sending this out now since we are updating many of the HMM APIs
>>> and I think it will be useful.
>>
>> This is the right thing to do.  But the really right thing is to just
>> kill the hmm_mirror API entirely and move to mmu_notifiers.  At least
>> for noveau this already is way simpler, although right now it defeats
>> Jasons patch to avoid allocating the struct hmm in the fault path.
>> But as said before that can be avoided by just killing struct hmm,
>> which for many reasons is the right thing to do anyway.
>>
>> I've got a series here, which is a bit broken (epecially the last
>> patch can't work as-is), but should explain where I'm trying to head:
>>
>> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/hmm-mirror-simplification
> 
> At least the current hmm approach does rely on the collision retry
> locking scheme in struct hmm/struct hmm_range for the pagefault side
> to work right.
> 
> So, before we can apply patch one in this series we need to fix
> hmm_vma_fault() and all its varients. Otherwise the driver will be
> broken.
> 
> I'm hoping to first define what this locking should be (see other
> emails to Ralph) then, ideally, see if we can extend mmu notifiers to
> get it directly withouth hmm stuff.
> 
> Then we apply your patch one and the hmm ops wrapper dies.
> 

This all makes sense, and thanks for all this work to simplify and clarify
HMM. It's going to make it a lot easier to work with, when the dust settles.

thanks,
-- 
John Hubbard
NVIDIA


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] mm/hmm: pass mmu_notifier_range to sync_cpu_device_pagetables
  2019-06-08  0:14 [RFC] mm/hmm: pass mmu_notifier_range to sync_cpu_device_pagetables Ralph Campbell
                   ` (2 preceding siblings ...)
  2019-06-09 19:46 ` Ira Weiny
@ 2019-07-02 19:53 ` Jason Gunthorpe
  2019-07-02 20:11   ` Ralph Campbell
  2019-07-02 22:49   ` Christoph Hellwig
  3 siblings, 2 replies; 13+ messages in thread
From: Jason Gunthorpe @ 2019-07-02 19:53 UTC (permalink / raw)
  To: Ralph Campbell, Christoph Hellwig
  Cc: Jerome Glisse, John Hubbard, Felix.Kuehling, linux-rdma,
	linux-mm, Andrea Arcangeli, dri-devel, amd-gfx

On Fri, Jun 07, 2019 at 05:14:52PM -0700, Ralph Campbell wrote:
> HMM defines its own struct hmm_update which is passed to the
> sync_cpu_device_pagetables() callback function. This is
> sufficient when the only action is to invalidate. However,
> a device may want to know the reason for the invalidation and
> be able to see the new permissions on a range, update device access
> rights or range statistics. Since sync_cpu_device_pagetables()
> can be called from try_to_unmap(), the mmap_sem may not be held
> and find_vma() is not safe to be called.
> Pass the struct mmu_notifier_range to sync_cpu_device_pagetables()
> to allow the full invalidation information to be used.
> 
> Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
> ---
> 
> I'm sending this out now since we are updating many of the HMM APIs
> and I think it will be useful.

This make so much sense, I'd like to apply this in hmm.git, is there
any objection?

Jason


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] mm/hmm: pass mmu_notifier_range to sync_cpu_device_pagetables
  2019-07-02 19:53 ` Jason Gunthorpe
@ 2019-07-02 20:11   ` Ralph Campbell
  2019-07-02 22:49   ` Christoph Hellwig
  1 sibling, 0 replies; 13+ messages in thread
From: Ralph Campbell @ 2019-07-02 20:11 UTC (permalink / raw)
  To: Jason Gunthorpe, Christoph Hellwig
  Cc: Jerome Glisse, John Hubbard, Felix.Kuehling, linux-rdma,
	linux-mm, Andrea Arcangeli, dri-devel, amd-gfx


On 7/2/19 12:53 PM, Jason Gunthorpe wrote:
> On Fri, Jun 07, 2019 at 05:14:52PM -0700, Ralph Campbell wrote:
>> HMM defines its own struct hmm_update which is passed to the
>> sync_cpu_device_pagetables() callback function. This is
>> sufficient when the only action is to invalidate. However,
>> a device may want to know the reason for the invalidation and
>> be able to see the new permissions on a range, update device access
>> rights or range statistics. Since sync_cpu_device_pagetables()
>> can be called from try_to_unmap(), the mmap_sem may not be held
>> and find_vma() is not safe to be called.
>> Pass the struct mmu_notifier_range to sync_cpu_device_pagetables()
>> to allow the full invalidation information to be used.
>>
>> Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
>> ---
>>
>> I'm sending this out now since we are updating many of the HMM APIs
>> and I think it will be useful.
> 
> This make so much sense, I'd like to apply this in hmm.git, is there
> any objection?
> 
> Jason
> 
Not from me. :-)

Thanks!


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] mm/hmm: pass mmu_notifier_range to sync_cpu_device_pagetables
  2019-07-02 19:53 ` Jason Gunthorpe
  2019-07-02 20:11   ` Ralph Campbell
@ 2019-07-02 22:49   ` Christoph Hellwig
  2019-07-02 22:59     ` Jason Gunthorpe
  1 sibling, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2019-07-02 22:49 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Ralph Campbell, Christoph Hellwig, Jerome Glisse, John Hubbard,
	Felix.Kuehling, linux-rdma, linux-mm, Andrea Arcangeli,
	dri-devel, amd-gfx

On Tue, Jul 02, 2019 at 07:53:23PM +0000, Jason Gunthorpe wrote:
> > I'm sending this out now since we are updating many of the HMM APIs
> > and I think it will be useful.
> 
> This make so much sense, I'd like to apply this in hmm.git, is there
> any objection?

As this creates a somewhat hairy conflict for amdgpu, wouldn't it be
a better idea to wait a bit and apply it first thing for next merge
window?


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] mm/hmm: pass mmu_notifier_range to sync_cpu_device_pagetables
  2019-07-02 22:49   ` Christoph Hellwig
@ 2019-07-02 22:59     ` Jason Gunthorpe
  2019-07-03  0:03       ` Christoph Hellwig
  2019-07-03  2:27       ` Kuehling, Felix
  0 siblings, 2 replies; 13+ messages in thread
From: Jason Gunthorpe @ 2019-07-02 22:59 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Ralph Campbell, Jerome Glisse, John Hubbard, Felix.Kuehling,
	linux-rdma, linux-mm, Andrea Arcangeli, dri-devel, amd-gfx

On Wed, Jul 03, 2019 at 12:49:12AM +0200, Christoph Hellwig wrote:
> On Tue, Jul 02, 2019 at 07:53:23PM +0000, Jason Gunthorpe wrote:
> > > I'm sending this out now since we are updating many of the HMM APIs
> > > and I think it will be useful.
> > 
> > This make so much sense, I'd like to apply this in hmm.git, is there
> > any objection?
> 
> As this creates a somewhat hairy conflict for amdgpu, wouldn't it be
> a better idea to wait a bit and apply it first thing for next merge
> window?

My thinking is that AMD GPU already has a monster conflict from this:

 int hmm_range_register(struct hmm_range *range,
-                      struct mm_struct *mm,
+                      struct hmm_mirror *mirror,
                       unsigned long start,
                       unsigned long end,
                       unsigned page_shift);

So, depending on how that is resolved we might want to do both API
changes at once.

Or we may have to revert the above change at this late date.

Waiting for AMDGPU team to discuss what process they want to use.

Jason


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] mm/hmm: pass mmu_notifier_range to sync_cpu_device_pagetables
  2019-07-02 22:59     ` Jason Gunthorpe
@ 2019-07-03  0:03       ` Christoph Hellwig
  2019-07-03  2:27       ` Kuehling, Felix
  1 sibling, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2019-07-03  0:03 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Ralph Campbell, Jerome Glisse, John Hubbard,
	Felix.Kuehling, linux-rdma, linux-mm, Andrea Arcangeli,
	dri-devel, amd-gfx

On Tue, Jul 02, 2019 at 10:59:16PM +0000, Jason Gunthorpe wrote:
> > As this creates a somewhat hairy conflict for amdgpu, wouldn't it be
> > a better idea to wait a bit and apply it first thing for next merge
> > window?
> 
> My thinking is that AMD GPU already has a monster conflict from this:
> 
>  int hmm_range_register(struct hmm_range *range,
> -                      struct mm_struct *mm,
> +                      struct hmm_mirror *mirror,
>                        unsigned long start,
>                        unsigned long end,
>                        unsigned page_shift);

Well, that seems like a relatively easy to fix conflict, at least as
long as you have the mirror easily available.  The notifier change
on the other hand basically requires rewriting about two dozen lines
of code entirely.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] mm/hmm: pass mmu_notifier_range to sync_cpu_device_pagetables
  2019-07-02 22:59     ` Jason Gunthorpe
  2019-07-03  0:03       ` Christoph Hellwig
@ 2019-07-03  2:27       ` Kuehling, Felix
  2019-07-03 15:08         ` Jason Gunthorpe
  1 sibling, 1 reply; 13+ messages in thread
From: Kuehling, Felix @ 2019-07-03  2:27 UTC (permalink / raw)
  To: Jason Gunthorpe, Christoph Hellwig, Deucher, Alexander, David Airlie
  Cc: Ralph Campbell, Jerome Glisse, John Hubbard, linux-rdma,
	linux-mm, Andrea Arcangeli, dri-devel, amd-gfx

On 2019-07-02 6:59 p.m., Jason Gunthorpe wrote:
> On Wed, Jul 03, 2019 at 12:49:12AM +0200, Christoph Hellwig wrote:
>> On Tue, Jul 02, 2019 at 07:53:23PM +0000, Jason Gunthorpe wrote:
>>>> I'm sending this out now since we are updating many of the HMM APIs
>>>> and I think it will be useful.
>>> This make so much sense, I'd like to apply this in hmm.git, is there
>>> any objection?
>> As this creates a somewhat hairy conflict for amdgpu, wouldn't it be
>> a better idea to wait a bit and apply it first thing for next merge
>> window?
> My thinking is that AMD GPU already has a monster conflict from this:
>
>   int hmm_range_register(struct hmm_range *range,
> -                      struct mm_struct *mm,
> +                      struct hmm_mirror *mirror,
>                         unsigned long start,
>                         unsigned long end,
>                         unsigned page_shift);
>
> So, depending on how that is resolved we might want to do both API
> changes at once.

I just sent out a fix for the hmm_mirror API change.


>
> Or we may have to revert the above change at this late date.
>
> Waiting for AMDGPU team to discuss what process they want to use.

Yeah, I'm wondering what the process is myself. With HMM and driver 
development happening on different branches these kinds of API changes 
are painful. There seems to be a built-in assumption in the current 
process, that code flows mostly in one direction amd-staging-drm-next -> 
drm-next -> linux-next -> linux. That assumption is broken with HMM code 
evolving rapidly in both amdgpu and mm.

If we want to continue developing HMM driver changes in 
amd-staging-drm-next, we'll need to synchronize with hmm.git more 
frequently, both ways. I believe part of the problem is, that there is a 
fairly long lead-time from getting changes from amd-staging-drm-next 
into linux-next, as they are held for one release cycle in drm-next. 
Pushing HMM-related changes through drm-fixes may offer a kind of 
shortcut. Philip and my latest fixup is just bypassing drm-next 
completely and going straight into linux-next, though.

Regards,
   Felix


>
> Jason

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] mm/hmm: pass mmu_notifier_range to sync_cpu_device_pagetables
  2019-07-03  2:27       ` Kuehling, Felix
@ 2019-07-03 15:08         ` Jason Gunthorpe
  0 siblings, 0 replies; 13+ messages in thread
From: Jason Gunthorpe @ 2019-07-03 15:08 UTC (permalink / raw)
  To: Kuehling, Felix
  Cc: Christoph Hellwig, Deucher, Alexander, David Airlie,
	Ralph Campbell, Jerome Glisse, John Hubbard, linux-rdma,
	linux-mm, Andrea Arcangeli, dri-devel, amd-gfx

On Wed, Jul 03, 2019 at 02:27:22AM +0000, Kuehling, Felix wrote:
> On 2019-07-02 6:59 p.m., Jason Gunthorpe wrote:
> > On Wed, Jul 03, 2019 at 12:49:12AM +0200, Christoph Hellwig wrote:
> >> On Tue, Jul 02, 2019 at 07:53:23PM +0000, Jason Gunthorpe wrote:
> >>>> I'm sending this out now since we are updating many of the HMM APIs
> >>>> and I think it will be useful.
> >>> This make so much sense, I'd like to apply this in hmm.git, is there
> >>> any objection?
> >> As this creates a somewhat hairy conflict for amdgpu, wouldn't it be
> >> a better idea to wait a bit and apply it first thing for next merge
> >> window?
> > My thinking is that AMD GPU already has a monster conflict from this:
> >
> >   int hmm_range_register(struct hmm_range *range,
> > -                      struct mm_struct *mm,
> > +                      struct hmm_mirror *mirror,
> >                         unsigned long start,
> >                         unsigned long end,
> >                         unsigned page_shift);
> >
> > So, depending on how that is resolved we might want to do both API
> > changes at once.
> 
> I just sent out a fix for the hmm_mirror API change.

I think if you follow my suggestion to apply a prep patch to AMD GPU
to make the conflict resolution simple, we should defer this patch
until next kernel for the reasons CH gave.

> > Or we may have to revert the above change at this late date.
> >
> > Waiting for AMDGPU team to discuss what process they want to use.
> 
> Yeah, I'm wondering what the process is myself. With HMM and driver 
> development happening on different branches these kinds of API changes 
> are painful. There seems to be a built-in assumption in the current 
> process, that code flows mostly in one direction amd-staging-drm-next -> 
> drm-next -> linux-next -> linux. That assumption is broken with HMM code 
> evolving rapidly in both amdgpu and mm.

It looks to me like AMD GPU uses a pull request model. So a goal as a
tree runner should be to work with the other trees (ie hmm.git, etc)
to minimize conflicts between the PR you will send and the PR other
trees will send.

Do not focus on linux-next, that is just an 'early warning system'
that conflicts are on the horizon, we knew about this one :) (well,
mostly, I was surprised how big it was, my bad)

So we must stay in co-ordination with patches in-flight on the list
and make the right decision, depending on the situation. Communication
here is key :)

We have lots of strategies available to deal with these situations.

> If we want to continue developing HMM driver changes in
> amd-staging-drm-next, we'll need to synchronize with hmm.git more 
> frequently, both ways.

It can't really go both ways. hmm.git has to be only the hmm topic,
otherwise it doesn't really work.

> I believe part of the problem is, that there is a fairly long
> lead-time from getting changes from amd-staging-drm-next into
> linux-next, as they are held for one release cycle in drm-next.
> Pushing HMM-related changes through drm-fixes may offer a kind of
> shortcut. Philip and my latest fixup is just bypassing drm-next
> completely and going straight into linux-next, though.

I'm not so familiar with the DRM work flow to give you advice on this.

Jason


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-07-03 15:08 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-08  0:14 [RFC] mm/hmm: pass mmu_notifier_range to sync_cpu_device_pagetables Ralph Campbell
2019-06-08  9:10 ` Christoph Hellwig
2019-06-08 11:41   ` Jason Gunthorpe
2019-06-10  0:16     ` John Hubbard
2019-06-08 11:50 ` Jason Gunthorpe
2019-06-09 19:46 ` Ira Weiny
2019-07-02 19:53 ` Jason Gunthorpe
2019-07-02 20:11   ` Ralph Campbell
2019-07-02 22:49   ` Christoph Hellwig
2019-07-02 22:59     ` Jason Gunthorpe
2019-07-03  0:03       ` Christoph Hellwig
2019-07-03  2:27       ` Kuehling, Felix
2019-07-03 15:08         ` Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).