dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] drm/xe: Fix unprotected rebind_list accesses
@ 2023-05-10 14:19 Thomas Hellström
  2023-05-10 14:19 ` [PATCH 1/2] drm/xe: Fix unlocked access of the vma::rebind_link Thomas Hellström
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Thomas Hellström @ 2023-05-10 14:19 UTC (permalink / raw)
  To: dri-devel, intel-xe; +Cc: Thomas Hellström

Each VM has two rebind lists, one protected by the VM resv, the other
one protected essentially by the VM notifier.list_lock. This series
intends to fix two points of illegal access.

Patch 1 fixes an access of VM rebind_lists' link without the VM resv held.
Patch 2 fixes an issue where the VMA may not be removed from the
VM notifier.rebind_list on VMA destruction.

Thomas Hellström (2):
  drm/xe: Fix unlocked access of the vma::rebind_link
  drm/xe: Properly remove the vma from the vm::notifer::rebind_list when
    destroyed

 drivers/gpu/drm/xe/xe_vm.c | 23 ++++++++++++++++-------
 1 file changed, 16 insertions(+), 7 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/2] drm/xe: Fix unlocked access of the vma::rebind_link
  2023-05-10 14:19 [PATCH 0/2] drm/xe: Fix unprotected rebind_list accesses Thomas Hellström
@ 2023-05-10 14:19 ` Thomas Hellström
  2023-05-11 14:57   ` [Intel-xe] " Matthew Brost
  2023-05-10 14:19 ` [PATCH 2/2] drm/xe: Properly remove the vma from the vm::notifer::rebind_list when destroyed Thomas Hellström
  2023-05-10 20:36 ` [Intel-xe] [PATCH 0/2] drm/xe: Fix unprotected rebind_list accesses Souza, Jose
  2 siblings, 1 reply; 8+ messages in thread
From: Thomas Hellström @ 2023-05-10 14:19 UTC (permalink / raw)
  To: dri-devel, intel-xe; +Cc: Thomas Hellström

the vma::rebind_link is protected by the vm's resv, but we was
modifying it without. Fix this by use the vma::userptr_link instead
for the tmp_evict list. The vma::userptr_link is protected by the
vm lock.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/xe/xe_vm.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 0a4becdf4675..5f93d78c2e58 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -764,8 +764,7 @@ int xe_vm_userptr_pin(struct xe_vm *vm)
 		if (err < 0)
 			goto out_err;
 
-		list_del_init(&vma->userptr_link);
-		list_move_tail(&vma->rebind_link, &tmp_evict);
+		list_move_tail(&vma->userptr_link, &tmp_evict);
 	}
 
 	/* Take lock and move to rebind_list for rebinding. */
@@ -773,16 +772,17 @@ int xe_vm_userptr_pin(struct xe_vm *vm)
 	if (err)
 		goto out_err;
 
-	list_splice_tail(&tmp_evict, &vm->rebind_list);
+	list_for_each_entry_safe(vma, next, &tmp_evict, userptr_link) {
+		list_del_init(&vma->userptr_link);
+		list_move_tail(&vma->rebind_link, &vm->rebind_list);
+	}
+
 	dma_resv_unlock(&vm->resv);
 
 	return 0;
 
 out_err:
-	list_for_each_entry_safe(vma, next, &tmp_evict, rebind_link) {
-		list_del_init(&vma->rebind_link);
-		list_add_tail(&vma->userptr_link, &vm->userptr.repin_list);
-	}
+	list_splice_tail(&tmp_evict, &vm->userptr.repin_list);
 
 	return err;
 }
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] drm/xe: Properly remove the vma from the vm::notifer::rebind_list when destroyed
  2023-05-10 14:19 [PATCH 0/2] drm/xe: Fix unprotected rebind_list accesses Thomas Hellström
  2023-05-10 14:19 ` [PATCH 1/2] drm/xe: Fix unlocked access of the vma::rebind_link Thomas Hellström
@ 2023-05-10 14:19 ` Thomas Hellström
  2023-05-11 14:54   ` [Intel-xe] " Matthew Brost
  2023-05-10 20:36 ` [Intel-xe] [PATCH 0/2] drm/xe: Fix unprotected rebind_list accesses Souza, Jose
  2 siblings, 1 reply; 8+ messages in thread
From: Thomas Hellström @ 2023-05-10 14:19 UTC (permalink / raw)
  To: dri-devel, intel-xe; +Cc: Thomas Hellström

If a vma was destroyed with the bo evicted, it might happen that we forget
to remove it from the notifer::rebind_list. Fix to make sure that really
happens.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/xe/xe_vm.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 5f93d78c2e58..f54b3b7566c9 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -978,6 +978,15 @@ static void xe_vma_destroy(struct xe_vma *vma, struct dma_fence *fence)
 	} else {
 		xe_bo_assert_held(vma->bo);
 		list_del(&vma->bo_link);
+		/*
+		 * TODO: We can do an advisory check for list link empty here,
+		 * if this lock becomes too costly. Nobody can re-add to the
+		 * bo to the vm::notifier::rebind_list at this point since we
+		 * have the bo lock.
+		 */
+		 spin_lock(&vm->notifier.list_lock);
+		 list_del(&vma->notifier.rebind_link);
+		 spin_unlock(&vm->notifier.list_lock);
 		if (!vma->bo->vm)
 			vm_remove_extobj(vma);
 	}
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [Intel-xe] [PATCH 0/2] drm/xe: Fix unprotected rebind_list accesses
  2023-05-10 14:19 [PATCH 0/2] drm/xe: Fix unprotected rebind_list accesses Thomas Hellström
  2023-05-10 14:19 ` [PATCH 1/2] drm/xe: Fix unlocked access of the vma::rebind_link Thomas Hellström
  2023-05-10 14:19 ` [PATCH 2/2] drm/xe: Properly remove the vma from the vm::notifer::rebind_list when destroyed Thomas Hellström
@ 2023-05-10 20:36 ` Souza, Jose
  2 siblings, 0 replies; 8+ messages in thread
From: Souza, Jose @ 2023-05-10 20:36 UTC (permalink / raw)
  To: dri-devel, intel-xe, thomas.hellstrom

On Wed, 2023-05-10 at 16:19 +0200, Thomas Hellström wrote:
> Each VM has two rebind lists, one protected by the VM resv, the other
> one protected essentially by the VM notifier.list_lock. This series
> intends to fix two points of illegal access.
> 
> Patch 1 fixes an access of VM rebind_lists' link without the VM resv held.
> Patch 2 fixes an issue where the VMA may not be removed from the
> VM notifier.rebind_list on VMA destruction.

Tested-by: José Roberto de Souza <jose.souza@intel.com>

This improved overall stability and fixed the 'list_add corruption' errors, thank you.

> 
> Thomas Hellström (2):
>   drm/xe: Fix unlocked access of the vma::rebind_link
>   drm/xe: Properly remove the vma from the vm::notifer::rebind_list when
>     destroyed
> 
>  drivers/gpu/drm/xe/xe_vm.c | 23 ++++++++++++++++-------
>  1 file changed, 16 insertions(+), 7 deletions(-)
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Intel-xe] [PATCH 2/2] drm/xe: Properly remove the vma from the vm::notifer::rebind_list when destroyed
  2023-05-10 14:19 ` [PATCH 2/2] drm/xe: Properly remove the vma from the vm::notifer::rebind_list when destroyed Thomas Hellström
@ 2023-05-11 14:54   ` Matthew Brost
  2023-05-11 15:38     ` Thomas Hellström
  0 siblings, 1 reply; 8+ messages in thread
From: Matthew Brost @ 2023-05-11 14:54 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: intel-xe, dri-devel

On Wed, May 10, 2023 at 04:19:32PM +0200, Thomas Hellström wrote:
> If a vma was destroyed with the bo evicted, it might happen that we forget
> to remove it from the notifer::rebind_list. Fix to make sure that really
> happens.
> 
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> ---
>  drivers/gpu/drm/xe/xe_vm.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> index 5f93d78c2e58..f54b3b7566c9 100644
> --- a/drivers/gpu/drm/xe/xe_vm.c
> +++ b/drivers/gpu/drm/xe/xe_vm.c
> @@ -978,6 +978,15 @@ static void xe_vma_destroy(struct xe_vma *vma, struct dma_fence *fence)
>  	} else {
>  		xe_bo_assert_held(vma->bo);
>  		list_del(&vma->bo_link);
> +		/*
> +		 * TODO: We can do an advisory check for list link empty here,
> +		 * if this lock becomes too costly. Nobody can re-add to the
> +		 * bo to the vm::notifier::rebind_list at this point since we
> +		 * have the bo lock.
> +		 */

IMO grab isn't a big deal, not sure this is worth such a lengthly comment.

> +		 spin_lock(&vm->notifier.list_lock);
> +		 list_del(&vma->notifier.rebind_link);

Can you safe call list_del on an empty list? I thought that call blows
up hence we have a bunch of if (!list_empty()) checks before calling
list_del all over the driver.

Matt

> +		 spin_unlock(&vm->notifier.list_lock);
>  		if (!vma->bo->vm)
>  			vm_remove_extobj(vma);
>  	}
> -- 
> 2.39.2
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Intel-xe] [PATCH 1/2] drm/xe: Fix unlocked access of the vma::rebind_link
  2023-05-10 14:19 ` [PATCH 1/2] drm/xe: Fix unlocked access of the vma::rebind_link Thomas Hellström
@ 2023-05-11 14:57   ` Matthew Brost
  0 siblings, 0 replies; 8+ messages in thread
From: Matthew Brost @ 2023-05-11 14:57 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: intel-xe, dri-devel

On Wed, May 10, 2023 at 04:19:31PM +0200, Thomas Hellström wrote:
> the vma::rebind_link is protected by the vm's resv, but we was
> modifying it without. Fix this by use the vma::userptr_link instead
> for the tmp_evict list. The vma::userptr_link is protected by the
> vm lock.
> 
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

Reviewed-by: Matthew Brost <matthew.brost@intel.com>

> ---
>  drivers/gpu/drm/xe/xe_vm.c | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> index 0a4becdf4675..5f93d78c2e58 100644
> --- a/drivers/gpu/drm/xe/xe_vm.c
> +++ b/drivers/gpu/drm/xe/xe_vm.c
> @@ -764,8 +764,7 @@ int xe_vm_userptr_pin(struct xe_vm *vm)
>  		if (err < 0)
>  			goto out_err;
>  
> -		list_del_init(&vma->userptr_link);
> -		list_move_tail(&vma->rebind_link, &tmp_evict);
> +		list_move_tail(&vma->userptr_link, &tmp_evict);
>  	}
>  
>  	/* Take lock and move to rebind_list for rebinding. */
> @@ -773,16 +772,17 @@ int xe_vm_userptr_pin(struct xe_vm *vm)
>  	if (err)
>  		goto out_err;
>  
> -	list_splice_tail(&tmp_evict, &vm->rebind_list);
> +	list_for_each_entry_safe(vma, next, &tmp_evict, userptr_link) {
> +		list_del_init(&vma->userptr_link);
> +		list_move_tail(&vma->rebind_link, &vm->rebind_list);
> +	}
> +
>  	dma_resv_unlock(&vm->resv);
>  
>  	return 0;
>  
>  out_err:
> -	list_for_each_entry_safe(vma, next, &tmp_evict, rebind_link) {
> -		list_del_init(&vma->rebind_link);
> -		list_add_tail(&vma->userptr_link, &vm->userptr.repin_list);
> -	}
> +	list_splice_tail(&tmp_evict, &vm->userptr.repin_list);
>  
>  	return err;
>  }
> -- 
> 2.39.2
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Intel-xe] [PATCH 2/2] drm/xe: Properly remove the vma from the vm::notifer::rebind_list when destroyed
  2023-05-11 14:54   ` [Intel-xe] " Matthew Brost
@ 2023-05-11 15:38     ` Thomas Hellström
  2023-05-12  1:31       ` Matthew Brost
  0 siblings, 1 reply; 8+ messages in thread
From: Thomas Hellström @ 2023-05-11 15:38 UTC (permalink / raw)
  To: Matthew Brost; +Cc: intel-xe, dri-devel


On 5/11/23 16:54, Matthew Brost wrote:
> On Wed, May 10, 2023 at 04:19:32PM +0200, Thomas Hellström wrote:
>> If a vma was destroyed with the bo evicted, it might happen that we forget
>> to remove it from the notifer::rebind_list. Fix to make sure that really
>> happens.
>>
>> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>> ---
>>   drivers/gpu/drm/xe/xe_vm.c | 9 +++++++++
>>   1 file changed, 9 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
>> index 5f93d78c2e58..f54b3b7566c9 100644
>> --- a/drivers/gpu/drm/xe/xe_vm.c
>> +++ b/drivers/gpu/drm/xe/xe_vm.c
>> @@ -978,6 +978,15 @@ static void xe_vma_destroy(struct xe_vma *vma, struct dma_fence *fence)
>>   	} else {
>>   		xe_bo_assert_held(vma->bo);
>>   		list_del(&vma->bo_link);
>> +		/*
>> +		 * TODO: We can do an advisory check for list link empty here,
>> +		 * if this lock becomes too costly. Nobody can re-add to the
>> +		 * bo to the vm::notifier::rebind_list at this point since we
>> +		 * have the bo lock.
>> +		 */
> IMO grab isn't a big deal, not sure this is worth such a lengthly comment.

Ok, I'll remove it.


>
>> +		 spin_lock(&vm->notifier.list_lock);
>> +		 list_del(&vma->notifier.rebind_link);
> Can you safe call list_del on an empty list? I thought that call blows
> up hence we have a bunch of if (!list_empty()) checks before calling
> list_del all over the driver.

Good question. Looking at the implementation it definitely looks 
possible, and I have LIST_DEBUG turned on when testing, so I assume it 
would have blown up otherwise.

/Thomas


>
> Matt
>
>> +		 spin_unlock(&vm->notifier.list_lock);
>>   		if (!vma->bo->vm)
>>   			vm_remove_extobj(vma);
>>   	}
>> -- 
>> 2.39.2
>>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Intel-xe] [PATCH 2/2] drm/xe: Properly remove the vma from the vm::notifer::rebind_list when destroyed
  2023-05-11 15:38     ` Thomas Hellström
@ 2023-05-12  1:31       ` Matthew Brost
  0 siblings, 0 replies; 8+ messages in thread
From: Matthew Brost @ 2023-05-12  1:31 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: intel-xe, dri-devel

On Thu, May 11, 2023 at 05:38:11PM +0200, Thomas Hellström wrote:
> 
> On 5/11/23 16:54, Matthew Brost wrote:
> > On Wed, May 10, 2023 at 04:19:32PM +0200, Thomas Hellström wrote:
> > > If a vma was destroyed with the bo evicted, it might happen that we forget
> > > to remove it from the notifer::rebind_list. Fix to make sure that really
> > > happens.
> > > 
> > > Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > > ---
> > >   drivers/gpu/drm/xe/xe_vm.c | 9 +++++++++
> > >   1 file changed, 9 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> > > index 5f93d78c2e58..f54b3b7566c9 100644
> > > --- a/drivers/gpu/drm/xe/xe_vm.c
> > > +++ b/drivers/gpu/drm/xe/xe_vm.c
> > > @@ -978,6 +978,15 @@ static void xe_vma_destroy(struct xe_vma *vma, struct dma_fence *fence)
> > >   	} else {
> > >   		xe_bo_assert_held(vma->bo);
> > >   		list_del(&vma->bo_link);
> > > +		/*
> > > +		 * TODO: We can do an advisory check for list link empty here,
> > > +		 * if this lock becomes too costly. Nobody can re-add to the
> > > +		 * bo to the vm::notifier::rebind_list at this point since we
> > > +		 * have the bo lock.
> > > +		 */
> > IMO grab isn't a big deal, not sure this is worth such a lengthly comment.
> 
> Ok, I'll remove it.
> 
> 
> > 
> > > +		 spin_lock(&vm->notifier.list_lock);
> > > +		 list_del(&vma->notifier.rebind_link);
> > Can you safe call list_del on an empty list? I thought that call blows
> > up hence we have a bunch of if (!list_empty()) checks before calling
> > list_del all over the driver.
> 
> Good question. Looking at the implementation it definitely looks possible,
> and I have LIST_DEBUG turned on when testing, so I assume it would have
> blown up otherwise.
> 

It looks like 2 deletes will blow up but delete on a empty list is fine.

With that:
Reviewed-by: Matthew Brost <matthew.brost@intel.com>

> /Thomas
> 
> 
> > 
> > Matt
> > 
> > > +		 spin_unlock(&vm->notifier.list_lock);
> > >   		if (!vma->bo->vm)
> > >   			vm_remove_extobj(vma);
> > >   	}
> > > -- 
> > > 2.39.2
> > > 

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-05-12  1:31 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-10 14:19 [PATCH 0/2] drm/xe: Fix unprotected rebind_list accesses Thomas Hellström
2023-05-10 14:19 ` [PATCH 1/2] drm/xe: Fix unlocked access of the vma::rebind_link Thomas Hellström
2023-05-11 14:57   ` [Intel-xe] " Matthew Brost
2023-05-10 14:19 ` [PATCH 2/2] drm/xe: Properly remove the vma from the vm::notifer::rebind_list when destroyed Thomas Hellström
2023-05-11 14:54   ` [Intel-xe] " Matthew Brost
2023-05-11 15:38     ` Thomas Hellström
2023-05-12  1:31       ` Matthew Brost
2023-05-10 20:36 ` [Intel-xe] [PATCH 0/2] drm/xe: Fix unprotected rebind_list accesses Souza, Jose

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).