* [PATCH 0/2] drm/xe: Fix unprotected rebind_list accesses
@ 2023-05-10 14:19 Thomas Hellström
2023-05-10 14:19 ` [PATCH 1/2] drm/xe: Fix unlocked access of the vma::rebind_link Thomas Hellström
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Thomas Hellström @ 2023-05-10 14:19 UTC (permalink / raw)
To: dri-devel, intel-xe; +Cc: Thomas Hellström
Each VM has two rebind lists, one protected by the VM resv, the other
one protected essentially by the VM notifier.list_lock. This series
intends to fix two points of illegal access.
Patch 1 fixes an access of VM rebind_lists' link without the VM resv held.
Patch 2 fixes an issue where the VMA may not be removed from the
VM notifier.rebind_list on VMA destruction.
Thomas Hellström (2):
drm/xe: Fix unlocked access of the vma::rebind_link
drm/xe: Properly remove the vma from the vm::notifer::rebind_list when
destroyed
drivers/gpu/drm/xe/xe_vm.c | 23 ++++++++++++++++-------
1 file changed, 16 insertions(+), 7 deletions(-)
--
2.39.2
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/2] drm/xe: Fix unlocked access of the vma::rebind_link
2023-05-10 14:19 [PATCH 0/2] drm/xe: Fix unprotected rebind_list accesses Thomas Hellström
@ 2023-05-10 14:19 ` Thomas Hellström
2023-05-11 14:57 ` [Intel-xe] " Matthew Brost
2023-05-10 14:19 ` [PATCH 2/2] drm/xe: Properly remove the vma from the vm::notifer::rebind_list when destroyed Thomas Hellström
2023-05-10 20:36 ` [Intel-xe] [PATCH 0/2] drm/xe: Fix unprotected rebind_list accesses Souza, Jose
2 siblings, 1 reply; 8+ messages in thread
From: Thomas Hellström @ 2023-05-10 14:19 UTC (permalink / raw)
To: dri-devel, intel-xe; +Cc: Thomas Hellström
the vma::rebind_link is protected by the vm's resv, but we was
modifying it without. Fix this by use the vma::userptr_link instead
for the tmp_evict list. The vma::userptr_link is protected by the
vm lock.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
drivers/gpu/drm/xe/xe_vm.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 0a4becdf4675..5f93d78c2e58 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -764,8 +764,7 @@ int xe_vm_userptr_pin(struct xe_vm *vm)
if (err < 0)
goto out_err;
- list_del_init(&vma->userptr_link);
- list_move_tail(&vma->rebind_link, &tmp_evict);
+ list_move_tail(&vma->userptr_link, &tmp_evict);
}
/* Take lock and move to rebind_list for rebinding. */
@@ -773,16 +772,17 @@ int xe_vm_userptr_pin(struct xe_vm *vm)
if (err)
goto out_err;
- list_splice_tail(&tmp_evict, &vm->rebind_list);
+ list_for_each_entry_safe(vma, next, &tmp_evict, userptr_link) {
+ list_del_init(&vma->userptr_link);
+ list_move_tail(&vma->rebind_link, &vm->rebind_list);
+ }
+
dma_resv_unlock(&vm->resv);
return 0;
out_err:
- list_for_each_entry_safe(vma, next, &tmp_evict, rebind_link) {
- list_del_init(&vma->rebind_link);
- list_add_tail(&vma->userptr_link, &vm->userptr.repin_list);
- }
+ list_splice_tail(&tmp_evict, &vm->userptr.repin_list);
return err;
}
--
2.39.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/2] drm/xe: Properly remove the vma from the vm::notifer::rebind_list when destroyed
2023-05-10 14:19 [PATCH 0/2] drm/xe: Fix unprotected rebind_list accesses Thomas Hellström
2023-05-10 14:19 ` [PATCH 1/2] drm/xe: Fix unlocked access of the vma::rebind_link Thomas Hellström
@ 2023-05-10 14:19 ` Thomas Hellström
2023-05-11 14:54 ` [Intel-xe] " Matthew Brost
2023-05-10 20:36 ` [Intel-xe] [PATCH 0/2] drm/xe: Fix unprotected rebind_list accesses Souza, Jose
2 siblings, 1 reply; 8+ messages in thread
From: Thomas Hellström @ 2023-05-10 14:19 UTC (permalink / raw)
To: dri-devel, intel-xe; +Cc: Thomas Hellström
If a vma was destroyed with the bo evicted, it might happen that we forget
to remove it from the notifer::rebind_list. Fix to make sure that really
happens.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
drivers/gpu/drm/xe/xe_vm.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 5f93d78c2e58..f54b3b7566c9 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -978,6 +978,15 @@ static void xe_vma_destroy(struct xe_vma *vma, struct dma_fence *fence)
} else {
xe_bo_assert_held(vma->bo);
list_del(&vma->bo_link);
+ /*
+ * TODO: We can do an advisory check for list link empty here,
+ * if this lock becomes too costly. Nobody can re-add to the
+ * bo to the vm::notifier::rebind_list at this point since we
+ * have the bo lock.
+ */
+ spin_lock(&vm->notifier.list_lock);
+ list_del(&vma->notifier.rebind_link);
+ spin_unlock(&vm->notifier.list_lock);
if (!vma->bo->vm)
vm_remove_extobj(vma);
}
--
2.39.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [Intel-xe] [PATCH 0/2] drm/xe: Fix unprotected rebind_list accesses
2023-05-10 14:19 [PATCH 0/2] drm/xe: Fix unprotected rebind_list accesses Thomas Hellström
2023-05-10 14:19 ` [PATCH 1/2] drm/xe: Fix unlocked access of the vma::rebind_link Thomas Hellström
2023-05-10 14:19 ` [PATCH 2/2] drm/xe: Properly remove the vma from the vm::notifer::rebind_list when destroyed Thomas Hellström
@ 2023-05-10 20:36 ` Souza, Jose
2 siblings, 0 replies; 8+ messages in thread
From: Souza, Jose @ 2023-05-10 20:36 UTC (permalink / raw)
To: dri-devel, intel-xe, thomas.hellstrom
On Wed, 2023-05-10 at 16:19 +0200, Thomas Hellström wrote:
> Each VM has two rebind lists, one protected by the VM resv, the other
> one protected essentially by the VM notifier.list_lock. This series
> intends to fix two points of illegal access.
>
> Patch 1 fixes an access of VM rebind_lists' link without the VM resv held.
> Patch 2 fixes an issue where the VMA may not be removed from the
> VM notifier.rebind_list on VMA destruction.
Tested-by: José Roberto de Souza <jose.souza@intel.com>
This improved overall stability and fixed the 'list_add corruption' errors, thank you.
>
> Thomas Hellström (2):
> drm/xe: Fix unlocked access of the vma::rebind_link
> drm/xe: Properly remove the vma from the vm::notifer::rebind_list when
> destroyed
>
> drivers/gpu/drm/xe/xe_vm.c | 23 ++++++++++++++++-------
> 1 file changed, 16 insertions(+), 7 deletions(-)
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Intel-xe] [PATCH 2/2] drm/xe: Properly remove the vma from the vm::notifer::rebind_list when destroyed
2023-05-10 14:19 ` [PATCH 2/2] drm/xe: Properly remove the vma from the vm::notifer::rebind_list when destroyed Thomas Hellström
@ 2023-05-11 14:54 ` Matthew Brost
2023-05-11 15:38 ` Thomas Hellström
0 siblings, 1 reply; 8+ messages in thread
From: Matthew Brost @ 2023-05-11 14:54 UTC (permalink / raw)
To: Thomas Hellström; +Cc: intel-xe, dri-devel
On Wed, May 10, 2023 at 04:19:32PM +0200, Thomas Hellström wrote:
> If a vma was destroyed with the bo evicted, it might happen that we forget
> to remove it from the notifer::rebind_list. Fix to make sure that really
> happens.
>
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> ---
> drivers/gpu/drm/xe/xe_vm.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> index 5f93d78c2e58..f54b3b7566c9 100644
> --- a/drivers/gpu/drm/xe/xe_vm.c
> +++ b/drivers/gpu/drm/xe/xe_vm.c
> @@ -978,6 +978,15 @@ static void xe_vma_destroy(struct xe_vma *vma, struct dma_fence *fence)
> } else {
> xe_bo_assert_held(vma->bo);
> list_del(&vma->bo_link);
> + /*
> + * TODO: We can do an advisory check for list link empty here,
> + * if this lock becomes too costly. Nobody can re-add to the
> + * bo to the vm::notifier::rebind_list at this point since we
> + * have the bo lock.
> + */
IMO grab isn't a big deal, not sure this is worth such a lengthly comment.
> + spin_lock(&vm->notifier.list_lock);
> + list_del(&vma->notifier.rebind_link);
Can you safe call list_del on an empty list? I thought that call blows
up hence we have a bunch of if (!list_empty()) checks before calling
list_del all over the driver.
Matt
> + spin_unlock(&vm->notifier.list_lock);
> if (!vma->bo->vm)
> vm_remove_extobj(vma);
> }
> --
> 2.39.2
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Intel-xe] [PATCH 1/2] drm/xe: Fix unlocked access of the vma::rebind_link
2023-05-10 14:19 ` [PATCH 1/2] drm/xe: Fix unlocked access of the vma::rebind_link Thomas Hellström
@ 2023-05-11 14:57 ` Matthew Brost
0 siblings, 0 replies; 8+ messages in thread
From: Matthew Brost @ 2023-05-11 14:57 UTC (permalink / raw)
To: Thomas Hellström; +Cc: intel-xe, dri-devel
On Wed, May 10, 2023 at 04:19:31PM +0200, Thomas Hellström wrote:
> the vma::rebind_link is protected by the vm's resv, but we was
> modifying it without. Fix this by use the vma::userptr_link instead
> for the tmp_evict list. The vma::userptr_link is protected by the
> vm lock.
>
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
> ---
> drivers/gpu/drm/xe/xe_vm.c | 14 +++++++-------
> 1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> index 0a4becdf4675..5f93d78c2e58 100644
> --- a/drivers/gpu/drm/xe/xe_vm.c
> +++ b/drivers/gpu/drm/xe/xe_vm.c
> @@ -764,8 +764,7 @@ int xe_vm_userptr_pin(struct xe_vm *vm)
> if (err < 0)
> goto out_err;
>
> - list_del_init(&vma->userptr_link);
> - list_move_tail(&vma->rebind_link, &tmp_evict);
> + list_move_tail(&vma->userptr_link, &tmp_evict);
> }
>
> /* Take lock and move to rebind_list for rebinding. */
> @@ -773,16 +772,17 @@ int xe_vm_userptr_pin(struct xe_vm *vm)
> if (err)
> goto out_err;
>
> - list_splice_tail(&tmp_evict, &vm->rebind_list);
> + list_for_each_entry_safe(vma, next, &tmp_evict, userptr_link) {
> + list_del_init(&vma->userptr_link);
> + list_move_tail(&vma->rebind_link, &vm->rebind_list);
> + }
> +
> dma_resv_unlock(&vm->resv);
>
> return 0;
>
> out_err:
> - list_for_each_entry_safe(vma, next, &tmp_evict, rebind_link) {
> - list_del_init(&vma->rebind_link);
> - list_add_tail(&vma->userptr_link, &vm->userptr.repin_list);
> - }
> + list_splice_tail(&tmp_evict, &vm->userptr.repin_list);
>
> return err;
> }
> --
> 2.39.2
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Intel-xe] [PATCH 2/2] drm/xe: Properly remove the vma from the vm::notifer::rebind_list when destroyed
2023-05-11 14:54 ` [Intel-xe] " Matthew Brost
@ 2023-05-11 15:38 ` Thomas Hellström
2023-05-12 1:31 ` Matthew Brost
0 siblings, 1 reply; 8+ messages in thread
From: Thomas Hellström @ 2023-05-11 15:38 UTC (permalink / raw)
To: Matthew Brost; +Cc: intel-xe, dri-devel
On 5/11/23 16:54, Matthew Brost wrote:
> On Wed, May 10, 2023 at 04:19:32PM +0200, Thomas Hellström wrote:
>> If a vma was destroyed with the bo evicted, it might happen that we forget
>> to remove it from the notifer::rebind_list. Fix to make sure that really
>> happens.
>>
>> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>> ---
>> drivers/gpu/drm/xe/xe_vm.c | 9 +++++++++
>> 1 file changed, 9 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
>> index 5f93d78c2e58..f54b3b7566c9 100644
>> --- a/drivers/gpu/drm/xe/xe_vm.c
>> +++ b/drivers/gpu/drm/xe/xe_vm.c
>> @@ -978,6 +978,15 @@ static void xe_vma_destroy(struct xe_vma *vma, struct dma_fence *fence)
>> } else {
>> xe_bo_assert_held(vma->bo);
>> list_del(&vma->bo_link);
>> + /*
>> + * TODO: We can do an advisory check for list link empty here,
>> + * if this lock becomes too costly. Nobody can re-add to the
>> + * bo to the vm::notifier::rebind_list at this point since we
>> + * have the bo lock.
>> + */
> IMO grab isn't a big deal, not sure this is worth such a lengthly comment.
Ok, I'll remove it.
>
>> + spin_lock(&vm->notifier.list_lock);
>> + list_del(&vma->notifier.rebind_link);
> Can you safe call list_del on an empty list? I thought that call blows
> up hence we have a bunch of if (!list_empty()) checks before calling
> list_del all over the driver.
Good question. Looking at the implementation it definitely looks
possible, and I have LIST_DEBUG turned on when testing, so I assume it
would have blown up otherwise.
/Thomas
>
> Matt
>
>> + spin_unlock(&vm->notifier.list_lock);
>> if (!vma->bo->vm)
>> vm_remove_extobj(vma);
>> }
>> --
>> 2.39.2
>>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Intel-xe] [PATCH 2/2] drm/xe: Properly remove the vma from the vm::notifer::rebind_list when destroyed
2023-05-11 15:38 ` Thomas Hellström
@ 2023-05-12 1:31 ` Matthew Brost
0 siblings, 0 replies; 8+ messages in thread
From: Matthew Brost @ 2023-05-12 1:31 UTC (permalink / raw)
To: Thomas Hellström; +Cc: intel-xe, dri-devel
On Thu, May 11, 2023 at 05:38:11PM +0200, Thomas Hellström wrote:
>
> On 5/11/23 16:54, Matthew Brost wrote:
> > On Wed, May 10, 2023 at 04:19:32PM +0200, Thomas Hellström wrote:
> > > If a vma was destroyed with the bo evicted, it might happen that we forget
> > > to remove it from the notifer::rebind_list. Fix to make sure that really
> > > happens.
> > >
> > > Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > > ---
> > > drivers/gpu/drm/xe/xe_vm.c | 9 +++++++++
> > > 1 file changed, 9 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> > > index 5f93d78c2e58..f54b3b7566c9 100644
> > > --- a/drivers/gpu/drm/xe/xe_vm.c
> > > +++ b/drivers/gpu/drm/xe/xe_vm.c
> > > @@ -978,6 +978,15 @@ static void xe_vma_destroy(struct xe_vma *vma, struct dma_fence *fence)
> > > } else {
> > > xe_bo_assert_held(vma->bo);
> > > list_del(&vma->bo_link);
> > > + /*
> > > + * TODO: We can do an advisory check for list link empty here,
> > > + * if this lock becomes too costly. Nobody can re-add to the
> > > + * bo to the vm::notifier::rebind_list at this point since we
> > > + * have the bo lock.
> > > + */
> > IMO grab isn't a big deal, not sure this is worth such a lengthly comment.
>
> Ok, I'll remove it.
>
>
> >
> > > + spin_lock(&vm->notifier.list_lock);
> > > + list_del(&vma->notifier.rebind_link);
> > Can you safe call list_del on an empty list? I thought that call blows
> > up hence we have a bunch of if (!list_empty()) checks before calling
> > list_del all over the driver.
>
> Good question. Looking at the implementation it definitely looks possible,
> and I have LIST_DEBUG turned on when testing, so I assume it would have
> blown up otherwise.
>
It looks like 2 deletes will blow up but delete on a empty list is fine.
With that:
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
> /Thomas
>
>
> >
> > Matt
> >
> > > + spin_unlock(&vm->notifier.list_lock);
> > > if (!vma->bo->vm)
> > > vm_remove_extobj(vma);
> > > }
> > > --
> > > 2.39.2
> > >
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-05-12 1:31 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-10 14:19 [PATCH 0/2] drm/xe: Fix unprotected rebind_list accesses Thomas Hellström
2023-05-10 14:19 ` [PATCH 1/2] drm/xe: Fix unlocked access of the vma::rebind_link Thomas Hellström
2023-05-11 14:57 ` [Intel-xe] " Matthew Brost
2023-05-10 14:19 ` [PATCH 2/2] drm/xe: Properly remove the vma from the vm::notifer::rebind_list when destroyed Thomas Hellström
2023-05-11 14:54 ` [Intel-xe] " Matthew Brost
2023-05-11 15:38 ` Thomas Hellström
2023-05-12 1:31 ` Matthew Brost
2023-05-10 20:36 ` [Intel-xe] [PATCH 0/2] drm/xe: Fix unprotected rebind_list accesses Souza, Jose
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).