All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/scheduler: set current_entity to next when remove from rq
@ 2022-10-25  6:18 ` brolerliew
  0 siblings, 0 replies; 15+ messages in thread
From: brolerliew @ 2022-10-25  6:18 UTC (permalink / raw)
  Cc: brolerliew, Andrey Grodzovsky, David Airlie, Daniel Vetter,
	dri-devel, linux-kernel

When entity move from one rq to another, current_entity will be set to NULL
if it is the moving entity. This make entities close to rq head got
selected more frequently, especially when doing load balance between
multiple drm_gpu_scheduler.

Make current_entity to next when removing from rq.

Signed-off-by: brolerliew <brolerliew@gmail.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 2fab218d7082..00b22cc50f08 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -168,10 +168,11 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
 	spin_lock(&rq->lock);
 
 	atomic_dec(rq->sched->score);
-	list_del_init(&entity->list);
 
 	if (rq->current_entity == entity)
-		rq->current_entity = NULL;
+		rq->current_entity = list_next_entry(entity, list);
+
+	list_del_init(&entity->list);
 
 	if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
 		drm_sched_rq_remove_fifo_locked(entity);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH] drm/scheduler: set current_entity to next when remove from rq
@ 2022-10-25  6:18 ` brolerliew
  0 siblings, 0 replies; 15+ messages in thread
From: brolerliew @ 2022-10-25  6:18 UTC (permalink / raw)
  Cc: Andrey Grodzovsky, brolerliew, linux-kernel, dri-devel

When entity move from one rq to another, current_entity will be set to NULL
if it is the moving entity. This make entities close to rq head got
selected more frequently, especially when doing load balance between
multiple drm_gpu_scheduler.

Make current_entity to next when removing from rq.

Signed-off-by: brolerliew <brolerliew@gmail.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 2fab218d7082..00b22cc50f08 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -168,10 +168,11 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
 	spin_lock(&rq->lock);
 
 	atomic_dec(rq->sched->score);
-	list_del_init(&entity->list);
 
 	if (rq->current_entity == entity)
-		rq->current_entity = NULL;
+		rq->current_entity = list_next_entry(entity, list);
+
+	list_del_init(&entity->list);
 
 	if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
 		drm_sched_rq_remove_fifo_locked(entity);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/scheduler: set current_entity to next when remove from rq
  2022-10-25  6:18 ` brolerliew
  (?)
@ 2022-10-25 13:35 ` Alex Deucher
  2022-10-25 17:50   ` Luben Tuikov
  -1 siblings, 1 reply; 15+ messages in thread
From: Alex Deucher @ 2022-10-25 13:35 UTC (permalink / raw)
  To: brolerliew, Tuikov, Luben; +Cc: Andrey Grodzovsky, linux-kernel, dri-devel

+ Luben

On Tue, Oct 25, 2022 at 2:55 AM brolerliew <brolerliew@gmail.com> wrote:
>
> When entity move from one rq to another, current_entity will be set to NULL
> if it is the moving entity. This make entities close to rq head got
> selected more frequently, especially when doing load balance between
> multiple drm_gpu_scheduler.
>
> Make current_entity to next when removing from rq.
>
> Signed-off-by: brolerliew <brolerliew@gmail.com>
> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 2fab218d7082..00b22cc50f08 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -168,10 +168,11 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>         spin_lock(&rq->lock);
>
>         atomic_dec(rq->sched->score);
> -       list_del_init(&entity->list);
>
>         if (rq->current_entity == entity)
> -               rq->current_entity = NULL;
> +               rq->current_entity = list_next_entry(entity, list);
> +
> +       list_del_init(&entity->list);
>
>         if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
>                 drm_sched_rq_remove_fifo_locked(entity);
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/scheduler: set current_entity to next when remove from rq
  2022-10-25 13:35 ` Alex Deucher
@ 2022-10-25 17:50   ` Luben Tuikov
  2022-10-27  7:01     ` Luben Tuikov
  0 siblings, 1 reply; 15+ messages in thread
From: Luben Tuikov @ 2022-10-25 17:50 UTC (permalink / raw)
  To: Alex Deucher, brolerliew; +Cc: Andrey Grodzovsky, linux-kernel, dri-devel

Looking...

Regards,
Luben

On 2022-10-25 09:35, Alex Deucher wrote:
> + Luben
> 
> On Tue, Oct 25, 2022 at 2:55 AM brolerliew <brolerliew@gmail.com> wrote:
>>
>> When entity move from one rq to another, current_entity will be set to NULL
>> if it is the moving entity. This make entities close to rq head got
>> selected more frequently, especially when doing load balance between
>> multiple drm_gpu_scheduler.
>>
>> Make current_entity to next when removing from rq.
>>
>> Signed-off-by: brolerliew <brolerliew@gmail.com>
>> ---
>>  drivers/gpu/drm/scheduler/sched_main.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>> index 2fab218d7082..00b22cc50f08 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -168,10 +168,11 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>>         spin_lock(&rq->lock);
>>
>>         atomic_dec(rq->sched->score);
>> -       list_del_init(&entity->list);
>>
>>         if (rq->current_entity == entity)
>> -               rq->current_entity = NULL;
>> +               rq->current_entity = list_next_entry(entity, list);
>> +
>> +       list_del_init(&entity->list);
>>
>>         if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
>>                 drm_sched_rq_remove_fifo_locked(entity);
>> --
>> 2.34.1
>>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/scheduler: set current_entity to next when remove from rq
  2022-10-25 17:50   ` Luben Tuikov
@ 2022-10-27  7:01     ` Luben Tuikov
  2022-10-27  8:07         ` Luben Tuikov
  0 siblings, 1 reply; 15+ messages in thread
From: Luben Tuikov @ 2022-10-27  7:01 UTC (permalink / raw)
  To: Alex Deucher, brolerliew; +Cc: linux-kernel, dri-devel

On 2022-10-25 13:50, Luben Tuikov wrote:
> Looking...
> 
> Regards,
> Luben
> 
> On 2022-10-25 09:35, Alex Deucher wrote:
>> + Luben
>>
>> On Tue, Oct 25, 2022 at 2:55 AM brolerliew <brolerliew@gmail.com> wrote:
>>>
>>> When entity move from one rq to another, current_entity will be set to NULL
>>> if it is the moving entity. This make entities close to rq head got
>>> selected more frequently, especially when doing load balance between
>>> multiple drm_gpu_scheduler.
>>>
>>> Make current_entity to next when removing from rq.
>>>
>>> Signed-off-by: brolerliew <brolerliew@gmail.com>
>>> ---
>>>  drivers/gpu/drm/scheduler/sched_main.c | 5 +++--
>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>> index 2fab218d7082..00b22cc50f08 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -168,10 +168,11 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>>>         spin_lock(&rq->lock);
>>>
>>>         atomic_dec(rq->sched->score);
>>> -       list_del_init(&entity->list);
>>>
>>>         if (rq->current_entity == entity)
>>> -               rq->current_entity = NULL;
>>> +               rq->current_entity = list_next_entry(entity, list);
>>> +
>>> +       list_del_init(&entity->list);
>>>
>>>         if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
>>>                 drm_sched_rq_remove_fifo_locked(entity);
>>> --
>>> 2.34.1
>>>
> 

Looks good. I'll pick it up into some other changes I've in tow, and repost
along with my changes, as they're somewhat related.

Regards,
Luben


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/scheduler: set current_entity to next when remove from rq
  2022-10-27  7:01     ` Luben Tuikov
@ 2022-10-27  8:07         ` Luben Tuikov
  0 siblings, 0 replies; 15+ messages in thread
From: Luben Tuikov @ 2022-10-27  8:07 UTC (permalink / raw)
  To: brolerliew, Koenig, Christian; +Cc: Alex Deucher, linux-kernel, dri-devel

On 2022-10-27 03:01, Luben Tuikov wrote:
> On 2022-10-25 13:50, Luben Tuikov wrote:
>> Looking...
>>
>> Regards,
>> Luben
>>
>> On 2022-10-25 09:35, Alex Deucher wrote:
>>> + Luben
>>>
>>> On Tue, Oct 25, 2022 at 2:55 AM brolerliew <brolerliew@gmail.com> wrote:
>>>>
>>>> When entity move from one rq to another, current_entity will be set to NULL
>>>> if it is the moving entity. This make entities close to rq head got
>>>> selected more frequently, especially when doing load balance between
>>>> multiple drm_gpu_scheduler.
>>>>
>>>> Make current_entity to next when removing from rq.
>>>>
>>>> Signed-off-by: brolerliew <brolerliew@gmail.com>
>>>> ---
>>>>  drivers/gpu/drm/scheduler/sched_main.c | 5 +++--
>>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>> index 2fab218d7082..00b22cc50f08 100644
>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>> @@ -168,10 +168,11 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>>>>         spin_lock(&rq->lock);
>>>>
>>>>         atomic_dec(rq->sched->score);
>>>> -       list_del_init(&entity->list);
>>>>
>>>>         if (rq->current_entity == entity)
>>>> -               rq->current_entity = NULL;
>>>> +               rq->current_entity = list_next_entry(entity, list);
>>>> +
>>>> +       list_del_init(&entity->list);
>>>>
>>>>         if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
>>>>                 drm_sched_rq_remove_fifo_locked(entity);
>>>> --
>>>> 2.34.1
>>>>
>>
> 
> Looks good. I'll pick it up into some other changes I've in tow, and repost
> along with my changes, as they're somewhat related.

Actually, the more I look at it, the more I think that we do want to set
rq->current_entity to NULL in that function, in order to pick the next best entity
(or scheduler for that matter), the next time around. See sched_entity.c,
and drm_sched_rq_select_entity() where we start evaluating from the _next_
entity.

So, it is best to leave it to set it to NULL, for now.

Regards,
Luben


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/scheduler: set current_entity to next when remove from rq
@ 2022-10-27  8:07         ` Luben Tuikov
  0 siblings, 0 replies; 15+ messages in thread
From: Luben Tuikov @ 2022-10-27  8:07 UTC (permalink / raw)
  To: brolerliew, Koenig, Christian; +Cc: linux-kernel, dri-devel

On 2022-10-27 03:01, Luben Tuikov wrote:
> On 2022-10-25 13:50, Luben Tuikov wrote:
>> Looking...
>>
>> Regards,
>> Luben
>>
>> On 2022-10-25 09:35, Alex Deucher wrote:
>>> + Luben
>>>
>>> On Tue, Oct 25, 2022 at 2:55 AM brolerliew <brolerliew@gmail.com> wrote:
>>>>
>>>> When entity move from one rq to another, current_entity will be set to NULL
>>>> if it is the moving entity. This make entities close to rq head got
>>>> selected more frequently, especially when doing load balance between
>>>> multiple drm_gpu_scheduler.
>>>>
>>>> Make current_entity to next when removing from rq.
>>>>
>>>> Signed-off-by: brolerliew <brolerliew@gmail.com>
>>>> ---
>>>>  drivers/gpu/drm/scheduler/sched_main.c | 5 +++--
>>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>> index 2fab218d7082..00b22cc50f08 100644
>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>> @@ -168,10 +168,11 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>>>>         spin_lock(&rq->lock);
>>>>
>>>>         atomic_dec(rq->sched->score);
>>>> -       list_del_init(&entity->list);
>>>>
>>>>         if (rq->current_entity == entity)
>>>> -               rq->current_entity = NULL;
>>>> +               rq->current_entity = list_next_entry(entity, list);
>>>> +
>>>> +       list_del_init(&entity->list);
>>>>
>>>>         if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
>>>>                 drm_sched_rq_remove_fifo_locked(entity);
>>>> --
>>>> 2.34.1
>>>>
>>
> 
> Looks good. I'll pick it up into some other changes I've in tow, and repost
> along with my changes, as they're somewhat related.

Actually, the more I look at it, the more I think that we do want to set
rq->current_entity to NULL in that function, in order to pick the next best entity
(or scheduler for that matter), the next time around. See sched_entity.c,
and drm_sched_rq_select_entity() where we start evaluating from the _next_
entity.

So, it is best to leave it to set it to NULL, for now.

Regards,
Luben


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/scheduler: set current_entity to next when remove from rq
  2022-10-27  8:07         ` Luben Tuikov
@ 2022-10-27  8:19           ` Christian König
  -1 siblings, 0 replies; 15+ messages in thread
From: Christian König @ 2022-10-27  8:19 UTC (permalink / raw)
  To: Luben Tuikov, brolerliew; +Cc: Alex Deucher, linux-kernel, dri-devel

Am 27.10.22 um 10:07 schrieb Luben Tuikov:
> On 2022-10-27 03:01, Luben Tuikov wrote:
>> On 2022-10-25 13:50, Luben Tuikov wrote:
>>> Looking...
>>>
>>> Regards,
>>> Luben
>>>
>>> On 2022-10-25 09:35, Alex Deucher wrote:
>>>> + Luben
>>>>
>>>> On Tue, Oct 25, 2022 at 2:55 AM brolerliew <brolerliew@gmail.com> wrote:
>>>>> When entity move from one rq to another, current_entity will be set to NULL
>>>>> if it is the moving entity. This make entities close to rq head got
>>>>> selected more frequently, especially when doing load balance between
>>>>> multiple drm_gpu_scheduler.
>>>>>
>>>>> Make current_entity to next when removing from rq.
>>>>>
>>>>> Signed-off-by: brolerliew <brolerliew@gmail.com>
>>>>> ---
>>>>>   drivers/gpu/drm/scheduler/sched_main.c | 5 +++--
>>>>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> index 2fab218d7082..00b22cc50f08 100644
>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> @@ -168,10 +168,11 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>>>>>          spin_lock(&rq->lock);
>>>>>
>>>>>          atomic_dec(rq->sched->score);
>>>>> -       list_del_init(&entity->list);
>>>>>
>>>>>          if (rq->current_entity == entity)
>>>>> -               rq->current_entity = NULL;
>>>>> +               rq->current_entity = list_next_entry(entity, list);
>>>>> +
>>>>> +       list_del_init(&entity->list);
>>>>>
>>>>>          if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
>>>>>                  drm_sched_rq_remove_fifo_locked(entity);
>>>>> --
>>>>> 2.34.1
>>>>>
>> Looks good. I'll pick it up into some other changes I've in tow, and repost
>> along with my changes, as they're somewhat related.
> Actually, the more I look at it, the more I think that we do want to set
> rq->current_entity to NULL in that function, in order to pick the next best entity
> (or scheduler for that matter), the next time around. See sched_entity.c,
> and drm_sched_rq_select_entity() where we start evaluating from the _next_
> entity.
>
> So, it is best to leave it to set it to NULL, for now.

Apart from that this patch here could cause a crash when the entity is 
the last one in the list.

In this case current current_entity would be set to an incorrect upcast 
of the head of the list.

Regards,
Christian.

>
> Regards,
> Luben
>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/scheduler: set current_entity to next when remove from rq
@ 2022-10-27  8:19           ` Christian König
  0 siblings, 0 replies; 15+ messages in thread
From: Christian König @ 2022-10-27  8:19 UTC (permalink / raw)
  To: Luben Tuikov, brolerliew; +Cc: linux-kernel, dri-devel

Am 27.10.22 um 10:07 schrieb Luben Tuikov:
> On 2022-10-27 03:01, Luben Tuikov wrote:
>> On 2022-10-25 13:50, Luben Tuikov wrote:
>>> Looking...
>>>
>>> Regards,
>>> Luben
>>>
>>> On 2022-10-25 09:35, Alex Deucher wrote:
>>>> + Luben
>>>>
>>>> On Tue, Oct 25, 2022 at 2:55 AM brolerliew <brolerliew@gmail.com> wrote:
>>>>> When entity move from one rq to another, current_entity will be set to NULL
>>>>> if it is the moving entity. This make entities close to rq head got
>>>>> selected more frequently, especially when doing load balance between
>>>>> multiple drm_gpu_scheduler.
>>>>>
>>>>> Make current_entity to next when removing from rq.
>>>>>
>>>>> Signed-off-by: brolerliew <brolerliew@gmail.com>
>>>>> ---
>>>>>   drivers/gpu/drm/scheduler/sched_main.c | 5 +++--
>>>>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> index 2fab218d7082..00b22cc50f08 100644
>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> @@ -168,10 +168,11 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>>>>>          spin_lock(&rq->lock);
>>>>>
>>>>>          atomic_dec(rq->sched->score);
>>>>> -       list_del_init(&entity->list);
>>>>>
>>>>>          if (rq->current_entity == entity)
>>>>> -               rq->current_entity = NULL;
>>>>> +               rq->current_entity = list_next_entry(entity, list);
>>>>> +
>>>>> +       list_del_init(&entity->list);
>>>>>
>>>>>          if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
>>>>>                  drm_sched_rq_remove_fifo_locked(entity);
>>>>> --
>>>>> 2.34.1
>>>>>
>> Looks good. I'll pick it up into some other changes I've in tow, and repost
>> along with my changes, as they're somewhat related.
> Actually, the more I look at it, the more I think that we do want to set
> rq->current_entity to NULL in that function, in order to pick the next best entity
> (or scheduler for that matter), the next time around. See sched_entity.c,
> and drm_sched_rq_select_entity() where we start evaluating from the _next_
> entity.
>
> So, it is best to leave it to set it to NULL, for now.

Apart from that this patch here could cause a crash when the entity is 
the last one in the list.

In this case current current_entity would be set to an incorrect upcast 
of the head of the list.

Regards,
Christian.

>
> Regards,
> Luben
>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/scheduler: set current_entity to next when remove from rq
  2022-10-27  8:19           ` Christian König
@ 2022-10-27  8:24             ` Luben Tuikov
  -1 siblings, 0 replies; 15+ messages in thread
From: Luben Tuikov @ 2022-10-27  8:24 UTC (permalink / raw)
  To: Christian König, brolerliew; +Cc: Alex Deucher, linux-kernel, dri-devel

On 2022-10-27 04:19, Christian König wrote:
> Am 27.10.22 um 10:07 schrieb Luben Tuikov:
>> On 2022-10-27 03:01, Luben Tuikov wrote:
>>> On 2022-10-25 13:50, Luben Tuikov wrote:
>>>> Looking...
>>>>
>>>> Regards,
>>>> Luben
>>>>
>>>> On 2022-10-25 09:35, Alex Deucher wrote:
>>>>> + Luben
>>>>>
>>>>> On Tue, Oct 25, 2022 at 2:55 AM brolerliew <brolerliew@gmail.com> wrote:
>>>>>> When entity move from one rq to another, current_entity will be set to NULL
>>>>>> if it is the moving entity. This make entities close to rq head got
>>>>>> selected more frequently, especially when doing load balance between
>>>>>> multiple drm_gpu_scheduler.
>>>>>>
>>>>>> Make current_entity to next when removing from rq.
>>>>>>
>>>>>> Signed-off-by: brolerliew <brolerliew@gmail.com>
>>>>>> ---
>>>>>>   drivers/gpu/drm/scheduler/sched_main.c | 5 +++--
>>>>>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> index 2fab218d7082..00b22cc50f08 100644
>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> @@ -168,10 +168,11 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>>>>>>          spin_lock(&rq->lock);
>>>>>>
>>>>>>          atomic_dec(rq->sched->score);
>>>>>> -       list_del_init(&entity->list);
>>>>>>
>>>>>>          if (rq->current_entity == entity)
>>>>>> -               rq->current_entity = NULL;
>>>>>> +               rq->current_entity = list_next_entry(entity, list);
>>>>>> +
>>>>>> +       list_del_init(&entity->list);
>>>>>>
>>>>>>          if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
>>>>>>                  drm_sched_rq_remove_fifo_locked(entity);
>>>>>> --
>>>>>> 2.34.1
>>>>>>
>>> Looks good. I'll pick it up into some other changes I've in tow, and repost
>>> along with my changes, as they're somewhat related.
>> Actually, the more I look at it, the more I think that we do want to set
>> rq->current_entity to NULL in that function, in order to pick the next best entity
>> (or scheduler for that matter), the next time around. See sched_entity.c,
>> and drm_sched_rq_select_entity() where we start evaluating from the _next_
>> entity.
>>
>> So, it is best to leave it to set it to NULL, for now.
> 
> Apart from that this patch here could cause a crash when the entity is 
> the last one in the list.
> 
> In this case current current_entity would be set to an incorrect upcast 
> of the head of the list.

Absolutely. I saw that, but in rejecting the patch, I didn't feel the need to mention it.

Thanks for looking into this.

Regards,
Luben


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/scheduler: set current_entity to next when remove from rq
@ 2022-10-27  8:24             ` Luben Tuikov
  0 siblings, 0 replies; 15+ messages in thread
From: Luben Tuikov @ 2022-10-27  8:24 UTC (permalink / raw)
  To: Christian König, brolerliew; +Cc: linux-kernel, dri-devel

On 2022-10-27 04:19, Christian König wrote:
> Am 27.10.22 um 10:07 schrieb Luben Tuikov:
>> On 2022-10-27 03:01, Luben Tuikov wrote:
>>> On 2022-10-25 13:50, Luben Tuikov wrote:
>>>> Looking...
>>>>
>>>> Regards,
>>>> Luben
>>>>
>>>> On 2022-10-25 09:35, Alex Deucher wrote:
>>>>> + Luben
>>>>>
>>>>> On Tue, Oct 25, 2022 at 2:55 AM brolerliew <brolerliew@gmail.com> wrote:
>>>>>> When entity move from one rq to another, current_entity will be set to NULL
>>>>>> if it is the moving entity. This make entities close to rq head got
>>>>>> selected more frequently, especially when doing load balance between
>>>>>> multiple drm_gpu_scheduler.
>>>>>>
>>>>>> Make current_entity to next when removing from rq.
>>>>>>
>>>>>> Signed-off-by: brolerliew <brolerliew@gmail.com>
>>>>>> ---
>>>>>>   drivers/gpu/drm/scheduler/sched_main.c | 5 +++--
>>>>>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> index 2fab218d7082..00b22cc50f08 100644
>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> @@ -168,10 +168,11 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>>>>>>          spin_lock(&rq->lock);
>>>>>>
>>>>>>          atomic_dec(rq->sched->score);
>>>>>> -       list_del_init(&entity->list);
>>>>>>
>>>>>>          if (rq->current_entity == entity)
>>>>>> -               rq->current_entity = NULL;
>>>>>> +               rq->current_entity = list_next_entry(entity, list);
>>>>>> +
>>>>>> +       list_del_init(&entity->list);
>>>>>>
>>>>>>          if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
>>>>>>                  drm_sched_rq_remove_fifo_locked(entity);
>>>>>> --
>>>>>> 2.34.1
>>>>>>
>>> Looks good. I'll pick it up into some other changes I've in tow, and repost
>>> along with my changes, as they're somewhat related.
>> Actually, the more I look at it, the more I think that we do want to set
>> rq->current_entity to NULL in that function, in order to pick the next best entity
>> (or scheduler for that matter), the next time around. See sched_entity.c,
>> and drm_sched_rq_select_entity() where we start evaluating from the _next_
>> entity.
>>
>> So, it is best to leave it to set it to NULL, for now.
> 
> Apart from that this patch here could cause a crash when the entity is 
> the last one in the list.
> 
> In this case current current_entity would be set to an incorrect upcast 
> of the head of the list.

Absolutely. I saw that, but in rejecting the patch, I didn't feel the need to mention it.

Thanks for looking into this.

Regards,
Luben


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/scheduler: set current_entity to next when remove from rq
  2022-10-27  8:24             ` Luben Tuikov
  (?)
@ 2022-10-27  9:00             ` broler Liew
  2022-10-27  9:08               ` Christian König
  -1 siblings, 1 reply; 15+ messages in thread
From: broler Liew @ 2022-10-27  9:00 UTC (permalink / raw)
  To: Luben Tuikov; +Cc: Christian König, dri-devel, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3498 bytes --]

It's very nice of you-all to finger it out that it may crash when it is the
last entity in the list.   Absolutely I made a mistake about that.
But I still cannot understand why we need to restart the selection from the
list head when the current entity is removed from rq.
In drm_sched_rq_select_entity, starting from head may cause the first
entity to be selected more often than others, which breaks the equal rule
the scheduler wants to achieve.
Maybe the previous one is the better choice when current_entity == entity?

Luben Tuikov <luben.tuikov@amd.com> 于2022年10月27日周四 16:24写道:

> On 2022-10-27 04:19, Christian König wrote:
> > Am 27.10.22 um 10:07 schrieb Luben Tuikov:
> >> On 2022-10-27 03:01, Luben Tuikov wrote:
> >>> On 2022-10-25 13:50, Luben Tuikov wrote:
> >>>> Looking...
> >>>>
> >>>> Regards,
> >>>> Luben
> >>>>
> >>>> On 2022-10-25 09:35, Alex Deucher wrote:
> >>>>> + Luben
> >>>>>
> >>>>> On Tue, Oct 25, 2022 at 2:55 AM brolerliew <brolerliew@gmail.com>
> wrote:
> >>>>>> When entity move from one rq to another, current_entity will be set
> to NULL
> >>>>>> if it is the moving entity. This make entities close to rq head got
> >>>>>> selected more frequently, especially when doing load balance between
> >>>>>> multiple drm_gpu_scheduler.
> >>>>>>
> >>>>>> Make current_entity to next when removing from rq.
> >>>>>>
> >>>>>> Signed-off-by: brolerliew <brolerliew@gmail.com>
> >>>>>> ---
> >>>>>>   drivers/gpu/drm/scheduler/sched_main.c | 5 +++--
> >>>>>>   1 file changed, 3 insertions(+), 2 deletions(-)
> >>>>>>
> >>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> b/drivers/gpu/drm/scheduler/sched_main.c
> >>>>>> index 2fab218d7082..00b22cc50f08 100644
> >>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
> >>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> >>>>>> @@ -168,10 +168,11 @@ void drm_sched_rq_remove_entity(struct
> drm_sched_rq *rq,
> >>>>>>          spin_lock(&rq->lock);
> >>>>>>
> >>>>>>          atomic_dec(rq->sched->score);
> >>>>>> -       list_del_init(&entity->list);
> >>>>>>
> >>>>>>          if (rq->current_entity == entity)
> >>>>>> -               rq->current_entity = NULL;
> >>>>>> +               rq->current_entity = list_next_entry(entity, list);
> >>>>>> +
> >>>>>> +       list_del_init(&entity->list);
> >>>>>>
> >>>>>>          if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
> >>>>>>                  drm_sched_rq_remove_fifo_locked(entity);
> >>>>>> --
> >>>>>> 2.34.1
> >>>>>>
> >>> Looks good. I'll pick it up into some other changes I've in tow, and
> repost
> >>> along with my changes, as they're somewhat related.
> >> Actually, the more I look at it, the more I think that we do want to set
> >> rq->current_entity to NULL in that function, in order to pick the next
> best entity
> >> (or scheduler for that matter), the next time around. See
> sched_entity.c,
> >> and drm_sched_rq_select_entity() where we start evaluating from the
> _next_
> >> entity.
> >>
> >> So, it is best to leave it to set it to NULL, for now.
> >
> > Apart from that this patch here could cause a crash when the entity is
> > the last one in the list.
> >
> > In this case current current_entity would be set to an incorrect upcast
> > of the head of the list.
>
> Absolutely. I saw that, but in rejecting the patch, I didn't feel the need
> to mention it.
>
> Thanks for looking into this.
>
> Regards,
> Luben
>
>

[-- Attachment #2: Type: text/html, Size: 4972 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/scheduler: set current_entity to next when remove from rq
  2022-10-27  9:00             ` broler Liew
@ 2022-10-27  9:08               ` Christian König
  2022-10-27 15:46                   ` Luben Tuikov
  0 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2022-10-27  9:08 UTC (permalink / raw)
  To: broler Liew, Luben Tuikov; +Cc: linux-kernel, dri-devel

[-- Attachment #1: Type: text/plain, Size: 4054 bytes --]

Am 27.10.22 um 11:00 schrieb broler Liew:
> It's very nice of you-all to finger it out that it may crash when it 
> is the last entity in the list.   Absolutely I made a mistake about that.
> But I still cannot understand why we need to restart the selection 
> from the list head when the current entity is removed from rq.
> In drm_sched_rq_select_entity, starting from head may cause the first 
> entity to be selected more often than others, which breaks the equal 
> rule the scheduler wants to achieve.
> Maybe the previous one is the better choice when current_entity == entity?

That's a good argument, but we want to get rid of the round robin 
algorithm anyway and switch over to the fifo.

So this is some code which is already not used by default any more and 
improving it doesn't make much sense.

Regards,
Christian.

>
> Luben Tuikov <luben.tuikov@amd.com> 于2022年10月27日周四 16:24写道:
>
>     On 2022-10-27 04:19, Christian König wrote:
>     > Am 27.10.22 um 10:07 schrieb Luben Tuikov:
>     >> On 2022-10-27 03:01, Luben Tuikov wrote:
>     >>> On 2022-10-25 13:50, Luben Tuikov wrote:
>     >>>> Looking...
>     >>>>
>     >>>> Regards,
>     >>>> Luben
>     >>>>
>     >>>> On 2022-10-25 09:35, Alex Deucher wrote:
>     >>>>> + Luben
>     >>>>>
>     >>>>> On Tue, Oct 25, 2022 at 2:55 AM brolerliew
>     <brolerliew@gmail.com> wrote:
>     >>>>>> When entity move from one rq to another, current_entity
>     will be set to NULL
>     >>>>>> if it is the moving entity. This make entities close to rq
>     head got
>     >>>>>> selected more frequently, especially when doing load
>     balance between
>     >>>>>> multiple drm_gpu_scheduler.
>     >>>>>>
>     >>>>>> Make current_entity to next when removing from rq.
>     >>>>>>
>     >>>>>> Signed-off-by: brolerliew <brolerliew@gmail.com>
>     >>>>>> ---
>     >>>>>>  drivers/gpu/drm/scheduler/sched_main.c | 5 +++--
>     >>>>>>   1 file changed, 3 insertions(+), 2 deletions(-)
>     >>>>>>
>     >>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>     b/drivers/gpu/drm/scheduler/sched_main.c
>     >>>>>> index 2fab218d7082..00b22cc50f08 100644
>     >>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>     >>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>     >>>>>> @@ -168,10 +168,11 @@ void
>     drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>     >>>>>>          spin_lock(&rq->lock);
>     >>>>>>
>     >>>>>> atomic_dec(rq->sched->score);
>     >>>>>> -  list_del_init(&entity->list);
>     >>>>>>
>     >>>>>>          if (rq->current_entity == entity)
>     >>>>>> -               rq->current_entity = NULL;
>     >>>>>> +               rq->current_entity =
>     list_next_entry(entity, list);
>     >>>>>> +
>     >>>>>> +  list_del_init(&entity->list);
>     >>>>>>
>     >>>>>>          if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
>     >>>>>> drm_sched_rq_remove_fifo_locked(entity);
>     >>>>>> --
>     >>>>>> 2.34.1
>     >>>>>>
>     >>> Looks good. I'll pick it up into some other changes I've in
>     tow, and repost
>     >>> along with my changes, as they're somewhat related.
>     >> Actually, the more I look at it, the more I think that we do
>     want to set
>     >> rq->current_entity to NULL in that function, in order to pick
>     the next best entity
>     >> (or scheduler for that matter), the next time around. See
>     sched_entity.c,
>     >> and drm_sched_rq_select_entity() where we start evaluating from
>     the _next_
>     >> entity.
>     >>
>     >> So, it is best to leave it to set it to NULL, for now.
>     >
>     > Apart from that this patch here could cause a crash when the
>     entity is
>     > the last one in the list.
>     >
>     > In this case current current_entity would be set to an incorrect
>     upcast
>     > of the head of the list.
>
>     Absolutely. I saw that, but in rejecting the patch, I didn't feel
>     the need to mention it.
>
>     Thanks for looking into this.
>
>     Regards,
>     Luben
>

[-- Attachment #2: Type: text/html, Size: 7230 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/scheduler: set current_entity to next when remove from rq
  2022-10-27  9:08               ` Christian König
@ 2022-10-27 15:46                   ` Luben Tuikov
  0 siblings, 0 replies; 15+ messages in thread
From: Luben Tuikov @ 2022-10-27 15:46 UTC (permalink / raw)
  To: Christian König, broler Liew; +Cc: linux-kernel, dri-devel

So, I started fixing this, including the bug taking the next element as an entity, but it could be actually the list_head... a la your patch being fixed, and then went down the rabbit whole of also fixing drm_sched_rq_select_entity(), but the problem is that at that point we don't know if we should start from the _next_ entity (as it is currently the case) or from the current entity (a la list_for_each_entry_from()) as it would be the case with this patch (if it were fixed for the list_head bug).

But the problem is that elsewhere in the GPU scheduler (sched_entity.c), we do want to start from rq->current_entity->next, and picking "next" in drm_sched_rq_remove_entity(), would then skip an entity, or be biased for an entity twice. This is why this function is called drm_sched_rq_remove_entity() and not drm_sched_rq_next_entity_or_null().

So all this work seemed moot, given that we've already switched to FIFO-based scheduling in drm-misc-next, and so I didn't see a point in developing this further at this point (it's been working alright)--we're going with FIFO-based scheduling.

Regards,
Luben


On 2022-10-27 05:08, Christian König wrote:
> Am 27.10.22 um 11:00 schrieb broler Liew:
>> It's very nice of you-all to finger it out that it may crash when it is the last entity in the list.   Absolutely I made a mistake about that.
>> But I still cannot understand why we need to restart the selection from the list head when the current entity is removed from rq.
>> In drm_sched_rq_select_entity, starting from head may cause the first entity to be selected more often than others, which breaks the equal rule the scheduler wants to achieve.
>> Maybe the previous one is the better choice when current_entity == entity?
> 
> That's a good argument, but we want to get rid of the round robin algorithm anyway and switch over to the fifo.
> 
> So this is some code which is already not used by default any more and improving it doesn't make much sense.
> 
> Regards,
> Christian.
> 
>>
>> Luben Tuikov <luben.tuikov@amd.com> 于2022年10月27日周四 16:24写道:
>>
>>     On 2022-10-27 04:19, Christian König wrote:
>>     > Am 27.10.22 um 10:07 schrieb Luben Tuikov:
>>     >> On 2022-10-27 03:01, Luben Tuikov wrote:
>>     >>> On 2022-10-25 13:50, Luben Tuikov wrote:
>>     >>>> Looking...
>>     >>>>
>>     >>>> Regards,
>>     >>>> Luben
>>     >>>>
>>     >>>> On 2022-10-25 09:35, Alex Deucher wrote:
>>     >>>>> + Luben
>>     >>>>>
>>     >>>>> On Tue, Oct 25, 2022 at 2:55 AM brolerliew <brolerliew@gmail.com> wrote:
>>     >>>>>> When entity move from one rq to another, current_entity will be set to NULL
>>     >>>>>> if it is the moving entity. This make entities close to rq head got
>>     >>>>>> selected more frequently, especially when doing load balance between
>>     >>>>>> multiple drm_gpu_scheduler.
>>     >>>>>>
>>     >>>>>> Make current_entity to next when removing from rq.
>>     >>>>>>
>>     >>>>>> Signed-off-by: brolerliew <brolerliew@gmail.com>
>>     >>>>>> ---
>>     >>>>>>   drivers/gpu/drm/scheduler/sched_main.c | 5 +++--
>>     >>>>>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>     >>>>>>
>>     >>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>     >>>>>> index 2fab218d7082..00b22cc50f08 100644
>>     >>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>     >>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>     >>>>>> @@ -168,10 +168,11 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>>     >>>>>>          spin_lock(&rq->lock);
>>     >>>>>>
>>     >>>>>>          atomic_dec(rq->sched->score);
>>     >>>>>> -       list_del_init(&entity->list);
>>     >>>>>>
>>     >>>>>>          if (rq->current_entity == entity)
>>     >>>>>> -               rq->current_entity = NULL;
>>     >>>>>> +               rq->current_entity = list_next_entry(entity, list);
>>     >>>>>> +
>>     >>>>>> +       list_del_init(&entity->list);
>>     >>>>>>
>>     >>>>>>          if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
>>     >>>>>>                  drm_sched_rq_remove_fifo_locked(entity);
>>     >>>>>> --
>>     >>>>>> 2.34.1
>>     >>>>>>
>>     >>> Looks good. I'll pick it up into some other changes I've in tow, and repost
>>     >>> along with my changes, as they're somewhat related.
>>     >> Actually, the more I look at it, the more I think that we do want to set
>>     >> rq->current_entity to NULL in that function, in order to pick the next best entity
>>     >> (or scheduler for that matter), the next time around. See sched_entity.c,
>>     >> and drm_sched_rq_select_entity() where we start evaluating from the _next_
>>     >> entity.
>>     >>
>>     >> So, it is best to leave it to set it to NULL, for now.
>>     >
>>     > Apart from that this patch here could cause a crash when the entity is
>>     > the last one in the list.
>>     >
>>     > In this case current current_entity would be set to an incorrect upcast
>>     > of the head of the list.
>>
>>     Absolutely. I saw that, but in rejecting the patch, I didn't feel the need to mention it.
>>
>>     Thanks for looking into this.
>>
>>     Regards,
>>     Luben
>>
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/scheduler: set current_entity to next when remove from rq
@ 2022-10-27 15:46                   ` Luben Tuikov
  0 siblings, 0 replies; 15+ messages in thread
From: Luben Tuikov @ 2022-10-27 15:46 UTC (permalink / raw)
  To: Christian König, broler Liew; +Cc: Alex Deucher, linux-kernel, dri-devel

So, I started fixing this, including the bug taking the next element as an entity, but it could be actually the list_head... a la your patch being fixed, and then went down the rabbit whole of also fixing drm_sched_rq_select_entity(), but the problem is that at that point we don't know if we should start from the _next_ entity (as it is currently the case) or from the current entity (a la list_for_each_entry_from()) as it would be the case with this patch (if it were fixed for the list_head bug).

But the problem is that elsewhere in the GPU scheduler (sched_entity.c), we do want to start from rq->current_entity->next, and picking "next" in drm_sched_rq_remove_entity(), would then skip an entity, or be biased for an entity twice. This is why this function is called drm_sched_rq_remove_entity() and not drm_sched_rq_next_entity_or_null().

So all this work seemed moot, given that we've already switched to FIFO-based scheduling in drm-misc-next, and so I didn't see a point in developing this further at this point (it's been working alright)--we're going with FIFO-based scheduling.

Regards,
Luben


On 2022-10-27 05:08, Christian König wrote:
> Am 27.10.22 um 11:00 schrieb broler Liew:
>> It's very nice of you-all to finger it out that it may crash when it is the last entity in the list.   Absolutely I made a mistake about that.
>> But I still cannot understand why we need to restart the selection from the list head when the current entity is removed from rq.
>> In drm_sched_rq_select_entity, starting from head may cause the first entity to be selected more often than others, which breaks the equal rule the scheduler wants to achieve.
>> Maybe the previous one is the better choice when current_entity == entity?
> 
> That's a good argument, but we want to get rid of the round robin algorithm anyway and switch over to the fifo.
> 
> So this is some code which is already not used by default any more and improving it doesn't make much sense.
> 
> Regards,
> Christian.
> 
>>
>> Luben Tuikov <luben.tuikov@amd.com> 于2022年10月27日周四 16:24写道:
>>
>>     On 2022-10-27 04:19, Christian König wrote:
>>     > Am 27.10.22 um 10:07 schrieb Luben Tuikov:
>>     >> On 2022-10-27 03:01, Luben Tuikov wrote:
>>     >>> On 2022-10-25 13:50, Luben Tuikov wrote:
>>     >>>> Looking...
>>     >>>>
>>     >>>> Regards,
>>     >>>> Luben
>>     >>>>
>>     >>>> On 2022-10-25 09:35, Alex Deucher wrote:
>>     >>>>> + Luben
>>     >>>>>
>>     >>>>> On Tue, Oct 25, 2022 at 2:55 AM brolerliew <brolerliew@gmail.com> wrote:
>>     >>>>>> When entity move from one rq to another, current_entity will be set to NULL
>>     >>>>>> if it is the moving entity. This make entities close to rq head got
>>     >>>>>> selected more frequently, especially when doing load balance between
>>     >>>>>> multiple drm_gpu_scheduler.
>>     >>>>>>
>>     >>>>>> Make current_entity to next when removing from rq.
>>     >>>>>>
>>     >>>>>> Signed-off-by: brolerliew <brolerliew@gmail.com>
>>     >>>>>> ---
>>     >>>>>>   drivers/gpu/drm/scheduler/sched_main.c | 5 +++--
>>     >>>>>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>     >>>>>>
>>     >>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>     >>>>>> index 2fab218d7082..00b22cc50f08 100644
>>     >>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>     >>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>     >>>>>> @@ -168,10 +168,11 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>>     >>>>>>          spin_lock(&rq->lock);
>>     >>>>>>
>>     >>>>>>          atomic_dec(rq->sched->score);
>>     >>>>>> -       list_del_init(&entity->list);
>>     >>>>>>
>>     >>>>>>          if (rq->current_entity == entity)
>>     >>>>>> -               rq->current_entity = NULL;
>>     >>>>>> +               rq->current_entity = list_next_entry(entity, list);
>>     >>>>>> +
>>     >>>>>> +       list_del_init(&entity->list);
>>     >>>>>>
>>     >>>>>>          if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
>>     >>>>>>                  drm_sched_rq_remove_fifo_locked(entity);
>>     >>>>>> --
>>     >>>>>> 2.34.1
>>     >>>>>>
>>     >>> Looks good. I'll pick it up into some other changes I've in tow, and repost
>>     >>> along with my changes, as they're somewhat related.
>>     >> Actually, the more I look at it, the more I think that we do want to set
>>     >> rq->current_entity to NULL in that function, in order to pick the next best entity
>>     >> (or scheduler for that matter), the next time around. See sched_entity.c,
>>     >> and drm_sched_rq_select_entity() where we start evaluating from the _next_
>>     >> entity.
>>     >>
>>     >> So, it is best to leave it to set it to NULL, for now.
>>     >
>>     > Apart from that this patch here could cause a crash when the entity is
>>     > the last one in the list.
>>     >
>>     > In this case current current_entity would be set to an incorrect upcast
>>     > of the head of the list.
>>
>>     Absolutely. I saw that, but in rejecting the patch, I didn't feel the need to mention it.
>>
>>     Thanks for looking into this.
>>
>>     Regards,
>>     Luben
>>
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-10-27 15:46 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-25  6:18 [PATCH] drm/scheduler: set current_entity to next when remove from rq brolerliew
2022-10-25  6:18 ` brolerliew
2022-10-25 13:35 ` Alex Deucher
2022-10-25 17:50   ` Luben Tuikov
2022-10-27  7:01     ` Luben Tuikov
2022-10-27  8:07       ` Luben Tuikov
2022-10-27  8:07         ` Luben Tuikov
2022-10-27  8:19         ` Christian König
2022-10-27  8:19           ` Christian König
2022-10-27  8:24           ` Luben Tuikov
2022-10-27  8:24             ` Luben Tuikov
2022-10-27  9:00             ` broler Liew
2022-10-27  9:08               ` Christian König
2022-10-27 15:46                 ` Luben Tuikov
2022-10-27 15:46                   ` Luben Tuikov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.