amd-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] drm/scheduler: fix race condition in load balancer
@ 2020-01-14 15:43 Nirmoy Das
  2020-01-14 16:01 ` Christian König
  0 siblings, 1 reply; 10+ messages in thread
From: Nirmoy Das @ 2020-01-14 15:43 UTC (permalink / raw)
  To: amd-gfx
  Cc: alexander.deucher, kenny.ho, nirmoy.das, christian.koenig,
	pierre-eric.pelloux-prayer

Jobs submitted in an entity should execute in the order those jobs
are submitted. We make sure that by checking entity->job_queue in
drm_sched_entity_select_rq() so that we don't loadbalance jobs within
an entity.

But because we update entity->job_queue later in drm_sched_entity_push_job(),
there remains a open window when it is possibe that entity->rq might get
updated by drm_sched_entity_select_rq() which should not be allowed.

Changes in this part also improves job distribution.
Below are test results after running amdgpu_test from mesa drm

Before this patch:

sched_name     num of many times it got scheduled
=========      ==================================
sdma0          314
sdma1          32
comp_1.0.0     56
comp_1.1.0     0
comp_1.1.1     0
comp_1.2.0     0
comp_1.2.1     0
comp_1.3.0     0
comp_1.3.1     0

After this patch:

sched_name     num of many times it got scheduled
=========      ==================================
 sdma1          243
 sdma0          164
 comp_1.0.1     14
 comp_1.1.0     11
 comp_1.1.1     10
 comp_1.2.0     15
 comp_1.2.1     14
 comp_1.3.0     10
 comp_1.3.1     10

Fixes: 35e160e781a048 (drm/scheduler: change entities rq even earlier)

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reported-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
---
 drivers/gpu/drm/scheduler/sched_entity.c | 9 +++++++--
 drivers/gpu/drm/scheduler/sched_main.c   | 1 +
 include/drm/gpu_scheduler.h              | 1 +
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index 2e3a058fc239..8414e084b6ac 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -67,6 +67,7 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
 	entity->priority = priority;
 	entity->sched_list = num_sched_list > 1 ? sched_list : NULL;
 	entity->last_scheduled = NULL;
+	entity->loadbalance_on = true;
 
 	if(num_sched_list)
 		entity->rq = &sched_list[0]->sched_rq[entity->priority];
@@ -447,6 +448,9 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
 	entity->last_scheduled = dma_fence_get(&sched_job->s_fence->finished);
 
 	spsc_queue_pop(&entity->job_queue);
+	if (!spsc_queue_count(&entity->job_queue))
+		entity->loadbalance_on = true;
+
 	return sched_job;
 }
 
@@ -463,7 +467,8 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
 	struct dma_fence *fence;
 	struct drm_sched_rq *rq;
 
-	if (spsc_queue_count(&entity->job_queue) || entity->num_sched_list <= 1)
+	atomic_inc(&entity->rq->sched->num_jobs);
+	if ((entity->num_sched_list <= 1) || !entity->loadbalance_on)
 		return;
 
 	fence = READ_ONCE(entity->last_scheduled);
@@ -477,6 +482,7 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
 		entity->rq = rq;
 	}
 
+	entity->loadbalance_on = false;
 	spin_unlock(&entity->rq_lock);
 }
 
@@ -498,7 +504,6 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job,
 	bool first;
 
 	trace_drm_sched_job(sched_job, entity);
-	atomic_inc(&entity->rq->sched->num_jobs);
 	WRITE_ONCE(entity->last_user, current->group_leader);
 	first = spsc_queue_push(&entity->job_queue, &sched_job->queue_node);
 
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 3fad5876a13f..00fdc350134e 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -562,6 +562,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
 		return -ENOENT;
 
 	sched = entity->rq->sched;
+	atomic_inc(&entity->rq->sched->num_jobs);
 
 	job->sched = sched;
 	job->entity = entity;
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 96a1a1b7526e..a5190869d323 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -97,6 +97,7 @@ struct drm_sched_entity {
 	struct dma_fence                *last_scheduled;
 	struct task_struct		*last_user;
 	bool 				stopped;
+	bool				loadbalance_on;
 	struct completion		entity_idle;
 };
 
-- 
2.24.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] drm/scheduler: fix race condition in load balancer
  2020-01-14 15:43 [PATCH] drm/scheduler: fix race condition in load balancer Nirmoy Das
@ 2020-01-14 16:01 ` Christian König
  2020-01-14 16:13   ` Nirmoy
                     ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Christian König @ 2020-01-14 16:01 UTC (permalink / raw)
  To: Nirmoy Das, amd-gfx
  Cc: alexander.deucher, kenny.ho, nirmoy.das, christian.koenig,
	pierre-eric.pelloux-prayer

Am 14.01.20 um 16:43 schrieb Nirmoy Das:
> Jobs submitted in an entity should execute in the order those jobs
> are submitted. We make sure that by checking entity->job_queue in
> drm_sched_entity_select_rq() so that we don't loadbalance jobs within
> an entity.
>
> But because we update entity->job_queue later in drm_sched_entity_push_job(),
> there remains a open window when it is possibe that entity->rq might get
> updated by drm_sched_entity_select_rq() which should not be allowed.

NAK, concurrent calls to 
drm_sched_job_init()/drm_sched_entity_push_job() are not allowed in the 
first place or otherwise we mess up the fence sequence order and risk 
memory corruption.

>
> Changes in this part also improves job distribution.
> Below are test results after running amdgpu_test from mesa drm
>
> Before this patch:
>
> sched_name     num of many times it got scheduled
> =========      ==================================
> sdma0          314
> sdma1          32
> comp_1.0.0     56
> comp_1.1.0     0
> comp_1.1.1     0
> comp_1.2.0     0
> comp_1.2.1     0
> comp_1.3.0     0
> comp_1.3.1     0
>
> After this patch:
>
> sched_name     num of many times it got scheduled
> =========      ==================================
>   sdma1          243
>   sdma0          164
>   comp_1.0.1     14
>   comp_1.1.0     11
>   comp_1.1.1     10
>   comp_1.2.0     15
>   comp_1.2.1     14
>   comp_1.3.0     10
>   comp_1.3.1     10

Well that is still rather nice to have, why does that happen?

Christian.

>
> Fixes: 35e160e781a048 (drm/scheduler: change entities rq even earlier)
>
> Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
> Reported-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
> ---
>   drivers/gpu/drm/scheduler/sched_entity.c | 9 +++++++--
>   drivers/gpu/drm/scheduler/sched_main.c   | 1 +
>   include/drm/gpu_scheduler.h              | 1 +
>   3 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> index 2e3a058fc239..8414e084b6ac 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -67,6 +67,7 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
>   	entity->priority = priority;
>   	entity->sched_list = num_sched_list > 1 ? sched_list : NULL;
>   	entity->last_scheduled = NULL;
> +	entity->loadbalance_on = true;
>   
>   	if(num_sched_list)
>   		entity->rq = &sched_list[0]->sched_rq[entity->priority];
> @@ -447,6 +448,9 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
>   	entity->last_scheduled = dma_fence_get(&sched_job->s_fence->finished);
>   
>   	spsc_queue_pop(&entity->job_queue);
> +	if (!spsc_queue_count(&entity->job_queue))
> +		entity->loadbalance_on = true;
> +
>   	return sched_job;
>   }
>   
> @@ -463,7 +467,8 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
>   	struct dma_fence *fence;
>   	struct drm_sched_rq *rq;
>   
> -	if (spsc_queue_count(&entity->job_queue) || entity->num_sched_list <= 1)
> +	atomic_inc(&entity->rq->sched->num_jobs);
> +	if ((entity->num_sched_list <= 1) || !entity->loadbalance_on)
>   		return;
>   
>   	fence = READ_ONCE(entity->last_scheduled);
> @@ -477,6 +482,7 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
>   		entity->rq = rq;
>   	}
>   
> +	entity->loadbalance_on = false;
>   	spin_unlock(&entity->rq_lock);
>   }
>   
> @@ -498,7 +504,6 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job,
>   	bool first;
>   
>   	trace_drm_sched_job(sched_job, entity);
> -	atomic_inc(&entity->rq->sched->num_jobs);
>   	WRITE_ONCE(entity->last_user, current->group_leader);
>   	first = spsc_queue_push(&entity->job_queue, &sched_job->queue_node);
>   
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 3fad5876a13f..00fdc350134e 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -562,6 +562,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>   		return -ENOENT;
>   
>   	sched = entity->rq->sched;
> +	atomic_inc(&entity->rq->sched->num_jobs);
>   
>   	job->sched = sched;
>   	job->entity = entity;
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 96a1a1b7526e..a5190869d323 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -97,6 +97,7 @@ struct drm_sched_entity {
>   	struct dma_fence                *last_scheduled;
>   	struct task_struct		*last_user;
>   	bool 				stopped;
> +	bool				loadbalance_on;
>   	struct completion		entity_idle;
>   };
>   

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] drm/scheduler: fix race condition in load balancer
  2020-01-14 16:01 ` Christian König
@ 2020-01-14 16:13   ` Nirmoy
  2020-01-14 16:20     ` Christian König
  2020-01-14 16:20   ` Nirmoy
  2020-01-15 11:04   ` Nirmoy
  2 siblings, 1 reply; 10+ messages in thread
From: Nirmoy @ 2020-01-14 16:13 UTC (permalink / raw)
  To: christian.koenig, Nirmoy Das, amd-gfx
  Cc: alexander.deucher, kenny.ho, nirmoy.das, pierre-eric.pelloux-prayer


On 1/14/20 5:01 PM, Christian König wrote:
> Am 14.01.20 um 16:43 schrieb Nirmoy Das:
>> Jobs submitted in an entity should execute in the order those jobs
>> are submitted. We make sure that by checking entity->job_queue in
>> drm_sched_entity_select_rq() so that we don't loadbalance jobs within
>> an entity.
>>
>> But because we update entity->job_queue later in 
>> drm_sched_entity_push_job(),
>> there remains a open window when it is possibe that entity->rq might get
>> updated by drm_sched_entity_select_rq() which should not be allowed.
>
> NAK, concurrent calls to 
> drm_sched_job_init()/drm_sched_entity_push_job() are not allowed in 
> the first place or otherwise we mess up the fence sequence order and 
> risk memory corruption.

>
>>
>> Changes in this part also improves job distribution.
>> Below are test results after running amdgpu_test from mesa drm
>>
>> Before this patch:
>>
>> sched_name     num of many times it got scheduled
>> =========      ==================================
>> sdma0          314
>> sdma1          32
>> comp_1.0.0     56
>> comp_1.1.0     0
>> comp_1.1.1     0
>> comp_1.2.0     0
>> comp_1.2.1     0
>> comp_1.3.0     0
>> comp_1.3.1     0
>>
>> After this patch:
>>
>> sched_name     num of many times it got scheduled
>> =========      ==================================
>>   sdma1          243
>>   sdma0          164
>>   comp_1.0.1     14
>>   comp_1.1.0     11
>>   comp_1.1.1     10
>>   comp_1.2.0     15
>>   comp_1.2.1     14
>>   comp_1.3.0     10
>>   comp_1.3.1     10
>
> Well that is still rather nice to have, why does that happen?

I think it is because we are updating num_jobs immediately after 
selecting a new rq. Previously  we do that way after

drm_sched_job_init() in drm_sched_entity_push_job(). The problem is if I 
just do

@@ -562,6 +562,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
           return -ENOENT;
         sched = entity->rq->sched;
+    atomic_inc(&entity->rq->sched->num_jobs);

  @@ -498,7 +504,6 @@ void drm_sched_entity_push_job(struct 
drm_sched_job *sched_job,
       bool first;
         trace_drm_sched_job(sched_job, entity);
-    atomic_inc(&entity->rq->sched->num_jobs);


num_jobs gets negative somewhere down the line somewhere. I am guessing  
it's hitting the race condition as I explained in the commit message


Regards,

Nirmoy

>
> Christian.
>
>>
>> Fixes: 35e160e781a048 (drm/scheduler: change entities rq even earlier)
>>
>> Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
>> Reported-by: Pierre-Eric Pelloux-Prayer 
>> <pierre-eric.pelloux-prayer@amd.com>
>> ---
>>   drivers/gpu/drm/scheduler/sched_entity.c | 9 +++++++--
>>   drivers/gpu/drm/scheduler/sched_main.c   | 1 +
>>   include/drm/gpu_scheduler.h              | 1 +
>>   3 files changed, 9 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
>> b/drivers/gpu/drm/scheduler/sched_entity.c
>> index 2e3a058fc239..8414e084b6ac 100644
>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>> @@ -67,6 +67,7 @@ int drm_sched_entity_init(struct drm_sched_entity 
>> *entity,
>>       entity->priority = priority;
>>       entity->sched_list = num_sched_list > 1 ? sched_list : NULL;
>>       entity->last_scheduled = NULL;
>> +    entity->loadbalance_on = true;
>>         if(num_sched_list)
>>           entity->rq = &sched_list[0]->sched_rq[entity->priority];
>> @@ -447,6 +448,9 @@ struct drm_sched_job 
>> *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
>>       entity->last_scheduled = 
>> dma_fence_get(&sched_job->s_fence->finished);
>>         spsc_queue_pop(&entity->job_queue);
>> +    if (!spsc_queue_count(&entity->job_queue))
>> +        entity->loadbalance_on = true;
>> +
>>       return sched_job;
>>   }
>>   @@ -463,7 +467,8 @@ void drm_sched_entity_select_rq(struct 
>> drm_sched_entity *entity)
>>       struct dma_fence *fence;
>>       struct drm_sched_rq *rq;
>>   -    if (spsc_queue_count(&entity->job_queue) || 
>> entity->num_sched_list <= 1)
>> +    atomic_inc(&entity->rq->sched->num_jobs);
>> +    if ((entity->num_sched_list <= 1) || !entity->loadbalance_on)
>>           return;
>>         fence = READ_ONCE(entity->last_scheduled);
>> @@ -477,6 +482,7 @@ void drm_sched_entity_select_rq(struct 
>> drm_sched_entity *entity)
>>           entity->rq = rq;
>>       }
>>   +    entity->loadbalance_on = false;
>>       spin_unlock(&entity->rq_lock);
>>   }
>>   @@ -498,7 +504,6 @@ void drm_sched_entity_push_job(struct 
>> drm_sched_job *sched_job,
>>       bool first;
>>         trace_drm_sched_job(sched_job, entity);
>> -    atomic_inc(&entity->rq->sched->num_jobs);
>>       WRITE_ONCE(entity->last_user, current->group_leader);
>>       first = spsc_queue_push(&entity->job_queue, 
>> &sched_job->queue_node);
>>   diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>> b/drivers/gpu/drm/scheduler/sched_main.c
>> index 3fad5876a13f..00fdc350134e 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -562,6 +562,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>           return -ENOENT;
>>         sched = entity->rq->sched;
>> +    atomic_inc(&entity->rq->sched->num_jobs);
>>         job->sched = sched;
>>       job->entity = entity;
>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>> index 96a1a1b7526e..a5190869d323 100644
>> --- a/include/drm/gpu_scheduler.h
>> +++ b/include/drm/gpu_scheduler.h
>> @@ -97,6 +97,7 @@ struct drm_sched_entity {
>>       struct dma_fence                *last_scheduled;
>>       struct task_struct        *last_user;
>>       bool                 stopped;
>> +    bool                loadbalance_on;
>>       struct completion        entity_idle;
>>   };
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] drm/scheduler: fix race condition in load balancer
  2020-01-14 16:13   ` Nirmoy
@ 2020-01-14 16:20     ` Christian König
  0 siblings, 0 replies; 10+ messages in thread
From: Christian König @ 2020-01-14 16:20 UTC (permalink / raw)
  To: Nirmoy, Nirmoy Das, amd-gfx
  Cc: alexander.deucher, kenny.ho, nirmoy.das, pierre-eric.pelloux-prayer

Am 14.01.20 um 17:13 schrieb Nirmoy:
>
> On 1/14/20 5:01 PM, Christian König wrote:
>> Am 14.01.20 um 16:43 schrieb Nirmoy Das:
>>> Jobs submitted in an entity should execute in the order those jobs
>>> are submitted. We make sure that by checking entity->job_queue in
>>> drm_sched_entity_select_rq() so that we don't loadbalance jobs within
>>> an entity.
>>>
>>> But because we update entity->job_queue later in 
>>> drm_sched_entity_push_job(),
>>> there remains a open window when it is possibe that entity->rq might 
>>> get
>>> updated by drm_sched_entity_select_rq() which should not be allowed.
>>
>> NAK, concurrent calls to 
>> drm_sched_job_init()/drm_sched_entity_push_job() are not allowed in 
>> the first place or otherwise we mess up the fence sequence order and 
>> risk memory corruption.
>
>>
>>>
>>> Changes in this part also improves job distribution.
>>> Below are test results after running amdgpu_test from mesa drm
>>>
>>> Before this patch:
>>>
>>> sched_name     num of many times it got scheduled
>>> =========      ==================================
>>> sdma0          314
>>> sdma1          32
>>> comp_1.0.0     56
>>> comp_1.1.0     0
>>> comp_1.1.1     0
>>> comp_1.2.0     0
>>> comp_1.2.1     0
>>> comp_1.3.0     0
>>> comp_1.3.1     0
>>>
>>> After this patch:
>>>
>>> sched_name     num of many times it got scheduled
>>> =========      ==================================
>>>   sdma1          243
>>>   sdma0          164
>>>   comp_1.0.1     14
>>>   comp_1.1.0     11
>>>   comp_1.1.1     10
>>>   comp_1.2.0     15
>>>   comp_1.2.1     14
>>>   comp_1.3.0     10
>>>   comp_1.3.1     10
>>
>> Well that is still rather nice to have, why does that happen?
>
> I think it is because we are updating num_jobs immediately after 
> selecting a new rq. Previously  we do that way after
>
> drm_sched_job_init() in drm_sched_entity_push_job(). The problem is if 
> I just do
>
> @@ -562,6 +562,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>           return -ENOENT;
>         sched = entity->rq->sched;
> +    atomic_inc(&entity->rq->sched->num_jobs);
>
>  @@ -498,7 +504,6 @@ void drm_sched_entity_push_job(struct 
> drm_sched_job *sched_job,
>       bool first;
>         trace_drm_sched_job(sched_job, entity);
> -    atomic_inc(&entity->rq->sched->num_jobs);
>
>
> num_jobs gets negative somewhere down the line somewhere. I am 
> guessing  it's hitting the race condition as I explained in the commit 
> message

The race condition you explain in the commit message should be 
impossible to hit or we have much much larger problems than just an 
incorrect job count.

Incrementing num_jobs so early is not possible either cause the job 
might not get pushed to the entity because of an error.

Christian.

>
>
> Regards,
>
> Nirmoy
>
>>
>> Christian.
>>
>>>
>>> Fixes: 35e160e781a048 (drm/scheduler: change entities rq even earlier)
>>>
>>> Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
>>> Reported-by: Pierre-Eric Pelloux-Prayer 
>>> <pierre-eric.pelloux-prayer@amd.com>
>>> ---
>>>   drivers/gpu/drm/scheduler/sched_entity.c | 9 +++++++--
>>>   drivers/gpu/drm/scheduler/sched_main.c   | 1 +
>>>   include/drm/gpu_scheduler.h              | 1 +
>>>   3 files changed, 9 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
>>> b/drivers/gpu/drm/scheduler/sched_entity.c
>>> index 2e3a058fc239..8414e084b6ac 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>>> @@ -67,6 +67,7 @@ int drm_sched_entity_init(struct drm_sched_entity 
>>> *entity,
>>>       entity->priority = priority;
>>>       entity->sched_list = num_sched_list > 1 ? sched_list : NULL;
>>>       entity->last_scheduled = NULL;
>>> +    entity->loadbalance_on = true;
>>>         if(num_sched_list)
>>>           entity->rq = &sched_list[0]->sched_rq[entity->priority];
>>> @@ -447,6 +448,9 @@ struct drm_sched_job 
>>> *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
>>>       entity->last_scheduled = 
>>> dma_fence_get(&sched_job->s_fence->finished);
>>>         spsc_queue_pop(&entity->job_queue);
>>> +    if (!spsc_queue_count(&entity->job_queue))
>>> +        entity->loadbalance_on = true;
>>> +
>>>       return sched_job;
>>>   }
>>>   @@ -463,7 +467,8 @@ void drm_sched_entity_select_rq(struct 
>>> drm_sched_entity *entity)
>>>       struct dma_fence *fence;
>>>       struct drm_sched_rq *rq;
>>>   -    if (spsc_queue_count(&entity->job_queue) || 
>>> entity->num_sched_list <= 1)
>>> +    atomic_inc(&entity->rq->sched->num_jobs);
>>> +    if ((entity->num_sched_list <= 1) || !entity->loadbalance_on)
>>>           return;
>>>         fence = READ_ONCE(entity->last_scheduled);
>>> @@ -477,6 +482,7 @@ void drm_sched_entity_select_rq(struct 
>>> drm_sched_entity *entity)
>>>           entity->rq = rq;
>>>       }
>>>   +    entity->loadbalance_on = false;
>>>       spin_unlock(&entity->rq_lock);
>>>   }
>>>   @@ -498,7 +504,6 @@ void drm_sched_entity_push_job(struct 
>>> drm_sched_job *sched_job,
>>>       bool first;
>>>         trace_drm_sched_job(sched_job, entity);
>>> -    atomic_inc(&entity->rq->sched->num_jobs);
>>>       WRITE_ONCE(entity->last_user, current->group_leader);
>>>       first = spsc_queue_push(&entity->job_queue, 
>>> &sched_job->queue_node);
>>>   diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>> index 3fad5876a13f..00fdc350134e 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -562,6 +562,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>>           return -ENOENT;
>>>         sched = entity->rq->sched;
>>> +    atomic_inc(&entity->rq->sched->num_jobs);
>>>         job->sched = sched;
>>>       job->entity = entity;
>>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>>> index 96a1a1b7526e..a5190869d323 100644
>>> --- a/include/drm/gpu_scheduler.h
>>> +++ b/include/drm/gpu_scheduler.h
>>> @@ -97,6 +97,7 @@ struct drm_sched_entity {
>>>       struct dma_fence                *last_scheduled;
>>>       struct task_struct        *last_user;
>>>       bool                 stopped;
>>> +    bool                loadbalance_on;
>>>       struct completion        entity_idle;
>>>   };
>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] drm/scheduler: fix race condition in load balancer
  2020-01-14 16:01 ` Christian König
  2020-01-14 16:13   ` Nirmoy
@ 2020-01-14 16:20   ` Nirmoy
  2020-01-14 16:23     ` Christian König
  2020-01-15 11:04   ` Nirmoy
  2 siblings, 1 reply; 10+ messages in thread
From: Nirmoy @ 2020-01-14 16:20 UTC (permalink / raw)
  To: christian.koenig, Nirmoy Das, amd-gfx
  Cc: alexander.deucher, kenny.ho, nirmoy.das, pierre-eric.pelloux-prayer


On 1/14/20 5:01 PM, Christian König wrote:
> Am 14.01.20 um 16:43 schrieb Nirmoy Das:
>> Jobs submitted in an entity should execute in the order those jobs
>> are submitted. We make sure that by checking entity->job_queue in
>> drm_sched_entity_select_rq() so that we don't loadbalance jobs within
>> an entity.
>>
>> But because we update entity->job_queue later in 
>> drm_sched_entity_push_job(),
>> there remains a open window when it is possibe that entity->rq might get
>> updated by drm_sched_entity_select_rq() which should not be allowed.
>
> NAK, concurrent calls to 
> drm_sched_job_init()/drm_sched_entity_push_job() are not allowed in 
> the first place or otherwise we mess up the fence sequence order and 
> risk memory corruption.
if I am not missing something, I don't see any lock securing 
drm_sched_job_init()/drm_sched_entity_push_job() calls in 
amdgpu_cs_submit().


Regards,

Nirmoy

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] drm/scheduler: fix race condition in load balancer
  2020-01-14 16:20   ` Nirmoy
@ 2020-01-14 16:23     ` Christian König
  2020-01-14 16:27       ` Nirmoy
  0 siblings, 1 reply; 10+ messages in thread
From: Christian König @ 2020-01-14 16:23 UTC (permalink / raw)
  To: Nirmoy, Nirmoy Das, amd-gfx
  Cc: alexander.deucher, kenny.ho, nirmoy.das, pierre-eric.pelloux-prayer

Am 14.01.20 um 17:20 schrieb Nirmoy:
>
> On 1/14/20 5:01 PM, Christian König wrote:
>> Am 14.01.20 um 16:43 schrieb Nirmoy Das:
>>> Jobs submitted in an entity should execute in the order those jobs
>>> are submitted. We make sure that by checking entity->job_queue in
>>> drm_sched_entity_select_rq() so that we don't loadbalance jobs within
>>> an entity.
>>>
>>> But because we update entity->job_queue later in 
>>> drm_sched_entity_push_job(),
>>> there remains a open window when it is possibe that entity->rq might 
>>> get
>>> updated by drm_sched_entity_select_rq() which should not be allowed.
>>
>> NAK, concurrent calls to 
>> drm_sched_job_init()/drm_sched_entity_push_job() are not allowed in 
>> the first place or otherwise we mess up the fence sequence order and 
>> risk memory corruption.
> if I am not missing something, I don't see any lock securing 
> drm_sched_job_init()/drm_sched_entity_push_job() calls in 
> amdgpu_cs_submit().

See one step up in the call chain, function amdgpu_cs_ioctl().

This is locking the page tables, which also makes access to the context 
and entities mutual exclusive:
>         r = amdgpu_cs_parser_bos(&parser, data);
...
>         r = amdgpu_cs_submit(&parser, cs);
>
> out:

And here the page tables are unlocked again:
>         amdgpu_cs_parser_fini(&parser, r, reserved_buffers);

Regards,
Christian.

>
>
> Regards,
>
> Nirmoy
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] drm/scheduler: fix race condition in load balancer
  2020-01-14 16:23     ` Christian König
@ 2020-01-14 16:27       ` Nirmoy
  0 siblings, 0 replies; 10+ messages in thread
From: Nirmoy @ 2020-01-14 16:27 UTC (permalink / raw)
  To: Christian König, Nirmoy Das, amd-gfx
  Cc: alexander.deucher, kenny.ho, nirmoy.das, pierre-eric.pelloux-prayer


On 1/14/20 5:23 PM, Christian König wrote:
> Am 14.01.20 um 17:20 schrieb Nirmoy:
>>
>> On 1/14/20 5:01 PM, Christian König wrote:
>>> Am 14.01.20 um 16:43 schrieb Nirmoy Das:
>>>> Jobs submitted in an entity should execute in the order those jobs
>>>> are submitted. We make sure that by checking entity->job_queue in
>>>> drm_sched_entity_select_rq() so that we don't loadbalance jobs within
>>>> an entity.
>>>>
>>>> But because we update entity->job_queue later in 
>>>> drm_sched_entity_push_job(),
>>>> there remains a open window when it is possibe that entity->rq 
>>>> might get
>>>> updated by drm_sched_entity_select_rq() which should not be allowed.
>>>
>>> NAK, concurrent calls to 
>>> drm_sched_job_init()/drm_sched_entity_push_job() are not allowed in 
>>> the first place or otherwise we mess up the fence sequence order and 
>>> risk memory corruption.
>> if I am not missing something, I don't see any lock securing 
>> drm_sched_job_init()/drm_sched_entity_push_job() calls in 
>> amdgpu_cs_submit().
>
> See one step up in the call chain, function amdgpu_cs_ioctl().
>
> This is locking the page tables, which also makes access to the 
> context and entities mutual exclusive:
>>         r = amdgpu_cs_parser_bos(&parser, data);
> ...
>>         r = amdgpu_cs_submit(&parser, cs);
>>
>> out:
>
> And here the page tables are unlocked again:
>>         amdgpu_cs_parser_fini(&parser, r, reserved_buffers);

Okay. Then something else is going on. Let me dig more.


Thanks,

Nirmoy

>
> Regards,
> Christian.
>
>>
>>
>> Regards,
>>
>> Nirmoy
>>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] drm/scheduler: fix race condition in load balancer
  2020-01-14 16:01 ` Christian König
  2020-01-14 16:13   ` Nirmoy
  2020-01-14 16:20   ` Nirmoy
@ 2020-01-15 11:04   ` Nirmoy
  2020-01-15 12:52     ` Christian König
  2 siblings, 1 reply; 10+ messages in thread
From: Nirmoy @ 2020-01-15 11:04 UTC (permalink / raw)
  To: christian.koenig, Nirmoy Das, amd-gfx
  Cc: alexander.deucher, kenny.ho, nirmoy.das, pierre-eric.pelloux-prayer

Hi Christian,

On 1/14/20 5:01 PM, Christian König wrote:
>
>> Before this patch:
>>
>> sched_name     num of many times it got scheduled
>> =========      ==================================
>> sdma0          314
>> sdma1          32
>> comp_1.0.0     56
>> comp_1.1.0     0
>> comp_1.1.1     0
>> comp_1.2.0     0
>> comp_1.2.1     0
>> comp_1.3.0     0
>> comp_1.3.1     0
>>
>> After this patch:
>>
>> sched_name     num of many times it got scheduled
>> =========      ==================================
>>   sdma1          243
>>   sdma0          164
>>   comp_1.0.1     14
>>   comp_1.1.0     11
>>   comp_1.1.1     10
>>   comp_1.2.0     15
>>   comp_1.2.1     14
>>   comp_1.3.0     10
>>   comp_1.3.1     10
>
> Well that is still rather nice to have, why does that happen?

I think I know why it happens. At init all entity's rq gets assigned to 
sched_list[0]. I put some prints to check what we compare in 
drm_sched_entity_get_free_sched.

It turns out most of the time it compares zero values(num_jobs(0) < 
min_jobs(0)) so most of the time 1st rq(sdma0, comp_1.0.0) was picked by 
drm_sched_entity_get_free_sched.


This patch was not correct , had an extra atomic_inc(num_jobs) in 
drm_sched_job_init. This probably added bit of randomness I think, which 
helped in better job distribution.

I've updated my previous RFC patch which uses time consumed by each 
sched for load balance with a twist of ignoring previously scheduled 
sched/rq. Let me know what do you think.


Regards,

Nirmoy

>
> Christian.
>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] drm/scheduler: fix race condition in load balancer
  2020-01-15 11:04   ` Nirmoy
@ 2020-01-15 12:52     ` Christian König
  2020-01-15 13:24       ` Nirmoy
  0 siblings, 1 reply; 10+ messages in thread
From: Christian König @ 2020-01-15 12:52 UTC (permalink / raw)
  To: Nirmoy, Nirmoy Das, amd-gfx
  Cc: alexander.deucher, kenny.ho, nirmoy.das, pierre-eric.pelloux-prayer

Hi Nirmoy,

Am 15.01.20 um 12:04 schrieb Nirmoy:
> Hi Christian,
>
> On 1/14/20 5:01 PM, Christian König wrote:
>>
>>> Before this patch:
>>>
>>> sched_name     num of many times it got scheduled
>>> =========      ==================================
>>> sdma0          314
>>> sdma1          32
>>> comp_1.0.0     56
>>> comp_1.1.0     0
>>> comp_1.1.1     0
>>> comp_1.2.0     0
>>> comp_1.2.1     0
>>> comp_1.3.0     0
>>> comp_1.3.1     0
>>>
>>> After this patch:
>>>
>>> sched_name     num of many times it got scheduled
>>> =========      ==================================
>>>   sdma1          243
>>>   sdma0          164
>>>   comp_1.0.1     14
>>>   comp_1.1.0     11
>>>   comp_1.1.1     10
>>>   comp_1.2.0     15
>>>   comp_1.2.1     14
>>>   comp_1.3.0     10
>>>   comp_1.3.1     10
>>
>> Well that is still rather nice to have, why does that happen?
>
> I think I know why it happens. At init all entity's rq gets assigned 
> to sched_list[0]. I put some prints to check what we compare in 
> drm_sched_entity_get_free_sched.
>
> It turns out most of the time it compares zero values(num_jobs(0) < 
> min_jobs(0)) so most of the time 1st rq(sdma0, comp_1.0.0) was picked 
> by drm_sched_entity_get_free_sched.

Well that is expected because the unit tests always does 
submission,wait,submission,wait,submission,wait.... So the number of 
jobs in the scheduler becomes zero in between.

> This patch was not correct , had an extra atomic_inc(num_jobs) in 
> drm_sched_job_init. This probably added bit of randomness I think, 
> which helped in better job distribution.

Mhm, that might not be a bad idea after all. We could rename num_jobs 
into something like like score and do a +1 in drm_sched_rq_add_entity() 
and a -1 in drm_sched_rq_remove_entity().

That should have pretty much the effect we want to have.

> I've updated my previous RFC patch which uses time consumed by each 
> sched for load balance with a twist of ignoring previously scheduled 
> sched/rq. Let me know what do you think.

I didn't had time yet to wrap my head around that in detail, but at 
least of hand Luben is right that the locking looks really awkward.

And I would rather like to avoid a larger change like this for a nice to 
have for testing feature.

Regards,
Christian.

>
>
> Regards,
>
> Nirmoy
>
>>
>> Christian.
>>
>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] drm/scheduler: fix race condition in load balancer
  2020-01-15 12:52     ` Christian König
@ 2020-01-15 13:24       ` Nirmoy
  0 siblings, 0 replies; 10+ messages in thread
From: Nirmoy @ 2020-01-15 13:24 UTC (permalink / raw)
  To: Christian König, Nirmoy Das, amd-gfx
  Cc: alexander.deucher, kenny.ho, nirmoy.das, pierre-eric.pelloux-prayer


>> I think I know why it happens. At init all entity's rq gets assigned 
>> to sched_list[0]. I put some prints to check what we compare in 
>> drm_sched_entity_get_free_sched.
>>
>> It turns out most of the time it compares zero values(num_jobs(0) < 
>> min_jobs(0)) so most of the time 1st rq(sdma0, comp_1.0.0) was picked 
>> by drm_sched_entity_get_free_sched.
>
> Well that is expected because the unit tests always does 
> submission,wait,submission,wait,submission,wait.... So the number of 
> jobs in the scheduler becomes zero in between.
Even with multiple parallel instances of amdgpu_test, I haven't seen any 
improvement in the load balance.
>
>> This patch was not correct , had an extra atomic_inc(num_jobs) in 
>> drm_sched_job_init. This probably added bit of randomness I think, 
>> which helped in better job distribution.
>
> Mhm, that might not be a bad idea after all. We could rename num_jobs 
> into something like like score and do a +1 in 
> drm_sched_rq_add_entity() and a -1 in drm_sched_rq_remove_entity().
>
> That should have pretty much the effect we want to have.
That's sounds good as well. I will create a patch.
>
>> I've updated my previous RFC patch which uses time consumed by each 
>> sched for load balance with a twist of ignoring previously scheduled 
>> sched/rq. Let me know what do you think.
>
> I didn't had time yet to wrap my head around that in detail, but at 
> least of hand Luben is right that the locking looks really awkward.
I was unable to find a better way to do the locking part. My mail client 
might've missed Luben's review, can't find it :/


Regards,

Nirmoy

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-01-15 13:23 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-14 15:43 [PATCH] drm/scheduler: fix race condition in load balancer Nirmoy Das
2020-01-14 16:01 ` Christian König
2020-01-14 16:13   ` Nirmoy
2020-01-14 16:20     ` Christian König
2020-01-14 16:20   ` Nirmoy
2020-01-14 16:23     ` Christian König
2020-01-14 16:27       ` Nirmoy
2020-01-15 11:04   ` Nirmoy
2020-01-15 12:52     ` Christian König
2020-01-15 13:24       ` Nirmoy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).