All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 1/4] drm/amdgpu: Fix KFD oversubscription by tracking queues correctly
@ 2017-07-14  1:21 Jay Cornwall
       [not found] ` <1499995316-4544-1-git-send-email-Jay.Cornwall-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Jay Cornwall @ 2017-07-14  1:21 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Jay Cornwall

The number of compute queues available to the KFD was erroneously
calculated as 64. Only the first MEC can execute compute queues and
it has 32 queue slots.

This caused the oversubscription limit to be calculated incorrectly,
leading to a missing chained runlist command at the end of an
oversubscribed runlist.

v2: Remove unused num_mec field to avoid duplicate logic
v3: Separate num_mec removal into separate patches

Change-Id: I9e7bba2cc1928b624e3eeb1edb06fdb602e5294f
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 7060daf..aa4006a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -140,7 +140,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
 		/* According to linux/bitmap.h we shouldn't use bitmap_clear if
 		 * nbits is not compile time constant
 		 */
-		last_valid_bit = adev->gfx.mec.num_mec
+		last_valid_bit = 1 /* only first MEC can have compute queues */
 				* adev->gfx.mec.num_pipe_per_mec
 				* adev->gfx.mec.num_queue_per_pipe;
 		for (i = last_valid_bit; i < KGD_MAX_QUEUES; ++i)
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/4] drm/amdkfd: Remove unused references to shared_resources.num_mec
       [not found] ` <1499995316-4544-1-git-send-email-Jay.Cornwall-5C7GfCeVMHo@public.gmane.org>
@ 2017-07-14  1:21   ` Jay Cornwall
       [not found]     ` <1499995316-4544-2-git-send-email-Jay.Cornwall-5C7GfCeVMHo@public.gmane.org>
  2017-07-14  1:21   ` [PATCH 3/4] drm/radeon: Remove initialization of shared_resources.num_mec Jay Cornwall
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Jay Cornwall @ 2017-07-14  1:21 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Jay Cornwall

Dead code.

Change-Id: Ic0bb1bcca87e96bc5e8fa9894727b0de152e8818
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c               | 4 ----
 drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 7 -------
 2 files changed, 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 1cf00d4..95f9396 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -494,10 +494,6 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 	} else
 		kfd->max_proc_per_quantum = hws_max_conc_proc;
 
-	/* We only use the first MEC */
-	if (kfd->shared_resources.num_mec > 1)
-		kfd->shared_resources.num_mec = 1;
-
 	/* calculate max size of mqds needed for queues */
 	size = max_num_of_queues_per_device *
 			kfd->device_info->mqd_size_aligned;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 7607989..306144f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -82,13 +82,6 @@ static bool is_pipe_enabled(struct device_queue_manager *dqm, int mec, int pipe)
 	return false;
 }
 
-unsigned int get_mec_num(struct device_queue_manager *dqm)
-{
-	BUG_ON(!dqm || !dqm->dev);
-
-	return dqm->dev->shared_resources.num_mec;
-}
-
 unsigned int get_queues_num(struct device_queue_manager *dqm)
 {
 	BUG_ON(!dqm || !dqm->dev);
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/4] drm/radeon: Remove initialization of shared_resources.num_mec
       [not found] ` <1499995316-4544-1-git-send-email-Jay.Cornwall-5C7GfCeVMHo@public.gmane.org>
  2017-07-14  1:21   ` [PATCH 2/4] drm/amdkfd: Remove unused references to shared_resources.num_mec Jay Cornwall
@ 2017-07-14  1:21   ` Jay Cornwall
  2017-07-14  1:21   ` [PATCH 4/4] drm/amdgpu: Remove unused field kgd2kfd_shared_resources.num_mec Jay Cornwall
  2017-07-14 16:24   ` [PATCH v3 1/4] drm/amdgpu: Fix KFD oversubscription by tracking queues correctly Alex Deucher
  3 siblings, 0 replies; 9+ messages in thread
From: Jay Cornwall @ 2017-07-14  1:21 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Jay Cornwall

Dead code.

Change-Id: I2383e0b541ed55288570b6a0ec8a0d49cdd4df89
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
---
 drivers/gpu/drm/radeon/radeon_kfd.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c b/drivers/gpu/drm/radeon/radeon_kfd.c
index 719ea51..8f8c7c1 100644
--- a/drivers/gpu/drm/radeon/radeon_kfd.c
+++ b/drivers/gpu/drm/radeon/radeon_kfd.c
@@ -251,7 +251,6 @@ void radeon_kfd_device_init(struct radeon_device *rdev)
 	if (rdev->kfd) {
 		struct kgd2kfd_shared_resources gpu_resources = {
 			.compute_vmid_bitmap = 0xFF00,
-			.num_mec = 1,
 			.num_pipe_per_mec = 4,
 			.num_queue_per_pipe = 8,
 			.gpuvm_size = (uint64_t)radeon_vm_size << 30
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/4] drm/amdgpu: Remove unused field kgd2kfd_shared_resources.num_mec
       [not found] ` <1499995316-4544-1-git-send-email-Jay.Cornwall-5C7GfCeVMHo@public.gmane.org>
  2017-07-14  1:21   ` [PATCH 2/4] drm/amdkfd: Remove unused references to shared_resources.num_mec Jay Cornwall
  2017-07-14  1:21   ` [PATCH 3/4] drm/radeon: Remove initialization of shared_resources.num_mec Jay Cornwall
@ 2017-07-14  1:21   ` Jay Cornwall
  2017-07-14 16:24   ` [PATCH v3 1/4] drm/amdgpu: Fix KFD oversubscription by tracking queues correctly Alex Deucher
  3 siblings, 0 replies; 9+ messages in thread
From: Jay Cornwall @ 2017-07-14  1:21 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Jay Cornwall

Dead code.

Change-Id: I9575aa73b5741b80dc340f953cc773385c92b2be
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c      | 1 -
 drivers/gpu/drm/amd/include/kgd_kfd_interface.h | 3 ---
 2 files changed, 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index aa4006a..8c710f7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -116,7 +116,6 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
 	if (adev->kfd) {
 		struct kgd2kfd_shared_resources gpu_resources = {
 			.compute_vmid_bitmap = global_compute_vmid_bitmap,
-			.num_mec = adev->gfx.mec.num_mec,
 			.num_pipe_per_mec = adev->gfx.mec.num_pipe_per_mec,
 			.num_queue_per_pipe = adev->gfx.mec.num_queue_per_pipe,
 			.gpuvm_size = (uint64_t)amdgpu_vm_size << 30
diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
index a4d2fee..10794b3 100644
--- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
+++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
@@ -107,9 +107,6 @@ struct kgd2kfd_shared_resources {
 	/* Bit n == 1 means VMID n is available for KFD. */
 	unsigned int compute_vmid_bitmap;
 
-	/* number of mec available from the hardware */
-	uint32_t num_mec;
-
 	/* number of pipes per mec */
 	uint32_t num_pipe_per_mec;
 
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 1/4] drm/amdgpu: Fix KFD oversubscription by tracking queues correctly
       [not found] ` <1499995316-4544-1-git-send-email-Jay.Cornwall-5C7GfCeVMHo@public.gmane.org>
                     ` (2 preceding siblings ...)
  2017-07-14  1:21   ` [PATCH 4/4] drm/amdgpu: Remove unused field kgd2kfd_shared_resources.num_mec Jay Cornwall
@ 2017-07-14 16:24   ` Alex Deucher
       [not found]     ` <CADnq5_MO-vbhKtEMHwZrAaqA2fhh+qnO0GCrdq+rh4SMY50KsQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  3 siblings, 1 reply; 9+ messages in thread
From: Alex Deucher @ 2017-07-14 16:24 UTC (permalink / raw)
  To: Jay Cornwall; +Cc: amd-gfx list

On Thu, Jul 13, 2017 at 9:21 PM, Jay Cornwall <Jay.Cornwall@amd.com> wrote:
> The number of compute queues available to the KFD was erroneously
> calculated as 64. Only the first MEC can execute compute queues and
> it has 32 queue slots.
>
> This caused the oversubscription limit to be calculated incorrectly,
> leading to a missing chained runlist command at the end of an
> oversubscribed runlist.
>
> v2: Remove unused num_mec field to avoid duplicate logic
> v3: Separate num_mec removal into separate patches
>
> Change-Id: I9e7bba2cc1928b624e3eeb1edb06fdb602e5294f
> Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>

Series is:
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> index 7060daf..aa4006a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> @@ -140,7 +140,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
>                 /* According to linux/bitmap.h we shouldn't use bitmap_clear if
>                  * nbits is not compile time constant
>                  */
> -               last_valid_bit = adev->gfx.mec.num_mec
> +               last_valid_bit = 1 /* only first MEC can have compute queues */
>                                 * adev->gfx.mec.num_pipe_per_mec
>                                 * adev->gfx.mec.num_queue_per_pipe;
>                 for (i = last_valid_bit; i < KGD_MAX_QUEUES; ++i)
> --
> 2.7.4
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 1/4] drm/amdgpu: Fix KFD oversubscription by tracking queues correctly
       [not found]     ` <CADnq5_MO-vbhKtEMHwZrAaqA2fhh+qnO0GCrdq+rh4SMY50KsQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-07-17 15:52       ` Oded Gabbay
       [not found]         ` <CAFCwf11Wr+8=1+-ohU1Ud_4tjfHeMOQiwz9vTFHiQev=FF9Nzg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Oded Gabbay @ 2017-07-17 15:52 UTC (permalink / raw)
  To: Alex Deucher; +Cc: Jay Cornwall, amd-gfx list

On Fri, Jul 14, 2017 at 7:24 PM, Alex Deucher <alexdeucher@gmail.com> wrote:
> On Thu, Jul 13, 2017 at 9:21 PM, Jay Cornwall <Jay.Cornwall@amd.com> wrote:
>> The number of compute queues available to the KFD was erroneously
>> calculated as 64. Only the first MEC can execute compute queues and
>> it has 32 queue slots.
>>
>> This caused the oversubscription limit to be calculated incorrectly,
>> leading to a missing chained runlist command at the end of an
>> oversubscribed runlist.
>>
>> v2: Remove unused num_mec field to avoid duplicate logic
>> v3: Separate num_mec removal into separate patches
>>
>> Change-Id: I9e7bba2cc1928b624e3eeb1edb06fdb602e5294f
>> Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
>
> Series is:
> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
>
Hi Jay,
Thanks for the patches, I applied them to amdkfd-fixes (after rebasing
them over 4.13-rc1)

Oded

>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> index 7060daf..aa4006a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> @@ -140,7 +140,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
>>                 /* According to linux/bitmap.h we shouldn't use bitmap_clear if
>>                  * nbits is not compile time constant
>>                  */
>> -               last_valid_bit = adev->gfx.mec.num_mec
>> +               last_valid_bit = 1 /* only first MEC can have compute queues */
>>                                 * adev->gfx.mec.num_pipe_per_mec
>>                                 * adev->gfx.mec.num_queue_per_pipe;
>>                 for (i = last_valid_bit; i < KGD_MAX_QUEUES; ++i)
>> --
>> 2.7.4
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/4] drm/amdkfd: Remove unused references to shared_resources.num_mec
       [not found]     ` <1499995316-4544-2-git-send-email-Jay.Cornwall-5C7GfCeVMHo@public.gmane.org>
@ 2017-07-17 15:53       ` Oded Gabbay
  0 siblings, 0 replies; 9+ messages in thread
From: Oded Gabbay @ 2017-07-17 15:53 UTC (permalink / raw)
  To: Jay Cornwall; +Cc: amd-gfx list

On Fri, Jul 14, 2017 at 4:21 AM, Jay Cornwall <Jay.Cornwall@amd.com> wrote:
> Dead code.
>
> Change-Id: Ic0bb1bcca87e96bc5e8fa9894727b0de152e8818
> Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c               | 4 ----
>  drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 7 -------
>  2 files changed, 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index 1cf00d4..95f9396 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -494,10 +494,6 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>         } else
>                 kfd->max_proc_per_quantum = hws_max_conc_proc;
>
> -       /* We only use the first MEC */
> -       if (kfd->shared_resources.num_mec > 1)
> -               kfd->shared_resources.num_mec = 1;
> -
>         /* calculate max size of mqds needed for queues */
>         size = max_num_of_queues_per_device *
>                         kfd->device_info->mqd_size_aligned;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> index 7607989..306144f 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> @@ -82,13 +82,6 @@ static bool is_pipe_enabled(struct device_queue_manager *dqm, int mec, int pipe)
>         return false;
>  }
>
> -unsigned int get_mec_num(struct device_queue_manager *dqm)
> -{
> -       BUG_ON(!dqm || !dqm->dev);
> -
> -       return dqm->dev->shared_resources.num_mec;
> -}
> -

FYI, I removed also the declaration of get_mec_num() in the header file
Oded

>  unsigned int get_queues_num(struct device_queue_manager *dqm)
>  {
>         BUG_ON(!dqm || !dqm->dev);
> --
> 2.7.4
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 1/4] drm/amdgpu: Fix KFD oversubscription by tracking queues correctly
       [not found]         ` <CAFCwf11Wr+8=1+-ohU1Ud_4tjfHeMOQiwz9vTFHiQev=FF9Nzg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-07-18  4:41           ` Felix Kuehling
       [not found]             ` <0567bf23-212d-4fa4-8cb8-55e03782dce0-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Felix Kuehling @ 2017-07-18  4:41 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Deucher, Alexander

Hi Alex,

This patch series went into amd-kfd-staging. I'd like to also push it
into amd-staging-4.11 as I'm just working to minimize any unnecessary
differences between the branches before the big KFD history rework.

I rebased it, resolved some contlicts, and removed the declaration of
get_mec_num from kfd_device_queue_manager.h. Do you want me to push that
rebased patch series?

Thanks,
  Felix


On 17-07-17 11:52 AM, Oded Gabbay wrote:
> On Fri, Jul 14, 2017 at 7:24 PM, Alex Deucher <alexdeucher@gmail.com> wrote:
>> On Thu, Jul 13, 2017 at 9:21 PM, Jay Cornwall <Jay.Cornwall@amd.com> wrote:
>>> The number of compute queues available to the KFD was erroneously
>>> calculated as 64. Only the first MEC can execute compute queues and
>>> it has 32 queue slots.
>>>
>>> This caused the oversubscription limit to be calculated incorrectly,
>>> leading to a missing chained runlist command at the end of an
>>> oversubscribed runlist.
>>>
>>> v2: Remove unused num_mec field to avoid duplicate logic
>>> v3: Separate num_mec removal into separate patches
>>>
>>> Change-Id: I9e7bba2cc1928b624e3eeb1edb06fdb602e5294f
>>> Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
>> Series is:
>> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
>>
> Hi Jay,
> Thanks for the patches, I applied them to amdkfd-fixes (after rebasing
> them over 4.13-rc1)
>
> Oded
>
>>> ---
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>> index 7060daf..aa4006a 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>> @@ -140,7 +140,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
>>>                 /* According to linux/bitmap.h we shouldn't use bitmap_clear if
>>>                  * nbits is not compile time constant
>>>                  */
>>> -               last_valid_bit = adev->gfx.mec.num_mec
>>> +               last_valid_bit = 1 /* only first MEC can have compute queues */
>>>                                 * adev->gfx.mec.num_pipe_per_mec
>>>                                 * adev->gfx.mec.num_queue_per_pipe;
>>>                 for (i = last_valid_bit; i < KGD_MAX_QUEUES; ++i)
>>> --
>>> 2.7.4
>>>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH v3 1/4] drm/amdgpu: Fix KFD oversubscription by tracking queues correctly
       [not found]             ` <0567bf23-212d-4fa4-8cb8-55e03782dce0-5C7GfCeVMHo@public.gmane.org>
@ 2017-07-18 12:46               ` Deucher, Alexander
  0 siblings, 0 replies; 9+ messages in thread
From: Deucher, Alexander @ 2017-07-18 12:46 UTC (permalink / raw)
  To: Kuehling, Felix, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

> -----Original Message-----
> From: Kuehling, Felix
> Sent: Tuesday, July 18, 2017 12:41 AM
> To: amd-gfx@lists.freedesktop.org; Deucher, Alexander
> Subject: Re: [PATCH v3 1/4] drm/amdgpu: Fix KFD oversubscription by
> tracking queues correctly
> 
> Hi Alex,
> 
> This patch series went into amd-kfd-staging. I'd like to also push it
> into amd-staging-4.11 as I'm just working to minimize any unnecessary
> differences between the branches before the big KFD history rework.
> 
> I rebased it, resolved some contlicts, and removed the declaration of
> get_mec_num from kfd_device_queue_manager.h. Do you want me to
> push that
> rebased patch series?

Sure.  Sounds good.

Alex

> 
> Thanks,
>   Felix
> 
> 
> On 17-07-17 11:52 AM, Oded Gabbay wrote:
> > On Fri, Jul 14, 2017 at 7:24 PM, Alex Deucher <alexdeucher@gmail.com>
> wrote:
> >> On Thu, Jul 13, 2017 at 9:21 PM, Jay Cornwall <Jay.Cornwall@amd.com>
> wrote:
> >>> The number of compute queues available to the KFD was erroneously
> >>> calculated as 64. Only the first MEC can execute compute queues and
> >>> it has 32 queue slots.
> >>>
> >>> This caused the oversubscription limit to be calculated incorrectly,
> >>> leading to a missing chained runlist command at the end of an
> >>> oversubscribed runlist.
> >>>
> >>> v2: Remove unused num_mec field to avoid duplicate logic
> >>> v3: Separate num_mec removal into separate patches
> >>>
> >>> Change-Id: I9e7bba2cc1928b624e3eeb1edb06fdb602e5294f
> >>> Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
> >> Series is:
> >> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
> >>
> > Hi Jay,
> > Thanks for the patches, I applied them to amdkfd-fixes (after rebasing
> > them over 4.13-rc1)
> >
> > Oded
> >
> >>> ---
> >>>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
> >>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> >>> index 7060daf..aa4006a 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> >>> @@ -140,7 +140,7 @@ void amdgpu_amdkfd_device_init(struct
> amdgpu_device *adev)
> >>>                 /* According to linux/bitmap.h we shouldn't use bitmap_clear if
> >>>                  * nbits is not compile time constant
> >>>                  */
> >>> -               last_valid_bit = adev->gfx.mec.num_mec
> >>> +               last_valid_bit = 1 /* only first MEC can have compute queues */
> >>>                                 * adev->gfx.mec.num_pipe_per_mec
> >>>                                 * adev->gfx.mec.num_queue_per_pipe;
> >>>                 for (i = last_valid_bit; i < KGD_MAX_QUEUES; ++i)
> >>> --
> >>> 2.7.4
> >>>
> >>> _______________________________________________
> >>> amd-gfx mailing list
> >>> amd-gfx@lists.freedesktop.org
> >>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> >> _______________________________________________
> >> amd-gfx mailing list
> >> amd-gfx@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> > _______________________________________________
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-07-18 12:46 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-14  1:21 [PATCH v3 1/4] drm/amdgpu: Fix KFD oversubscription by tracking queues correctly Jay Cornwall
     [not found] ` <1499995316-4544-1-git-send-email-Jay.Cornwall-5C7GfCeVMHo@public.gmane.org>
2017-07-14  1:21   ` [PATCH 2/4] drm/amdkfd: Remove unused references to shared_resources.num_mec Jay Cornwall
     [not found]     ` <1499995316-4544-2-git-send-email-Jay.Cornwall-5C7GfCeVMHo@public.gmane.org>
2017-07-17 15:53       ` Oded Gabbay
2017-07-14  1:21   ` [PATCH 3/4] drm/radeon: Remove initialization of shared_resources.num_mec Jay Cornwall
2017-07-14  1:21   ` [PATCH 4/4] drm/amdgpu: Remove unused field kgd2kfd_shared_resources.num_mec Jay Cornwall
2017-07-14 16:24   ` [PATCH v3 1/4] drm/amdgpu: Fix KFD oversubscription by tracking queues correctly Alex Deucher
     [not found]     ` <CADnq5_MO-vbhKtEMHwZrAaqA2fhh+qnO0GCrdq+rh4SMY50KsQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-07-17 15:52       ` Oded Gabbay
     [not found]         ` <CAFCwf11Wr+8=1+-ohU1Ud_4tjfHeMOQiwz9vTFHiQev=FF9Nzg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-07-18  4:41           ` Felix Kuehling
     [not found]             ` <0567bf23-212d-4fa4-8cb8-55e03782dce0-5C7GfCeVMHo@public.gmane.org>
2017-07-18 12:46               ` Deucher, Alexander

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.