dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/1] drm/amdkfd: Do not ignore requested queue size during allocation
@ 2017-11-16 21:36 Jan Vesely
       [not found] ` <20171116213631.3987-1-jan.vesely-kgbqMDwikbSVc3sceRu5cw@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Vesely @ 2017-11-16 21:36 UTC (permalink / raw)
  To: amd-gfx, dri-devel

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
---
 drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
index f1d48281e322..b3bee39661ab 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
@@ -37,15 +37,16 @@ static bool initialize_vi(struct kernel_queue *kq, struct kfd_dev *dev,
 			enum kfd_queue_type type, unsigned int queue_size)
 {
 	int retval;
+	unsigned int size = ALIGN(queue_size, PAGE_SIZE);
 
-	retval = kfd_gtt_sa_allocate(dev, PAGE_SIZE, &kq->eop_mem);
+	retval = kfd_gtt_sa_allocate(dev, size, &kq->eop_mem);
 	if (retval != 0)
 		return false;
 
 	kq->eop_gpu_addr = kq->eop_mem->gpu_addr;
 	kq->eop_kernel_addr = kq->eop_mem->cpu_ptr;
 
-	memset(kq->eop_kernel_addr, 0, PAGE_SIZE);
+	memset(kq->eop_kernel_addr, 0, size);
 
 	return true;
 }
-- 
2.13.6

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] drm/amdkfd: Do not ignore requested queue size during allocation
       [not found] ` <20171116213631.3987-1-jan.vesely-kgbqMDwikbSVc3sceRu5cw@public.gmane.org>
@ 2017-11-19  8:19   ` Oded Gabbay
       [not found]     ` <CAFCwf10MTrBcGn1kejNvn9AcHDsCiW85HkC6bSmYfY=1shGJxQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Oded Gabbay @ 2017-11-19  8:19 UTC (permalink / raw)
  To: Jan Vesely; +Cc: Maling list - DRI developers, amd-gfx list

On Thu, Nov 16, 2017 at 11:36 PM, Jan Vesely <jan.vesely@rutgers.edu> wrote:
> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
> index f1d48281e322..b3bee39661ab 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
> @@ -37,15 +37,16 @@ static bool initialize_vi(struct kernel_queue *kq, struct kfd_dev *dev,
>                         enum kfd_queue_type type, unsigned int queue_size)
>  {
>         int retval;
> +       unsigned int size = ALIGN(queue_size, PAGE_SIZE);
>
> -       retval = kfd_gtt_sa_allocate(dev, PAGE_SIZE, &kq->eop_mem);
> +       retval = kfd_gtt_sa_allocate(dev, size, &kq->eop_mem);
>         if (retval != 0)
>                 return false;
>
>         kq->eop_gpu_addr = kq->eop_mem->gpu_addr;
>         kq->eop_kernel_addr = kq->eop_mem->cpu_ptr;
>
> -       memset(kq->eop_kernel_addr, 0, PAGE_SIZE);
> +       memset(kq->eop_kernel_addr, 0, size);
>
>         return true;
>  }
> --
> 2.13.6
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Thanks!
Applied to -next tree
Oded
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] drm/amdkfd: Do not ignore requested queue size during allocation
       [not found]     ` <CAFCwf10MTrBcGn1kejNvn9AcHDsCiW85HkC6bSmYfY=1shGJxQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-11-20 19:22       ` Felix Kuehling
       [not found]         ` <21e77adc-4fbe-a3e9-0a02-5d84eb201561-5C7GfCeVMHo@public.gmane.org>
  2017-11-29 21:43         ` Jan Vesely
  0 siblings, 2 replies; 11+ messages in thread
From: Felix Kuehling @ 2017-11-20 19:22 UTC (permalink / raw)
  To: Oded Gabbay, Jan Vesely; +Cc: amd-gfx list, Maling list - DRI developers

I think this patch is not correct. The EOP-mem is not associated with
the queue size. The EOP buffer is a separate buffer used by the firmware
to handle command completion. As I understand it, this allows more
concurrency, while still making it look like all commands in the queue
are completing in order.

Regards,
  Felix


On 2017-11-19 03:19 AM, Oded Gabbay wrote:
> On Thu, Nov 16, 2017 at 11:36 PM, Jan Vesely <jan.vesely@rutgers.edu> wrote:
>> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
>> ---
>>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
>> index f1d48281e322..b3bee39661ab 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
>> @@ -37,15 +37,16 @@ static bool initialize_vi(struct kernel_queue *kq, struct kfd_dev *dev,
>>                         enum kfd_queue_type type, unsigned int queue_size)
>>  {
>>         int retval;
>> +       unsigned int size = ALIGN(queue_size, PAGE_SIZE);
>>
>> -       retval = kfd_gtt_sa_allocate(dev, PAGE_SIZE, &kq->eop_mem);
>> +       retval = kfd_gtt_sa_allocate(dev, size, &kq->eop_mem);
>>         if (retval != 0)
>>                 return false;
>>
>>         kq->eop_gpu_addr = kq->eop_mem->gpu_addr;
>>         kq->eop_kernel_addr = kq->eop_mem->cpu_ptr;
>>
>> -       memset(kq->eop_kernel_addr, 0, PAGE_SIZE);
>> +       memset(kq->eop_kernel_addr, 0, size);
>>
>>         return true;
>>  }
>> --
>> 2.13.6
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> Thanks!
> Applied to -next tree
> Oded
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] drm/amdkfd: Do not ignore requested queue size during allocation
       [not found]         ` <21e77adc-4fbe-a3e9-0a02-5d84eb201561-5C7GfCeVMHo@public.gmane.org>
@ 2017-11-21 11:44           ` Oded Gabbay
       [not found]             ` <CAFCwf12ZeO=6F-EQ_MFgSEugxMSAFNe+OkGT_gZBMp34VzFAcA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Oded Gabbay @ 2017-11-21 11:44 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list, Jan Vesely, Maling list - DRI developers

Thanks Felix for catching that, For some reason I remembered  EOP
buffer should be the same size of the queue.
Then we can remove the queue size parameter from that function ?

On Mon, Nov 20, 2017 at 9:22 PM, Felix Kuehling <felix.kuehling@amd.com> wrote:
> I think this patch is not correct. The EOP-mem is not associated with
> the queue size. The EOP buffer is a separate buffer used by the firmware
> to handle command completion. As I understand it, this allows more
> concurrency, while still making it look like all commands in the queue
> are completing in order.
>
> Regards,
>   Felix
>
>
> On 2017-11-19 03:19 AM, Oded Gabbay wrote:
>> On Thu, Nov 16, 2017 at 11:36 PM, Jan Vesely <jan.vesely@rutgers.edu> wrote:
>>> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
>>> ---
>>>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c | 5 +++--
>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
>>> index f1d48281e322..b3bee39661ab 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
>>> @@ -37,15 +37,16 @@ static bool initialize_vi(struct kernel_queue *kq, struct kfd_dev *dev,
>>>                         enum kfd_queue_type type, unsigned int queue_size)
>>>  {
>>>         int retval;
>>> +       unsigned int size = ALIGN(queue_size, PAGE_SIZE);
>>>
>>> -       retval = kfd_gtt_sa_allocate(dev, PAGE_SIZE, &kq->eop_mem);
>>> +       retval = kfd_gtt_sa_allocate(dev, size, &kq->eop_mem);
>>>         if (retval != 0)
>>>                 return false;
>>>
>>>         kq->eop_gpu_addr = kq->eop_mem->gpu_addr;
>>>         kq->eop_kernel_addr = kq->eop_mem->cpu_ptr;
>>>
>>> -       memset(kq->eop_kernel_addr, 0, PAGE_SIZE);
>>> +       memset(kq->eop_kernel_addr, 0, size);
>>>
>>>         return true;
>>>  }
>>> --
>>> 2.13.6
>>>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>> Thanks!
>> Applied to -next tree
>> Oded
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] drm/amdkfd: Do not ignore requested queue size during allocation
       [not found]             ` <CAFCwf12ZeO=6F-EQ_MFgSEugxMSAFNe+OkGT_gZBMp34VzFAcA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-11-21 16:30               ` Felix Kuehling
  0 siblings, 0 replies; 11+ messages in thread
From: Felix Kuehling @ 2017-11-21 16:30 UTC (permalink / raw)
  To: Oded Gabbay; +Cc: amd-gfx list, Jan Vesely, Maling list - DRI developers


On 2017-11-21 06:44 AM, Oded Gabbay wrote:
> Thanks Felix for catching that, For some reason I remembered  EOP
> buffer should be the same size of the queue.

The EOP queue size is hard-coded to prop.eop_ring_buffer_size =
PAGE_SIZE for kernel queues in initialize in kfd_kernel_queue.c. I'm not
too familiar with the HW/FW details. But I see this comment in
kfd_mqd_manager_vi.c:

        /*
         * HW does not clamp this field correctly. Maximum EOP queue size
         * is constrained by per-SE EOP done signal count, which is 8-bit.
         * Limit is 0xFF EOP entries (= 0x7F8 dwords). CP will not submit
         * more than (EOP entry count - 1) so a queue size of 0x800 dwords
         * is safe, giving a maximum field value of 0xA.
         */

With that the maximum possible EOP queue size would be two pages,
regardless of the queue size.

> Then we can remove the queue size parameter from that function ?

Not the way the code is currently organized. Currently struct
kernel_queue_ops is shared for ASIC-independent and ASIC-specific queue
ops. The ASIC-independent initialize function in kfd_kernel_queue.c
still needs this parameter.

That said, the kernel_queue stuff could be cleaned up a bit in general.
IMO the hardware-independent functions don't really need to be called
through function pointers. The ASIC-specific function pointers don't
need to be in the kernel_queue structure, they could be in kfd_dev.

Regards,
  Felix

>
> On Mon, Nov 20, 2017 at 9:22 PM, Felix Kuehling <felix.kuehling@amd.com> wrote:
>> I think this patch is not correct. The EOP-mem is not associated with
>> the queue size. The EOP buffer is a separate buffer used by the firmware
>> to handle command completion. As I understand it, this allows more
>> concurrency, while still making it look like all commands in the queue
>> are completing in order.
>>
>> Regards,
>>   Felix
>>
>>
>> On 2017-11-19 03:19 AM, Oded Gabbay wrote:
>>> On Thu, Nov 16, 2017 at 11:36 PM, Jan Vesely <jan.vesely@rutgers.edu> wrote:
>>>> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
>>>> ---
>>>>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c | 5 +++--
>>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
>>>> index f1d48281e322..b3bee39661ab 100644
>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
>>>> @@ -37,15 +37,16 @@ static bool initialize_vi(struct kernel_queue *kq, struct kfd_dev *dev,
>>>>                         enum kfd_queue_type type, unsigned int queue_size)
>>>>  {
>>>>         int retval;
>>>> +       unsigned int size = ALIGN(queue_size, PAGE_SIZE);
>>>>
>>>> -       retval = kfd_gtt_sa_allocate(dev, PAGE_SIZE, &kq->eop_mem);
>>>> +       retval = kfd_gtt_sa_allocate(dev, size, &kq->eop_mem);
>>>>         if (retval != 0)
>>>>                 return false;
>>>>
>>>>         kq->eop_gpu_addr = kq->eop_mem->gpu_addr;
>>>>         kq->eop_kernel_addr = kq->eop_mem->cpu_ptr;
>>>>
>>>> -       memset(kq->eop_kernel_addr, 0, PAGE_SIZE);
>>>> +       memset(kq->eop_kernel_addr, 0, size);
>>>>
>>>>         return true;
>>>>  }
>>>> --
>>>> 2.13.6
>>>>
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>> Thanks!
>>> Applied to -next tree
>>> Oded
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] drm/amdkfd: Do not ignore requested queue size during allocation
  2017-11-20 19:22       ` Felix Kuehling
       [not found]         ` <21e77adc-4fbe-a3e9-0a02-5d84eb201561-5C7GfCeVMHo@public.gmane.org>
@ 2017-11-29 21:43         ` Jan Vesely
       [not found]           ` <1511991803.2978.67.camel-kgbqMDwikbSVc3sceRu5cw@public.gmane.org>
  1 sibling, 1 reply; 11+ messages in thread
From: Jan Vesely @ 2017-11-29 21:43 UTC (permalink / raw)
  To: Felix Kuehling, Oded Gabbay; +Cc: amd-gfx list, Maling list - DRI developers


[-- Attachment #1.1: Type: text/plain, Size: 2558 bytes --]

On Mon, 2017-11-20 at 14:22 -0500, Felix Kuehling wrote:
> I think this patch is not correct. The EOP-mem is not associated with
> the queue size. The EOP buffer is a separate buffer used by the firmware
> to handle command completion. As I understand it, this allows more
> concurrency, while still making it look like all commands in the queue
> are completing in order.

thanks for the explanation. I was looking for a source of a CP hang
(rptr stops advancing), but bumping the eop size actually mode things
worse. Is there a way to find out if a queue got disabled and for what
reason? (I'm running ROCK-1.6.x based kernel)

thanks,
Jan

> 
> Regards,
>   Felix
> 
> 
> On 2017-11-19 03:19 AM, Oded Gabbay wrote:
> > On Thu, Nov 16, 2017 at 11:36 PM, Jan Vesely <jan.vesely@rutgers.edu> wrote:
> > > Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
> > > ---
> > >  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c | 5 +++--
> > >  1 file changed, 3 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
> > > index f1d48281e322..b3bee39661ab 100644
> > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
> > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
> > > @@ -37,15 +37,16 @@ static bool initialize_vi(struct kernel_queue *kq, struct kfd_dev *dev,
> > >                         enum kfd_queue_type type, unsigned int queue_size)
> > >  {
> > >         int retval;
> > > +       unsigned int size = ALIGN(queue_size, PAGE_SIZE);
> > > 
> > > -       retval = kfd_gtt_sa_allocate(dev, PAGE_SIZE, &kq->eop_mem);
> > > +       retval = kfd_gtt_sa_allocate(dev, size, &kq->eop_mem);
> > >         if (retval != 0)
> > >                 return false;
> > > 
> > >         kq->eop_gpu_addr = kq->eop_mem->gpu_addr;
> > >         kq->eop_kernel_addr = kq->eop_mem->cpu_ptr;
> > > 
> > > -       memset(kq->eop_kernel_addr, 0, PAGE_SIZE);
> > > +       memset(kq->eop_kernel_addr, 0, size);
> > > 
> > >         return true;
> > >  }
> > > --
> > > 2.13.6
> > > 
> > > _______________________________________________
> > > amd-gfx mailing list
> > > amd-gfx@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> > 
> > Thanks!
> > Applied to -next tree
> > Oded
> > _______________________________________________
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> 
> 

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] drm/amdkfd: Do not ignore requested queue size during allocation
       [not found]           ` <1511991803.2978.67.camel-kgbqMDwikbSVc3sceRu5cw@public.gmane.org>
@ 2017-11-29 21:58             ` Felix Kuehling
  2017-11-30 23:51               ` Jan Vesely
  0 siblings, 1 reply; 11+ messages in thread
From: Felix Kuehling @ 2017-11-29 21:58 UTC (permalink / raw)
  To: Jan Vesely, Oded Gabbay; +Cc: amd-gfx list, Maling list - DRI developers

You can see the state of the queues in debugfs:
/sys/kernel/debug/kfd/... You can look at MQDs and HQDs.

If your application isn't stopping queues deliberately, queues get
disabled by evictions, usually temporarily. You'll see kernel messages
when that happens.

A VM fault will result in queues of the offending process getting
disabled permanently. Again, you'll see messages about that in the
kernel log.

The RPTR can also stop advancing if you have an infinite loop in a
shader program, or just a shader that takes a very long time to execute.
Or maybe if you have some dependencies (barriers) in your AQL packets
that never get satisfied.

The function you changed only affects the HIQ, the queue that KFD uses
to control the HWS. It does not affect user mode queues. If your problem
is with a user mode queue, your change should have no effect at all.

Regards,
  Felix


On 2017-11-29 04:43 PM, Jan Vesely wrote:
> On Mon, 2017-11-20 at 14:22 -0500, Felix Kuehling wrote:
>> I think this patch is not correct. The EOP-mem is not associated with
>> the queue size. The EOP buffer is a separate buffer used by the firmware
>> to handle command completion. As I understand it, this allows more
>> concurrency, while still making it look like all commands in the queue
>> are completing in order.
> thanks for the explanation. I was looking for a source of a CP hang
> (rptr stops advancing), but bumping the eop size actually mode things
> worse. Is there a way to find out if a queue got disabled and for what
> reason? (I'm running ROCK-1.6.x based kernel)
>
> thanks,
> Jan
>
>> Regards,
>>   Felix
>>
>>
>> On 2017-11-19 03:19 AM, Oded Gabbay wrote:
>>> On Thu, Nov 16, 2017 at 11:36 PM, Jan Vesely <jan.vesely@rutgers.edu> wrote:
>>>> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
>>>> ---
>>>>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c | 5 +++--
>>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
>>>> index f1d48281e322..b3bee39661ab 100644
>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
>>>> @@ -37,15 +37,16 @@ static bool initialize_vi(struct kernel_queue *kq, struct kfd_dev *dev,
>>>>                         enum kfd_queue_type type, unsigned int queue_size)
>>>>  {
>>>>         int retval;
>>>> +       unsigned int size = ALIGN(queue_size, PAGE_SIZE);
>>>>
>>>> -       retval = kfd_gtt_sa_allocate(dev, PAGE_SIZE, &kq->eop_mem);
>>>> +       retval = kfd_gtt_sa_allocate(dev, size, &kq->eop_mem);
>>>>         if (retval != 0)
>>>>                 return false;
>>>>
>>>>         kq->eop_gpu_addr = kq->eop_mem->gpu_addr;
>>>>         kq->eop_kernel_addr = kq->eop_mem->cpu_ptr;
>>>>
>>>> -       memset(kq->eop_kernel_addr, 0, PAGE_SIZE);
>>>> +       memset(kq->eop_kernel_addr, 0, size);
>>>>
>>>>         return true;
>>>>  }
>>>> --
>>>> 2.13.6
>>>>
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>> Thanks!
>>> Applied to -next tree
>>> Oded
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] drm/amdkfd: Do not ignore requested queue size during allocation
  2017-11-29 21:58             ` Felix Kuehling
@ 2017-11-30 23:51               ` Jan Vesely
       [not found]                 ` <1512085889.3631.50.camel-kgbqMDwikbSVc3sceRu5cw@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Vesely @ 2017-11-30 23:51 UTC (permalink / raw)
  To: Felix Kuehling, Oded Gabbay; +Cc: amd-gfx list, Maling list - DRI developers


[-- Attachment #1.1: Type: text/plain, Size: 4754 bytes --]

On Wed, 2017-11-29 at 16:58 -0500, Felix Kuehling wrote:
> You can see the state of the queues in debugfs:
> /sys/kernel/debug/kfd/... You can look at MQDs and HQDs.

thanks. how do I decode the information?
The rptr always stops at pos 60 which looks like this in mqds:

 DIQ on device 45a2
    00000000: c0310800 00004000 00000000 00000000 00000000 00000000 00000000 00000000
    00000020: 00000000 00000000 00000000 00000001 00000000 00000000 00000000 00000000
    00000040: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ffffffff
    00000060: ffffffff 00000000 ffffffff ffffffff 00000000 00000000 00000000 00000000

If I understood correctly that's the queue dump, so those fffffs look
wrong

> 
> If your application isn't stopping queues deliberately, queues get
> disabled by evictions, usually temporarily. You'll see kernel messages
> when that happens.
> 
> A VM fault will result in queues of the offending process getting
> disabled permanently. Again, you'll see messages about that in the
> kernel log.
> 
> The RPTR can also stop advancing if you have an infinite loop in a
> shader program, or just a shader that takes a very long time to execute.
> Or maybe if you have some dependencies (barriers) in your AQL packets
> that never get satisfied.
> 
> The function you changed only affects the HIQ, the queue that KFD uses
> to control the HWS. It does not affect user mode queues. If your problem
> is with a user mode queue, your change should have no effect at all.

It's not a userspace queue that stops. I'm using kernel dbgdev to issue
wave_resume commands. (waves are halted after executing
s_sendmsg_halt).
I bumped KFD_KERNEL_QUEUE_SIZE to 16KB to make sure all 320 resume
commads fit (otherwise I get spurious ENOMEM when the queue is full but
still advancing).

thanks,
Jan

> 
> Regards,
>   Felix
> 
> 
> On 2017-11-29 04:43 PM, Jan Vesely wrote:
> > On Mon, 2017-11-20 at 14:22 -0500, Felix Kuehling wrote:
> > > I think this patch is not correct. The EOP-mem is not associated with
> > > the queue size. The EOP buffer is a separate buffer used by the firmware
> > > to handle command completion. As I understand it, this allows more
> > > concurrency, while still making it look like all commands in the queue
> > > are completing in order.
> > 
> > thanks for the explanation. I was looking for a source of a CP hang
> > (rptr stops advancing), but bumping the eop size actually mode things
> > worse. Is there a way to find out if a queue got disabled and for what
> > reason? (I'm running ROCK-1.6.x based kernel)
> > 
> > thanks,
> > Jan
> > 
> > > Regards,
> > >   Felix
> > > 
> > > 
> > > On 2017-11-19 03:19 AM, Oded Gabbay wrote:
> > > > On Thu, Nov 16, 2017 at 11:36 PM, Jan Vesely <jan.vesely@rutgers.edu> wrote:
> > > > > Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
> > > > > ---
> > > > >  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c | 5 +++--
> > > > >  1 file changed, 3 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
> > > > > index f1d48281e322..b3bee39661ab 100644
> > > > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
> > > > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
> > > > > @@ -37,15 +37,16 @@ static bool initialize_vi(struct kernel_queue *kq, struct kfd_dev *dev,
> > > > >                         enum kfd_queue_type type, unsigned int queue_size)
> > > > >  {
> > > > >         int retval;
> > > > > +       unsigned int size = ALIGN(queue_size, PAGE_SIZE);
> > > > > 
> > > > > -       retval = kfd_gtt_sa_allocate(dev, PAGE_SIZE, &kq->eop_mem);
> > > > > +       retval = kfd_gtt_sa_allocate(dev, size, &kq->eop_mem);
> > > > >         if (retval != 0)
> > > > >                 return false;
> > > > > 
> > > > >         kq->eop_gpu_addr = kq->eop_mem->gpu_addr;
> > > > >         kq->eop_kernel_addr = kq->eop_mem->cpu_ptr;
> > > > > 
> > > > > -       memset(kq->eop_kernel_addr, 0, PAGE_SIZE);
> > > > > +       memset(kq->eop_kernel_addr, 0, size);
> > > > > 
> > > > >         return true;
> > > > >  }
> > > > > --
> > > > > 2.13.6
> > > > > 
> > > > > _______________________________________________
> > > > > amd-gfx mailing list
> > > > > amd-gfx@lists.freedesktop.org
> > > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> > > > 
> > > > Thanks!
> > > > Applied to -next tree
> > > > Oded
> > > > _______________________________________________
> > > > amd-gfx mailing list
> > > > amd-gfx@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> 
> 

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] drm/amdkfd: Do not ignore requested queue size during allocation
       [not found]                 ` <1512085889.3631.50.camel-kgbqMDwikbSVc3sceRu5cw@public.gmane.org>
@ 2017-12-01 17:10                   ` Felix Kuehling
  2017-12-01 17:15                   ` Felix Kuehling
  2017-12-01 19:37                   ` Felix Kuehling
  2 siblings, 0 replies; 11+ messages in thread
From: Felix Kuehling @ 2017-12-01 17:10 UTC (permalink / raw)
  To: Jan Vesely, Oded Gabbay; +Cc: amd-gfx list, Maling list - DRI developers

DIQ is the debug interface queue. Are you running a GPU debugger?
Otherwise I would not expect to even see a DIQ.

Are you not seeing any compute queues in mqds? If there are no compute
queues in mqds, that means your queue has been destroyed. That would
explain why the read pointer is not advancing.

Regards,
  Felix


On 2017-11-30 06:51 PM, Jan Vesely wrote:
> On Wed, 2017-11-29 at 16:58 -0500, Felix Kuehling wrote:
>> You can see the state of the queues in debugfs:
>> /sys/kernel/debug/kfd/... You can look at MQDs and HQDs.
> thanks. how do I decode the information?
> The rptr always stops at pos 60 which looks like this in mqds:
>
>  DIQ on device 45a2
>     00000000: c0310800 00004000 00000000 00000000 00000000 00000000 00000000 00000000
>     00000020: 00000000 00000000 00000000 00000001 00000000 00000000 00000000 00000000
>     00000040: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ffffffff
>     00000060: ffffffff 00000000 ffffffff ffffffff 00000000 00000000 00000000 00000000
>
> If I understood correctly that's the queue dump, so those fffffs look
> wrong
>
>> If your application isn't stopping queues deliberately, queues get
>> disabled by evictions, usually temporarily. You'll see kernel messages
>> when that happens.
>>
>> A VM fault will result in queues of the offending process getting
>> disabled permanently. Again, you'll see messages about that in the
>> kernel log.
>>
>> The RPTR can also stop advancing if you have an infinite loop in a
>> shader program, or just a shader that takes a very long time to execute.
>> Or maybe if you have some dependencies (barriers) in your AQL packets
>> that never get satisfied.
>>
>> The function you changed only affects the HIQ, the queue that KFD uses
>> to control the HWS. It does not affect user mode queues. If your problem
>> is with a user mode queue, your change should have no effect at all.
> It's not a userspace queue that stops. I'm using kernel dbgdev to issue
> wave_resume commands. (waves are halted after executing
> s_sendmsg_halt).
> I bumped KFD_KERNEL_QUEUE_SIZE to 16KB to make sure all 320 resume
> commads fit (otherwise I get spurious ENOMEM when the queue is full but
> still advancing).
>
> thanks,
> Jan
>
>> Regards,
>>   Felix
>>
>>
>> On 2017-11-29 04:43 PM, Jan Vesely wrote:
>>> On Mon, 2017-11-20 at 14:22 -0500, Felix Kuehling wrote:
>>>> I think this patch is not correct. The EOP-mem is not associated with
>>>> the queue size. The EOP buffer is a separate buffer used by the firmware
>>>> to handle command completion. As I understand it, this allows more
>>>> concurrency, while still making it look like all commands in the queue
>>>> are completing in order.
>>> thanks for the explanation. I was looking for a source of a CP hang
>>> (rptr stops advancing), but bumping the eop size actually mode things
>>> worse. Is there a way to find out if a queue got disabled and for what
>>> reason? (I'm running ROCK-1.6.x based kernel)
>>>
>>> thanks,
>>> Jan
>>>
>>>> Regards,
>>>>   Felix
>>>>
>>>>
>>>> On 2017-11-19 03:19 AM, Oded Gabbay wrote:
>>>>> On Thu, Nov 16, 2017 at 11:36 PM, Jan Vesely <jan.vesely@rutgers.edu> wrote:
>>>>>> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
>>>>>> ---
>>>>>>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c | 5 +++--
>>>>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
>>>>>> index f1d48281e322..b3bee39661ab 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
>>>>>> @@ -37,15 +37,16 @@ static bool initialize_vi(struct kernel_queue *kq, struct kfd_dev *dev,
>>>>>>                         enum kfd_queue_type type, unsigned int queue_size)
>>>>>>  {
>>>>>>         int retval;
>>>>>> +       unsigned int size = ALIGN(queue_size, PAGE_SIZE);
>>>>>>
>>>>>> -       retval = kfd_gtt_sa_allocate(dev, PAGE_SIZE, &kq->eop_mem);
>>>>>> +       retval = kfd_gtt_sa_allocate(dev, size, &kq->eop_mem);
>>>>>>         if (retval != 0)
>>>>>>                 return false;
>>>>>>
>>>>>>         kq->eop_gpu_addr = kq->eop_mem->gpu_addr;
>>>>>>         kq->eop_kernel_addr = kq->eop_mem->cpu_ptr;
>>>>>>
>>>>>> -       memset(kq->eop_kernel_addr, 0, PAGE_SIZE);
>>>>>> +       memset(kq->eop_kernel_addr, 0, size);
>>>>>>
>>>>>>         return true;
>>>>>>  }
>>>>>> --
>>>>>> 2.13.6
>>>>>>
>>>>>> _______________________________________________
>>>>>> amd-gfx mailing list
>>>>>> amd-gfx@lists.freedesktop.org
>>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>>> Thanks!
>>>>> Applied to -next tree
>>>>> Oded
>>>>> _______________________________________________
>>>>> amd-gfx mailing list
>>>>> amd-gfx@lists.freedesktop.org
>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] drm/amdkfd: Do not ignore requested queue size during allocation
       [not found]                 ` <1512085889.3631.50.camel-kgbqMDwikbSVc3sceRu5cw@public.gmane.org>
  2017-12-01 17:10                   ` Felix Kuehling
@ 2017-12-01 17:15                   ` Felix Kuehling
  2017-12-01 19:37                   ` Felix Kuehling
  2 siblings, 0 replies; 11+ messages in thread
From: Felix Kuehling @ 2017-12-01 17:15 UTC (permalink / raw)
  To: Jan Vesely, Oded Gabbay; +Cc: amd-gfx list, Maling list - DRI developers

To answer your questions about decoding MQDs, take a look at struct
vi_mqd in drivers/gpu/drm/amd/include/vi_structs.h. What you're looking
at is a binary dump of that structure, one per queue.

The information in the MQD may not always be up to date, because the MQD
represents an unmapped queue. It mostly gets updated when queues are
unmapped. So you would need to correlate the MQD of the queue you're
interested in with an HQD to see the current HW state.

Regards,
  Felix


On 2017-11-30 06:51 PM, Jan Vesely wrote:
> On Wed, 2017-11-29 at 16:58 -0500, Felix Kuehling wrote:
>> You can see the state of the queues in debugfs:
>> /sys/kernel/debug/kfd/... You can look at MQDs and HQDs.
> thanks. how do I decode the information?
> The rptr always stops at pos 60 which looks like this in mqds:
>
>  DIQ on device 45a2
>     00000000: c0310800 00004000 00000000 00000000 00000000 00000000 00000000 00000000
>     00000020: 00000000 00000000 00000000 00000001 00000000 00000000 00000000 00000000
>     00000040: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ffffffff
>     00000060: ffffffff 00000000 ffffffff ffffffff 00000000 00000000 00000000 00000000
>
> If I understood correctly that's the queue dump, so those fffffs look
> wrong
>
>> If your application isn't stopping queues deliberately, queues get
>> disabled by evictions, usually temporarily. You'll see kernel messages
>> when that happens.
>>
>> A VM fault will result in queues of the offending process getting
>> disabled permanently. Again, you'll see messages about that in the
>> kernel log.
>>
>> The RPTR can also stop advancing if you have an infinite loop in a
>> shader program, or just a shader that takes a very long time to execute.
>> Or maybe if you have some dependencies (barriers) in your AQL packets
>> that never get satisfied.
>>
>> The function you changed only affects the HIQ, the queue that KFD uses
>> to control the HWS. It does not affect user mode queues. If your problem
>> is with a user mode queue, your change should have no effect at all.
> It's not a userspace queue that stops. I'm using kernel dbgdev to issue
> wave_resume commands. (waves are halted after executing
> s_sendmsg_halt).
> I bumped KFD_KERNEL_QUEUE_SIZE to 16KB to make sure all 320 resume
> commads fit (otherwise I get spurious ENOMEM when the queue is full but
> still advancing).
>
> thanks,
> Jan
>
>> Regards,
>>   Felix
>>
>>
>> On 2017-11-29 04:43 PM, Jan Vesely wrote:
>>> On Mon, 2017-11-20 at 14:22 -0500, Felix Kuehling wrote:
>>>> I think this patch is not correct. The EOP-mem is not associated with
>>>> the queue size. The EOP buffer is a separate buffer used by the firmware
>>>> to handle command completion. As I understand it, this allows more
>>>> concurrency, while still making it look like all commands in the queue
>>>> are completing in order.
>>> thanks for the explanation. I was looking for a source of a CP hang
>>> (rptr stops advancing), but bumping the eop size actually mode things
>>> worse. Is there a way to find out if a queue got disabled and for what
>>> reason? (I'm running ROCK-1.6.x based kernel)
>>>
>>> thanks,
>>> Jan
>>>
>>>> Regards,
>>>>   Felix
>>>>
>>>>
>>>> On 2017-11-19 03:19 AM, Oded Gabbay wrote:
>>>>> On Thu, Nov 16, 2017 at 11:36 PM, Jan Vesely <jan.vesely@rutgers.edu> wrote:
>>>>>> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
>>>>>> ---
>>>>>>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c | 5 +++--
>>>>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
>>>>>> index f1d48281e322..b3bee39661ab 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
>>>>>> @@ -37,15 +37,16 @@ static bool initialize_vi(struct kernel_queue *kq, struct kfd_dev *dev,
>>>>>>                         enum kfd_queue_type type, unsigned int queue_size)
>>>>>>  {
>>>>>>         int retval;
>>>>>> +       unsigned int size = ALIGN(queue_size, PAGE_SIZE);
>>>>>>
>>>>>> -       retval = kfd_gtt_sa_allocate(dev, PAGE_SIZE, &kq->eop_mem);
>>>>>> +       retval = kfd_gtt_sa_allocate(dev, size, &kq->eop_mem);
>>>>>>         if (retval != 0)
>>>>>>                 return false;
>>>>>>
>>>>>>         kq->eop_gpu_addr = kq->eop_mem->gpu_addr;
>>>>>>         kq->eop_kernel_addr = kq->eop_mem->cpu_ptr;
>>>>>>
>>>>>> -       memset(kq->eop_kernel_addr, 0, PAGE_SIZE);
>>>>>> +       memset(kq->eop_kernel_addr, 0, size);
>>>>>>
>>>>>>         return true;
>>>>>>  }
>>>>>> --
>>>>>> 2.13.6
>>>>>>
>>>>>> _______________________________________________
>>>>>> amd-gfx mailing list
>>>>>> amd-gfx@lists.freedesktop.org
>>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>>> Thanks!
>>>>> Applied to -next tree
>>>>> Oded
>>>>> _______________________________________________
>>>>> amd-gfx mailing list
>>>>> amd-gfx@lists.freedesktop.org
>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] drm/amdkfd: Do not ignore requested queue size during allocation
       [not found]                 ` <1512085889.3631.50.camel-kgbqMDwikbSVc3sceRu5cw@public.gmane.org>
  2017-12-01 17:10                   ` Felix Kuehling
  2017-12-01 17:15                   ` Felix Kuehling
@ 2017-12-01 19:37                   ` Felix Kuehling
  2 siblings, 0 replies; 11+ messages in thread
From: Felix Kuehling @ 2017-12-01 19:37 UTC (permalink / raw)
  To: Jan Vesely, Oded Gabbay; +Cc: amd-gfx list, Maling list - DRI developers


On 2017-11-30 06:51 PM, Jan Vesely wrote:
>
> It's not a userspace queue that stops. I'm using kernel dbgdev to issue
> wave_resume commands. (waves are halted after executing
> s_sendmsg_halt).
> I bumped KFD_KERNEL_QUEUE_SIZE to 16KB to make sure all 320 resume
> commads fit (otherwise I get spurious ENOMEM when the queue is full but
> still advancing).

Sorry, didn't see this part of your message before.

To see the actual state of the DIQ in the hardware, you should look at
the HQD. You can find the matching HQD by looking at the queue base
address (cp_hqd_pq_base) which is at offset 0x220 in the MQD and offset
0xc934 in the register space (HQD).

I've debugged some obscure CP hangs involving the DIQ and wave control
commands before, that required help from the firmware team. The fix was
to remove synchronization with release_mem packets that could hang in
combination with wave control. It turned out the synchronization wasn't
really needed anyway. But it had some implications for how memory was
managed. I had to add code to allocate the IB on the queue (using a NOP
command), so I wouldn't have to free it explicitly (which would require
synchronization). I think that code is still not 100% correct. When the
queue is nearly full, an IB may get overwritten. I'd have to restructure
the code to allocate the IB after the commands that submit the IB, so
that the IB can't get overwritten until after the IB execution is finished.

Regards,
  Felix

>
> thanks,
> Jan
>
>> Regards,
>>   Felix
>>
>>
>> On 2017-11-29 04:43 PM, Jan Vesely wrote:
>>> On Mon, 2017-11-20 at 14:22 -0500, Felix Kuehling wrote:
>>>> I think this patch is not correct. The EOP-mem is not associated with
>>>> the queue size. The EOP buffer is a separate buffer used by the firmware
>>>> to handle command completion. As I understand it, this allows more
>>>> concurrency, while still making it look like all commands in the queue
>>>> are completing in order.
>>> thanks for the explanation. I was looking for a source of a CP hang
>>> (rptr stops advancing), but bumping the eop size actually mode things
>>> worse. Is there a way to find out if a queue got disabled and for what
>>> reason? (I'm running ROCK-1.6.x based kernel)
>>>
>>> thanks,
>>> Jan
>>>
>>>> Regards,
>>>>   Felix
>>>>
>>>>
>>>> On 2017-11-19 03:19 AM, Oded Gabbay wrote:
>>>>> On Thu, Nov 16, 2017 at 11:36 PM, Jan Vesely <jan.vesely@rutgers.edu> wrote:
>>>>>> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
>>>>>> ---
>>>>>>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c | 5 +++--
>>>>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
>>>>>> index f1d48281e322..b3bee39661ab 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
>>>>>> @@ -37,15 +37,16 @@ static bool initialize_vi(struct kernel_queue *kq, struct kfd_dev *dev,
>>>>>>                         enum kfd_queue_type type, unsigned int queue_size)
>>>>>>  {
>>>>>>         int retval;
>>>>>> +       unsigned int size = ALIGN(queue_size, PAGE_SIZE);
>>>>>>
>>>>>> -       retval = kfd_gtt_sa_allocate(dev, PAGE_SIZE, &kq->eop_mem);
>>>>>> +       retval = kfd_gtt_sa_allocate(dev, size, &kq->eop_mem);
>>>>>>         if (retval != 0)
>>>>>>                 return false;
>>>>>>
>>>>>>         kq->eop_gpu_addr = kq->eop_mem->gpu_addr;
>>>>>>         kq->eop_kernel_addr = kq->eop_mem->cpu_ptr;
>>>>>>
>>>>>> -       memset(kq->eop_kernel_addr, 0, PAGE_SIZE);
>>>>>> +       memset(kq->eop_kernel_addr, 0, size);
>>>>>>
>>>>>>         return true;
>>>>>>  }
>>>>>> --
>>>>>> 2.13.6
>>>>>>
>>>>>> _______________________________________________
>>>>>> amd-gfx mailing list
>>>>>> amd-gfx@lists.freedesktop.org
>>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>>> Thanks!
>>>>> Applied to -next tree
>>>>> Oded
>>>>> _______________________________________________
>>>>> amd-gfx mailing list
>>>>> amd-gfx@lists.freedesktop.org
>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-12-01 19:37 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-16 21:36 [PATCH 1/1] drm/amdkfd: Do not ignore requested queue size during allocation Jan Vesely
     [not found] ` <20171116213631.3987-1-jan.vesely-kgbqMDwikbSVc3sceRu5cw@public.gmane.org>
2017-11-19  8:19   ` Oded Gabbay
     [not found]     ` <CAFCwf10MTrBcGn1kejNvn9AcHDsCiW85HkC6bSmYfY=1shGJxQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-11-20 19:22       ` Felix Kuehling
     [not found]         ` <21e77adc-4fbe-a3e9-0a02-5d84eb201561-5C7GfCeVMHo@public.gmane.org>
2017-11-21 11:44           ` Oded Gabbay
     [not found]             ` <CAFCwf12ZeO=6F-EQ_MFgSEugxMSAFNe+OkGT_gZBMp34VzFAcA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-11-21 16:30               ` Felix Kuehling
2017-11-29 21:43         ` Jan Vesely
     [not found]           ` <1511991803.2978.67.camel-kgbqMDwikbSVc3sceRu5cw@public.gmane.org>
2017-11-29 21:58             ` Felix Kuehling
2017-11-30 23:51               ` Jan Vesely
     [not found]                 ` <1512085889.3631.50.camel-kgbqMDwikbSVc3sceRu5cw@public.gmane.org>
2017-12-01 17:10                   ` Felix Kuehling
2017-12-01 17:15                   ` Felix Kuehling
2017-12-01 19:37                   ` Felix Kuehling

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).