All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: "Tao, Yintian" <Yintian.Tao@amd.com>,
	"Liu, Monk" <Monk.Liu@amd.com>,
	"Kuehling, Felix" <Felix.Kuehling@amd.com>
Cc: "amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/amdgpu: refine kiq access register
Date: Wed, 22 Apr 2020 14:33:08 +0200	[thread overview]
Message-ID: <02f3f1a7-5bae-9aaa-47ca-74affe2b247b@amd.com> (raw)
In-Reply-To: <MN2PR12MB3039E72087ED59AFAF5FC319E5D20@MN2PR12MB3039.namprd12.prod.outlook.com>

Am 22.04.20 um 14:20 schrieb Tao, Yintian:
> Hi  Christian
>
>
> Please see inline commetns.
> -----Original Message-----
> From: Koenig, Christian <Christian.Koenig@amd.com>
> Sent: 2020年4月22日 19:57
> To: Tao, Yintian <Yintian.Tao@amd.com>; Liu, Monk <Monk.Liu@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com>
> Cc: amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: refine kiq access register
>
> Am 22.04.20 um 13:49 schrieb Tao, Yintian:
>> Hi  Christian
>>
>>
>> Can you help answer the questions below? Thanks in advance.
>> -----Original Message-----
>> From: Koenig, Christian <Christian.Koenig@amd.com>
>> Sent: 2020年4月22日 19:03
>> To: Tao, Yintian <Yintian.Tao@amd.com>; Liu, Monk <Monk.Liu@amd.com>;
>> Kuehling, Felix <Felix.Kuehling@amd.com>
>> Cc: amd-gfx@lists.freedesktop.org
>> Subject: Re: [PATCH] drm/amdgpu: refine kiq access register
>>
>> Am 22.04.20 um 11:29 schrieb Yintian Tao:
>>> According to the current kiq access register method, there will be
>>> race condition when using KIQ to read register if multiple clients
>>> want to read at same time just like the expample below:
>>> 1. client-A start to read REG-0 throguh KIQ 2. client-A poll the
>>> seqno-0 3. client-B start to read REG-1 through KIQ 4. client-B poll
>>> the seqno-1 5. the kiq complete these two read operation 6. client-A
>>> to read the register at the wb buffer and
>>>       get REG-1 value
>>>
>>> And if there are multiple clients to frequently write registers
>>> through KIQ which may raise the KIQ ring buffer overwritten problem.
>>>
>>> Therefore, allocate fixed number wb slot for rreg use and limit the
>>> submit number which depends on the kiq ring_size in order to prevent
>>> the overwritten problem.
>>>
>>> v2: directly use amdgpu_device_wb_get() for each read instead
>>>        of to reserve fixde number slot.
>>>        if there is no enough kiq ring buffer or rreg slot then
>>>        directly print error log and return instead of busy waiting
>> I would split that into three patches. One for each problem we have here:
>>
>> 1. Fix kgd_hiq_mqd_load() and maybe other occasions to use spin_lock_irqsave().
>> [yttao]: Do you mean that we need to use spin_lock_irqsave for the functions just like kgd_hiq_mqd_load()?
> Yes, I strongly think so.
>
> See when you have one spin lock you either need always need to lock it with irqs disabled or never.
>
> In other words we always need to either use spin_lock() or spin_lock_irqsave(), but never mix them with the same lock.
>
> The only exception to this rule is when you take multiple locks, e.g.
> you can do:
>
> spin_lock_irqsave(&a, flags);
> spin_lock(&b, flags);
> spin_lock(&c, flags);
> ....
> spin_unlock_irqsave(&a, flags);
>
> Here you don't need to use spin_lock_irqsave for b and c. But we rarely have that case in the code.
> [yttao]: thanks , I got it. I will submit another patch for it.
>
>> 2. Prevent the overrung of the KIQ. Please drop the approach with the
>> atomic here. Instead just add a amdgpu_fence_wait_polling() into
>> amdgpu_fence_emit_polling() as I discussed with Monk.
>> [yttao]: Sorry, I can't get your original idea for the amdgpu_fence_wait_polling(). Can you give more details about it? Thanks in advance.
>>
>> "That is actually only a problem because the KIQ uses polling waits.
>>
>> See amdgpu_fence_emit() waits for the oldest possible fence to be signaled before emitting a new one.
>>
>> I suggest that we do the same in amdgpu_fence_emit_polling(). A one liner like the following should be enough:
>>
>> amdgpu_fence_wait_polling(ring, seq - ring->fence_drv.num_fences_mask, timeout);"
>> [yttao]: there is no usage of num_fences_mask at kiq fence polling, the num_fences_mask is only effective at dma_fence architecture.
>> 		If I understand correctly, do you want the protype code below? If the protype code is wrong, can you help give one sample? Thanks in advance.
>>
>> int amdgpu_fence_emit_polling(struct amdgpu_ring *ring, uint32_t *s) {
>>           uint32_t seq;
>>
>>           if (!s)
>>                   return -EINVAL;
>> +		amdgpu_fence_wait_polling(ring, seq, timeout);
>>           seq = ++ring->fence_drv.sync_seq;
> Your understanding sounds more or less correct. The code should look something like this:
>
> seq = ++ring->fence_drv.sync_seq;
> amdgpu_fence_wait_polling(ring, seq -
> number_of_allowed_submissions_to_the_kiq, timeout);
> [yttao]: whether we need directly wait at the first just like below? Otherwise, amdgpu_ring_emit_wreg may overwrite the KIQ ring buffer.

There should always be room for at least one more submission.

As long as we always submit a fence checking the free room there should 
be fine.

Regards,
Christian.

> +		amdgpu_fence_wait_polling(ring, seq - number_of_allowed_submissions_to_the_kiq, timeout);
> 		spin_lock_irqsave(&kiq->ring_lock, flags);
>          amdgpu_ring_alloc(ring, 32);
>          amdgpu_ring_emit_wreg(ring, reg, v);
>          amdgpu_fence_emit_polling(ring, &seq); /* wait */
>          amdgpu_ring_commit(ring);
>          spin_unlock_irqrestore(&kiq->ring_lock, flags);
>
> I just used num_fences_mask as number_of_allowed_submissions_to_the_kiq
> because it is probably a good value to start with.
>
> But you could give that as parameter as well if you think that makes more sense.
>
>>           amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
>>                           ¦      seq, 0);
>>
>>           *s = seq;
>>
>>           return 0;
>> }
>>
>>
>>
>>
>> 3. Use amdgpu_device_wb_get() each time we need to submit a read.
>> [yttao]: yes, I will do it.
> Thanks,
> Christian.
>
>> Regards,
>> Christian.
>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

  reply	other threads:[~2020-04-22 12:33 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-22  9:29 [PATCH] drm/amdgpu: refine kiq access register Yintian Tao
2020-04-22 11:02 ` Christian König
2020-04-22 11:49   ` Tao, Yintian
2020-04-22 11:57     ` Christian König
2020-04-22 12:20       ` Tao, Yintian
2020-04-22 12:33         ` Christian König [this message]
2020-04-22 12:40           ` Tao, Yintian
  -- strict thread matches above, loose matches on Subject: below --
2020-04-22  4:42 Yintian Tao
2020-04-22  7:11 ` Tao, Yintian
2020-04-22  7:23   ` Christian König
2020-04-22  7:35     ` Tao, Yintian
2020-04-22  7:40       ` Christian König
2020-04-22  7:49         ` Tao, Yintian
2020-04-22  7:53           ` Christian König
2020-04-22  8:06             ` Tao, Yintian
2020-04-22  8:23               ` Christian König
2020-04-22  8:34                 ` Tao, Yintian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=02f3f1a7-5bae-9aaa-47ca-74affe2b247b@amd.com \
    --to=christian.koenig@amd.com \
    --cc=Felix.Kuehling@amd.com \
    --cc=Monk.Liu@amd.com \
    --cc=Yintian.Tao@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.