Re: [PATCH] drm/amdgpu: add ih call to process until checkpoint

From: "Christian König" <ckoenig.leichtzumerken@gmail.com>
To: Felix Kuehling <felix.kuehling@amd.com>,
	"Kim, Jonathan" <Jonathan.Kim@amd.com>,
	"Koenig, Christian" <Christian.Koenig@amd.com>,
	"amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>
Cc: "Yang, Philip" <Philip.Yang@amd.com>
Subject: Re: [PATCH] drm/amdgpu: add ih call to process until checkpoint
Date: Thu, 25 Feb 2021 14:53:32 +0100	[thread overview]
Message-ID: <8650a46f-f40d-d627-42a5-d9d760400086@gmail.com> (raw)
In-Reply-To: <92ddf2b3-3ee7-d481-2794-b57daed40ef9@amd.com>

Am 25.02.21 um 04:15 schrieb Felix Kuehling:
> On 2021-02-24 10:54 a.m., Kim, Jonathan wrote:
>> [AMD Official Use Only - Internal Distribution Only]
>>
>>> -----Original Message-----
>>> From: Koenig, Christian <Christian.Koenig@amd.com>
>>> Sent: Wednesday, February 24, 2021 4:17 AM
>>> To: Kim, Jonathan <Jonathan.Kim@amd.com>; amd-
>>> gfx@lists.freedesktop.org
>>> Cc: Yang, Philip <Philip.Yang@amd.com>; Kuehling, Felix
>>> <Felix.Kuehling@amd.com>
>>> Subject: Re: [PATCH] drm/amdgpu: add ih call to process until 
>>> checkpoint
>>>
>>> Am 23.02.21 um 22:10 schrieb Jonathan Kim:
>>>> Add IH function to allow caller to process ring entries until the
>>>> checkpoint write pointer.
>>> This needs a better description of what this will be used for.
>> Felix or Philip could elaborate better for HMM needs.
>> Debugging tools requires this but it's in experimental mode at the 
>> moment so probably not the best place to describe here.
>
> On the HMM side we're planning to use this to drain pending page fault 
> interrupts before we unmap memory. That should address phantom VM 
> faults after memory is unmapped.

Thought so. I suggest to use a wait_event() here which on the waiter 
side checks ih->lock and add a wake_up_all() at the end of 
amdgpu_ih_process. I won't touch rptr or wptr at all for this.

Regards,
Christian.

>
> Regards,
>   Felix
>
>
>>
>>>> Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
>>>> Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 46
>>> +++++++++++++++++++++++++-
>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h |  2 ++
>>>>    2 files changed, 47 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
>>>> index dc852af4f3b7..cae50af9559d 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
>>>> @@ -22,7 +22,7 @@
>>>>     */
>>>>
>>>>    #include <linux/dma-mapping.h>
>>>> -
>>>> +#include <linux/processor.h>
>>>>    #include "amdgpu.h"
>>>>    #include "amdgpu_ih.h"
>>>>
>>>> @@ -160,6 +160,50 @@ void amdgpu_ih_ring_write(struct
>>> amdgpu_ih_ring *ih, const uint32_t *iv,
>>>>    }
>>>>    }
>>>>
>>>> +/**
>>>> + * amdgpu_ih_wait_on_checkpoint_process - wait to process IVs up to
>>>> +checkpoint
>>>> + *
>>>> + * @adev: amdgpu_device pointer
>>>> + * @ih: ih ring to process
>>>> + *
>>>> + * Used to ensure ring has processed IVs up to the checkpoint write
>>> pointer.
>>>> + */
>>>> +int amdgpu_ih_wait_on_checkpoint_process(struct amdgpu_device
>>> *adev,
>>>> +struct amdgpu_ih_ring *ih)
>>>> +{
>>>> +u32 prev_rptr, cur_rptr, checkpoint_wptr;
>>>> +
>>>> +if (!ih->enabled || adev->shutdown)
>>>> +return -ENODEV;
>>>> +
>>>> +cur_rptr = READ_ONCE(ih->rptr);
>>>> +/* Order read of current rptr with checktpoint wptr. */
>>>> +mb();
>>>> +checkpoint_wptr = amdgpu_ih_get_wptr(adev, ih);
>>>> +
>>>> +/* allow rptr to wrap around  */
>>>> +if (cur_rptr > checkpoint_wptr) {
>>>> +spin_begin();
>>>> +do {
>>>> +spin_cpu_relax();
>>>> +prev_rptr = cur_rptr;
>>>> +cur_rptr = READ_ONCE(ih->rptr);
>>>> +} while (cur_rptr >= prev_rptr);
>>>> +spin_end();
>>> That's a certain NAK since it busy waits for IH processing. We need 
>>> some
>>> event to trigger here.
>> The function is meant to be just a waiter up to the checkpoint.
>> There's a need to guarantee that "stale" interrupts have been 
>> processed on check before doing other stuff after call.
>> The description could be improved to clarify that.
>>
>> Would busy waiting only on a locked ring help?  I assume an unlocked 
>> ring means nothing to process so no need to wait and we can exit 
>> early.  Or is it better to just to process the entries up to the 
>> checkpoint (maybe adjust amdgpu_ih_process for this need like adding 
>> a bool arg to skip restart or something)?
>>
>> Thanks,
>>
>> Jon
>>
>>>> +}
>>>> +
>>>> +/* wait for rptr to catch up to or pass checkpoint. */
>>>> +spin_begin();
>>>> +do {
>>>> +spin_cpu_relax();
>>>> +prev_rptr = cur_rptr;
>>>> +cur_rptr = READ_ONCE(ih->rptr);
>>>> +} while (cur_rptr >= prev_rptr && cur_rptr < checkpoint_wptr);
>>> Same of course here.
>>>
>>> Christian.
>>>
>>>> +spin_end();
>>>> +
>>>> +return 0;
>>>> +}
>>>> +
>>>>    /**
>>>>     * amdgpu_ih_process - interrupt handler
>>>>     *
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
>>>> index 6ed4a85fc7c3..6817f0a812d2 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
>>>> @@ -87,6 +87,8 @@ int amdgpu_ih_ring_init(struct amdgpu_device *adev,
>>> struct amdgpu_ih_ring *ih,
>>>>    void amdgpu_ih_ring_fini(struct amdgpu_device *adev, struct
>>> amdgpu_ih_ring *ih);
>>>>    void amdgpu_ih_ring_write(struct amdgpu_ih_ring *ih, const 
>>>> uint32_t *iv,
>>>>      unsigned int num_dw);
>>>> +int amdgpu_ih_wait_on_checkpoint_process(struct amdgpu_device
>>> *adev,
>>>> +struct amdgpu_ih_ring *ih);
>>>>    int amdgpu_ih_process(struct amdgpu_device *adev, struct
>>> amdgpu_ih_ring *ih);
>>>>    void amdgpu_ih_decode_iv_helper(struct amdgpu_device *adev,
>>>>    struct amdgpu_ih_ring *ih,
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx