All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: Daniel Vetter <daniel@ffwll.ch>
Cc: Alex Sierra <alex.sierra@amd.com>,
	"Yang, Philip" <philip.yang@amd.com>,
	Felix Kuehling <felix.kuehling@amd.com>,
	amd-gfx list <amd-gfx@lists.freedesktop.org>,
	Jerome Glisse <jglisse@redhat.com>,
	dri-devel <dri-devel@lists.freedesktop.org>
Subject: Re: HMM fence (was Re: [PATCH 00/35] Add HMM-based SVM memory manager to KFD)
Date: Thu, 14 Jan 2021 16:08:21 +0100	[thread overview]
Message-ID: <55d283fc-10e1-d3de-0c2c-88e16c3af9c0@amd.com> (raw)
In-Reply-To: <CAKMK7uEgHpzGBKE5vTEpfvqgoK2DrQW4KGbvXMsAF_n85opbmg@mail.gmail.com>

Am 14.01.21 um 15:23 schrieb Daniel Vetter:
> On Thu, Jan 14, 2021 at 3:13 PM Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>> Am 14.01.21 um 14:57 schrieb Daniel Vetter:
>>> On Thu, Jan 14, 2021 at 2:37 PM Christian König
>>> <christian.koenig@amd.com> wrote:
>>>> Am 14.01.21 um 12:52 schrieb Daniel Vetter:
>>>>> [SNIP]
>>>>>>> I had a new idea, i wanted to think more about it but have not yet,
>>>>>>> anyway here it is. Adding a new callback to dma fence which ask the
>>>>>>> question can it dead lock ? Any time a GPU driver has pending page
>>>>>>> fault (ie something calling into the mm) it answer yes, otherwise
>>>>>>> no. The GPU shrinker would ask the question before waiting on any
>>>>>>> dma-fence and back of if it gets yes. Shrinker can still try many
>>>>>>> dma buf object for which it does not get a yes on associated fence.
>>>>>>>
>>>>>>> This does not solve the mmu notifier case, for this you would just
>>>>>>> invalidate the gem userptr object (with a flag but not releasing the
>>>>>>> page refcount) but you would not wait for the GPU (ie no dma fence
>>>>>>> wait in that code path anymore). The userptr API never really made
>>>>>>> the contract that it will always be in sync with the mm view of the
>>>>>>> world so if different page get remapped to same virtual address
>>>>>>> while GPU is still working with the old pages it should not be an
>>>>>>> issue (it would not be in our usage of userptr for compositor and
>>>>>>> what not).
>>>>>> The current working idea in my mind goes into a similar direction.
>>>>>>
>>>>>> But instead of a callback I'm adding a complete new class of HMM fences.
>>>>>>
>>>>>> Waiting in the MMU notfier, scheduler, TTM etc etc is only allowed for
>>>>>> the dma_fences and HMM fences are ignored in container objects.
>>>>>>
>>>>>> When you handle an implicit or explicit synchronization request from
>>>>>> userspace you need to block for HMM fences to complete before taking any
>>>>>> resource locks.
>>>>> Isnt' that what I call gang scheduling? I.e. you either run in HMM
>>>>> mode, or in legacy fencing mode (whether implicit or explicit doesn't
>>>>> really matter I think). By forcing that split we avoid the problem,
>>>>> but it means occasionally full stalls on mixed workloads.
>>>>>
>>>>> But that's not what Jerome wants (afaiui at least), I think his idea
>>>>> is to track the reverse dependencies of all the fences floating
>>>>> around, and then skip evicting an object if you have to wait for any
>>>>> fence that is problematic for the current calling context. And I don't
>>>>> think that's very feasible in practice.
>>>>>
>>>>> So what kind of hmm fences do you have in mind here?
>>>> It's a bit more relaxed than your gang schedule.
>>>>
>>>> See the requirements are as follow:
>>>>
>>>> 1. dma_fences never depend on hmm_fences.
>>>> 2. hmm_fences can never preempt dma_fences.
>>>> 3. dma_fences must be able to preempt hmm_fences or we always reserve
>>>> enough hardware resources (CUs) to guarantee forward progress of dma_fences.
>>>>
>>>> Critical sections are MMU notifiers, page faults, GPU schedulers and
>>>> dma_reservation object locks.
>>>>
>>>> 4. It is valid to wait for a dma_fences in critical sections.
>>>> 5. It is not valid to wait for hmm_fences in critical sections.
>>>>
>>>> Fence creation either happens during command submission or by adding
>>>> something like a barrier or signal command to your userspace queue.
>>>>
>>>> 6. If we have an hmm_fence as implicit or explicit dependency for
>>>> creating a dma_fence we must wait for that before taking any locks or
>>>> reserving resources.
>>>> 7. If we have a dma_fence as implicit or explicit dependency for
>>>> creating an hmm_fence we can wait later on. So busy waiting or special
>>>> WAIT hardware commands are valid.
>>>>
>>>> This prevents hard cuts, e.g. can mix hmm_fences and dma_fences at the
>>>> same time on the hardware.
>>>>
>>>> In other words we can have a high priority gfx queue running jobs based
>>>> on dma_fences and a low priority compute queue running jobs based on
>>>> hmm_fences.
>>>>
>>>> Only when we switch from hmm_fence to dma_fence we need to block the
>>>> submission until all the necessary resources (both memory as well as
>>>> CUs) are available.
>>>>
>>>> This is somewhat an extension to your gang submit idea.
>>> Either I'm missing something, or this is just exactly what we
>>> documented already with userspace fences in general, and how you can't
>>> have a dma_fence depend upon a userspace (or hmm_fence).
>>>
>>> My gang scheduling idea is really just an alternative for what you
>>> have listed as item 3 above. Instead of requiring preempt or requiring
>>> guaranteed forward progress of some other sorts we flush out any
>>> pending dma_fence request. But _only_ those which would get stalled by
>>> the job we're running, so high-priority sdma requests we need in the
>>> kernel to shuffle buffers around are still all ok. This would be
>>> needed if you're hw can't preempt, and you also have shared engines
>>> between compute and gfx, so reserving CUs won't solve the problem
>>> either.
>>>
>>> What I don't mean with my gang scheduling is a completely exclusive
>>> mode between hmm_fence and dma_fence, since that would prevent us from
>>> using copy engines and dma_fence in the kernel to shuffle memory
>>> around for hmm jobs. And that would suck, even on compute-only
>>> workloads. Maybe I should rename "gang scheduling" to "engine flush"
>>> or something like that.
>> Yeah, "engine flush" makes it much more clearer.
>>
>> What I wanted to emphasis is that we have to mix dma_fences and
>> hmm_fences running at the same time on the same hardware fighting over
>> the same resources.
>>
>> E.g. even on the newest hardware multimedia engines can't handle page
>> faults, so video decoding/encoding will still produce dma_fences.
> Well we also have to mix them so the kernel can shovel data around
> using copy engines. Plus we have to mix it at the overall subsystem
> level because I'm not sure SoC-class gpus will ever get here,
> definitely aren't yet there for sure.
>
>>> I think the basics of userspace or hmm_fence or whatever we'll call it
>>> we've documented already here:
>>>
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdri.freedesktop.org%2Fdocs%2Fdrm%2Fdriver-api%2Fdma-buf.html%3Fhighlight%3Ddma_fence%23indefinite-dma-fences&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cc35b65cf4ad5430475de08d8b897f5dd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637462310094850656%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=GHBbLzmHPaW4sSZUrfKi6aNMAmYDbzgUMhZOOd1Im8E%3D&amp;reserved=0
>> This talks about the restrictions we have for dma_fences and why
>> infinite fences (even as hmm_fence) will never work.
>>
>> But it doesn't talk about how to handle implicit or explicit
>> dependencies with something like hmm_fences.
>>
>> In other words my proposal above allows for hmm_fences to show up in
>> dma_reservation objects and are used together with all this explicit
>> synchronization we still have with only a medium amount of work :)
> Oh. I don't think we should put any hmm_fence or other infinite fence
> into a dma_resv object. At least not into the current dma_resv object,
> because then we have that infinite fences problem everywhere, and very
> hard to audit.

Yes, exactly. That's why this rules how to mix them or rather not mix them.

> What we could do is add new hmm_fence only slots for implicit sync,

Yeah, we would have them separated to the dma_fence objects.

> but I think consensus is that implicit sync is bad, never do it again.
> Last time around (for timeline syncobj) we've also pushed the waiting
> on cross-over to userspace, and I think that's the right option, so we
> need userspace to understand the hmm fence anyway. At that point we
> might as well bite the bullet and do another round of wayland/dri
> protocols.

As you said I don't see this happening in the next 5 years either.

So I think we have to somehow solve this in the kernel or we will go in 
circles all the time.

> So from that pov I think the kernel should at most deal with an
> hmm_fence for cross-process communication and maybe some standard wait
> primitives (for userspace to use, not for the kernel).
>
> The only use case this would forbid is using page faults for legacy
> implicit/explicit dma_fence synced workloads, and I think that's
> perfectly ok to not allow. Especially since the motivation here for
> all this is compute, and compute doesn't pass around dma_fences
> anyway.

As Alex said we will rather soon see this for gfx as well and we most 
likely will see combinations of old dma_fence based integrated graphics 
with new dedicated GPUs.

So I don't think we can say we reduce the problem to compute and don't 
support anything else.

Regards,
Christian.

>
>>> I think the only thing missing is clarifying a bit what you have under
>>> item 3, i.e. how do we make sure there's no accidental hidden
>>> dependency between hmm_fence and dma_fence. Maybe a subsection about
>>> gpu page fault handling?
>> The real improvement is item 6. The problem with it is that it requires
>> auditing all occasions when we create dma_fences so that we don't
>> accidentally depend on an HMM fence.
> We have that rule already, it's the "dma_fence must not depend upon an
> infinite fence anywhere" rule we documented last summer. So that
> doesn't feel new.
> -Daniel
>
>> Regards,
>> Christian.
>>
>>> Or are we still talking past each another a bit here?
>>> -Daniel
>>>
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> -Daniel
>>>>>
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

WARNING: multiple messages have this Message-ID (diff)
From: "Christian König" <christian.koenig@amd.com>
To: Daniel Vetter <daniel@ffwll.ch>
Cc: Alex Sierra <alex.sierra@amd.com>,
	"Yang, Philip" <philip.yang@amd.com>,
	Felix Kuehling <felix.kuehling@amd.com>,
	amd-gfx list <amd-gfx@lists.freedesktop.org>,
	Jerome Glisse <jglisse@redhat.com>,
	dri-devel <dri-devel@lists.freedesktop.org>
Subject: Re: HMM fence (was Re: [PATCH 00/35] Add HMM-based SVM memory manager to KFD)
Date: Thu, 14 Jan 2021 16:08:21 +0100	[thread overview]
Message-ID: <55d283fc-10e1-d3de-0c2c-88e16c3af9c0@amd.com> (raw)
In-Reply-To: <CAKMK7uEgHpzGBKE5vTEpfvqgoK2DrQW4KGbvXMsAF_n85opbmg@mail.gmail.com>

Am 14.01.21 um 15:23 schrieb Daniel Vetter:
> On Thu, Jan 14, 2021 at 3:13 PM Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>> Am 14.01.21 um 14:57 schrieb Daniel Vetter:
>>> On Thu, Jan 14, 2021 at 2:37 PM Christian König
>>> <christian.koenig@amd.com> wrote:
>>>> Am 14.01.21 um 12:52 schrieb Daniel Vetter:
>>>>> [SNIP]
>>>>>>> I had a new idea, i wanted to think more about it but have not yet,
>>>>>>> anyway here it is. Adding a new callback to dma fence which ask the
>>>>>>> question can it dead lock ? Any time a GPU driver has pending page
>>>>>>> fault (ie something calling into the mm) it answer yes, otherwise
>>>>>>> no. The GPU shrinker would ask the question before waiting on any
>>>>>>> dma-fence and back of if it gets yes. Shrinker can still try many
>>>>>>> dma buf object for which it does not get a yes on associated fence.
>>>>>>>
>>>>>>> This does not solve the mmu notifier case, for this you would just
>>>>>>> invalidate the gem userptr object (with a flag but not releasing the
>>>>>>> page refcount) but you would not wait for the GPU (ie no dma fence
>>>>>>> wait in that code path anymore). The userptr API never really made
>>>>>>> the contract that it will always be in sync with the mm view of the
>>>>>>> world so if different page get remapped to same virtual address
>>>>>>> while GPU is still working with the old pages it should not be an
>>>>>>> issue (it would not be in our usage of userptr for compositor and
>>>>>>> what not).
>>>>>> The current working idea in my mind goes into a similar direction.
>>>>>>
>>>>>> But instead of a callback I'm adding a complete new class of HMM fences.
>>>>>>
>>>>>> Waiting in the MMU notfier, scheduler, TTM etc etc is only allowed for
>>>>>> the dma_fences and HMM fences are ignored in container objects.
>>>>>>
>>>>>> When you handle an implicit or explicit synchronization request from
>>>>>> userspace you need to block for HMM fences to complete before taking any
>>>>>> resource locks.
>>>>> Isnt' that what I call gang scheduling? I.e. you either run in HMM
>>>>> mode, or in legacy fencing mode (whether implicit or explicit doesn't
>>>>> really matter I think). By forcing that split we avoid the problem,
>>>>> but it means occasionally full stalls on mixed workloads.
>>>>>
>>>>> But that's not what Jerome wants (afaiui at least), I think his idea
>>>>> is to track the reverse dependencies of all the fences floating
>>>>> around, and then skip evicting an object if you have to wait for any
>>>>> fence that is problematic for the current calling context. And I don't
>>>>> think that's very feasible in practice.
>>>>>
>>>>> So what kind of hmm fences do you have in mind here?
>>>> It's a bit more relaxed than your gang schedule.
>>>>
>>>> See the requirements are as follow:
>>>>
>>>> 1. dma_fences never depend on hmm_fences.
>>>> 2. hmm_fences can never preempt dma_fences.
>>>> 3. dma_fences must be able to preempt hmm_fences or we always reserve
>>>> enough hardware resources (CUs) to guarantee forward progress of dma_fences.
>>>>
>>>> Critical sections are MMU notifiers, page faults, GPU schedulers and
>>>> dma_reservation object locks.
>>>>
>>>> 4. It is valid to wait for a dma_fences in critical sections.
>>>> 5. It is not valid to wait for hmm_fences in critical sections.
>>>>
>>>> Fence creation either happens during command submission or by adding
>>>> something like a barrier or signal command to your userspace queue.
>>>>
>>>> 6. If we have an hmm_fence as implicit or explicit dependency for
>>>> creating a dma_fence we must wait for that before taking any locks or
>>>> reserving resources.
>>>> 7. If we have a dma_fence as implicit or explicit dependency for
>>>> creating an hmm_fence we can wait later on. So busy waiting or special
>>>> WAIT hardware commands are valid.
>>>>
>>>> This prevents hard cuts, e.g. can mix hmm_fences and dma_fences at the
>>>> same time on the hardware.
>>>>
>>>> In other words we can have a high priority gfx queue running jobs based
>>>> on dma_fences and a low priority compute queue running jobs based on
>>>> hmm_fences.
>>>>
>>>> Only when we switch from hmm_fence to dma_fence we need to block the
>>>> submission until all the necessary resources (both memory as well as
>>>> CUs) are available.
>>>>
>>>> This is somewhat an extension to your gang submit idea.
>>> Either I'm missing something, or this is just exactly what we
>>> documented already with userspace fences in general, and how you can't
>>> have a dma_fence depend upon a userspace (or hmm_fence).
>>>
>>> My gang scheduling idea is really just an alternative for what you
>>> have listed as item 3 above. Instead of requiring preempt or requiring
>>> guaranteed forward progress of some other sorts we flush out any
>>> pending dma_fence request. But _only_ those which would get stalled by
>>> the job we're running, so high-priority sdma requests we need in the
>>> kernel to shuffle buffers around are still all ok. This would be
>>> needed if you're hw can't preempt, and you also have shared engines
>>> between compute and gfx, so reserving CUs won't solve the problem
>>> either.
>>>
>>> What I don't mean with my gang scheduling is a completely exclusive
>>> mode between hmm_fence and dma_fence, since that would prevent us from
>>> using copy engines and dma_fence in the kernel to shuffle memory
>>> around for hmm jobs. And that would suck, even on compute-only
>>> workloads. Maybe I should rename "gang scheduling" to "engine flush"
>>> or something like that.
>> Yeah, "engine flush" makes it much more clearer.
>>
>> What I wanted to emphasis is that we have to mix dma_fences and
>> hmm_fences running at the same time on the same hardware fighting over
>> the same resources.
>>
>> E.g. even on the newest hardware multimedia engines can't handle page
>> faults, so video decoding/encoding will still produce dma_fences.
> Well we also have to mix them so the kernel can shovel data around
> using copy engines. Plus we have to mix it at the overall subsystem
> level because I'm not sure SoC-class gpus will ever get here,
> definitely aren't yet there for sure.
>
>>> I think the basics of userspace or hmm_fence or whatever we'll call it
>>> we've documented already here:
>>>
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdri.freedesktop.org%2Fdocs%2Fdrm%2Fdriver-api%2Fdma-buf.html%3Fhighlight%3Ddma_fence%23indefinite-dma-fences&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cc35b65cf4ad5430475de08d8b897f5dd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637462310094850656%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=GHBbLzmHPaW4sSZUrfKi6aNMAmYDbzgUMhZOOd1Im8E%3D&amp;reserved=0
>> This talks about the restrictions we have for dma_fences and why
>> infinite fences (even as hmm_fence) will never work.
>>
>> But it doesn't talk about how to handle implicit or explicit
>> dependencies with something like hmm_fences.
>>
>> In other words my proposal above allows for hmm_fences to show up in
>> dma_reservation objects and are used together with all this explicit
>> synchronization we still have with only a medium amount of work :)
> Oh. I don't think we should put any hmm_fence or other infinite fence
> into a dma_resv object. At least not into the current dma_resv object,
> because then we have that infinite fences problem everywhere, and very
> hard to audit.

Yes, exactly. That's why this rules how to mix them or rather not mix them.

> What we could do is add new hmm_fence only slots for implicit sync,

Yeah, we would have them separated to the dma_fence objects.

> but I think consensus is that implicit sync is bad, never do it again.
> Last time around (for timeline syncobj) we've also pushed the waiting
> on cross-over to userspace, and I think that's the right option, so we
> need userspace to understand the hmm fence anyway. At that point we
> might as well bite the bullet and do another round of wayland/dri
> protocols.

As you said I don't see this happening in the next 5 years either.

So I think we have to somehow solve this in the kernel or we will go in 
circles all the time.

> So from that pov I think the kernel should at most deal with an
> hmm_fence for cross-process communication and maybe some standard wait
> primitives (for userspace to use, not for the kernel).
>
> The only use case this would forbid is using page faults for legacy
> implicit/explicit dma_fence synced workloads, and I think that's
> perfectly ok to not allow. Especially since the motivation here for
> all this is compute, and compute doesn't pass around dma_fences
> anyway.

As Alex said we will rather soon see this for gfx as well and we most 
likely will see combinations of old dma_fence based integrated graphics 
with new dedicated GPUs.

So I don't think we can say we reduce the problem to compute and don't 
support anything else.

Regards,
Christian.

>
>>> I think the only thing missing is clarifying a bit what you have under
>>> item 3, i.e. how do we make sure there's no accidental hidden
>>> dependency between hmm_fence and dma_fence. Maybe a subsection about
>>> gpu page fault handling?
>> The real improvement is item 6. The problem with it is that it requires
>> auditing all occasions when we create dma_fences so that we don't
>> accidentally depend on an HMM fence.
> We have that rule already, it's the "dma_fence must not depend upon an
> infinite fence anywhere" rule we documented last summer. So that
> doesn't feel new.
> -Daniel
>
>> Regards,
>> Christian.
>>
>>> Or are we still talking past each another a bit here?
>>> -Daniel
>>>
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> -Daniel
>>>>>
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

  reply	other threads:[~2021-01-14 15:08 UTC|newest]

Thread overview: 168+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-07  3:00 [PATCH 00/35] Add HMM-based SVM memory manager to KFD Felix Kuehling
2021-01-07  3:00 ` Felix Kuehling
2021-01-07  3:00 ` [PATCH 01/35] drm/amdkfd: select kernel DEVICE_PRIVATE option Felix Kuehling
2021-01-07  3:00   ` Felix Kuehling
2021-01-07  3:00 ` [PATCH 02/35] drm/amdgpu: replace per_device_list by array Felix Kuehling
2021-01-07  3:00   ` Felix Kuehling
2021-01-07  3:00 ` [PATCH 03/35] drm/amdkfd: helper to convert gpu id and idx Felix Kuehling
2021-01-07  3:00   ` Felix Kuehling
2021-01-07  3:00 ` [PATCH 04/35] drm/amdkfd: add svm ioctl API Felix Kuehling
2021-01-07  3:00   ` Felix Kuehling
2021-01-07  3:00 ` [PATCH 05/35] drm/amdkfd: Add SVM API support capability bits Felix Kuehling
2021-01-07  3:00   ` Felix Kuehling
2021-01-07  3:00 ` [PATCH 06/35] drm/amdkfd: register svm range Felix Kuehling
2021-01-07  3:00   ` Felix Kuehling
2021-01-07  3:00 ` [PATCH 07/35] drm/amdkfd: add svm ioctl GET_ATTR op Felix Kuehling
2021-01-07  3:00   ` Felix Kuehling
2021-01-07  3:01 ` [PATCH 08/35] drm/amdgpu: add common HMM get pages function Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-01-07 10:53   ` Christian König
2021-01-07 10:53     ` Christian König
2021-01-07  3:01 ` [PATCH 09/35] drm/amdkfd: validate svm range system memory Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-01-07  3:01 ` [PATCH 10/35] drm/amdkfd: register overlap system memory range Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-01-07  3:01 ` [PATCH 11/35] drm/amdkfd: deregister svm range Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-01-07  3:01 ` [PATCH 12/35] drm/amdgpu: export vm update mapping interface Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-01-07 10:54   ` Christian König
2021-01-07 10:54     ` Christian König
2021-01-07  3:01 ` [PATCH 13/35] drm/amdkfd: map svm range to GPUs Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-01-07  3:01 ` [PATCH 14/35] drm/amdkfd: svm range eviction and restore Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-01-07  3:01 ` [PATCH 15/35] drm/amdkfd: add xnack enabled flag to kfd_process Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-01-07  3:01 ` [PATCH 16/35] drm/amdkfd: add ioctl to configure and query xnack retries Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-01-07  3:01 ` [PATCH 17/35] drm/amdkfd: register HMM device private zone Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-03-01  8:32   ` Daniel Vetter
2021-03-01  8:32     ` Daniel Vetter
2021-03-01  8:46     ` Thomas Hellström (Intel)
2021-03-01  8:46       ` Thomas Hellström (Intel)
2021-03-01  8:58       ` Daniel Vetter
2021-03-01  8:58         ` Daniel Vetter
2021-03-01  9:30         ` Thomas Hellström (Intel)
2021-03-01  9:30           ` Thomas Hellström (Intel)
2021-03-04 17:58       ` Felix Kuehling
2021-03-04 17:58         ` Felix Kuehling
2021-03-11 12:24         ` Thomas Hellström (Intel)
2021-03-11 12:24           ` Thomas Hellström (Intel)
2021-01-07  3:01 ` [PATCH 18/35] drm/amdkfd: validate vram svm range from TTM Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-01-07  3:01 ` [PATCH 19/35] drm/amdkfd: support xgmi same hive mapping Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-01-07  3:01 ` [PATCH 20/35] drm/amdkfd: copy memory through gart table Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-01-07  3:01 ` [PATCH 21/35] drm/amdkfd: HMM migrate ram to vram Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-01-07  3:01 ` [PATCH 22/35] drm/amdkfd: HMM migrate vram to ram Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-01-07  3:01 ` [PATCH 23/35] drm/amdkfd: invalidate tables on page retry fault Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-01-07  3:01 ` [PATCH 24/35] drm/amdkfd: page table restore through svm API Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-01-07  3:01 ` [PATCH 25/35] drm/amdkfd: SVM API call to restore page tables Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-01-07  3:01 ` [PATCH 26/35] drm/amdkfd: add svm_bo reference for eviction fence Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-01-07  3:01 ` [PATCH 27/35] drm/amdgpu: add param bit flag to create SVM BOs Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-01-07  3:01 ` [PATCH 28/35] drm/amdkfd: add svm_bo eviction mechanism support Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-01-07  3:01 ` [PATCH 29/35] drm/amdgpu: svm bo enable_signal call condition Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-01-07 10:56   ` Christian König
2021-01-07 10:56     ` Christian König
2021-01-07 16:16     ` Felix Kuehling
2021-01-07 16:16       ` Felix Kuehling
2021-01-07 16:28       ` Christian König
2021-01-07 16:28         ` Christian König
2021-01-07 16:53         ` Felix Kuehling
2021-01-07 16:53           ` Felix Kuehling
2021-01-07  3:01 ` [PATCH 30/35] drm/amdgpu: add svm_bo eviction to enable_signal cb Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-01-07  3:01 ` [PATCH 31/35] drm/amdgpu: reserve fence slot to update page table Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-01-07 10:57   ` Christian König
2021-01-07 10:57     ` Christian König
2021-01-07  3:01 ` [PATCH 32/35] drm/amdgpu: enable retry fault wptr overflow Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-01-07 11:01   ` Christian König
2021-01-07 11:01     ` Christian König
2021-01-07  3:01 ` [PATCH 33/35] drm/amdkfd: refine migration policy with xnack on Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-01-07  3:01 ` [PATCH 34/35] drm/amdkfd: add svm range validate timestamp Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-01-07  3:01 ` [PATCH 35/35] drm/amdkfd: multiple gpu migrate vram to vram Felix Kuehling
2021-01-07  3:01   ` Felix Kuehling
2021-01-07  9:23 ` [PATCH 00/35] Add HMM-based SVM memory manager to KFD Daniel Vetter
2021-01-07  9:23   ` Daniel Vetter
2021-01-07 16:25   ` Felix Kuehling
2021-01-07 16:25     ` Felix Kuehling
2021-01-08 14:40     ` Daniel Vetter
2021-01-08 14:40       ` Daniel Vetter
2021-01-08 14:45       ` Christian König
2021-01-08 14:45         ` Christian König
2021-01-08 15:58       ` Felix Kuehling
2021-01-08 15:58         ` Felix Kuehling
2021-01-08 16:06         ` Daniel Vetter
2021-01-08 16:06           ` Daniel Vetter
2021-01-08 16:36           ` Felix Kuehling
2021-01-08 16:36             ` Felix Kuehling
2021-01-08 16:53             ` Daniel Vetter
2021-01-08 16:53               ` Daniel Vetter
2021-01-08 17:56               ` Felix Kuehling
2021-01-08 17:56                 ` Felix Kuehling
2021-01-11 16:29                 ` Daniel Vetter
2021-01-11 16:29                   ` Daniel Vetter
2021-01-14  5:34                   ` Felix Kuehling
2021-01-14  5:34                     ` Felix Kuehling
2021-01-14 12:19                     ` Christian König
2021-01-14 12:19                       ` Christian König
2021-01-13 16:56       ` Jerome Glisse
2021-01-13 16:56         ` Jerome Glisse
2021-01-13 20:31         ` Daniel Vetter
2021-01-13 20:31           ` Daniel Vetter
2021-01-14  3:27           ` Jerome Glisse
2021-01-14  3:27             ` Jerome Glisse
2021-01-14  9:26             ` Daniel Vetter
2021-01-14  9:26               ` Daniel Vetter
2021-01-14 10:39               ` Daniel Vetter
2021-01-14 10:39                 ` Daniel Vetter
2021-01-14 10:49         ` Christian König
2021-01-14 10:49           ` Christian König
2021-01-14 11:52           ` Daniel Vetter
2021-01-14 11:52             ` Daniel Vetter
2021-01-14 13:37             ` HMM fence (was Re: [PATCH 00/35] Add HMM-based SVM memory manager to KFD) Christian König
2021-01-14 13:37               ` Christian König
2021-01-14 13:57               ` Daniel Vetter
2021-01-14 13:57                 ` Daniel Vetter
2021-01-14 14:13                 ` Christian König
2021-01-14 14:13                   ` Christian König
2021-01-14 14:23                   ` Daniel Vetter
2021-01-14 14:23                     ` Daniel Vetter
2021-01-14 15:08                     ` Christian König [this message]
2021-01-14 15:08                       ` Christian König
2021-01-14 15:40                       ` Daniel Vetter
2021-01-14 15:40                         ` Daniel Vetter
2021-01-14 16:01                         ` Christian König
2021-01-14 16:01                           ` Christian König
2021-01-14 16:36                           ` Daniel Vetter
2021-01-14 16:36                             ` Daniel Vetter
2021-01-14 19:08                             ` Christian König
2021-01-14 19:08                               ` Christian König
2021-01-14 20:09                               ` Daniel Vetter
2021-01-14 20:09                                 ` Daniel Vetter
2021-01-14 16:51               ` Jerome Glisse
2021-01-14 16:51                 ` Jerome Glisse
2021-01-14 21:13                 ` Felix Kuehling
2021-01-14 21:13                   ` Felix Kuehling
2021-01-15  7:47                   ` Christian König
2021-01-15  7:47                     ` Christian König
2021-01-13 16:47 ` [PATCH 00/35] Add HMM-based SVM memory manager to KFD Jerome Glisse
2021-01-13 16:47   ` Jerome Glisse
2021-01-14  0:06   ` Felix Kuehling
2021-01-14  0:06     ` Felix Kuehling

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55d283fc-10e1-d3de-0c2c-88e16c3af9c0@amd.com \
    --to=christian.koenig@amd.com \
    --cc=alex.sierra@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=felix.kuehling@amd.com \
    --cc=jglisse@redhat.com \
    --cc=philip.yang@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.