From: Felix Kuehling <felix.kuehling@amd.com> To: "Christian König" <ckoenig.leichtzumerken@gmail.com>, dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 1/2] drm/ttm: Don't evict SG BOs Date: Wed, 28 Apr 2021 13:02:31 -0400 [thread overview] Message-ID: <55742179-98d9-d68a-30b7-331885fd91e0@amd.com> (raw) In-Reply-To: <6946e644-0a16-30fe-e987-861bec610762@gmail.com> Am 2021-04-28 um 12:58 p.m. schrieb Christian König: > Am 28.04.21 um 18:49 schrieb Felix Kuehling: >> Am 2021-04-28 um 12:33 p.m. schrieb Christian König: >>> Am 28.04.21 um 17:19 schrieb Felix Kuehling: >>> [SNIP] >>>>>> Failing that, I'd probably have to abandon userptr BOs altogether >>>>>> and >>>>>> switch system memory mappings over to using the new SVM API on >>>>>> systems >>>>>> where it is avaliable. >>>>> Well as long as that provides the necessary functionality through HMM >>>>> it would be an option. >>>> Just another way of circumventing "It should limit the amount of >>>> system >>>> memory the GPU can access at the same time," a premise I disagree with >>>> in case of userptrs and HMM. Both use pageable, unpinned memory. >>>> Both can cause the GPU to be preempted in case of MMU interval >>>> notifiers. >>> Well that's the key point. GFX userptrs and DMA-buf imports can't be >>> preempted. >> But they don't need to be. They don't use any resources on the importing >> GPU or system memory, so why do we limit them? > > Yeah, but at least user pointer effectively pin their backing store as > long as the GPU operation is running. > >> With dynamic attachment, the exported BOs can be evicted and that >> affects the imports as well. I don't see why the import needs to be >> evicted as if there was some resource limitation on the importing GPU. > > It prevents that multiple DMA-buf imports are active at the same time. > > See the following example: GTT space is 1GiB and we have two DMA-buf > imports of 600MiB each. > > When userspace wants to submit work using both at the same time we > return -ENOSPC (or -ENOMEM, not 100% sure). > > When one is in use and a submission made with the other we block until > that submission is completed. > > This way there is never more than 1 GiB of memory in use or "pinned" > by the GPU using it. Is this reasonable for imports of VRAM in a multi GPU system? E.g. you allocate 600 MB on GPU A and 600 MB on GPU B. You export both and import them on the other GPU because you want both GPUs to access each other's memory. This is a common use case for KFD, and something we want to implement for upstreamable PCIe P2P support. With your limitation, I will lever be able to validate both BOs and run KFD user mode queues in the above scenario. Regards, Felix > >>> So they basically lock the backing memory until the last submission is >>> completed and that is causing problems if it happens for to much >>> memory at the same time. >>> >>> What we could do is to figure out in the valuable callback if the BO >>> is preempt-able or not. >> Then we should also not count them in mgr->available. Otherwise not >> evicting these BOs can block other GTT allocations. Again, maybe it's >> easier to use a different domain for preemptible BOs. > > Good point. That would also be valuable when we get user queues at > some point. > > Regards, > Christian. > >> >> Regards, >> Felix >> >> >>> Regards, >>> Christian. >>> >>>> Statically limiting the amount of pageable memory accessible to GTT is >>>> redundant and overly limiting. >>>> >>>> Regards, >>>> Felix >>>> >>>> >>>>> Regards, >>>>> Christian. >>>>> >>>>>> Regards, >>>>>> Felix >>>>>> >>>>>> >>>>>>> Christian. >>>>>>> >>>>>>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> >>>>>>>> --- >>>>>>>> drivers/gpu/drm/ttm/ttm_bo.c | 4 ++++ >>>>>>>> 1 file changed, 4 insertions(+) >>>>>>>> >>>>>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c >>>>>>>> b/drivers/gpu/drm/ttm/ttm_bo.c >>>>>>>> index de1ec838cf8b..0b953654fdbf 100644 >>>>>>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c >>>>>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c >>>>>>>> @@ -655,6 +655,10 @@ int ttm_mem_evict_first(struct ttm_device >>>>>>>> *bdev, >>>>>>>> list_for_each_entry(bo, &man->lru[i], lru) { >>>>>>>> bool busy; >>>>>>>> + /* Don't evict SG BOs */ >>>>>>>> + if (bo->ttm && bo->ttm->sg) >>>>>>>> + continue; >>>>>>>> + >>>>>>>> if (!ttm_bo_evict_swapout_allowable(bo, ctx, >>>>>>>> &locked, >>>>>>>> &busy)) { >>>>>>>> if (busy && !busy_bo && ticket != > _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
WARNING: multiple messages have this Message-ID (diff)
From: Felix Kuehling <felix.kuehling@amd.com> To: "Christian König" <ckoenig.leichtzumerken@gmail.com>, dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 1/2] drm/ttm: Don't evict SG BOs Date: Wed, 28 Apr 2021 13:02:31 -0400 [thread overview] Message-ID: <55742179-98d9-d68a-30b7-331885fd91e0@amd.com> (raw) In-Reply-To: <6946e644-0a16-30fe-e987-861bec610762@gmail.com> Am 2021-04-28 um 12:58 p.m. schrieb Christian König: > Am 28.04.21 um 18:49 schrieb Felix Kuehling: >> Am 2021-04-28 um 12:33 p.m. schrieb Christian König: >>> Am 28.04.21 um 17:19 schrieb Felix Kuehling: >>> [SNIP] >>>>>> Failing that, I'd probably have to abandon userptr BOs altogether >>>>>> and >>>>>> switch system memory mappings over to using the new SVM API on >>>>>> systems >>>>>> where it is avaliable. >>>>> Well as long as that provides the necessary functionality through HMM >>>>> it would be an option. >>>> Just another way of circumventing "It should limit the amount of >>>> system >>>> memory the GPU can access at the same time," a premise I disagree with >>>> in case of userptrs and HMM. Both use pageable, unpinned memory. >>>> Both can cause the GPU to be preempted in case of MMU interval >>>> notifiers. >>> Well that's the key point. GFX userptrs and DMA-buf imports can't be >>> preempted. >> But they don't need to be. They don't use any resources on the importing >> GPU or system memory, so why do we limit them? > > Yeah, but at least user pointer effectively pin their backing store as > long as the GPU operation is running. > >> With dynamic attachment, the exported BOs can be evicted and that >> affects the imports as well. I don't see why the import needs to be >> evicted as if there was some resource limitation on the importing GPU. > > It prevents that multiple DMA-buf imports are active at the same time. > > See the following example: GTT space is 1GiB and we have two DMA-buf > imports of 600MiB each. > > When userspace wants to submit work using both at the same time we > return -ENOSPC (or -ENOMEM, not 100% sure). > > When one is in use and a submission made with the other we block until > that submission is completed. > > This way there is never more than 1 GiB of memory in use or "pinned" > by the GPU using it. Is this reasonable for imports of VRAM in a multi GPU system? E.g. you allocate 600 MB on GPU A and 600 MB on GPU B. You export both and import them on the other GPU because you want both GPUs to access each other's memory. This is a common use case for KFD, and something we want to implement for upstreamable PCIe P2P support. With your limitation, I will lever be able to validate both BOs and run KFD user mode queues in the above scenario. Regards, Felix > >>> So they basically lock the backing memory until the last submission is >>> completed and that is causing problems if it happens for to much >>> memory at the same time. >>> >>> What we could do is to figure out in the valuable callback if the BO >>> is preempt-able or not. >> Then we should also not count them in mgr->available. Otherwise not >> evicting these BOs can block other GTT allocations. Again, maybe it's >> easier to use a different domain for preemptible BOs. > > Good point. That would also be valuable when we get user queues at > some point. > > Regards, > Christian. > >> >> Regards, >> Felix >> >> >>> Regards, >>> Christian. >>> >>>> Statically limiting the amount of pageable memory accessible to GTT is >>>> redundant and overly limiting. >>>> >>>> Regards, >>>> Felix >>>> >>>> >>>>> Regards, >>>>> Christian. >>>>> >>>>>> Regards, >>>>>> Felix >>>>>> >>>>>> >>>>>>> Christian. >>>>>>> >>>>>>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> >>>>>>>> --- >>>>>>>> drivers/gpu/drm/ttm/ttm_bo.c | 4 ++++ >>>>>>>> 1 file changed, 4 insertions(+) >>>>>>>> >>>>>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c >>>>>>>> b/drivers/gpu/drm/ttm/ttm_bo.c >>>>>>>> index de1ec838cf8b..0b953654fdbf 100644 >>>>>>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c >>>>>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c >>>>>>>> @@ -655,6 +655,10 @@ int ttm_mem_evict_first(struct ttm_device >>>>>>>> *bdev, >>>>>>>> list_for_each_entry(bo, &man->lru[i], lru) { >>>>>>>> bool busy; >>>>>>>> + /* Don't evict SG BOs */ >>>>>>>> + if (bo->ttm && bo->ttm->sg) >>>>>>>> + continue; >>>>>>>> + >>>>>>>> if (!ttm_bo_evict_swapout_allowable(bo, ctx, >>>>>>>> &locked, >>>>>>>> &busy)) { >>>>>>>> if (busy && !busy_bo && ticket != > _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
next prev parent reply other threads:[~2021-04-28 17:02 UTC|newest] Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-04-28 5:33 [PATCH 1/2] drm/ttm: Don't evict SG BOs Felix Kuehling 2021-04-28 5:33 ` Felix Kuehling 2021-04-28 5:33 ` [PATCH 2/2] drm/ttm: Fix swapout in ttm_tt_populate Felix Kuehling 2021-04-28 5:33 ` Felix Kuehling 2021-04-28 7:03 ` Christian König 2021-04-28 7:03 ` Christian König 2021-04-28 7:04 ` [PATCH 1/2] drm/ttm: Don't evict SG BOs Christian König 2021-04-28 7:04 ` Christian König 2021-04-28 7:49 ` Felix Kuehling 2021-04-28 7:49 ` Felix Kuehling 2021-04-28 9:05 ` Christian König 2021-04-28 9:05 ` Christian König 2021-04-28 15:19 ` Felix Kuehling 2021-04-28 15:19 ` Felix Kuehling 2021-04-28 16:33 ` Christian König 2021-04-28 16:33 ` Christian König 2021-04-28 16:49 ` Felix Kuehling 2021-04-28 16:49 ` Felix Kuehling 2021-04-28 16:58 ` Christian König 2021-04-28 16:58 ` Christian König 2021-04-28 17:02 ` Felix Kuehling [this message] 2021-04-28 17:02 ` Felix Kuehling
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=55742179-98d9-d68a-30b7-331885fd91e0@amd.com \ --to=felix.kuehling@amd.com \ --cc=amd-gfx@lists.freedesktop.org \ --cc=ckoenig.leichtzumerken@gmail.com \ --cc=dri-devel@lists.freedesktop.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.