dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: "Christian König" <christian.koenig@amd.com>
Cc: Dave Airlie <airlied@linux.ie>,
	Roland Scheidegger <sroland@vmware.com>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	Huang Rui <ray.huang@amd.com>,
	VMware Graphics <linux-graphics-maintainer@vmware.com>
Subject: Re: [PATCH 1/2] drm/ttm: rework ttm_tt page limit v2
Date: Thu, 17 Dec 2020 16:45:55 +0100	[thread overview]
Message-ID: <CAKMK7uGgoeF8LmFBwWh5mW1k4xWjuUh3hdSFpVH1NBM7K0=edA@mail.gmail.com> (raw)
In-Reply-To: <0e50223b-d851-fdaa-d25e-5402d14444af@amd.com>

On Thu, Dec 17, 2020 at 4:36 PM Christian König
<christian.koenig@amd.com> wrote:
> Am 17.12.20 um 16:26 schrieb Daniel Vetter:
> > On Thu, Dec 17, 2020 at 4:10 PM Christian König
> > <christian.koenig@amd.com> wrote:
> >> Am 17.12.20 um 15:36 schrieb Daniel Vetter:
> >>> On Thu, Dec 17, 2020 at 2:46 PM Christian König
> >>> <ckoenig.leichtzumerken@gmail.com> wrote:
> >>>> Am 16.12.20 um 16:09 schrieb Daniel Vetter:
> >>>>> On Wed, Dec 16, 2020 at 03:04:26PM +0100, Christian König wrote:
> >>>>> [SNIP]
> >>>>>> +
> >>>>>> +/* As long as pages are available make sure to release at least one */
> >>>>>> +static unsigned long ttm_tt_shrinker_scan(struct shrinker *shrink,
> >>>>>> +                                      struct shrink_control *sc)
> >>>>>> +{
> >>>>>> +    struct ttm_operation_ctx ctx = {
> >>>>>> +            .no_wait_gpu = true
> >>>>> Iirc there's an eventual shrinker limit where it gets desperate. I think
> >>>>> once we hit that, we should allow gpu waits. But it's not passed to
> >>>>> shrinkers for reasons, so maybe we should have a second round that tries
> >>>>> to more actively shrink objects if we fell substantially short of what
> >>>>> reclaim expected us to do?
> >>>> I think we should try to avoid waiting for the GPU in the shrinker callback.
> >>>>
> >>>> When we get HMM we will have cases where the shrinker is called from
> >>>> there and we can't wait for the GPU then without causing deadlocks.
> >>> Uh that doesn't work. Also, the current rules are that you are allowed
> >>> to call dma_fence_wait from shrinker callbacks, so that shipped sailed
> >>> already. This is because shrinkers are a less restrictive context than
> >>> mmu notifier invalidation, and we wait in there too.
> >>>
> >>> So if you can't wait in shrinkers, you also can't wait in mmu
> >>> notifiers (and also not in HMM, wĥich is the same thing). Why do you
> >>> need this?
> >> The core concept of HMM is that pages are faulted in on demand and it is
> >> perfectly valid for one of those pages to be on disk.
> >>
> >> So when a page fault happens we might need to be able to allocate memory
> >> and fetch something from disk to handle that.
> >>
> >> When this memory allocation then in turn waits for the GPU which is
> >> running the HMM process we are pretty much busted.
> > Yeah you can't do that. That's the entire infinite fences discussions.
>
> Yes, exactly.
>
> > For HMM to work, we need to stop using dma_fence for userspace sync,
>
> I was considering of separating that into a dma_fence and a hmm_fence.
> Or something like this.

The trouble is that dma_fence it all its forms is uapi. And on gpus
without page fault support dma_fence_wait is still required in
allocation contexts. So creating a new kernel structure doesn't really
solve anything I think, it needs entire new uapi completely decoupled
from memory management. Last time we've done new uapi was probably
modifiers, and that's still not rolled out years later.

> > and you can only use the amdkfd style preempt fences. And preempting
> > while the pagefault is pending is I thought something we require.
>
> Yeah, problem is that most hardware can't do that :)
>
> Getting page faults to work is hard enough, preempting while waiting for
> a fault to return is not something which was anticipated :)

Hm last summer in a thread you said you've blocked that because it
doesn't work. I agreed, page fault without preempt is rather tough to
make work.

> > Iow, the HMM page fault handler must not be a dma-fence critical
> > section, i.e. it's not allowed to hold up any dma_fence, ever.
>
> What do you mean with that?

dma_fence_signalling_begin/end() annotations essentially, i.e.
cross-release dependencies. Or the other way round, if you want to be
able to allocate memory you have to guarantee that you're never
holding up a dma_fence.
-Daniel

> > One consequence of this is that you can use HMM for compute, but until
> > we've revamped all the linux winsys layers, not for gl/vk. Or at least
> > I'm not seeing how.
> >
> > Also like I said, dma_fence_wait is already allowed in mmu notifiers,
> > so we've already locked down these semantics even more. Due to the
> > nesting of gfp allocation contexts allowing dma_fence_wait in mmu
> > notifiers (i.e. __GFP_ALLOW_RECLAIM or whatever the flag is exactly)
> > implies it's allowed in shrinkers. And only if you forbid it from from
> > all allocations contexts (which makes all buffer object managed gpu
> > memory essentially pinned, exactly what you're trying to lift here) do
> > you get what you want.
> >
> > The other option is to make HMM and dma-buf completely disjoint worlds
> > with no overlap, and gang scheduling on the gpu (to guarantee that
> > there's never any dma_fence in pending state while an HMM task might
> > cause a fault).
> >
> >> [SNIP]
> >>> So where do you want to recurse here?
> >> I wasn't aware that without __GFP_FS shrinkers are not called.
> > Maybe double check, but that's at least my understanding. GFP flags
> > are flags, but in reality it's a strictly nesting hierarchy:
> > GFP_KERNEL > GFP_NOFS > GFP_NOIO > GFP_RELCAIM > GFP_ATOMIC (ok atomic
> > is special, since it's allowed to dip into emergency reserve).
>
> Going to read myself into that over the holidays.
>
> Christian.



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2020-12-17 15:46 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-16 14:04 [PATCH 1/2] drm/ttm: rework ttm_tt page limit v2 Christian König
2020-12-16 14:04 ` [PATCH 2/2] drm/ttm: move memory accounting into vmwgfx Christian König
2020-12-16 15:26   ` Daniel Vetter
2020-12-16 16:55   ` kernel test robot
2020-12-16 18:19   ` kernel test robot
2020-12-16 15:09 ` [PATCH 1/2] drm/ttm: rework ttm_tt page limit v2 Daniel Vetter
2020-12-17 13:46   ` Christian König
2020-12-17 14:36     ` Daniel Vetter
2020-12-17 15:10       ` Christian König
2020-12-17 15:26         ` Daniel Vetter
2020-12-17 15:35           ` Christian König
2020-12-17 15:45             ` Daniel Vetter [this message]
2020-12-17 18:09               ` Jerome Glisse
2020-12-17 18:19                 ` Daniel Vetter
2020-12-17 18:26                   ` Daniel Vetter
2020-12-17 19:22                     ` Daniel Vetter
2020-12-17 18:40                   ` Jerome Glisse
2020-12-17 19:20                     ` Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKMK7uGgoeF8LmFBwWh5mW1k4xWjuUh3hdSFpVH1NBM7K0=edA@mail.gmail.com' \
    --to=daniel@ffwll.ch \
    --cc=airlied@linux.ie \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-graphics-maintainer@vmware.com \
    --cc=ray.huang@amd.com \
    --cc=sroland@vmware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).