linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: "Thomas Hellström (Intel)" <thomas_os@shipmail.org>,
	"DRI Development" <dri-devel@lists.freedesktop.org>,
	linux-rdma <linux-rdma@vger.kernel.org>,
	"Intel Graphics Development" <intel-gfx@lists.freedesktop.org>,
	"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"amd-gfx list" <amd-gfx@lists.freedesktop.org>,
	"moderated list:DMA BUFFER SHARING FRAMEWORK"
	<linaro-mm-sig@lists.linaro.org>,
	"Thomas Hellstrom" <thomas.hellstrom@intel.com>,
	"Daniel Vetter" <daniel.vetter@intel.com>,
	"open list:DMA BUFFER SHARING FRAMEWORK"
	<linux-media@vger.kernel.org>,
	"Christian König" <christian.koenig@amd.com>,
	"Mika Kuoppala" <mika.kuoppala@intel.com>
Subject: Re: [Linaro-mm-sig] [PATCH 04/18] dma-fence: prime lockdep annotations
Date: Wed, 17 Jun 2020 08:48:50 +0200	[thread overview]
Message-ID: <CAKMK7uE7DKUo9Z+yCpY+mW5gmKet8ugbF3yZNyHGqsJ=e-g_hA@mail.gmail.com> (raw)
In-Reply-To: <20200616120719.GL20149@phenom.ffwll.local>

On Tue, Jun 16, 2020 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> Hi Jason,
>
> Somehow this got stuck somewhere in the mail queues, only popped up just
> now ...
>
> On Thu, Jun 11, 2020 at 11:15:15AM -0300, Jason Gunthorpe wrote:
> > On Thu, Jun 11, 2020 at 10:34:30AM +0200, Daniel Vetter wrote:
> > > > I still have my doubts about allowing fence waiting from within shrinkers.
> > > > IMO ideally they should use a trywait approach, in order to allow memory
> > > > allocation during command submission for drivers that
> > > > publish fences before command submission. (Since early reservation object
> > > > release requires that).
> > >
> > > Yeah it is a bit annoying, e.g. for drm/scheduler I think we'll end up
> > > with a mempool to make sure it can handle it's allocations.
> > >
> > > > But since drivers are already waiting from within shrinkers and I take your
> > > > word for HMM requiring this,
> > >
> > > Yeah the big trouble is HMM and mmu notifiers. That's the really awkward
> > > one, the shrinker one is a lot less established.
> >
> > I really question if HW that needs something like DMA fence should
> > even be using mmu notifiers - the best use is HW that can fence the
> > DMA directly without having to get involved with some command stream
> > processing.
> >
> > Or at the very least it should not be a generic DMA fence but a
> > narrowed completion tied only into the same GPU driver's command
> > completion processing which should be able to progress without
> > blocking.
>
> The problem with gpus is that these completions leak across the board like
> mad. Both internally within memory managers (made a lot worse with p2p
> direct access to vram), and through uapi.
>
> Many gpus still have a very hard time preempting, so doing an overall
> switch in drivers/gpu to a memory management model where that is required
> is not a very realistic option.  And minimally you need either preempt
> (still takes a while, but a lot faster generally than waiting for work to
> complete) or hw faults (just a bunch of tlb flushes plus virtual indexed
> caches, so just the caveat of that for a gpu, which has lots and big tlbs
> and caches). So preventing the completion leaks within the kernel is I
> think unrealistic, except if we just say "well sorry, run on windows,
> mkay" for many gpu workloads. Or more realistic "well sorry, run on the
> nvidia blob with nvidia hw".
>
> The userspace side we can somewhat isolate, at least for pure compute
> workloads. But the thing is drivers/gpu is a continum from tiny socs
> (where dma_fence is a very nice model) to huge compute stuff (where it's
> maybe not the nicest, but hey hw sucks so still neeeded). Doing full on
> break in uapi somewhere in there is at least a bit awkward, e.g. some of
> the media codec code on intel runs all the way from the smallest intel soc
> to the big transcode servers.
>
> So the current status quo is "total mess, every driver defines their own
> rules". All I'm trying to do is some common rules here, do make this mess
> slightly more manageable and overall reviewable and testable.
>
> I have no illusions that this is fundamentally pretty horrible, and the
> leftover wiggle room for writing memory manager is barely more than a
> hairline. Just not seeing how other options are better.

So bad news is that gpu's are horrible, but I think if you don't have
to review gpu drivers it's substantially better. If you do have hw
with full device page fault support, then there's no need to ever
install a dma_fence. Punching out device ptes and flushing caches is
all that's needed. That is also the plan we have, for the workloads
and devices where that's possible.

Now my understanding for rdma is that if you don't have hw page fault
support, then the only other object is to more or less permanently pin
the memory. So again, dma_fence are completely useless, since it's
entirely up to userspace when a given piece of registered memory isn't
needed anymore, and the entire problem boils down to how much do we
allow random userspace to just pin (system or device) memory. Or at
least I don't really see any other solution.

On the other end we have simpler devices like video input/output.
Those always need pinned memory, but through hw design it's limited in
how much you can pin (generally max resolution times a limited set of
buffers to cycle through). Just including that memory pinning
allowance as part of device access makes sense.

It's only gpus (I think) which are in this awkward in-between spot
where dynamic memory management really is much wanted, but the hw
kinda sucks. Aside, about 10+ years ago we had a similar problem with
gpu hw, but for security: Many gpu didn't have any kinds of page
tables to isolate different clients from each another. drivers/gpu
fixed this by parsing&validating what userspace submitted to make sure
it's only every accessing its own buffers. Most gpus have become
reasonable nowadays and do have proper per-process pagetables (gpu
process, not the pasid stuff), but even today there's still some of
the old model left in some of the smallest SoC.

tldr; of all this: gpus kinda suck sometimes, but  that's also not news :-/

Cheers, Daniel

> > The intent of notifiers was never to endlessly block while vast
> > amounts of SW does work.
> >
> > Going around and switching everything in a GPU to GFP_ATOMIC seems
> > like bad idea.
>
> It's not everyone, or at least not everywhere, it's some fairly limited
> cases. Also, even if we drop the mmu_notifier on the floor, then we're
> stuck with shrinkers and GFP_NOFS. Still need a mempool of some sorts to
> guarantee you get out of a bind, so not much better.
>
> At least that's my current understanding of where we are across all
> drivers.
>
> > > I've pinged a bunch of armsoc gpu driver people and ask them how much this
> > > hurts, so that we have a clear answer. On x86 I don't think we have much
> > > of a choice on this, with userptr in amd and i915 and hmm work in nouveau
> > > (but nouveau I think doesn't use dma_fence in there).
> >
> > Right, nor will RDMA ODP.
>
> Hm, what's the context here? I thought RDMA side you really don't want
> dma_fence in mmu_notifiers, so not clear to me what you're agreeing on
> here.
> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

  parent reply	other threads:[~2020-06-17  6:49 UTC|newest]

Thread overview: 101+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-04  8:12 [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
2020-06-04  8:12 ` [PATCH 01/18] mm: Track mmu notifiers in fs_reclaim_acquire/release Daniel Vetter
2020-06-10 12:01   ` Thomas Hellström (Intel)
2020-06-10 12:25     ` [Intel-gfx] " Daniel Vetter
2020-06-10 19:41   ` [PATCH] " Daniel Vetter
2020-06-11 14:29     ` Jason Gunthorpe
2020-06-21 17:42     ` Qian Cai
2020-06-21 18:07       ` Daniel Vetter
2020-06-21 20:01         ` Daniel Vetter
2020-06-21 22:09           ` Qian Cai
2020-06-23 16:17           ` Qian Cai
2020-06-23 22:13             ` Daniel Vetter
2020-06-23 22:29               ` Qian Cai
2020-06-23 22:31       ` Dave Chinner
2020-06-23 22:36         ` Daniel Vetter
2020-06-21 17:00   ` [PATCH 01/18] " Qian Cai
2020-06-21 17:28     ` Daniel Vetter
2020-06-21 17:46       ` Qian Cai
2020-06-04  8:12 ` [PATCH 02/18] dma-buf: minor doc touch-ups Daniel Vetter
2020-06-10 13:07   ` Thomas Hellström (Intel)
2020-06-04  8:12 ` [PATCH 03/18] dma-fence: basic lockdep annotations Daniel Vetter
2020-06-04  8:57   ` Thomas Hellström (Intel)
2020-06-04  9:21     ` Daniel Vetter
     [not found]       ` <159126281827.25109.3992161193069793005@build.alporthouse.com>
2020-06-04  9:36         ` [Intel-gfx] " Daniel Vetter
2020-06-05 13:29   ` [PATCH] " Daniel Vetter
2020-06-05 14:30     ` Thomas Hellström (Intel)
2020-06-11  9:57     ` Maarten Lankhorst
2020-06-10 14:21   ` [Intel-gfx] [PATCH 03/18] " Tvrtko Ursulin
2020-06-10 15:17     ` Daniel Vetter
2020-06-11 10:36       ` Tvrtko Ursulin
2020-06-11 11:29         ` Daniel Vetter
2020-06-11 14:29           ` Tvrtko Ursulin
2020-06-11 15:03             ` Daniel Vetter
     [not found]   ` <159186243606.1506.4437341616828968890@build.alporthouse.com>
2020-06-11  8:44     ` Dave Airlie
2020-06-11  9:01       ` [Intel-gfx] " Daniel Stone
     [not found]         ` <159255511144.7737.12635440776531222029@build.alporthouse.com>
2020-06-19  8:51           ` Daniel Vetter
     [not found]             ` <159255801588.7737.4425728073225310839@build.alporthouse.com>
2020-06-19  9:43               ` Daniel Vetter
     [not found]                 ` <159257233754.7737.17318605310513355800@build.alporthouse.com>
2020-06-22  9:16                   ` Daniel Vetter
2020-07-09  7:29                 ` Daniel Stone
2020-07-09  8:01                   ` Daniel Vetter
2020-06-12  7:06   ` [PATCH] " Daniel Vetter
2020-06-04  8:12 ` [PATCH 04/18] dma-fence: prime " Daniel Vetter
2020-06-11  7:30   ` [Linaro-mm-sig] " Thomas Hellström (Intel)
2020-06-11  8:34     ` Daniel Vetter
2020-06-11 14:15       ` Jason Gunthorpe
2020-06-11 23:35         ` Felix Kuehling
2020-06-12  5:11           ` Daniel Vetter
2020-06-19 18:13           ` Jerome Glisse
2020-06-23  7:39           ` Daniel Vetter
2020-06-23 18:44             ` Felix Kuehling
2020-06-23 19:02               ` Daniel Vetter
2020-06-16 12:07         ` Daniel Vetter
2020-06-16 14:53           ` Jason Gunthorpe
2020-06-17  7:57             ` Daniel Vetter
2020-06-17 15:29               ` Jason Gunthorpe
2020-06-18 14:42                 ` Daniel Vetter
2020-06-17  6:48           ` Daniel Vetter [this message]
2020-06-17 15:28             ` Jason Gunthorpe
2020-06-18 15:00               ` Daniel Vetter
2020-06-18 17:23                 ` Jason Gunthorpe
2020-06-19  7:22                   ` Daniel Vetter
2020-06-19 11:39                     ` Jason Gunthorpe
2020-06-19 15:06                       ` Daniel Vetter
2020-06-19 15:15                         ` Jason Gunthorpe
2020-06-19 16:19                           ` Daniel Vetter
2020-06-19 17:23                             ` Jason Gunthorpe
2020-06-19 18:09                               ` Jerome Glisse
2020-06-19 18:18                                 ` Jason Gunthorpe
2020-06-19 19:48                                   ` Felix Kuehling
2020-06-19 19:55                                     ` Jason Gunthorpe
2020-06-19 20:03                                       ` Felix Kuehling
2020-06-19 20:31                                       ` Jerome Glisse
2020-06-22 11:46                                         ` Jason Gunthorpe
2020-06-22 20:15                                           ` Jerome Glisse
2020-06-23  0:02                                             ` Jason Gunthorpe
2020-06-19 20:10                                   ` Jerome Glisse
2020-06-19 20:43                                     ` Daniel Vetter
2020-06-19 20:59                                       ` Jerome Glisse
2020-06-23  0:05                                     ` Jason Gunthorpe
2020-06-19 19:11                                 ` Alex Deucher
2020-06-19 19:30                                   ` Felix Kuehling
2020-06-19 19:40                                     ` Jerome Glisse
2020-06-19 19:51                                     ` Jason Gunthorpe
2020-06-12  7:01   ` [PATCH] " Daniel Vetter
2020-06-04  8:12 ` [PATCH 05/18] drm/vkms: Annotate vblank timer Daniel Vetter
2020-06-04  8:12 ` [PATCH 06/18] drm/vblank: Annotate with dma-fence signalling section Daniel Vetter
2020-06-04  8:12 ` [PATCH 07/18] drm/atomic-helper: Add dma-fence annotations Daniel Vetter
2020-06-04  8:12 ` [PATCH 08/18] drm/amdgpu: add dma-fence annotations to atomic commit path Daniel Vetter
2020-06-23 10:51   ` Daniel Vetter
2020-06-04  8:12 ` [PATCH 09/18] drm/scheduler: use dma-fence annotations in main thread Daniel Vetter
2020-06-04  8:12 ` [PATCH 10/18] drm/amdgpu: use dma-fence annotations in cs_submit() Daniel Vetter
2020-06-04  8:12 ` [PATCH 11/18] drm/amdgpu: s/GFP_KERNEL/GFP_ATOMIC in scheduler code Daniel Vetter
2020-06-04  8:12 ` [PATCH 12/18] drm/amdgpu: DC also loves to allocate stuff where it shouldn't Daniel Vetter
2020-06-04  8:12 ` [PATCH 13/18] drm/amdgpu/dc: Stop dma_resv_lock inversion in commit_tail Daniel Vetter
2020-06-05  8:30   ` Pierre-Eric Pelloux-Prayer
2020-06-05 12:41     ` Daniel Vetter
2020-06-04  8:12 ` [PATCH 14/18] drm/scheduler: use dma-fence annotations in tdr work Daniel Vetter
2020-06-04  8:12 ` [PATCH 15/18] drm/amdgpu: use dma-fence annotations for gpu reset code Daniel Vetter
2020-06-04  8:12 ` [PATCH 16/18] Revert "drm/amdgpu: add fbdev suspend/resume on gpu reset" Daniel Vetter
2020-06-04  8:12 ` [PATCH 17/18] drm/amdgpu: gpu recovery does full modesets Daniel Vetter
2020-06-04  8:12 ` [PATCH 18/18] drm/i915: Annotate dma_fence_work Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKMK7uE7DKUo9Z+yCpY+mW5gmKet8ugbF3yZNyHGqsJ=e-g_hA@mail.gmail.com' \
    --to=daniel@ffwll.ch \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=christian.koenig@amd.com \
    --cc=daniel.vetter@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=jgg@ziepe.ca \
    --cc=linaro-mm-sig@lists.linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=mika.kuoppala@intel.com \
    --cc=thomas.hellstrom@intel.com \
    --cc=thomas_os@shipmail.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).