From: Jason Gunthorpe <jgg@ziepe.ca>
To: Felix Kuehling <felix.kuehling@amd.com>
Cc: linux-rdma <linux-rdma@vger.kernel.org>,
"Thomas Hellström (Intel)" <thomas_os@shipmail.org>,
"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
LKML <linux-kernel@vger.kernel.org>,
"DRI Development" <dri-devel@lists.freedesktop.org>,
"moderated list:DMA BUFFER SHARING FRAMEWORK"
<linaro-mm-sig@lists.linaro.org>,
"Jerome Glisse" <jglisse@redhat.com>,
"Thomas Hellstrom" <thomas.hellstrom@intel.com>,
"amd-gfx list" <amd-gfx@lists.freedesktop.org>,
"Daniel Vetter" <daniel@ffwll.ch>,
"Daniel Vetter" <daniel.vetter@intel.com>,
"open list:DMA BUFFER SHARING FRAMEWORK"
<linux-media@vger.kernel.org>,
"Intel Graphics Development" <intel-gfx@lists.freedesktop.org>,
"Christian König" <christian.koenig@amd.com>,
"Mika Kuoppala" <mika.kuoppala@intel.com>
Subject: Re: [Linaro-mm-sig] [PATCH 04/18] dma-fence: prime lockdep annotations
Date: Fri, 19 Jun 2020 16:55:38 -0300 [thread overview]
Message-ID: <20200619195538.GT6578@ziepe.ca> (raw)
In-Reply-To: <56008d64-772d-5757-6136-f20591ef71d2@amd.com>
On Fri, Jun 19, 2020 at 03:48:49PM -0400, Felix Kuehling wrote:
> Am 2020-06-19 um 2:18 p.m. schrieb Jason Gunthorpe:
> > On Fri, Jun 19, 2020 at 02:09:35PM -0400, Jerome Glisse wrote:
> >> On Fri, Jun 19, 2020 at 02:23:08PM -0300, Jason Gunthorpe wrote:
> >>> On Fri, Jun 19, 2020 at 06:19:41PM +0200, Daniel Vetter wrote:
> >>>
> >>>> The madness is only that device B's mmu notifier might need to wait
> >>>> for fence_B so that the dma operation finishes. Which in turn has to
> >>>> wait for device A to finish first.
> >>> So, it sound, fundamentally you've got this graph of operations across
> >>> an unknown set of drivers and the kernel cannot insert itself in
> >>> dma_fence hand offs to re-validate any of the buffers involved?
> >>> Buffers which by definition cannot be touched by the hardware yet.
> >>>
> >>> That really is a pretty horrible place to end up..
> >>>
> >>> Pinning really is right answer for this kind of work flow. I think
> >>> converting pinning to notifers should not be done unless notifier
> >>> invalidation is relatively bounded.
> >>>
> >>> I know people like notifiers because they give a bit nicer performance
> >>> in some happy cases, but this cripples all the bad cases..
> >>>
> >>> If pinning doesn't work for some reason maybe we should address that?
> >> Note that the dma fence is only true for user ptr buffer which predate
> >> any HMM work and thus were using mmu notifier already. You need the
> >> mmu notifier there because of fork and other corner cases.
> > I wonder if we should try to fix the fork case more directly - RDMA
> > has this same problem and added MADV_DONTFORK a long time ago as a
> > hacky way to deal with it.
> >
> > Some crazy page pin that resolved COW in a way that always kept the
> > physical memory with the mm that initiated the pin?
> >
> > (isn't this broken for O_DIRECT as well anyhow?)
> >
> > How does mmu_notifiers help the fork case anyhow? Block fork from
> > progressing?
>
> How much the mmu_notifier blocks fork progress depends, on quickly we
> can preempt GPU jobs accessing affected memory. If we don't have
> fine-grained preemption capability (graphics), the best we can do is
> wait for the GPU jobs to complete. We can also delay submission of new
> GPU jobs to the same memory until the MMU notifier is done. Future jobs
> would use the new page addresses.
>
> With fine-grained preemption (ROCm compute), we can preempt GPU work on
> the affected adders space to minimize the delay seen by fork.
>
> With recoverable device page faults, we can invalidate GPU page table
> entries, so device access to the affected pages stops immediately.
>
> In all cases, the end result is, that the device page table gets updated
> with the address of the copied pages before the GPU accesses the COW
> memory again.Without the MMU notifier, we'd end up with the GPU
> corrupting memory of the other process.
The model here in fork has been wrong for a long time, and I do wonder
how O_DIRECT manages to not be broken too.. I guess the time windows
there are too small to get unlucky.
If you have a write pin on a page then it should not be COW'd into the
fork'd process but copied with the originating page remaining with the
original mm.
I wonder if there is some easy way to achive that - if that is the
main reason to use notifiers then it would be a better solution.
Jason
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
next prev parent reply other threads:[~2020-06-19 20:02 UTC|newest]
Thread overview: 106+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-04 8:12 [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
2020-06-04 8:12 ` [PATCH 01/18] mm: Track mmu notifiers in fs_reclaim_acquire/release Daniel Vetter
2020-06-10 12:01 ` Thomas Hellström (Intel)
2020-06-10 12:25 ` [Intel-gfx] " Daniel Vetter
2020-06-10 19:41 ` [PATCH] " Daniel Vetter
2020-06-11 14:29 ` Jason Gunthorpe
2020-06-21 17:42 ` Qian Cai
2020-06-21 18:07 ` Daniel Vetter
2020-06-21 20:01 ` Daniel Vetter
2020-06-21 22:09 ` Qian Cai
2020-06-23 16:17 ` Qian Cai
2020-06-23 22:13 ` Daniel Vetter
2020-06-23 22:29 ` Qian Cai
2020-06-23 22:31 ` Dave Chinner
2020-06-23 22:36 ` Daniel Vetter
2020-06-21 17:00 ` [PATCH 01/18] " Qian Cai
2020-06-21 17:28 ` Daniel Vetter
2020-06-21 17:46 ` Qian Cai
2020-06-04 8:12 ` [PATCH 02/18] dma-buf: minor doc touch-ups Daniel Vetter
2020-06-10 13:07 ` Thomas Hellström (Intel)
2020-06-04 8:12 ` [PATCH 03/18] dma-fence: basic lockdep annotations Daniel Vetter
2020-06-04 8:57 ` Thomas Hellström (Intel)
2020-06-04 9:21 ` Daniel Vetter
2020-06-04 9:26 ` Chris Wilson
2020-06-04 9:36 ` [Intel-gfx] " Daniel Vetter
2020-06-05 13:29 ` [PATCH] " Daniel Vetter
2020-06-05 14:30 ` Thomas Hellström (Intel)
2020-06-11 9:57 ` Maarten Lankhorst
2020-06-10 14:21 ` [Intel-gfx] [PATCH 03/18] " Tvrtko Ursulin
2020-06-10 15:17 ` Daniel Vetter
2020-06-11 10:36 ` Tvrtko Ursulin
2020-06-11 11:29 ` Daniel Vetter
2020-06-11 14:29 ` Tvrtko Ursulin
2020-06-11 15:03 ` Daniel Vetter
2020-06-11 8:00 ` Chris Wilson
2020-06-11 8:44 ` Dave Airlie
2020-06-11 9:01 ` [Intel-gfx] " Daniel Stone
2020-06-19 8:25 ` Chris Wilson
2020-06-19 8:51 ` Daniel Vetter
2020-06-19 9:13 ` Chris Wilson
2020-06-19 9:43 ` Daniel Vetter
2020-06-19 13:12 ` Chris Wilson
2020-06-22 9:16 ` Daniel Vetter
2020-07-09 7:29 ` Daniel Stone
2020-07-09 8:01 ` Daniel Vetter
2020-06-12 7:06 ` [PATCH] " Daniel Vetter
2020-06-04 8:12 ` [PATCH 04/18] dma-fence: prime " Daniel Vetter
2020-06-11 7:30 ` [Linaro-mm-sig] " Thomas Hellström (Intel)
2020-06-11 8:34 ` Daniel Vetter
2020-06-11 14:15 ` Jason Gunthorpe
2020-06-11 23:35 ` Felix Kuehling
2020-06-12 5:11 ` Daniel Vetter
2020-06-19 18:13 ` Jerome Glisse
2020-06-23 7:39 ` Daniel Vetter
2020-06-23 18:44 ` Felix Kuehling
2020-06-23 19:02 ` Daniel Vetter
2020-06-16 12:07 ` Daniel Vetter
2020-06-16 14:53 ` Jason Gunthorpe
2020-06-17 7:57 ` Daniel Vetter
2020-06-17 15:29 ` Jason Gunthorpe
2020-06-18 14:42 ` Daniel Vetter
2020-06-17 6:48 ` Daniel Vetter
2020-06-17 15:28 ` Jason Gunthorpe
2020-06-18 15:00 ` Daniel Vetter
2020-06-18 17:23 ` Jason Gunthorpe
2020-06-19 7:22 ` Daniel Vetter
2020-06-19 11:39 ` Jason Gunthorpe
2020-06-19 15:06 ` Daniel Vetter
2020-06-19 15:15 ` Jason Gunthorpe
2020-06-19 16:19 ` Daniel Vetter
2020-06-19 17:23 ` Jason Gunthorpe
2020-06-19 18:09 ` Jerome Glisse
2020-06-19 18:18 ` Jason Gunthorpe
2020-06-19 19:48 ` Felix Kuehling
2020-06-19 19:55 ` Jason Gunthorpe [this message]
2020-06-19 20:03 ` Felix Kuehling
2020-06-19 20:31 ` Jerome Glisse
2020-06-22 11:46 ` Jason Gunthorpe
2020-06-22 20:15 ` Jerome Glisse
2020-06-23 0:02 ` Jason Gunthorpe
2020-06-19 20:10 ` Jerome Glisse
2020-06-19 20:43 ` Daniel Vetter
2020-06-19 20:59 ` Jerome Glisse
2020-06-23 0:05 ` Jason Gunthorpe
2020-06-19 19:11 ` Alex Deucher
2020-06-19 19:30 ` Felix Kuehling
2020-06-19 19:40 ` Jerome Glisse
2020-06-19 19:51 ` Jason Gunthorpe
2020-06-12 7:01 ` [PATCH] " Daniel Vetter
2020-06-04 8:12 ` [PATCH 05/18] drm/vkms: Annotate vblank timer Daniel Vetter
2020-06-04 8:12 ` [PATCH 06/18] drm/vblank: Annotate with dma-fence signalling section Daniel Vetter
2020-06-04 8:12 ` [PATCH 07/18] drm/atomic-helper: Add dma-fence annotations Daniel Vetter
2020-06-04 8:12 ` [PATCH 08/18] drm/amdgpu: add dma-fence annotations to atomic commit path Daniel Vetter
2020-06-23 10:51 ` Daniel Vetter
2020-06-04 8:12 ` [PATCH 09/18] drm/scheduler: use dma-fence annotations in main thread Daniel Vetter
2020-06-04 8:12 ` [PATCH 10/18] drm/amdgpu: use dma-fence annotations in cs_submit() Daniel Vetter
2020-06-04 8:12 ` [PATCH 11/18] drm/amdgpu: s/GFP_KERNEL/GFP_ATOMIC in scheduler code Daniel Vetter
2020-06-04 8:12 ` [PATCH 12/18] drm/amdgpu: DC also loves to allocate stuff where it shouldn't Daniel Vetter
2020-06-04 8:12 ` [PATCH 13/18] drm/amdgpu/dc: Stop dma_resv_lock inversion in commit_tail Daniel Vetter
2020-06-05 8:30 ` Pierre-Eric Pelloux-Prayer
2020-06-05 12:41 ` Daniel Vetter
2020-06-04 8:12 ` [PATCH 14/18] drm/scheduler: use dma-fence annotations in tdr work Daniel Vetter
2020-06-04 8:12 ` [PATCH 15/18] drm/amdgpu: use dma-fence annotations for gpu reset code Daniel Vetter
2020-06-04 8:12 ` [PATCH 16/18] Revert "drm/amdgpu: add fbdev suspend/resume on gpu reset" Daniel Vetter
2020-06-04 8:12 ` [PATCH 17/18] drm/amdgpu: gpu recovery does full modesets Daniel Vetter
2020-06-04 8:12 ` [PATCH 18/18] drm/i915: Annotate dma_fence_work Daniel Vetter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200619195538.GT6578@ziepe.ca \
--to=jgg@ziepe.ca \
--cc=amd-gfx@lists.freedesktop.org \
--cc=christian.koenig@amd.com \
--cc=daniel.vetter@intel.com \
--cc=daniel@ffwll.ch \
--cc=dri-devel@lists.freedesktop.org \
--cc=felix.kuehling@amd.com \
--cc=intel-gfx@lists.freedesktop.org \
--cc=jglisse@redhat.com \
--cc=linaro-mm-sig@lists.linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-media@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=maarten.lankhorst@linux.intel.com \
--cc=mika.kuoppala@intel.com \
--cc=thomas.hellstrom@intel.com \
--cc=thomas_os@shipmail.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).