All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel.vetter@ffwll.ch>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: "Michal Hocko" <mhocko@kernel.org>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	"Linux MM" <linux-mm@kvack.org>,
	"DRI Development" <dri-devel@lists.freedesktop.org>,
	"Intel Graphics Development" <intel-gfx@lists.freedesktop.org>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Ingo Molnar" <mingo@redhat.com>,
	"David Rientjes" <rientjes@google.com>,
	"Christian König" <christian.koenig@amd.com>,
	"Jérôme Glisse" <jglisse@redhat.com>,
	"Masahiro Yamada" <yamada.masahiro@socionext.com>,
	"Wei Wang" <wvw@google.com>,
	"Andy Shevchenko" <andriy.shevchenko@linux.intel.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Jann Horn" <jannh@google.com>, "Feng Tang" <feng.tang@intel.com>,
	"Kees Cook" <keescook@chromium.org>,
	"Randy Dunlap" <rdunlap@infradead.org>,
	"Daniel Vetter" <daniel.vetter@intel.com>
Subject: Re: [PATCH 2/5] kernel.h: Add non_block_start/end()
Date: Thu, 15 Aug 2019 18:25:16 +0200	[thread overview]
Message-ID: <CAKMK7uG33FFCGJrDV4-FHT2FWi+Z5SnQ7hoyBQd4hignzm1C-A@mail.gmail.com> (raw)
In-Reply-To: <20190815151028.GJ21596@ziepe.ca>

On Thu, Aug 15, 2019 at 5:10 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
>
> On Thu, Aug 15, 2019 at 04:43:38PM +0200, Daniel Vetter wrote:
>
> > You have to wait for the gpu to finnish current processing in
> > invalidate_range_start. Otherwise there's no point to any of this
> > really. So the wait_event/dma_fence_wait are unavoidable really.
>
> I don't envy your task :|
>
> But, what you describe sure sounds like a 'registration cache' model,
> not the 'shadow pte' model of coherency.
>
> The key difference is that a regirstationcache is allowed to become
> incoherent with the VMA's because it holds page pins. It is a
> programming bug in userspace to change VA mappings via mmap/munmap/etc
> while the device is working on that VA, but it does not harm system
> integrity because of the page pin.
>
> The cache ensures that each initiated operation sees a DMA setup that
> matches the current VA map when the operation is initiated and allows
> expensive device DMA setups to be re-used.
>
> A 'shadow pte' model (ie hmm) *really* needs device support to
> directly block DMA access - ie trigger 'device page fault'. ie the
> invalidate_start should inform the device to enter a fault mode and
> that is it.  If the device can't do that, then the driver probably
> shouldn't persue this level of coherency. The driver would quickly get
> into the messy locking problems like dma_fence_wait from a notifier.
>
> It is important to identify what model you are going for as defining a
> 'registration cache' coherence expectation allows the driver to skip
> blocking in invalidate_range_start. All it does is invalidate the
> cache so that future operations pick up the new VA mapping.
>
> Intel's HFI RDMA driver uses this model extensively, and I think it is
> well proven, within some limitations of course.
>
> At least, 'registration cache' is the only use model I know of where
> it is acceptable to skip invalidate_range_end.

I'm not really well versed in the details of our userptr, but both
amdgpu and i915 wait for the gpu to complete from
invalidate_range_start. Jerome has at least looked a lot at the amdgpu
one, so maybe he can explain what exactly it is we're doing ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

WARNING: multiple messages have this Message-ID (diff)
From: Daniel Vetter <daniel.vetter@ffwll.ch>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: "Feng Tang" <feng.tang@intel.com>,
	"Randy Dunlap" <rdunlap@infradead.org>,
	"Kees Cook" <keescook@chromium.org>,
	"Masahiro Yamada" <yamada.masahiro@socionext.com>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Intel Graphics Development" <intel-gfx@lists.freedesktop.org>,
	"Jann Horn" <jannh@google.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"DRI Development" <dri-devel@lists.freedesktop.org>,
	"Michal Hocko" <mhocko@kernel.org>,
	"Linux MM" <linux-mm@kvack.org>,
	"Jérôme Glisse" <jglisse@redhat.com>,
	"Ingo Molnar" <mingo@redhat.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"David Rientjes" <rientjes@google.com>,
	"Wei Wang" <wvw@google.com>,
	"Daniel Vetter" <daniel.vetter@intel.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Andy Shevchenko" <andriy.shevchenko@linux.intel.com>,
	"Christian König" <christian.koenig@amd.com>
Subject: Re: [PATCH 2/5] kernel.h: Add non_block_start/end()
Date: Thu, 15 Aug 2019 18:25:16 +0200	[thread overview]
Message-ID: <CAKMK7uG33FFCGJrDV4-FHT2FWi+Z5SnQ7hoyBQd4hignzm1C-A@mail.gmail.com> (raw)
In-Reply-To: <20190815151028.GJ21596@ziepe.ca>

On Thu, Aug 15, 2019 at 5:10 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
>
> On Thu, Aug 15, 2019 at 04:43:38PM +0200, Daniel Vetter wrote:
>
> > You have to wait for the gpu to finnish current processing in
> > invalidate_range_start. Otherwise there's no point to any of this
> > really. So the wait_event/dma_fence_wait are unavoidable really.
>
> I don't envy your task :|
>
> But, what you describe sure sounds like a 'registration cache' model,
> not the 'shadow pte' model of coherency.
>
> The key difference is that a regirstationcache is allowed to become
> incoherent with the VMA's because it holds page pins. It is a
> programming bug in userspace to change VA mappings via mmap/munmap/etc
> while the device is working on that VA, but it does not harm system
> integrity because of the page pin.
>
> The cache ensures that each initiated operation sees a DMA setup that
> matches the current VA map when the operation is initiated and allows
> expensive device DMA setups to be re-used.
>
> A 'shadow pte' model (ie hmm) *really* needs device support to
> directly block DMA access - ie trigger 'device page fault'. ie the
> invalidate_start should inform the device to enter a fault mode and
> that is it.  If the device can't do that, then the driver probably
> shouldn't persue this level of coherency. The driver would quickly get
> into the messy locking problems like dma_fence_wait from a notifier.
>
> It is important to identify what model you are going for as defining a
> 'registration cache' coherence expectation allows the driver to skip
> blocking in invalidate_range_start. All it does is invalidate the
> cache so that future operations pick up the new VA mapping.
>
> Intel's HFI RDMA driver uses this model extensively, and I think it is
> well proven, within some limitations of course.
>
> At least, 'registration cache' is the only use model I know of where
> it is acceptable to skip invalidate_range_end.

I'm not really well versed in the details of our userptr, but both
amdgpu and i915 wait for the gpu to complete from
invalidate_range_start. Jerome has at least looked a lot at the amdgpu
one, so maybe he can explain what exactly it is we're doing ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2019-08-15 16:25 UTC|newest]

Thread overview: 130+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-14 20:20 [PATCH 0/5] hmm & mmu_notifier debug/lockdep annotations Daniel Vetter
2019-08-14 20:20 ` [PATCH 1/5] mm: Check if mmu notifier callbacks are allowed to fail Daniel Vetter
2019-08-14 22:14   ` Andrew Morton
2019-08-14 23:22     ` Jason Gunthorpe
2019-08-14 23:34     ` Ralph Campbell
2019-08-16 17:19   ` Jason Gunthorpe
2019-08-14 20:20 ` [PATCH 2/5] kernel.h: Add non_block_start/end() Daniel Vetter
2019-08-14 20:20   ` Daniel Vetter
2019-08-14 20:45   ` Andrew Morton
2019-08-14 20:45     ` Andrew Morton
2019-08-15  6:52     ` Daniel Vetter
2019-08-15  6:52       ` Daniel Vetter
2019-08-15  8:44     ` Michal Hocko
2019-08-15  8:44       ` Michal Hocko
2019-08-15 13:04       ` Jason Gunthorpe
2019-08-15 13:04         ` Jason Gunthorpe
2019-08-15 13:12         ` Daniel Vetter
2019-08-15 13:12           ` Daniel Vetter
2019-08-15 14:37           ` Jason Gunthorpe
2019-08-15 14:37             ` Jason Gunthorpe
2019-08-15 14:43             ` Daniel Vetter
2019-08-15 14:43               ` Daniel Vetter
2019-08-15 15:10               ` Jason Gunthorpe
2019-08-15 15:10                 ` Jason Gunthorpe
2019-08-15 16:25                 ` Daniel Vetter [this message]
2019-08-15 16:25                   ` Daniel Vetter
2019-08-15 17:35                   ` Jason Gunthorpe
2019-08-15 17:35                     ` Jason Gunthorpe
2019-08-15 17:39                     ` Jerome Glisse
2019-08-15 17:39                       ` Jerome Glisse
2019-08-15 18:01                       ` Jason Gunthorpe
2019-08-15 18:01                         ` Jason Gunthorpe
2019-08-15 18:27                         ` Jerome Glisse
2019-08-15 18:27                           ` Jerome Glisse
2019-08-15 18:57                           ` Jason Gunthorpe
2019-08-15 18:57                             ` Jason Gunthorpe
2019-08-15 16:32                 ` Jerome Glisse
2019-08-15 16:32                   ` Jerome Glisse
2019-08-15 17:16                   ` Jason Gunthorpe
2019-08-15 17:16                     ` Jason Gunthorpe
2019-08-15 17:21                     ` Daniel Vetter
2019-08-15 17:21                       ` Daniel Vetter
2019-08-15 17:35                       ` Jerome Glisse
2019-08-15 17:35                         ` Jerome Glisse
2019-08-15 13:24         ` Michal Hocko
2019-08-15 13:24           ` Michal Hocko
2019-08-15 22:15       ` Andrew Morton
2019-08-15 22:15         ` Andrew Morton
2019-08-16  8:24         ` Michal Hocko
2019-08-16  8:24           ` Michal Hocko
2019-08-14 23:58   ` Jason Gunthorpe
2019-08-14 23:58     ` Jason Gunthorpe
2019-08-15  6:58     ` Daniel Vetter
2019-08-15  6:58       ` Daniel Vetter
2019-08-15 12:23       ` Jason Gunthorpe
2019-08-15 12:23         ` Jason Gunthorpe
2019-08-15 13:21         ` Michal Hocko
2019-08-15 13:21           ` Michal Hocko
2019-08-15 14:12           ` Jason Gunthorpe
2019-08-15 14:12             ` Jason Gunthorpe
2019-08-15 16:00             ` Michal Hocko
2019-08-15 16:00               ` Michal Hocko
2019-08-15 16:56               ` Jason Gunthorpe
2019-08-15 16:56                 ` Jason Gunthorpe
2019-08-15 17:11                 ` Jerome Glisse
2019-08-15 17:17                   ` Jason Gunthorpe
2019-08-15 17:42                 ` Michal Hocko
2019-08-15 17:42                   ` Michal Hocko
2019-08-15 17:57                   ` Jerome Glisse
2019-08-15 18:24                   ` Jason Gunthorpe
2019-08-15 18:24                     ` Jason Gunthorpe
2019-08-15 19:05                     ` Michal Hocko
2019-08-15 19:05                       ` Michal Hocko
2019-08-15 19:18                       ` Jason Gunthorpe
2019-08-15 19:18                         ` Jason Gunthorpe
2019-08-15 19:35                         ` Michal Hocko
2019-08-15 19:35                           ` Michal Hocko
2019-08-15 20:13                           ` Jason Gunthorpe
2019-08-15 20:13                             ` Jason Gunthorpe
2019-08-16  8:10                             ` Michal Hocko
2019-08-16  8:10                               ` Michal Hocko
2019-08-16 12:19                               ` Jason Gunthorpe
2019-08-16 12:19                                 ` Jason Gunthorpe
2019-08-16 12:26                                 ` Michal Hocko
2019-08-16 12:26                                   ` Michal Hocko
2019-08-16 14:31                                   ` Jason Gunthorpe
2019-08-16 14:31                                     ` Jason Gunthorpe
2019-08-16 15:05                                     ` Jerome Glisse
2019-08-16 15:05                                       ` Jerome Glisse
2019-08-20  8:18                                     ` Michal Hocko
2019-08-20  8:18                                       ` Michal Hocko
2019-08-15 20:16                           ` [Intel-gfx] " Daniel Vetter
2019-08-15 20:16                             ` Daniel Vetter
2019-08-15 20:27                             ` Jason Gunthorpe
2019-08-15 20:27                               ` Jason Gunthorpe
2019-08-15 20:49                               ` Daniel Vetter
2019-08-15 20:49                                 ` Daniel Vetter
2019-08-16  1:00                                 ` Jason Gunthorpe
2019-08-16  1:00                                   ` Jason Gunthorpe
2019-08-16  6:20                                   ` Daniel Vetter
2019-08-16  6:20                                     ` Daniel Vetter
2019-08-16 12:12                                     ` Jason Gunthorpe
2019-08-16 12:12                                       ` Jason Gunthorpe
2019-08-16 14:11                                       ` Daniel Vetter
2019-08-16 14:11                                         ` Daniel Vetter
2019-08-16 14:38                                         ` Jason Gunthorpe
2019-08-16 14:38                                           ` Jason Gunthorpe
2019-08-16 16:36                                           ` Daniel Vetter
2019-08-16 16:36                                             ` Daniel Vetter
2019-08-16 16:54                                             ` Jason Gunthorpe
2019-08-16 16:54                                               ` Jason Gunthorpe
2019-08-16  8:27                             ` Michal Hocko
2019-08-16  8:27                               ` Michal Hocko
2019-08-14 20:20 ` [PATCH 3/5] mm, notifier: Catch sleeping/blocking for !blockable Daniel Vetter
2019-08-15  0:00   ` Jason Gunthorpe
2019-08-15  7:02     ` Daniel Vetter
2019-08-15 12:35       ` Jason Gunthorpe
2019-08-17 16:09         ` Daniel Vetter
2019-08-17 16:09           ` Daniel Vetter
2019-08-14 20:20 ` [PATCH 4/5] mm, notifier: Add a lockdep map for invalidate_range_start Daniel Vetter
2019-08-15  0:09   ` Jason Gunthorpe
2019-08-15  7:10     ` Daniel Vetter
2019-08-15  7:10       ` Daniel Vetter
2019-08-15 12:53       ` Jason Gunthorpe
2019-08-14 20:20 ` [PATCH 5/5] mm/hmm: WARN on illegal ->sync_cpu_device_pagetables errors Daniel Vetter
2019-08-15  0:11   ` Jason Gunthorpe
2019-08-15  7:14     ` Daniel Vetter
2019-08-15  7:14       ` Daniel Vetter
2019-08-14 21:29 ` ✗ Fi.CI.CHECKPATCH: warning for hmm & mmu_notifier debug/lockdep annotations Patchwork
2019-08-14 21:56 ` ✓ Fi.CI.BAT: success " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKMK7uG33FFCGJrDV4-FHT2FWi+Z5SnQ7hoyBQd4hignzm1C-A@mail.gmail.com \
    --to=daniel.vetter@ffwll.ch \
    --cc=akpm@linux-foundation.org \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=christian.koenig@amd.com \
    --cc=daniel.vetter@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=feng.tang@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=jannh@google.com \
    --cc=jgg@ziepe.ca \
    --cc=jglisse@redhat.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=rientjes@google.com \
    --cc=tglx@linutronix.de \
    --cc=wvw@google.com \
    --cc=yamada.masahiro@socionext.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.