linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Kasireddy, Vivek" <vivek.kasireddy@intel.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Alistair Popple <apopple@nvidia.com>,
	Gerd Hoffmann <kraxel@redhat.com>,
	"Kim, Dongwon" <dongwon.kim@intel.com>,
	David Hildenbrand <david@redhat.com>,
	"Chang, Junxiao" <junxiao.chang@intel.com>,
	Hugh Dickins <hughd@google.com>, Peter Xu <peterx@redhat.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>,
	"Mike Kravetz" <mike.kravetz@oracle.com>
Subject: RE: [RFC v1 1/3] mm/mmu_notifier: Add a new notifier for mapping updates (new pages)
Date: Tue, 25 Jul 2023 22:44:09 +0000	[thread overview]
Message-ID: <CH3PR11MB7177FA18562FCED8A3171007F803A@CH3PR11MB7177.namprd11.prod.outlook.com> (raw)
In-Reply-To: <ZL/B6yvO1bIkFRcF@nvidia.com>

Hi Jason,

> > >
> > > > > I'm not at all familiar with the udmabuf use case but that sounds
> > > > > brittle and effectively makes this notifier udmabuf specific right?
> > > > Oh, Qemu uses the udmabuf driver to provide Host Graphics
> components
> > > > (such as Spice, Gstreamer, UI, etc) zero-copy access to Guest created
> > > > buffers. In other words, from a core mm standpoint, udmabuf just
> > > > collects a bunch of pages (associated with buffers) scattered inside
> > > > the memfd (Guest ram backed by shmem or hugetlbfs) and wraps
> > > > them in a dmabuf fd. And, since we provide zero-copy access, we
> > > > use DMA fences to ensure that the components on the Host and
> > > > Guest do not access the buffer simultaneously.
> > >
> > > So why do you need to track updates proactively like this?
> > As David noted in the earlier series, if Qemu punches a hole in its memfd
> > that goes through pages that are registered against a udmabuf fd, then
> > udmabuf needs to update its list with new pages when the hole gets
> > filled after (guest) writes. Otherwise, we'd run into the coherency
> > problem (between udmabuf and memfd) as demonstrated in the selftest
> > (patch #3 in this series).
> 
> Holes created in VMA are tracked by invalidation, you haven't
> explained why this needs to also see change.
Oh, the invalidation part is ok and does not need any changes. My concern
(and the reason for this new notifier patch) is only about the lack of a
notification when a PTE is updated because of a fault (new page). In other
words, if something like change_pte() would have been called after
handle_pte_fault() or hugetlb_fault(), then this patch would not be needed.

> 
> BTW it is very jarring to hear you talk about files when working with
> mmu notifiers. MMU notifiers do not track hole punches or memfds, they
> track VMAs and PTEs. Punching a hole in a mmapped memfd will
> invalidate the convering PTEs.
I figured describing the problem in terms of memfds or hole punches would
provide more context; but, ok, I'll refrain from mentioning memfds or holes
and limit the discussion of this patch to VMAs and PTEs. 

> 
> > > Trigger a move when the backing memory changes and re-acquire it with
> > AFAICS, without this patch or adding new change_pte calls, there is no way
> to
> > get notified when a new page is mapped into the backing memory of a
> memfd
> > (backed by shmem or hugetlbfs) which happens after a hole punch
> followed
> > by writes.
> 
> Yes, we have never wanted to do this because is it racy.
> 
> If you still need the memory mapped then you re-call hmm_range_fault
> and re-obtain it. hmm_range_fault will resolve all the races and you
> get new pages.
IIUC, for my udmabuf use-case, it looks like calling hmm_range_fault
immediately after an invalidate (range notification) would preemptively fault in
new pages before a write. The problem with that is if a read occurs on those
new pages, then the data is incorrect as a write may not have happened yet.
Ideally, what I am looking for is for getting new pages at the time of or after
a write; until then, it is ok to use the old pages given my use-case.

> 
> > We can definitely get notified when a hole is punched via the
> > invalidate notifiers though, but as I described earlier this is not very helpful
> > for the udmabuf use-case.
> 
> I still don't understand why, or what makes udmabuf so special
> compared to all the other places tracking VMA changes and using
> hmm_range_fault.
I think the difference comes down to whether we (udmabuf driver) want to
grab the new pages after getting notified about a PTE update because of a fault
triggered by a write vs proactively obtaining the new pages by triggering the
fault (since hmm_range_fault() seems to call handle_mm_fault()) before a
potential write.

Thanks,
Vivek

> 
> Jason


  reply	other threads:[~2023-07-25 22:44 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-18  8:28 [RFC v1 0/3] udmabuf: Replace pages when there is FALLOC_FL_PUNCH_HOLE in memfd Vivek Kasireddy
2023-07-18  8:28 ` [RFC v1 1/3] mm/mmu_notifier: Add a new notifier for mapping updates (new pages) Vivek Kasireddy
2023-07-18 15:36   ` Jason Gunthorpe
2023-07-19  0:05     ` Kasireddy, Vivek
2023-07-19  0:24       ` Jason Gunthorpe
2023-07-19  6:19         ` Kasireddy, Vivek
2023-07-19  2:08   ` Alistair Popple
2023-07-20  7:43     ` Kasireddy, Vivek
2023-07-20  9:00       ` Alistair Popple
2023-07-24  7:54         ` Kasireddy, Vivek
2023-07-24 13:35           ` Jason Gunthorpe
2023-07-24 20:32             ` Kasireddy, Vivek
2023-07-25  4:30               ` Hugh Dickins
2023-07-25 22:24                 ` Kasireddy, Vivek
2023-07-27 21:43                   ` Peter Xu
2023-07-29  0:08                     ` Kasireddy, Vivek
2023-07-31 17:05                       ` Peter Xu
2023-08-01  7:11                         ` Kasireddy, Vivek
2023-08-01 21:57                           ` Peter Xu
2023-08-03  8:08                             ` Kasireddy, Vivek
2023-08-03 13:02                               ` Peter Xu
2023-07-25 12:36               ` Jason Gunthorpe
2023-07-25 22:44                 ` Kasireddy, Vivek [this message]
2023-07-25 22:53                   ` Jason Gunthorpe
2023-07-27  7:34                     ` Kasireddy, Vivek
2023-07-27 11:58                       ` Jason Gunthorpe
2023-07-29  0:46                         ` Kasireddy, Vivek
2023-07-30 23:09                           ` Jason Gunthorpe
2023-08-01  5:32                             ` Kasireddy, Vivek
2023-08-01 12:19                               ` Jason Gunthorpe
2023-08-01 12:22                                 ` David Hildenbrand
2023-08-01 12:23                                   ` Jason Gunthorpe
2023-08-01 12:26                                     ` David Hildenbrand
2023-08-01 12:26                                       ` Jason Gunthorpe
2023-08-01 12:28                                         ` David Hildenbrand
2023-08-01 17:53                                           ` Kasireddy, Vivek
2023-08-01 18:19                                             ` Jason Gunthorpe
2023-08-03  7:35                                               ` Kasireddy, Vivek
2023-08-03 12:14                                                 ` Jason Gunthorpe
2023-08-03 12:32                                                   ` David Hildenbrand
2023-08-04  0:14                                                     ` Alistair Popple
2023-08-04  6:39                                                       ` Kasireddy, Vivek
2023-08-04  7:23                                                         ` David Hildenbrand
2023-08-04 21:53                                                           ` Kasireddy, Vivek
2023-08-04 12:49                                                         ` Jason Gunthorpe
2023-08-08  7:37                                                           ` Kasireddy, Vivek
2023-08-08 12:42                                                             ` Jason Gunthorpe
2023-08-16  6:43                                                               ` Kasireddy, Vivek
2023-08-21  9:02                                                                 ` Alistair Popple
2023-08-22  6:14                                                                   ` Kasireddy, Vivek
2023-08-22  8:15                                                                     ` Alistair Popple
2023-08-24  6:48                                                                       ` Kasireddy, Vivek
2023-08-28  4:38                                                                         ` Kasireddy, Vivek
2023-08-30 16:02                                                                           ` Jason Gunthorpe
2023-07-25  3:38             ` Alistair Popple
2023-07-24 13:36           ` Alistair Popple
2023-07-24 13:37             ` Jason Gunthorpe
2023-07-24 20:42             ` Kasireddy, Vivek
2023-07-25  3:14               ` Alistair Popple
2023-07-18  8:28 ` [RFC v1 2/3] udmabuf: Replace pages when there is FALLOC_FL_PUNCH_HOLE in memfd Vivek Kasireddy
2023-08-02 12:40   ` Daniel Vetter
2023-08-03  8:24     ` Kasireddy, Vivek
2023-08-03  8:32       ` Daniel Vetter
2023-07-18  8:28 ` [RFC v1 3/3] selftests/dma-buf/udmabuf: Add tests for huge pages and FALLOC_FL_PUNCH_HOLE Vivek Kasireddy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CH3PR11MB7177FA18562FCED8A3171007F803A@CH3PR11MB7177.namprd11.prod.outlook.com \
    --to=vivek.kasireddy@intel.com \
    --cc=apopple@nvidia.com \
    --cc=david@redhat.com \
    --cc=dongwon.kim@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hughd@google.com \
    --cc=jgg@nvidia.com \
    --cc=junxiao.chang@intel.com \
    --cc=kraxel@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=peterx@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).