From: "Kasireddy, Vivek" <vivek.kasireddy@intel.com>
To: Alistair Popple <apopple@nvidia.com>,
David Hildenbrand <david@redhat.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>,
"Kim, Dongwon" <dongwon.kim@intel.com>,
"Chang, Junxiao" <junxiao.chang@intel.com>,
"dri-devel@lists.freedesktop.org"
<dri-devel@lists.freedesktop.org>,
Hugh Dickins <hughd@google.com>, Peter Xu <peterx@redhat.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
Gerd Hoffmann <kraxel@redhat.com>,
Mike Kravetz <mike.kravetz@oracle.com>
Subject: RE: [RFC v1 1/3] mm/mmu_notifier: Add a new notifier for mapping updates (new pages)
Date: Fri, 4 Aug 2023 06:39:22 +0000 [thread overview]
Message-ID: <IA0PR11MB7185974FA204015EA3B74066F809A@IA0PR11MB7185.namprd11.prod.outlook.com> (raw)
In-Reply-To: <87cz0364kx.fsf@nvdebian.thelocal>
Hi Alistair, David, Jason,
> >>>>>> Right, the "the zero pages are changed into writable pages" in your
> >>>>>> above comment just might not apply, because there won't be any
> page
> >>>>>> replacement (hopefully :) ).
> >>>>
> >>>>> If the page replacement does not happen when there are new writes
> to the
> >>>>> area where the hole previously existed, then would we still get an
> >>>> invalidate
> >>>>> when this happens? Is there any other way to get notified when the
> zeroed
> >>>>> page is written to if the invalidate does not get triggered?
> >>>>
> >>>> What David is saying is that memfd does not use the zero page
> >>>> optimization for hole punches. Any access to the memory, including
> >>>> read-only access through hmm_range_fault() will allocate unique
> >>>> pages. Since there is no zero page and no zero-page replacement there
> >>>> is no issue with invalidations.
> >>
> >>> It looks like even with hmm_range_fault(), the invalidate does not get
> >>> triggered when the hole is refilled with new pages because of writes.
> >>> This is probably because hmm_range_fault() does not fault in any pages
> >>> that get invalidated later when writes occur.
> >> hmm_range_fault() returns the current content of the VMAs, or it
> >> faults. If it returns pages then it came from one of these two places.
> >> If your VMA is incoherent with what you are doing then you have
> >> bigger
> >> problems, or maybe you found a bug.
>
> Note it will only fault in pages if HMM_PFN_REQ_FAULT is specified. You
> are setting that however you aren't setting HMM_PFN_REQ_WRITE which is
> what would trigger a fault to bring in the new pages. Does setting that
> fix the issue you are seeing?
No, adding HMM_PFN_REQ_WRITE still doesn't help in fixing the issue.
Although, I do not have THP enabled (or built-in), shmem does not evict
the pages after hole punch as noted in the comment in shmem_fallocate():
if ((u64)unmap_end > (u64)unmap_start)
unmap_mapping_range(mapping, unmap_start,
1 + unmap_end - unmap_start, 0);
shmem_truncate_range(inode, offset, offset + len - 1);
/* No need to unmap again: hole-punching leaves COWed pages */
As a result, the pfn is still valid and the pte is pte_present() and pte_write().
This is the reason why adding in HMM_PFN_REQ_WRITE does not help;
because, it fails the below condition in hmm_pte_need_fault():
if ((pfn_req_flags & HMM_PFN_REQ_WRITE) &&
!(cpu_flags & HMM_PFN_WRITE))
return HMM_NEED_FAULT | HMM_NEED_WRITE_FAULT;
If I force it to read-fault or write-fault (by hacking hmm_pte_need_fault()),
it gets indefinitely stuck in the do while loop in hmm_range_fault().
AFAIU, unless there is a way to fault-in zero pages (or any scratch pages)
after hole punch that get invalidated because of writes, I do not see how
using hmm_range_fault() can help with my use-case.
Thanks,
Vivek
>
> >>> The above log messages are seen immediately after the hole is punched.
> As
> >>> you can see, hmm_range_fault() returns the pfns of old pages and not
> zero
> >>> pages. And, I see the below messages (with patch #2 in this series
> applied)
> >>> as the hole is refilled after writes:
> >> I don't know what you are doing, but it is something wrong or you've
> >> found a bug in the memfds.
> >
> >
> > Maybe THP is involved? I recently had to dig that out for an internal
> > discussion:
> >
> > "Currently when truncating shmem file, if the range is partial of THP
> > (start or end is in the middle of THP), the pages actually will just get
> > cleared rather than being freed unless the range cover the whole THP.
> > Even though all the subpages are truncated (randomly or sequentially),
> > the THP may still be kept in page cache. This might be fine for some
> > usecases which prefer preserving THP."
> >
> > My recollection is that this behavior was never changed.
> >
> > https://lore.kernel.org/all/1575420174-19171-1-git-send-email-
> yang.shi@linux.alibaba.com/
next prev parent reply other threads:[~2023-08-04 6:39 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-18 8:28 [RFC v1 0/3] udmabuf: Replace pages when there is FALLOC_FL_PUNCH_HOLE in memfd Vivek Kasireddy
2023-07-18 8:28 ` [RFC v1 1/3] mm/mmu_notifier: Add a new notifier for mapping updates (new pages) Vivek Kasireddy
2023-07-18 15:36 ` Jason Gunthorpe
2023-07-19 0:05 ` Kasireddy, Vivek
2023-07-19 0:24 ` Jason Gunthorpe
2023-07-19 6:19 ` Kasireddy, Vivek
2023-07-19 2:08 ` Alistair Popple
2023-07-20 7:43 ` Kasireddy, Vivek
2023-07-20 9:00 ` Alistair Popple
2023-07-24 7:54 ` Kasireddy, Vivek
2023-07-24 13:35 ` Jason Gunthorpe
2023-07-24 20:32 ` Kasireddy, Vivek
2023-07-25 4:30 ` Hugh Dickins
2023-07-25 22:24 ` Kasireddy, Vivek
2023-07-27 21:43 ` Peter Xu
2023-07-29 0:08 ` Kasireddy, Vivek
2023-07-31 17:05 ` Peter Xu
2023-08-01 7:11 ` Kasireddy, Vivek
2023-08-01 21:57 ` Peter Xu
2023-08-03 8:08 ` Kasireddy, Vivek
2023-08-03 13:02 ` Peter Xu
2023-07-25 12:36 ` Jason Gunthorpe
2023-07-25 22:44 ` Kasireddy, Vivek
2023-07-25 22:53 ` Jason Gunthorpe
2023-07-27 7:34 ` Kasireddy, Vivek
2023-07-27 11:58 ` Jason Gunthorpe
2023-07-29 0:46 ` Kasireddy, Vivek
2023-07-30 23:09 ` Jason Gunthorpe
2023-08-01 5:32 ` Kasireddy, Vivek
2023-08-01 12:19 ` Jason Gunthorpe
2023-08-01 12:22 ` David Hildenbrand
2023-08-01 12:23 ` Jason Gunthorpe
2023-08-01 12:26 ` David Hildenbrand
2023-08-01 12:26 ` Jason Gunthorpe
2023-08-01 12:28 ` David Hildenbrand
2023-08-01 17:53 ` Kasireddy, Vivek
2023-08-01 18:19 ` Jason Gunthorpe
2023-08-03 7:35 ` Kasireddy, Vivek
2023-08-03 12:14 ` Jason Gunthorpe
2023-08-03 12:32 ` David Hildenbrand
2023-08-04 0:14 ` Alistair Popple
2023-08-04 6:39 ` Kasireddy, Vivek [this message]
2023-08-04 7:23 ` David Hildenbrand
2023-08-04 21:53 ` Kasireddy, Vivek
2023-08-04 12:49 ` Jason Gunthorpe
2023-08-08 7:37 ` Kasireddy, Vivek
2023-08-08 12:42 ` Jason Gunthorpe
2023-08-16 6:43 ` Kasireddy, Vivek
2023-08-21 9:02 ` Alistair Popple
2023-08-22 6:14 ` Kasireddy, Vivek
2023-08-22 8:15 ` Alistair Popple
2023-08-24 6:48 ` Kasireddy, Vivek
2023-08-28 4:38 ` Kasireddy, Vivek
2023-08-30 16:02 ` Jason Gunthorpe
2023-07-25 3:38 ` Alistair Popple
2023-07-24 13:36 ` Alistair Popple
2023-07-24 13:37 ` Jason Gunthorpe
2023-07-24 20:42 ` Kasireddy, Vivek
2023-07-25 3:14 ` Alistair Popple
2023-07-18 8:28 ` [RFC v1 2/3] udmabuf: Replace pages when there is FALLOC_FL_PUNCH_HOLE in memfd Vivek Kasireddy
2023-08-02 12:40 ` Daniel Vetter
2023-08-03 8:24 ` Kasireddy, Vivek
2023-08-03 8:32 ` Daniel Vetter
2023-07-18 8:28 ` [RFC v1 3/3] selftests/dma-buf/udmabuf: Add tests for huge pages and FALLOC_FL_PUNCH_HOLE Vivek Kasireddy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=IA0PR11MB7185974FA204015EA3B74066F809A@IA0PR11MB7185.namprd11.prod.outlook.com \
--to=vivek.kasireddy@intel.com \
--cc=apopple@nvidia.com \
--cc=david@redhat.com \
--cc=dongwon.kim@intel.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=hughd@google.com \
--cc=jgg@nvidia.com \
--cc=junxiao.chang@intel.com \
--cc=kraxel@redhat.com \
--cc=linux-mm@kvack.org \
--cc=mike.kravetz@oracle.com \
--cc=peterx@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).