From: "Kasireddy, Vivek" <vivek.kasireddy@intel.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>,
"Kim, Dongwon" <dongwon.kim@intel.com>,
"Chang, Junxiao" <junxiao.chang@intel.com>,
"dri-devel@lists.freedesktop.org"
<dri-devel@lists.freedesktop.org>,
Alistair Popple <apopple@nvidia.com>,
Hugh Dickins <hughd@google.com>, Peter Xu <peterx@redhat.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"Gerd Hoffmann" <kraxel@redhat.com>,
Mike Kravetz <mike.kravetz@oracle.com>
Subject: RE: [RFC v1 1/3] mm/mmu_notifier: Add a new notifier for mapping updates (new pages)
Date: Thu, 3 Aug 2023 07:35:51 +0000 [thread overview]
Message-ID: <IA0PR11MB7185304345516521FA3005C2F808A@IA0PR11MB7185.namprd11.prod.outlook.com> (raw)
In-Reply-To: <ZMlMoRIkPoO0gG3B@nvidia.com>
Hi Jason,
> > > Right, the "the zero pages are changed into writable pages" in your
> > > above comment just might not apply, because there won't be any page
> > > replacement (hopefully :) ).
>
> > If the page replacement does not happen when there are new writes to the
> > area where the hole previously existed, then would we still get an
> invalidate
> > when this happens? Is there any other way to get notified when the zeroed
> > page is written to if the invalidate does not get triggered?
>
> What David is saying is that memfd does not use the zero page
> optimization for hole punches. Any access to the memory, including
> read-only access through hmm_range_fault() will allocate unique
> pages. Since there is no zero page and no zero-page replacement there
> is no issue with invalidations.
It looks like even with hmm_range_fault(), the invalidate does not get
triggered when the hole is refilled with new pages because of writes.
This is probably because hmm_range_fault() does not fault in any pages
that get invalidated later when writes occur. Not sure if there is a way to
request it to fill a hole with zero pages. Here is what I have in the
invalidate callback (added on top of this series):
static bool invalidate_udmabuf(struct mmu_interval_notifier *mn,
const struct mmu_notifier_range *range_mn,
unsigned long cur_seq)
{
struct udmabuf_vma_range *range =
container_of(mn, struct udmabuf_vma_range, range_mn);
struct udmabuf *ubuf = range->ubuf;
struct hmm_range hrange = {0};
unsigned long *pfns, num_pages, timeout;
int i, ret;
printk("invalidate; start = %lu, end = %lu\n",
range->start, range->end);
hrange.notifier = mn;
hrange.default_flags = HMM_PFN_REQ_FAULT;
hrange.start = max(range_mn->start, range->start);
hrange.end = min(range_mn->end, range->end);
num_pages = (hrange.end - hrange.start) >> PAGE_SHIFT;
pfns = kmalloc_array(num_pages, sizeof(*pfns), GFP_KERNEL);
if (!pfns)
return true;
printk("invalidate; num pages = %lu\n", num_pages);
hrange.hmm_pfns = pfns;
timeout = jiffies + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT);
do {
hrange.notifier_seq = mmu_interval_read_begin(mn);
mmap_read_lock(ubuf->vmm_mm);
ret = hmm_range_fault(&hrange);
mmap_read_unlock(ubuf->vmm_mm);
if (ret) {
if (ret == -EBUSY && !time_after(jiffies, timeout))
continue;
break;
}
if (mmu_interval_read_retry(mn, hrange.notifier_seq))
continue;
} while (ret);
if (!ret) {
for (i = 0; i < num_pages; i++) {
printk("hmm returned page = %p; pfn = %lu\n",
hmm_pfn_to_page(pfns[i]),
pfns[i] & ~HMM_PFN_FLAGS);
}
}
return true;
}
static const struct mmu_interval_notifier_ops udmabuf_invalidate_ops = {
.invalidate = invalidate_udmabuf,
};
Here are the log messages I see when I run the udmabuf (shmem-based) selftest:
[ 132.662863] invalidate; start = 140737347612672, end = 140737347629056
[ 132.672953] invalidate; num pages = 4
[ 132.676690] hmm returned page = 000000000483755d; pfn = 2595360
[ 132.682676] hmm returned page = 00000000d5a87cc6; pfn = 2588133
[ 132.688651] hmm returned page = 00000000f9eb8d20; pfn = 2673429
[ 132.694629] hmm returned page = 000000005b44da27; pfn = 2588481
[ 132.700605] invalidate; start = 140737348661248, end = 140737348677632
[ 132.710672] invalidate; num pages = 4
[ 132.714412] hmm returned page = 0000000002867206; pfn = 2680737
[ 132.720394] hmm returned page = 00000000778a48f0; pfn = 2680738
[ 132.726366] hmm returned page = 00000000d8adf162; pfn = 2680739
[ 132.732350] hmm returned page = 00000000671769ff; pfn = 2680740
The above log messages are seen immediately after the hole is punched. As
you can see, hmm_range_fault() returns the pfns of old pages and not zero
pages. And, I see the below messages (with patch #2 in this series applied)
as the hole is refilled after writes:
[ 160.279227] udpate mapping; old page = 000000000483755d; pfn = 2595360
[ 160.285809] update mapping; new page = 00000000080e9595; pfn = 2680991
[ 160.292402] udpate mapping; old page = 00000000d5a87cc6; pfn = 2588133
[ 160.298979] update mapping; new page = 000000000483755d; pfn = 2595360
[ 160.305574] udpate mapping; old page = 00000000f9eb8d20; pfn = 2673429
[ 160.312154] update mapping; new page = 00000000d5a87cc6; pfn = 2588133
[ 160.318744] udpate mapping; old page = 000000005b44da27; pfn = 2588481
[ 160.325320] update mapping; new page = 00000000f9eb8d20; pfn = 2673429
[ 160.333022] udpate mapping; old page = 0000000002867206; pfn = 2680737
[ 160.339603] update mapping; new page = 000000003e2e9628; pfn = 2674703
[ 160.346201] udpate mapping; old page = 00000000778a48f0; pfn = 2680738
[ 160.352789] update mapping; new page = 0000000002867206; pfn = 2680737
[ 160.359394] udpate mapping; old page = 00000000d8adf162; pfn = 2680739
[ 160.365966] update mapping; new page = 00000000778a48f0; pfn = 2680738
[ 160.372552] udpate mapping; old page = 00000000671769ff; pfn = 2680740
[ 160.379131] update mapping; new page = 00000000d8adf162; pfn = 2680739
FYI, I ran this experiment with the kernel (6.5.0 RC1) from drm-tip.
Thanks,
Vivek
>
> Jason
next prev parent reply other threads:[~2023-08-03 7:36 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-18 8:28 [RFC v1 0/3] udmabuf: Replace pages when there is FALLOC_FL_PUNCH_HOLE in memfd Vivek Kasireddy
2023-07-18 8:28 ` [RFC v1 1/3] mm/mmu_notifier: Add a new notifier for mapping updates (new pages) Vivek Kasireddy
2023-07-18 15:36 ` Jason Gunthorpe
2023-07-19 0:05 ` Kasireddy, Vivek
2023-07-19 0:24 ` Jason Gunthorpe
2023-07-19 6:19 ` Kasireddy, Vivek
2023-07-19 2:08 ` Alistair Popple
2023-07-20 7:43 ` Kasireddy, Vivek
2023-07-20 9:00 ` Alistair Popple
2023-07-24 7:54 ` Kasireddy, Vivek
2023-07-24 13:35 ` Jason Gunthorpe
2023-07-24 20:32 ` Kasireddy, Vivek
2023-07-25 4:30 ` Hugh Dickins
2023-07-25 22:24 ` Kasireddy, Vivek
2023-07-27 21:43 ` Peter Xu
2023-07-29 0:08 ` Kasireddy, Vivek
2023-07-31 17:05 ` Peter Xu
2023-08-01 7:11 ` Kasireddy, Vivek
2023-08-01 21:57 ` Peter Xu
2023-08-03 8:08 ` Kasireddy, Vivek
2023-08-03 13:02 ` Peter Xu
2023-07-25 12:36 ` Jason Gunthorpe
2023-07-25 22:44 ` Kasireddy, Vivek
2023-07-25 22:53 ` Jason Gunthorpe
2023-07-27 7:34 ` Kasireddy, Vivek
2023-07-27 11:58 ` Jason Gunthorpe
2023-07-29 0:46 ` Kasireddy, Vivek
2023-07-30 23:09 ` Jason Gunthorpe
2023-08-01 5:32 ` Kasireddy, Vivek
2023-08-01 12:19 ` Jason Gunthorpe
2023-08-01 12:22 ` David Hildenbrand
2023-08-01 12:23 ` Jason Gunthorpe
2023-08-01 12:26 ` David Hildenbrand
2023-08-01 12:26 ` Jason Gunthorpe
2023-08-01 12:28 ` David Hildenbrand
2023-08-01 17:53 ` Kasireddy, Vivek
2023-08-01 18:19 ` Jason Gunthorpe
2023-08-03 7:35 ` Kasireddy, Vivek [this message]
2023-08-03 12:14 ` Jason Gunthorpe
2023-08-03 12:32 ` David Hildenbrand
2023-08-04 0:14 ` Alistair Popple
2023-08-04 6:39 ` Kasireddy, Vivek
2023-08-04 7:23 ` David Hildenbrand
2023-08-04 21:53 ` Kasireddy, Vivek
2023-08-04 12:49 ` Jason Gunthorpe
2023-08-08 7:37 ` Kasireddy, Vivek
2023-08-08 12:42 ` Jason Gunthorpe
2023-08-16 6:43 ` Kasireddy, Vivek
2023-08-21 9:02 ` Alistair Popple
2023-08-22 6:14 ` Kasireddy, Vivek
2023-08-22 8:15 ` Alistair Popple
2023-08-24 6:48 ` Kasireddy, Vivek
2023-08-28 4:38 ` Kasireddy, Vivek
2023-08-30 16:02 ` Jason Gunthorpe
2023-07-25 3:38 ` Alistair Popple
2023-07-24 13:36 ` Alistair Popple
2023-07-24 13:37 ` Jason Gunthorpe
2023-07-24 20:42 ` Kasireddy, Vivek
2023-07-25 3:14 ` Alistair Popple
2023-07-18 8:28 ` [RFC v1 2/3] udmabuf: Replace pages when there is FALLOC_FL_PUNCH_HOLE in memfd Vivek Kasireddy
2023-08-02 12:40 ` Daniel Vetter
2023-08-03 8:24 ` Kasireddy, Vivek
2023-08-03 8:32 ` Daniel Vetter
2023-07-18 8:28 ` [RFC v1 3/3] selftests/dma-buf/udmabuf: Add tests for huge pages and FALLOC_FL_PUNCH_HOLE Vivek Kasireddy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=IA0PR11MB7185304345516521FA3005C2F808A@IA0PR11MB7185.namprd11.prod.outlook.com \
--to=vivek.kasireddy@intel.com \
--cc=apopple@nvidia.com \
--cc=david@redhat.com \
--cc=dongwon.kim@intel.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=hughd@google.com \
--cc=jgg@nvidia.com \
--cc=junxiao.chang@intel.com \
--cc=kraxel@redhat.com \
--cc=linux-mm@kvack.org \
--cc=mike.kravetz@oracle.com \
--cc=peterx@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).