All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Joao Martins <joao.m.martins@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
	Gerald Schaefer <gerald.schaefer@linux.ibm.com>,
	Christoph Hellwig <hch@lst.de>,
	Heiko Carstens <hca@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Linux NVDIMM <nvdimm@lists.linux.dev>,
	linux-s390 <linux-s390@vger.kernel.org>,
	Matthew Wilcox <willy@infradead.org>,
	Alex Sierra <alex.sierra@amd.com>,
	"Kuehling, Felix" <Felix.Kuehling@amd.com>,
	Linux MM <linux-mm@kvack.org>,
	Ralph Campbell <rcampbell@nvidia.com>,
	Alistair Popple <apopple@nvidia.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Dave Jiang <dave.jiang@intel.com>
Subject: Re: can we finally kill off CONFIG_FS_DAX_LIMITED
Date: Mon, 18 Oct 2021 20:30:45 -0300	[thread overview]
Message-ID: <20211018233045.GQ2744544@nvidia.com> (raw)
In-Reply-To: <5ca908e3-b4ad-dfef-d75f-75073d4165f7@oracle.com>

On Fri, Oct 15, 2021 at 01:22:41AM +0100, Joao Martins wrote:

> dev_pagemap_mapping_shift() does a lookup to figure out
> which order is the page table entry represents. is_zone_device_page()
> is already used to gate usage of dev_pagemap_mapping_shift(). I think
> this might be an artifact of the same issue as 3) in which PMDs/PUDs
> are represented with base pages and hence you can't do what the rest
> of the world does with:

This code is looks broken as written.

vma_address() relies on certain properties that I maybe DAX (maybe
even only FSDAX?) sets on its ZONE_DEVICE pages, and
dev_pagemap_mapping_shift() does not handle the -EFAULT return. It
will crash if a memory failure hits any other kind of ZONE_DEVICE
area.

I'm not sure the comment is correct anyhow:

		/*
		 * Unmap the largest mapping to avoid breaking up
		 * device-dax mappings which are constant size. The
		 * actual size of the mapping being torn down is
		 * communicated in siginfo, see kill_proc()
		 */
		unmap_mapping_range(page->mapping, start, size, 0);

Beacuse for non PageAnon unmap_mapping_range() does either
zap_huge_pud(), __split_huge_pmd(), or zap_huge_pmd().

Despite it's name __split_huge_pmd() does not actually split, it will
call __split_huge_pmd_locked:

	} else if (!(pmd_devmap(*pmd) || is_pmd_migration_entry(*pmd)))
		goto out;
	__split_huge_pmd_locked(vma, pmd, range.start, freeze);

Which does
	if (!vma_is_anonymous(vma)) {
		old_pmd = pmdp_huge_clear_flush_notify(vma, haddr, pmd);

Which is a zap, not split.

So I wonder if there is a reason to use anything other than 4k here
for DAX?

> 	tk->size_shift = page_shift(compound_head(p));
> 
> ... as page_shift() would just return PAGE_SHIFT (as compound_order() is 0).

And what would be so wrong with memory failure doing this as a 4k
page?

Jason

  reply	other threads:[~2021-10-18 23:30 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-20  5:43 can we finally kill off CONFIG_FS_DAX_LIMITED Christoph Hellwig
2021-08-20 15:41 ` Dan Williams
2021-08-20 15:41   ` Dan Williams
2021-08-20 17:42   ` Dan Williams
2021-08-20 17:42     ` Dan Williams
2021-08-20 19:03     ` Gerald Schaefer
2021-08-24 14:17     ` Joao Martins
2021-08-23 14:05 ` Gerald Schaefer
2021-08-23 19:47   ` Gerald Schaefer
2021-08-23 20:21     ` Dan Williams
2021-08-23 20:21       ` Dan Williams
2021-08-24 14:09       ` Joao Martins
2021-08-24 14:53         ` Dan Williams
2021-08-24 14:53           ` Dan Williams
2021-08-24 18:24           ` Gerald Schaefer
2021-08-24 18:44             ` Dan Williams
2021-08-24 18:44               ` Dan Williams
2021-10-14 23:04               ` Jason Gunthorpe
2021-10-15  0:22                 ` Joao Martins
2021-10-18 23:30                   ` Jason Gunthorpe [this message]
2021-10-19  4:26                     ` Dan Williams
2021-10-19 14:20                       ` Jason Gunthorpe
2021-10-19 15:20                         ` Joao Martins
2021-10-19 15:38                         ` Felix Kuehling
2021-10-19 17:38                         ` Dan Williams
2021-10-19 17:54                           ` Jason Gunthorpe
2021-08-24  6:49   ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211018233045.GQ2744544@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=Felix.Kuehling@amd.com \
    --cc=alex.sierra@amd.com \
    --cc=apopple@nvidia.com \
    --cc=borntraeger@de.ibm.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=gerald.schaefer@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=hch@lst.de \
    --cc=joao.m.martins@oracle.com \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=nvdimm@lists.linux.dev \
    --cc=rcampbell@nvidia.com \
    --cc=vishal.l.verma@intel.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.