linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Joao Martins <joao.m.martins@oracle.com>
To: Jason Gunthorpe <jgg@nvidia.com>,
	Dan Williams <dan.j.williams@intel.com>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>,
	Christoph Hellwig <hch@lst.de>,
	Heiko Carstens <hca@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Linux NVDIMM <nvdimm@lists.linux.dev>,
	linux-s390 <linux-s390@vger.kernel.org>,
	Matthew Wilcox <willy@infradead.org>,
	Alex Sierra <alex.sierra@amd.com>,
	"Kuehling, Felix" <Felix.Kuehling@amd.com>,
	Linux MM <linux-mm@kvack.org>,
	Ralph Campbell <rcampbell@nvidia.com>,
	Alistair Popple <apopple@nvidia.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Dave Jiang <dave.jiang@intel.com>
Subject: Re: can we finally kill off CONFIG_FS_DAX_LIMITED
Date: Fri, 15 Oct 2021 01:22:41 +0100	[thread overview]
Message-ID: <5ca908e3-b4ad-dfef-d75f-75073d4165f7@oracle.com> (raw)
In-Reply-To: <20211014230439.GA3592864@nvidia.com>

On 10/15/21 00:04, Jason Gunthorpe wrote:
> 2) Denying FOLL_LONGTERM
>    Once GUP has grabbed the page we can call is_zone_device_page() on
>    the struct page. If true we can check page->pgmap and read some
>    DENY_FOLL_LONGTERM flag from there
> 
I had proposed something similar to that:

https://lore.kernel.org/linux-mm/6a18179e-65f7-367d-89a9-d5162f10fef0@oracle.com/

Albeit I was using pgmap->type and was relying on get_dev_pagemap() ref
as opposed to after grabbing the page. I can ressurect that with some
adjustments to use pgmap flags to check DENY_LONGTERM flag (and set it
on fsdax[*]) and move the check to after try_grab_page(). That is provided
the other alternative with special page bit isn't an option anymore.

[*] which begs the question on whether fsdax is the *only* that needs the flag?

> 3) Different refcounts for pud/pmd pages
> 
>    Ideally DAX cases would not do this (ie Joao is fixing device-dax)
>    but in the interm we can just loop over the PUD/PMD in all
>    cases. Looping is safe for THP AFAIK. I described how this can work
>    here:
> 
>    https://lore.kernel.org/all/20211013174140.GJ2744544@nvidia.com/
> 
> After that there are only two remaining uses:
> 
> 4) The pud/pmd_devmap() in vm_normal_page() should just go
>    away. ZONE_DEVICE memory with struct pages SHOULD be a normal
>    page. This also means dropping pte_special too.
> 
> 5) dev_pagemap_mapping_shift() - I don't know what this does
>    but why not use the is_zone_device_page() approach from 2?
> 
dev_pagemap_mapping_shift() does a lookup to figure out
which order is the page table entry represents. is_zone_device_page()
is already used to gate usage of dev_pagemap_mapping_shift(). I think
this might be an artifact of the same issue as 3) in which PMDs/PUDs
are represented with base pages and hence you can't do what the rest
of the world does with:

	tk->size_shift = page_shift(compound_head(p));

... as page_shift() would just return PAGE_SHIFT (as compound_order() is 0).


  reply	other threads:[~2021-10-15  0:23 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20210820054340.GA28560@lst.de>
     [not found] ` <20210823160546.0bf243bf@thinkpad>
     [not found]   ` <20210823214708.77979b3f@thinkpad>
     [not found]     ` <CAPcyv4jijqrb1O5OOTd5ftQ2Q-5SVwNRM7XMQ+N3MAFxEfvxpA@mail.gmail.com>
     [not found]       ` <e250feab-1873-c91d-5ea9-39ac6ef26458@oracle.com>
     [not found]         ` <CAPcyv4jYXPWmT2EzroTa7RDz1Z68Qz8Uj4MeheQHPbBXdfS4pA@mail.gmail.com>
     [not found]           ` <20210824202449.19d524b5@thinkpad>
     [not found]             ` <CAPcyv4iFeVDVPn6uc=aKsyUvkiu3-fK-N16iJVZQ3N8oT00hWA@mail.gmail.com>
2021-10-14 23:04               ` can we finally kill off CONFIG_FS_DAX_LIMITED Jason Gunthorpe
2021-10-15  0:22                 ` Joao Martins [this message]
2021-10-18 23:30                   ` Jason Gunthorpe
2021-10-19  4:26                     ` Dan Williams
2021-10-19 14:20                       ` Jason Gunthorpe
2021-10-19 15:20                         ` Joao Martins
2021-10-19 15:38                         ` Felix Kuehling
2021-10-19 17:38                         ` Dan Williams
2021-10-19 17:54                           ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5ca908e3-b4ad-dfef-d75f-75073d4165f7@oracle.com \
    --to=joao.m.martins@oracle.com \
    --cc=Felix.Kuehling@amd.com \
    --cc=alex.sierra@amd.com \
    --cc=apopple@nvidia.com \
    --cc=borntraeger@de.ibm.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=gerald.schaefer@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=hch@lst.de \
    --cc=jgg@nvidia.com \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=nvdimm@lists.linux.dev \
    --cc=rcampbell@nvidia.com \
    --cc=vishal.l.verma@intel.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).