All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joao Martins <joao.m.martins@oracle.com>
To: Jason Gunthorpe <jgg@nvidia.com>,
	Dan Williams <dan.j.williams@intel.com>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>,
	Christoph Hellwig <hch@lst.de>,
	Heiko Carstens <hca@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Linux NVDIMM <nvdimm@lists.linux.dev>,
	linux-s390 <linux-s390@vger.kernel.org>,
	Matthew Wilcox <willy@infradead.org>,
	Alex Sierra <alex.sierra@amd.com>,
	"Kuehling, Felix" <Felix.Kuehling@amd.com>,
	Linux MM <linux-mm@kvack.org>,
	Ralph Campbell <rcampbell@nvidia.com>,
	Alistair Popple <apopple@nvidia.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Dave Jiang <dave.jiang@intel.com>
Subject: Re: can we finally kill off CONFIG_FS_DAX_LIMITED
Date: Fri, 15 Oct 2021 01:22:41 +0100	[thread overview]
Message-ID: <5ca908e3-b4ad-dfef-d75f-75073d4165f7@oracle.com> (raw)
In-Reply-To: <20211014230439.GA3592864@nvidia.com>

On 10/15/21 00:04, Jason Gunthorpe wrote:
> 2) Denying FOLL_LONGTERM
>    Once GUP has grabbed the page we can call is_zone_device_page() on
>    the struct page. If true we can check page->pgmap and read some
>    DENY_FOLL_LONGTERM flag from there
> 
I had proposed something similar to that:

https://lore.kernel.org/linux-mm/6a18179e-65f7-367d-89a9-d5162f10fef0@oracle.com/

Albeit I was using pgmap->type and was relying on get_dev_pagemap() ref
as opposed to after grabbing the page. I can ressurect that with some
adjustments to use pgmap flags to check DENY_LONGTERM flag (and set it
on fsdax[*]) and move the check to after try_grab_page(). That is provided
the other alternative with special page bit isn't an option anymore.

[*] which begs the question on whether fsdax is the *only* that needs the flag?

> 3) Different refcounts for pud/pmd pages
> 
>    Ideally DAX cases would not do this (ie Joao is fixing device-dax)
>    but in the interm we can just loop over the PUD/PMD in all
>    cases. Looping is safe for THP AFAIK. I described how this can work
>    here:
> 
>    https://lore.kernel.org/all/20211013174140.GJ2744544@nvidia.com/
> 
> After that there are only two remaining uses:
> 
> 4) The pud/pmd_devmap() in vm_normal_page() should just go
>    away. ZONE_DEVICE memory with struct pages SHOULD be a normal
>    page. This also means dropping pte_special too.
> 
> 5) dev_pagemap_mapping_shift() - I don't know what this does
>    but why not use the is_zone_device_page() approach from 2?
> 
dev_pagemap_mapping_shift() does a lookup to figure out
which order is the page table entry represents. is_zone_device_page()
is already used to gate usage of dev_pagemap_mapping_shift(). I think
this might be an artifact of the same issue as 3) in which PMDs/PUDs
are represented with base pages and hence you can't do what the rest
of the world does with:

	tk->size_shift = page_shift(compound_head(p));

... as page_shift() would just return PAGE_SHIFT (as compound_order() is 0).

  reply	other threads:[~2021-10-15  0:23 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-20  5:43 can we finally kill off CONFIG_FS_DAX_LIMITED Christoph Hellwig
2021-08-20 15:41 ` Dan Williams
2021-08-20 15:41   ` Dan Williams
2021-08-20 17:42   ` Dan Williams
2021-08-20 17:42     ` Dan Williams
2021-08-20 19:03     ` Gerald Schaefer
2021-08-24 14:17     ` Joao Martins
2021-08-23 14:05 ` Gerald Schaefer
2021-08-23 19:47   ` Gerald Schaefer
2021-08-23 20:21     ` Dan Williams
2021-08-23 20:21       ` Dan Williams
2021-08-24 14:09       ` Joao Martins
2021-08-24 14:53         ` Dan Williams
2021-08-24 14:53           ` Dan Williams
2021-08-24 18:24           ` Gerald Schaefer
2021-08-24 18:44             ` Dan Williams
2021-08-24 18:44               ` Dan Williams
2021-10-14 23:04               ` Jason Gunthorpe
2021-10-15  0:22                 ` Joao Martins [this message]
2021-10-18 23:30                   ` Jason Gunthorpe
2021-10-19  4:26                     ` Dan Williams
2021-10-19 14:20                       ` Jason Gunthorpe
2021-10-19 15:20                         ` Joao Martins
2021-10-19 15:38                         ` Felix Kuehling
2021-10-19 17:38                         ` Dan Williams
2021-10-19 17:54                           ` Jason Gunthorpe
2021-08-24  6:49   ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5ca908e3-b4ad-dfef-d75f-75073d4165f7@oracle.com \
    --to=joao.m.martins@oracle.com \
    --cc=Felix.Kuehling@amd.com \
    --cc=alex.sierra@amd.com \
    --cc=apopple@nvidia.com \
    --cc=borntraeger@de.ibm.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=gerald.schaefer@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=hch@lst.de \
    --cc=jgg@nvidia.com \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=nvdimm@lists.linux.dev \
    --cc=rcampbell@nvidia.com \
    --cc=vishal.l.verma@intel.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.