linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@linux.intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>,
	linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	axboe@kernel.dk, riel@redhat.com, linux-nvdimm@ml01.01.org,
	Dave Hansen <dave.hansen@linux.intel.com>,
	linux-raid@vger.kernel.org, mgorman@suse.de, hch@infradead.org,
	linux-fsdevel@vger.kernel.org
Subject: Re: [RFC PATCH 0/7] evacuate struct page from the block layer
Date: Thu, 19 Mar 2015 09:43:13 -0400	[thread overview]
Message-ID: <20150319134313.GF4003@linux.intel.com> (raw)
In-Reply-To: <20150318132650.3336261c58829f49a9af8675@linux-foundation.org>

On Wed, Mar 18, 2015 at 01:26:50PM -0700, Andrew Morton wrote:
> On Mon, 16 Mar 2015 16:25:25 -0400 Dan Williams <dan.j.williams@intel.com> wrote:
> 
> > Avoid the impending disaster of requiring struct page coverage for what
> > is expected to be ever increasing capacities of persistent memory.  In
> > conversations with Rik van Riel, Mel Gorman, and Jens Axboe at the
> > recently concluded Linux Storage Summit it became clear that struct page
> > is not required in many places, it was simply convenient to re-use.
> > 
> > Introduce helpers and infrastructure to remove struct page usage where
> > it is not necessary.  One use case for these changes is to implement a
> > write-back-cache in persistent memory for software-RAID.  Another use
> > case for the scatterlist changes is RDMA to a pfn-range.
> 
> Those use-cases sound very thin.  If that's all we have then I'd say
> "find another way of implementing those things without creating
> pageframes for persistent memory".
> 
> IOW, please tell us much much much more about the value of this change.

Dan missed "Support O_DIRECT to a mapped DAX file".  More generally, if we
want to be able to do any kind of I/O directly to persistent memory,
and I think we do, we need to do one of:

1. Construct struct pages for persistent memory
1a. Permanently
1b. While the pages are under I/O
2. Teach the I/O layers to deal in PFNs instead of struct pages
3. Replace struct page with some other structure that can represent both
   DRAM and PMEM

I'm personally a fan of #3, and I was looking at the scatterlist as
my preferred data structure.  I now believe the scatterlist as it is
currently defined isn't sufficient, so we probably end up needing a new
data structure.  I think Dan's preferred method of replacing struct
pages with PFNs is actually less instrusive, but doesn't give us as
much advantage (an entirely new data structure would let us move to an
extent based system at the same time, instead of sticking with an array
of pages).  Clearly Boaz prefers 1a, which works well enough for the
8GB NV-DIMMs, but not well enough for the 400GB NV-DIMMs.

What's your preference?  I guess option 0 is "force all I/O to go
through the page cache and then get copied", but that feels like a nasty
performance hit.

  reply	other threads:[~2015-03-19 13:43 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-16 20:25 [RFC PATCH 0/7] evacuate struct page from the block layer Dan Williams
2015-03-16 20:25 ` [RFC PATCH 1/7] block: add helpers for accessing a bio_vec page Dan Williams
2015-03-16 20:25 ` [RFC PATCH 2/7] block: convert bio_vec.bv_page to bv_pfn Dan Williams
2015-03-16 23:05   ` Al Viro
2015-03-17 13:02     ` Matthew Wilcox
2015-03-17 15:53       ` Dan Williams
2015-03-16 20:25 ` [RFC PATCH 3/7] dma-mapping: allow archs to optionally specify a ->map_pfn() operation Dan Williams
2015-03-18 11:21   ` [Linux-nvdimm] " Boaz Harrosh
2015-03-16 20:25 ` [RFC PATCH 4/7] scatterlist: use sg_phys() Dan Williams
2015-03-16 20:25 ` [RFC PATCH 5/7] scatterlist: support "page-less" (__pfn_t only) entries Dan Williams
2015-03-16 20:25 ` [RFC PATCH 6/7] x86: support dma_map_pfn() Dan Williams
2015-03-16 20:26 ` [RFC PATCH 7/7] block: base support for pfn i/o Dan Williams
2015-03-18 10:47 ` [RFC PATCH 0/7] evacuate struct page from the block layer Boaz Harrosh
2015-03-18 13:06   ` Matthew Wilcox
2015-03-18 14:38     ` [Linux-nvdimm] " Boaz Harrosh
2015-03-20 15:56       ` Rik van Riel
2015-03-22 11:53         ` Boaz Harrosh
2015-03-18 15:35   ` Dan Williams
2015-03-18 20:26 ` Andrew Morton
2015-03-19 13:43   ` Matthew Wilcox [this message]
2015-03-19 15:54     ` [Linux-nvdimm] " Boaz Harrosh
2015-03-19 19:59       ` Andrew Morton
2015-03-19 20:59         ` Dan Williams
2015-03-22 17:22           ` Boaz Harrosh
2015-03-20 17:32         ` Wols Lists
2015-03-22 10:30         ` Boaz Harrosh
2015-03-19 18:17     ` Christoph Hellwig
2015-03-19 19:31       ` Matthew Wilcox
2015-03-22 16:46       ` Boaz Harrosh
2015-03-20 16:21     ` Rik van Riel
2015-03-20 20:31       ` Matthew Wilcox
2015-03-20 21:08         ` Rik van Riel
2015-03-22 17:06           ` Boaz Harrosh
2015-03-22 17:22             ` Dan Williams
2015-03-22 17:39               ` Boaz Harrosh
2015-03-20 21:17         ` Wols Lists
2015-03-22 16:24         ` Boaz Harrosh
2015-03-22 15:51       ` Boaz Harrosh
2015-03-23 15:19         ` Rik van Riel
2015-03-23 19:30           ` Christoph Hellwig
2015-03-24  9:41           ` Boaz Harrosh
2015-03-24 16:57             ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150319134313.GF4003@linux.intel.com \
    --to=willy@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=hch@infradead.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@ml01.01.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).