All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@linux.intel.com>
To: Boaz Harrosh <openosd@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
	linux-kernel@vger.kernel.org, axboe@kernel.dk, hch@infradead.org,
	Al Viro <viro@ZenIV.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@osdl.org>,
	linux-arch@vger.kernel.org, riel@redhat.com,
	linux-nvdimm@lists.01.org,
	Dave Hansen <dave.hansen@linux.intel.com>,
	linux-raid@vger.kernel.org, mgorman@suse.de,
	linux-fsdevel@vger.kernel.org
Subject: Re: [RFC PATCH 0/7] evacuate struct page from the block layer
Date: Wed, 18 Mar 2015 09:06:41 -0400	[thread overview]
Message-ID: <20150318130641.GD4003@linux.intel.com> (raw)
In-Reply-To: <550957B9.5050803@gmail.com>

On Wed, Mar 18, 2015 at 12:47:21PM +0200, Boaz Harrosh wrote:
> God! Look at this endless list of files and it is only the very beginning.
> It does not even work and touches only 10% of what will need to be touched
> for this to work, and very very marginally at that. There will always be
> "another subsystem" that will not work. For example NUMA how will you do
> NUMA aware pmem? and this is just a simple example. (I'm saying NUMA
> because our tests show a huge drop in performance if you do not do
> NUMA aware allocation)

You're very entertaining, but please, tone down your emails and stick
to facts.  The BIOS presents the persistent memory as one table entry
per NUMA node, so you get one block device per NUMA node.  There's no
mixing of memory from different NUMA nodes within a single filesystem,
unless you have a filesystem that uses multiple block devices.

> I'm not the one afraid of hard work, if it was for a good cause, but for what?
> really for what? The block layer, and RDMA, and networking, and spline, and what
> ever the heck any one wants to imagine to do with pmem, already works perfectly
> stable. right now!

The overhead.  Allocating a struct page for every 4k page in a 400GB DIMM
(the current capacity available from one NV-DIMM vendor) occupies 6.4GB.
That's an unacceptable amount of overhead.


WARNING: multiple messages have this Message-ID (diff)
From: Matthew Wilcox <willy@linux.intel.com>
To: Boaz Harrosh <openosd@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
	linux-kernel@vger.kernel.org, axboe@kernel.dk, hch@infradead.org,
	Al Viro <viro@ZenIV.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@osdl.org>,
	linux-arch@vger.kernel.org, riel@redhat.com,
	linux-nvdimm@ml01.01.org,
	Dave Hansen <dave.hansen@linux.intel.com>,
	linux-raid@vger.kernel.org, mgorman@suse.de,
	linux-fsdevel@vger.kernel.org
Subject: Re: [RFC PATCH 0/7] evacuate struct page from the block layer
Date: Wed, 18 Mar 2015 09:06:41 -0400	[thread overview]
Message-ID: <20150318130641.GD4003@linux.intel.com> (raw)
In-Reply-To: <550957B9.5050803@gmail.com>

On Wed, Mar 18, 2015 at 12:47:21PM +0200, Boaz Harrosh wrote:
> God! Look at this endless list of files and it is only the very beginning.
> It does not even work and touches only 10% of what will need to be touched
> for this to work, and very very marginally at that. There will always be
> "another subsystem" that will not work. For example NUMA how will you do
> NUMA aware pmem? and this is just a simple example. (I'm saying NUMA
> because our tests show a huge drop in performance if you do not do
> NUMA aware allocation)

You're very entertaining, but please, tone down your emails and stick
to facts.  The BIOS presents the persistent memory as one table entry
per NUMA node, so you get one block device per NUMA node.  There's no
mixing of memory from different NUMA nodes within a single filesystem,
unless you have a filesystem that uses multiple block devices.

> I'm not the one afraid of hard work, if it was for a good cause, but for what?
> really for what? The block layer, and RDMA, and networking, and spline, and what
> ever the heck any one wants to imagine to do with pmem, already works perfectly
> stable. right now!

The overhead.  Allocating a struct page for every 4k page in a 400GB DIMM
(the current capacity available from one NV-DIMM vendor) occupies 6.4GB.
That's an unacceptable amount of overhead.


  reply	other threads:[~2015-03-18 13:06 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-16 20:25 [RFC PATCH 0/7] evacuate struct page from the block layer Dan Williams
2015-03-16 20:25 ` Dan Williams
2015-03-16 20:25 ` [RFC PATCH 1/7] block: add helpers for accessing a bio_vec page Dan Williams
2015-03-16 20:25   ` Dan Williams
2015-03-16 20:25 ` [RFC PATCH 2/7] block: convert bio_vec.bv_page to bv_pfn Dan Williams
2015-03-16 20:25   ` Dan Williams
2015-03-16 23:05   ` Al Viro
2015-03-17 13:02     ` Matthew Wilcox
2015-03-17 15:53       ` Dan Williams
2015-03-16 20:25 ` [RFC PATCH 3/7] dma-mapping: allow archs to optionally specify a ->map_pfn() operation Dan Williams
2015-03-16 20:25   ` Dan Williams
2015-03-18 11:21   ` [Linux-nvdimm] " Boaz Harrosh
2015-03-18 11:21     ` Boaz Harrosh
2015-03-16 20:25 ` [RFC PATCH 4/7] scatterlist: use sg_phys() Dan Williams
2015-03-16 20:25   ` Dan Williams
2015-03-16 20:25 ` [RFC PATCH 5/7] scatterlist: support "page-less" (__pfn_t only) entries Dan Williams
2015-03-16 20:25   ` Dan Williams
2015-03-16 20:25 ` [RFC PATCH 6/7] x86: support dma_map_pfn() Dan Williams
2015-03-16 20:25   ` Dan Williams
2015-03-16 20:26 ` [RFC PATCH 7/7] block: base support for pfn i/o Dan Williams
2015-03-16 20:26   ` Dan Williams
2015-03-18 10:47 ` [RFC PATCH 0/7] evacuate struct page from the block layer Boaz Harrosh
2015-03-18 10:47   ` Boaz Harrosh
2015-03-18 13:06   ` Matthew Wilcox [this message]
2015-03-18 13:06     ` Matthew Wilcox
2015-03-18 14:38     ` [Linux-nvdimm] " Boaz Harrosh
2015-03-18 14:38       ` Boaz Harrosh
2015-03-20 15:56       ` Rik van Riel
2015-03-22 11:53         ` Boaz Harrosh
2015-03-18 15:35   ` Dan Williams
2015-03-18 15:35     ` Dan Williams
2015-03-18 20:26 ` Andrew Morton
2015-03-19 13:43   ` Matthew Wilcox
2015-03-19 15:54     ` [Linux-nvdimm] " Boaz Harrosh
2015-03-19 19:59       ` Andrew Morton
2015-03-19 20:59         ` Dan Williams
2015-03-22 17:22           ` Boaz Harrosh
2015-03-20 17:32         ` Wols Lists
2015-03-22 10:30         ` Boaz Harrosh
2015-03-19 18:17     ` Christoph Hellwig
2015-03-19 19:31       ` Matthew Wilcox
2015-03-22 16:46       ` Boaz Harrosh
2015-03-20 16:21     ` Rik van Riel
2015-03-20 20:31       ` Matthew Wilcox
2015-03-20 21:08         ` Rik van Riel
2015-03-22 17:06           ` Boaz Harrosh
2015-03-22 17:22             ` Dan Williams
2015-03-22 17:39               ` Boaz Harrosh
2015-03-20 21:17         ` Wols Lists
2015-03-22 16:24         ` Boaz Harrosh
2015-03-22 15:51       ` Boaz Harrosh
2015-03-23 15:19         ` Rik van Riel
2015-03-23 19:30           ` Christoph Hellwig
2015-03-24  9:41           ` Boaz Harrosh
2015-03-24 16:57             ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150318130641.GD4003@linux.intel.com \
    --to=willy@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=hch@infradead.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=openosd@gmail.com \
    --cc=riel@redhat.com \
    --cc=torvalds@osdl.org \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.