linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Balbir Singh <bsingharora@gmail.com>
To: Dave Hansen <dave.hansen@linux.intel.com>
Cc: linux-kernel@vger.kernel.org, thomas.lendacky@amd.com,
	mhocko@suse.com, linux-nvdimm@lists.01.org, tiwai@suse.de,
	ying.huang@intel.com, linux-mm@kvack.org, jglisse@redhat.com,
	bp@suse.de, baiyaowei@cmss.chinamobile.com, zwisler@kernel.org,
	bhelgaas@google.com, fengguang.wu@intel.com,
	akpm@linux-foundation.org
Subject: Re: [PATCH 0/5] [v4] Allow persistent memory to be used like normal RAM
Date: Mon, 28 Jan 2019 22:09:58 +1100	[thread overview]
Message-ID: <20190128110958.GH26056@350D> (raw)
In-Reply-To: <20190124231441.37A4A305@viggo.jf.intel.com>

On Thu, Jan 24, 2019 at 03:14:41PM -0800, Dave Hansen wrote:
> v3 spurred a bunch of really good discussion.  Thanks to everybody
> that made comments and suggestions!
> 
> I would still love some Acks on this from the folks on cc, even if it
> is on just the patch touching your area.
> 
> Note: these are based on commit d2f33c19644 in:
> 
> 	git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm.git libnvdimm-pending
> 
> Changes since v3:
>  * Move HMM-related resource warning instead of removing it
>  * Use __request_resource() directly instead of devm.
>  * Create a separate DAX_PMEM Kconfig option, complete with help text
>  * Update patch descriptions and cover letter to give a better
>    overview of use-cases and hardware where this might be useful.
> 
> Changes since v2:
>  * Updates to dev_dax_kmem_probe() in patch 5:
>    * Reject probes for devices with bad NUMA nodes.  Keeps slow
>      memory from being added to node 0.
>    * Use raw request_mem_region()
>    * Add comments about permanent reservation
>    * use dev_*() instead of printk's
>  * Add references to nvdimm documentation in descriptions
>  * Remove unneeded GPL export
>  * Add Kconfig prompt and help text
> 
> Changes since v1:
>  * Now based on git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm.git
>  * Use binding/unbinding from "dax bus" code
>  * Move over to a "dax bus" driver from being an nvdimm driver
> 
> --
> 
> Persistent memory is cool.  But, currently, you have to rewrite
> your applications to use it.  Wouldn't it be cool if you could
> just have it show up in your system like normal RAM and get to
> it like a slow blob of memory?  Well... have I got the patch
> series for you!
> 
> == Background / Use Cases ==
> 
> Persistent Memory (aka Non-Volatile DIMMs / NVDIMMS) themselves
> are described in detail in Documentation/nvdimm/nvdimm.txt.
> However, this documentation focuses on actually using them as
> storage.  This set is focused on using NVDIMMs as DRAM replacement.
> 
> This is intended for Intel-style NVDIMMs (aka. Intel Optane DC
> persistent memory) NVDIMMs.  These DIMMs are physically persistent,
> more akin to flash than traditional RAM.  They are also expected to
> be more cost-effective than using RAM, which is why folks want this
> set in the first place.

What variant of NVDIMM's F/P or both?

> 
> This set is not intended for RAM-based NVDIMMs.  Those are not
> cost-effective vs. plain RAM, and this using them here would simply
> be a waste.
> 

Sounds like NVDIMM (P)

> But, why would you bother with this approach?  Intel itself [1]
> has announced a hardware feature that does something very similar:
> "Memory Mode" which turns DRAM into a cache in front of persistent
> memory, which is then as a whole used as normal "RAM"?
> 
> Here are a few reasons:
> 1. The capacity of memory mode is the size of your persistent
>    memory that you dedicate.  DRAM capacity is "lost" because it
>    is used for cache.  With this, you get PMEM+DRAM capacity for
>    memory.
> 2. DRAM acts as a cache with memory mode, and caches can lead to
>    unpredictable latencies.  Since memory mode is all-or-nothing
>    (either all your DRAM is used as cache or none is), your entire
>    memory space is exposed to these unpredictable latencies.  This
>    solution lets you guarantee DRAM latencies if you need them.
> 3. The new "tier" of memory is exposed to software.  That means
>    that you can build tiered applications or infrastructure.  A
>    cloud provider could sell cheaper VMs that use more PMEM and
>    more expensive ones that use DRAM.  That's impossible with
>    memory mode.
> 
> Don't take this as criticism of memory mode.  Memory mode is
> awesome, and doesn't strictly require *any* software changes (we
> have software changes proposed for optimizing it though).  It has
> tons of other advantages over *this* approach.  Basically, we
> believe that the approach in these patches is complementary to
> memory mode and that both can live side-by-side in harmony.
> 
> == Patch Set Overview ==
> 
> This series adds a new "driver" to which pmem devices can be
> attached.  Once attached, the memory "owned" by the device is
> hot-added to the kernel and managed like any other memory.  On
> systems with an HMAT (a new ACPI table), each socket (roughly)
> will have a separate NUMA node for its persistent memory so
> this newly-added memory can be selected by its unique NUMA
> node.


NUMA is distance based topology, does HMAT solve these problems?
How do we prevent fallback nodes of normal nodes being pmem nodes?
On an unexpected crash/failure is there a scrubbing mechanism
or do we rely on the allocator to do the right thing prior to
reallocating any memory. Will frequent zero'ing hurt NVDIMM/pmem's
life times?

Balbir Singh.

  parent reply	other threads:[~2019-01-28 11:10 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-24 23:14 [PATCH 0/5] [v4] Allow persistent memory to be used like normal RAM Dave Hansen
2019-01-24 23:14 ` [PATCH 1/5] mm/resource: return real error codes from walk failures Dave Hansen
2019-01-25 21:02   ` Bjorn Helgaas
2019-01-25 21:09     ` Dave Hansen
2019-01-25 21:19       ` Bjorn Helgaas
2019-01-29  1:18       ` Michael Ellerman
2019-01-24 23:14 ` [PATCH 2/5] mm/resource: move HMM pr_debug() deeper into resource code Dave Hansen
2019-01-25 19:07   ` Jerome Glisse
2019-01-25 21:18   ` Bjorn Helgaas
2019-01-25 21:24     ` Dave Hansen
2019-01-29  1:34       ` Michael Ellerman
2019-01-24 23:14 ` [PATCH 3/5] mm/memory-hotplug: allow memory resources to be children Dave Hansen
2019-01-24 23:14 ` [PATCH 4/5] dax/kmem: let walk_system_ram_range() search child resources Dave Hansen
2019-01-24 23:14 ` [PATCH 5/5] dax: "Hotplug" persistent memory for use like normal RAM Dave Hansen
2019-01-25  6:13   ` Jane Chu
2019-01-25  6:27     ` Dan Williams
2019-01-25  8:20       ` Du, Fan
2019-01-25 17:18         ` Dan Williams
2019-01-25 18:20           ` Verma, Vishal L
2019-01-25 19:10             ` Jane Chu
2019-01-25 19:15               ` Dan Williams
2019-01-25 23:30                 ` Jane Chu
2019-01-28  9:25                 ` Michal Hocko
2019-01-28 16:34                   ` Dan Williams
     [not found]   ` <c4c6aca8-6ee8-be10-65ae-4cbe0aa03bfb@inria.fr>
2019-02-11 16:22     ` Dave Hansen
2019-02-12 19:59       ` Brice Goglin
2019-02-13  0:30         ` Dan Williams
2019-02-13  8:12           ` Brice Goglin
2019-02-13  8:24             ` Dan Williams
2019-02-13  8:43               ` Brice Goglin
2019-02-13 13:06                 ` Brice Goglin
2019-02-13 16:19                   ` Dan Williams
2019-01-25 19:08 ` [PATCH 0/5] [v4] Allow persistent memory to be used " Jerome Glisse
2019-01-28 11:09 ` Balbir Singh [this message]
2019-01-28 16:50   ` Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190128110958.GH26056@350D \
    --to=bsingharora@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=baiyaowei@cmss.chinamobile.com \
    --cc=bhelgaas@google.com \
    --cc=bp@suse.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=fengguang.wu@intel.com \
    --cc=jglisse@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=mhocko@suse.com \
    --cc=thomas.lendacky@amd.com \
    --cc=tiwai@suse.de \
    --cc=ying.huang@intel.com \
    --cc=zwisler@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).