nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] Allow persistent memory to be used like normal RAM
@ 2018-10-22 20:13 Dave Hansen
  2018-10-22 20:13 ` [PATCH 1/9] mm/resource: return real error codes from walk failures Dave Hansen
                   ` (13 more replies)
  0 siblings, 14 replies; 26+ messages in thread
From: Dave Hansen @ 2018-10-22 20:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: thomas.lendacky, mhocko, linux-nvdimm, Dave Hansen, ying.huang,
	linux-mm, zwisler, fengguang.wu, akpm

Persistent memory is cool.  But, currently, you have to rewrite
your applications to use it.  Wouldn't it be cool if you could
just have it show up in your system like normal RAM and get to
it like a slow blob of memory?  Well... have I got the patch
series for you!

This series adds a new "driver" to which pmem devices can be
attached.  Once attached, the memory "owned" by the device is
hot-added to the kernel and managed like any other memory.  On
systems with an HMAT (a new ACPI table), each socket (roughly)
will have a separate NUMA node for its persistent memory so
this newly-added memory can be selected by its unique NUMA
node.

This is highly RFC, and I really want the feedback from the
nvdimm/pmem folks about whether this is a viable long-term
perversion of their code and device mode.  It's insufficiently
documented and probably not bisectable either.

Todo:
1. The device re-binding hacks are ham-fisted at best.  We
   need a better way of doing this, especially so the kmem
   driver does not get in the way of normal pmem devices.
2. When the device has no proper node, we default it to
   NUMA node 0.  Is that OK?
3. We muck with the 'struct resource' code quite a bit. It
   definitely needs a once-over from folks more familiar
   with it than I.
4. Is there a better way to do this than starting with a
   copy of pmem.c?

Here's how I set up a system to test this thing:

1. Boot qemu with lots of memory: "-m 4096", for instance
2. Reserve 512MB of physical memory.  Reserving a spot a 2GB
   physical seems to work: memmap=512M!0x0000000080000000
   This will end up looking like a pmem device at boot.
3. When booted, convert fsdax device to "device dax":
	ndctl create-namespace -fe namespace0.0 -m dax
4. In the background, the kmem driver will probably bind to the
   new device.
5. Now, online the new memory sections.  Perhaps:

grep ^MemTotal /proc/meminfo
for f in `grep -vl online /sys/devices/system/memory/*/state`; do
	echo $f: `cat $f`
	echo online > $f
	grep ^MemTotal /proc/meminfo
done

Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ross Zwisler <zwisler@kernel.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: linux-nvdimm@lists.01.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: Huang Ying <ying.huang@intel.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/9] mm/resource: return real error codes from walk failures
  2018-10-22 20:13 [PATCH 0/9] Allow persistent memory to be used like normal RAM Dave Hansen
@ 2018-10-22 20:13 ` Dave Hansen
  2018-10-22 20:13 ` [PATCH 2/9] dax: kernel memory driver for mm ownership of DAX Dave Hansen
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Dave Hansen @ 2018-10-22 20:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: thomas.lendacky, mhocko, linux-nvdimm, Dave Hansen, ying.huang,
	linux-mm, zwisler, fengguang.wu, akpm


walk_system_ram_range() can return an error code either becuase *it*
failed, or because the 'func' that it calls returned an error.  The
memory hotplug does the following:

        ret = walk_system_ram_range(..., func);
        if (ret)
		return ret;

and 'ret' makes it out to userspace, eventually.  The problem is,
walk_system_ram_range() failues that result from *it* failing (as
opposed to 'func') return -1.  That leads to a very odd -EPERM (-1)
return code out to userspace.

Make walk_system_ram_range() return -EINVAL for internal failures to
keep userspace less confused.

This return code is compatible with all the callers that I audited.

Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ross Zwisler <zwisler@kernel.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: linux-nvdimm@lists.01.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: Huang Ying <ying.huang@intel.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>


---

 b/kernel/resource.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff -puN kernel/resource.c~memory-hotplug-walk_system_ram_range-returns-neg-1 kernel/resource.c
--- a/kernel/resource.c~memory-hotplug-walk_system_ram_range-returns-neg-1	2018-10-22 13:12:21.000930395 -0700
+++ b/kernel/resource.c	2018-10-22 13:12:21.003930395 -0700
@@ -375,7 +375,7 @@ static int __walk_iomem_res_desc(resourc
 				 int (*func)(struct resource *, void *))
 {
 	struct resource res;
-	int ret = -1;
+	int ret = -EINVAL;
 
 	while (start < end &&
 	       !find_next_iomem_res(start, end, flags, desc, first_lvl, &res)) {
@@ -453,7 +453,7 @@ int walk_system_ram_range(unsigned long
 	unsigned long flags;
 	struct resource res;
 	unsigned long pfn, end_pfn;
-	int ret = -1;
+	int ret = -EINVAL;
 
 	start = (u64) start_pfn << PAGE_SHIFT;
 	end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
_
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 2/9] dax: kernel memory driver for mm ownership of DAX
  2018-10-22 20:13 [PATCH 0/9] Allow persistent memory to be used like normal RAM Dave Hansen
  2018-10-22 20:13 ` [PATCH 1/9] mm/resource: return real error codes from walk failures Dave Hansen
@ 2018-10-22 20:13 ` Dave Hansen
  2018-10-23  1:56   ` Randy Dunlap
  2018-10-22 20:13 ` [PATCH 3/9] dax: add more kmem device infrastructure Dave Hansen
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 26+ messages in thread
From: Dave Hansen @ 2018-10-22 20:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: thomas.lendacky, mhocko, linux-nvdimm, Dave Hansen, ying.huang,
	linux-mm, zwisler, fengguang.wu, akpm


Add the actual driver to which will own the DAX range.  This
allows very nice party with the other possible "owners" of
a DAX region: device DAX and filesystem DAX.  It also greatly
simplifies the process of handing off control of the memory
between the different owners since it's just a matter of
unbinding and rebinding the device to different drivers.

I tried to do this all internally to the kernel and the
locking and "self-destruction" of the old device context was
a nightmare.  Having userspace drive it is a wonderful
simplification.

Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ross Zwisler <zwisler@kernel.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: linux-nvdimm@lists.01.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: Huang Ying <ying.huang@intel.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>

---

 b/drivers/dax/kmem.c |  152 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 152 insertions(+)

diff -puN /dev/null drivers/dax/kmem.c
--- /dev/null	2018-09-18 12:39:53.059362935 -0700
+++ b/drivers/dax/kmem.c	2018-10-22 13:12:21.502930393 -0700
@@ -0,0 +1,152 @@
+// this just just a copy of drivers/dax/pmem.c with
+// s/dax_pmem/dax_kmem' for now.
+//
+// need real license
+/*
+ * Copyright(c) 2016-2018 Intel Corporation. All rights reserved.
+ */
+#include <linux/percpu-refcount.h>
+#include <linux/memremap.h>
+#include <linux/module.h>
+#include <linux/pfn_t.h>
+#include "../nvdimm/pfn.h"
+#include "../nvdimm/nd.h"
+#include "device-dax.h"
+
+struct dax_kmem {
+	struct device *dev;
+	struct percpu_ref ref;
+	struct dev_pagemap pgmap;
+	struct completion cmp;
+};
+
+static struct dax_kmem *to_dax_kmem(struct percpu_ref *ref)
+{
+	return container_of(ref, struct dax_kmem, ref);
+}
+
+static void dax_kmem_percpu_release(struct percpu_ref *ref)
+{
+	struct dax_kmem *dax_kmem = to_dax_pmem(ref);
+
+	dev_dbg(dax_kmem->dev, "trace\n");
+	complete(&dax_kmem->cmp);
+}
+
+static void dax_kmem_percpu_exit(void *data)
+{
+	struct percpu_ref *ref = data;
+	struct dax_kmem *dax_kmem = to_dax_pmem(ref);
+
+	dev_dbg(dax_kmem->dev, "trace\n");
+	wait_for_completion(&dax_kmem->cmp);
+	percpu_ref_exit(ref);
+}
+
+static void dax_kmem_percpu_kill(void *data)
+{
+	struct percpu_ref *ref = data;
+	struct dax_kmem *dax_kmem = to_dax_pmem(ref);
+
+	dev_dbg(dax_kmem->dev, "trace\n");
+	percpu_ref_kill(ref);
+}
+
+static int dax_kmem_probe(struct device *dev)
+{
+	void *addr;
+	struct resource res;
+	int rc, id, region_id;
+	struct nd_pfn_sb *pfn_sb;
+	struct dev_dax *dev_dax;
+	struct dax_kmem *dax_kmem;
+	struct nd_namespace_io *nsio;
+	struct dax_region *dax_region;
+	struct nd_namespace_common *ndns;
+	struct nd_dax *nd_dax = to_nd_dax(dev);
+	struct nd_pfn *nd_pfn = &nd_dax->nd_pfn;
+
+	ndns = nvdimm_namespace_common_probe(dev);
+	if (IS_ERR(ndns))
+		return PTR_ERR(ndns);
+	nsio = to_nd_namespace_io(&ndns->dev);
+
+	dax_kmem = devm_kzalloc(dev, sizeof(*dax_kmem), GFP_KERNEL);
+	if (!dax_kmem)
+		return -ENOMEM;
+
+	/* parse the 'pfn' info block via ->rw_bytes */
+	rc = devm_nsio_enable(dev, nsio);
+	if (rc)
+		return rc;
+	rc = nvdimm_setup_pfn(nd_pfn, &dax_kmem->pgmap);
+	if (rc)
+		return rc;
+	devm_nsio_disable(dev, nsio);
+
+	pfn_sb = nd_pfn->pfn_sb;
+
+	if (!devm_request_mem_region(dev, nsio->res.start,
+				resource_size(&nsio->res),
+				dev_name(&ndns->dev))) {
+		dev_warn(dev, "could not reserve region %pR\n", &nsio->res);
+		return -EBUSY;
+	}
+
+	dax_kmem->dev = dev;
+	init_completion(&dax_kmem->cmp);
+	rc = percpu_ref_init(&dax_kmem->ref, dax_kmem_percpu_release, 0,
+			GFP_KERNEL);
+	if (rc)
+		return rc;
+
+	rc = devm_add_action_or_reset(dev, dax_kmem_percpu_exit,
+							&dax_kmem->ref);
+	if (rc)
+		return rc;
+
+	dax_kmem->pgmap.ref = &dax_kmem->ref;
+	addr = devm_memremap_pages(dev, &dax_kmem->pgmap);
+	if (IS_ERR(addr))
+		return PTR_ERR(addr);
+
+	rc = devm_add_action_or_reset(dev, dax_kmem_percpu_kill,
+							&dax_kmem->ref);
+	if (rc)
+		return rc;
+
+	/* adjust the dax_region resource to the start of data */
+	memcpy(&res, &dax_kmem->pgmap.res, sizeof(res));
+	res.start += le64_to_cpu(pfn_sb->dataoff);
+
+	rc = sscanf(dev_name(&ndns->dev), "namespace%d.%d", &region_id, &id);
+	if (rc != 2)
+		return -EINVAL;
+
+	dax_region = alloc_dax_region(dev, region_id, &res,
+			le32_to_cpu(pfn_sb->align), addr, PFN_DEV|PFN_MAP);
+	if (!dax_region)
+		return -ENOMEM;
+
+	/* TODO: support for subdividing a dax region... */
+	dev_dax = devm_create_dev_dax(dax_region, id, &res, 1);
+
+	/* child dev_dax instances now own the lifetime of the dax_region */
+	dax_region_put(dax_region);
+
+	return PTR_ERR_OR_ZERO(dev_dax);
+}
+
+static struct nd_device_driver dax_kmem_driver = {
+	.probe = dax_kmem_probe,
+	.drv = {
+		.name = "dax_kmem",
+	},
+	.type = ND_DRIVER_DAX_PMEM,
+};
+
+module_nd_driver(dax_kmem_driver);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Intel Corporation");
+MODULE_ALIAS_ND_DEVICE(ND_DEVICE_DAX_PMEM);
_
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 3/9] dax: add more kmem device infrastructure
  2018-10-22 20:13 [PATCH 0/9] Allow persistent memory to be used like normal RAM Dave Hansen
  2018-10-22 20:13 ` [PATCH 1/9] mm/resource: return real error codes from walk failures Dave Hansen
  2018-10-22 20:13 ` [PATCH 2/9] dax: kernel memory driver for mm ownership of DAX Dave Hansen
@ 2018-10-22 20:13 ` Dave Hansen
  2018-10-22 20:13 ` [PATCH 4/9] dax/kmem: allow PMEM devices to bind to KMEM driver Dave Hansen
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Dave Hansen @ 2018-10-22 20:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: thomas.lendacky, mhocko, linux-nvdimm, Dave Hansen, ying.huang,
	linux-mm, zwisler, fengguang.wu, akpm


The previous patch is a simple copy of the pmem driver.  This
makes it easy while this is in development to keep the pmem
and kmem code in sync.

This actually adds some necessary infrastructure for the new
driver to compile.

Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ross Zwisler <zwisler@kernel.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: linux-nvdimm@lists.01.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: Huang Ying <ying.huang@intel.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>

---

 b/drivers/dax/kmem.c         |   10 +++++-----
 b/include/uapi/linux/ndctl.h |    2 ++
 2 files changed, 7 insertions(+), 5 deletions(-)

diff -puN drivers/dax/kmem.c~dax-kmem-try-again-2018-2-header drivers/dax/kmem.c
--- a/drivers/dax/kmem.c~dax-kmem-try-again-2018-2-header	2018-10-22 13:12:22.000930392 -0700
+++ b/drivers/dax/kmem.c	2018-10-22 13:12:22.005930392 -0700
@@ -27,7 +27,7 @@ static struct dax_kmem *to_dax_kmem(stru
 
 static void dax_kmem_percpu_release(struct percpu_ref *ref)
 {
-	struct dax_kmem *dax_kmem = to_dax_pmem(ref);
+	struct dax_kmem *dax_kmem = to_dax_kmem(ref);
 
 	dev_dbg(dax_kmem->dev, "trace\n");
 	complete(&dax_kmem->cmp);
@@ -36,7 +36,7 @@ static void dax_kmem_percpu_release(stru
 static void dax_kmem_percpu_exit(void *data)
 {
 	struct percpu_ref *ref = data;
-	struct dax_kmem *dax_kmem = to_dax_pmem(ref);
+	struct dax_kmem *dax_kmem = to_dax_kmem(ref);
 
 	dev_dbg(dax_kmem->dev, "trace\n");
 	wait_for_completion(&dax_kmem->cmp);
@@ -46,7 +46,7 @@ static void dax_kmem_percpu_exit(void *d
 static void dax_kmem_percpu_kill(void *data)
 {
 	struct percpu_ref *ref = data;
-	struct dax_kmem *dax_kmem = to_dax_pmem(ref);
+	struct dax_kmem *dax_kmem = to_dax_kmem(ref);
 
 	dev_dbg(dax_kmem->dev, "trace\n");
 	percpu_ref_kill(ref);
@@ -142,11 +142,11 @@ static struct nd_device_driver dax_kmem_
 	.drv = {
 		.name = "dax_kmem",
 	},
-	.type = ND_DRIVER_DAX_PMEM,
+	.type = ND_DRIVER_DAX_KMEM,
 };
 
 module_nd_driver(dax_kmem_driver);
 
 MODULE_LICENSE("GPL v2");
 MODULE_AUTHOR("Intel Corporation");
-MODULE_ALIAS_ND_DEVICE(ND_DEVICE_DAX_PMEM);
+MODULE_ALIAS_ND_DEVICE(ND_DEVICE_DAX_KMEM);
diff -puN include/uapi/linux/ndctl.h~dax-kmem-try-again-2018-2-header include/uapi/linux/ndctl.h
--- a/include/uapi/linux/ndctl.h~dax-kmem-try-again-2018-2-header	2018-10-22 13:12:22.002930392 -0700
+++ b/include/uapi/linux/ndctl.h	2018-10-22 13:12:22.005930392 -0700
@@ -197,6 +197,7 @@ static inline const char *nvdimm_cmd_nam
 #define ND_DEVICE_NAMESPACE_PMEM 5  /* PMEM namespace (may alias with BLK) */
 #define ND_DEVICE_NAMESPACE_BLK 6   /* BLK namespace (may alias with PMEM) */
 #define ND_DEVICE_DAX_PMEM 7        /* Device DAX interface to pmem */
+#define ND_DEVICE_DAX_KMEM 8        /* Normal kernel-managed system memory */
 
 enum nd_driver_flags {
 	ND_DRIVER_DIMM            = 1 << ND_DEVICE_DIMM,
@@ -206,6 +207,7 @@ enum nd_driver_flags {
 	ND_DRIVER_NAMESPACE_PMEM  = 1 << ND_DEVICE_NAMESPACE_PMEM,
 	ND_DRIVER_NAMESPACE_BLK   = 1 << ND_DEVICE_NAMESPACE_BLK,
 	ND_DRIVER_DAX_PMEM	  = 1 << ND_DEVICE_DAX_PMEM,
+	ND_DRIVER_DAX_KMEM	  = 1 << ND_DEVICE_DAX_KMEM,
 };
 
 enum {
_
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 4/9] dax/kmem: allow PMEM devices to bind to KMEM driver
  2018-10-22 20:13 [PATCH 0/9] Allow persistent memory to be used like normal RAM Dave Hansen
                   ` (2 preceding siblings ...)
  2018-10-22 20:13 ` [PATCH 3/9] dax: add more kmem device infrastructure Dave Hansen
@ 2018-10-22 20:13 ` Dave Hansen
  2018-10-22 20:13 ` [PATCH 5/9] dax/kmem: add more nd dax kmem infrastructure Dave Hansen
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Dave Hansen @ 2018-10-22 20:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: thomas.lendacky, mhocko, linux-nvdimm, Dave Hansen, ying.huang,
	linux-mm, zwisler, fengguang.wu, akpm


Currently, a persistent memory device's mode must be coordinated
with the driver to which it needs to bind.  To change it from the
fsdax to the device-dax driver, you first change the mode of the
device itself.

Instead of adding a new device mode, allow the PMEM mode to also
bind to the KMEM driver.

As I write this, I'm realizing that it might have just been
better to add a new device mode, rather than hijacking the PMEM
eode.  If this is the case, please speak up, NVDIMM folks.  :)

Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ross Zwisler <zwisler@kernel.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: linux-nvdimm@lists.01.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: Huang Ying <ying.huang@intel.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>

---

 b/drivers/nvdimm/bus.c |   15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff -puN drivers/nvdimm/bus.c~dax-kmem-try-again-2018-3-bus-match-override drivers/nvdimm/bus.c
--- a/drivers/nvdimm/bus.c~dax-kmem-try-again-2018-3-bus-match-override	2018-10-22 13:12:22.522930391 -0700
+++ b/drivers/nvdimm/bus.c	2018-10-22 13:12:22.525930391 -0700
@@ -464,11 +464,24 @@ static struct nd_device_driver nd_bus_dr
 static int nvdimm_bus_match(struct device *dev, struct device_driver *drv)
 {
 	struct nd_device_driver *nd_drv = to_nd_device_driver(drv);
+	bool match;
 
 	if (is_nvdimm_bus(dev) && nd_drv == &nd_bus_driver)
 		return true;
 
-	return !!test_bit(to_nd_device_type(dev), &nd_drv->type);
+	match = !!test_bit(to_nd_device_type(dev), &nd_drv->type);
+
+	/*
+	 * We allow PMEM devices to be bound to the KMEM driver.
+	 * Force a match if we detect a PMEM device type but
+	 * a KMEM device driver.
+	 */
+	if (!match &&
+	    (to_nd_device_type(dev) == ND_DEVICE_DAX_PMEM) &&
+	    (nd_drv->type == ND_DRIVER_DAX_KMEM))
+		match = true;
+
+	return match;
 }
 
 static ASYNC_DOMAIN_EXCLUSIVE(nd_async_domain);
_
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 5/9] dax/kmem: add more nd dax kmem infrastructure
  2018-10-22 20:13 [PATCH 0/9] Allow persistent memory to be used like normal RAM Dave Hansen
                   ` (3 preceding siblings ...)
  2018-10-22 20:13 ` [PATCH 4/9] dax/kmem: allow PMEM devices to bind to KMEM driver Dave Hansen
@ 2018-10-22 20:13 ` Dave Hansen
  2018-10-22 20:13 ` [PATCH 6/9] mm/memory-hotplug: allow memory resources to be children Dave Hansen
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Dave Hansen @ 2018-10-22 20:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: thomas.lendacky, mhocko, linux-nvdimm, Dave Hansen, ying.huang,
	linux-mm, zwisler, fengguang.wu, akpm


Each DAX mode has a set of wrappers and helpers.  Add them
for the kmem mode.

Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ross Zwisler <zwisler@kernel.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: linux-nvdimm@lists.01.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: Huang Ying <ying.huang@intel.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>

---

 b/drivers/nvdimm/bus.c      |    2 ++
 b/drivers/nvdimm/dax_devs.c |   35 +++++++++++++++++++++++++++++++++++
 b/drivers/nvdimm/nd.h       |    6 ++++++
 3 files changed, 43 insertions(+)

diff -puN drivers/nvdimm/bus.c~dax-kmem-try-again-2018-4-bus-dev-type-kmem drivers/nvdimm/bus.c
--- a/drivers/nvdimm/bus.c~dax-kmem-try-again-2018-4-bus-dev-type-kmem	2018-10-22 13:12:23.024930389 -0700
+++ b/drivers/nvdimm/bus.c	2018-10-22 13:12:23.031930389 -0700
@@ -46,6 +46,8 @@ static int to_nd_device_type(struct devi
 		return ND_DEVICE_REGION_BLK;
 	else if (is_nd_dax(dev))
 		return ND_DEVICE_DAX_PMEM;
+	else if (is_nd_dax_kmem(dev))
+		return ND_DEVICE_DAX_KMEM;
 	else if (is_nd_region(dev->parent))
 		return nd_region_to_nstype(to_nd_region(dev->parent));
 
diff -puN drivers/nvdimm/dax_devs.c~dax-kmem-try-again-2018-4-bus-dev-type-kmem drivers/nvdimm/dax_devs.c
--- a/drivers/nvdimm/dax_devs.c~dax-kmem-try-again-2018-4-bus-dev-type-kmem	2018-10-22 13:12:23.026930389 -0700
+++ b/drivers/nvdimm/dax_devs.c	2018-10-22 13:12:23.031930389 -0700
@@ -51,6 +51,41 @@ struct nd_dax *to_nd_dax(struct device *
 }
 EXPORT_SYMBOL(to_nd_dax);
 
+/* nd_dax_kmem */
+static void nd_dax_kmem_release(struct device *dev)
+{
+	struct nd_region *nd_region = to_nd_region(dev->parent);
+	struct nd_dax_kmem *nd_dax_kmem = to_nd_dax_kmem(dev);
+	struct nd_pfn *nd_pfn = &nd_dax_kmem->nd_pfn;
+
+	dev_dbg(dev, "trace\n");
+	nd_detach_ndns(dev, &nd_pfn->ndns);
+	ida_simple_remove(&nd_region->dax_ida, nd_pfn->id);
+	kfree(nd_pfn->uuid);
+	kfree(nd_dax_kmem);
+}
+
+static struct device_type nd_dax_kmem_device_type = {
+	.name = "nd_dax_kmem",
+	.release = nd_dax_kmem_release,
+};
+
+bool is_nd_dax_kmem(struct device *dev)
+{
+	return dev ? dev->type == &nd_dax_kmem_device_type : false;
+}
+EXPORT_SYMBOL(is_nd_dax_kmem);
+
+struct nd_dax_kmem *to_nd_dax_kmem(struct device *dev)
+{
+	struct nd_dax_kmem *nd_dax_kmem = container_of(dev, struct nd_dax_kmem, nd_pfn.dev);
+
+	WARN_ON(!is_nd_dax_kmem(dev));
+	return nd_dax_kmem;
+}
+EXPORT_SYMBOL(to_nd_dax_kmem);
+/* end nd_dax_kmem */
+
 static const struct attribute_group *nd_dax_attribute_groups[] = {
 	&nd_pfn_attribute_group,
 	&nd_device_attribute_group,
diff -puN drivers/nvdimm/nd.h~dax-kmem-try-again-2018-4-bus-dev-type-kmem drivers/nvdimm/nd.h
--- a/drivers/nvdimm/nd.h~dax-kmem-try-again-2018-4-bus-dev-type-kmem	2018-10-22 13:12:23.027930389 -0700
+++ b/drivers/nvdimm/nd.h	2018-10-22 13:12:23.031930389 -0700
@@ -215,6 +215,10 @@ struct nd_dax {
 	struct nd_pfn nd_pfn;
 };
 
+struct nd_dax_kmem {
+	struct nd_pfn nd_pfn;
+};
+
 enum nd_async_mode {
 	ND_SYNC,
 	ND_ASYNC,
@@ -318,9 +322,11 @@ static inline int nd_pfn_validate(struct
 #endif
 
 struct nd_dax *to_nd_dax(struct device *dev);
+struct nd_dax_kmem *to_nd_dax_kmem(struct device *dev);
 #if IS_ENABLED(CONFIG_NVDIMM_DAX)
 int nd_dax_probe(struct device *dev, struct nd_namespace_common *ndns);
 bool is_nd_dax(struct device *dev);
+bool is_nd_dax_kmem(struct device *dev);
 struct device *nd_dax_create(struct nd_region *nd_region);
 #else
 static inline int nd_dax_probe(struct device *dev,
_
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 6/9] mm/memory-hotplug: allow memory resources to be children
  2018-10-22 20:13 [PATCH 0/9] Allow persistent memory to be used like normal RAM Dave Hansen
                   ` (4 preceding siblings ...)
  2018-10-22 20:13 ` [PATCH 5/9] dax/kmem: add more nd dax kmem infrastructure Dave Hansen
@ 2018-10-22 20:13 ` Dave Hansen
  2018-10-22 20:13 ` [PATCH 7/9] dax/kmem: actually perform memory hotplug Dave Hansen
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Dave Hansen @ 2018-10-22 20:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: thomas.lendacky, mhocko, linux-nvdimm, Dave Hansen, ying.huang,
	linux-mm, zwisler, fengguang.wu, akpm


The mm/resource.c code is used to manage the physical address
space.  We can view the current resource configuration in
/proc/iomem.  An example of this is at the bottom of this
description.

The nvdimm subsystem "owns" the physical address resources which
map to persistent memory and has resources inserted for them as
"Persistent Memory".  We want to use this persistent memory, but
as volatile memory, just like RAM.  The best way to do this is
to leave the existing resource in place, but add a "System RAM"
resource underneath it. This clearly communicates the ownership
relationship of this memory.

The request_resource_conflict() API only deals with the
top-level resources.  Replace it with __request_region() which
will search for !IORESOURCE_BUSY areas lower in the resource
tree than the top level.

We also rework the old error message a bit since we do not get
the conflicting entry back: only an indication that we *had* a
conflict.

We *could* also simply truncate the existing top-level
"Persistent Memory" resource and take over the released address
space.  But, this means that if we ever decide to hot-unplug the
"RAM" and give it back, we need to recreate the original setup,
which may mean going back to the BIOS tables.

This should have no real effect on the existing collision
detection because the areas that truly conflict should be marked
IORESOURCE_BUSY.

00000000-00000fff : Reserved
00001000-0009fbff : System RAM
0009fc00-0009ffff : Reserved
000a0000-000bffff : PCI Bus 0000:00
000c0000-000c97ff : Video ROM
000c9800-000ca5ff : Adapter ROM
000f0000-000fffff : Reserved
  000f0000-000fffff : System ROM
00100000-9fffffff : System RAM
  01000000-01e071d0 : Kernel code
  01e071d1-027dfdff : Kernel data
  02dc6000-0305dfff : Kernel bss
a0000000-afffffff : Persistent Memory (legacy)
  a0000000-a7ffffff : System RAM
b0000000-bffdffff : System RAM
bffe0000-bfffffff : Reserved
c0000000-febfffff : PCI Bus 0000:00

Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ross Zwisler <zwisler@kernel.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: linux-nvdimm@lists.01.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: Huang Ying <ying.huang@intel.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>

---

 b/mm/memory_hotplug.c |   31 ++++++++++++++-----------------
 1 file changed, 14 insertions(+), 17 deletions(-)

diff -puN mm/memory_hotplug.c~mm-memory-hotplug-allow-memory-resource-to-be-child mm/memory_hotplug.c
--- a/mm/memory_hotplug.c~mm-memory-hotplug-allow-memory-resource-to-be-child	2018-10-22 13:12:23.570930388 -0700
+++ b/mm/memory_hotplug.c	2018-10-22 13:12:23.573930388 -0700
@@ -99,24 +99,21 @@ void mem_hotplug_done(void)
 /* add this memory to iomem resource */
 static struct resource *register_memory_resource(u64 start, u64 size)
 {
-	struct resource *res, *conflict;
-	res = kzalloc(sizeof(struct resource), GFP_KERNEL);
-	if (!res)
-		return ERR_PTR(-ENOMEM);
+	struct resource *res;
+	unsigned long flags =  IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
+	char resource_name[] = "System RAM";
 
-	res->name = "System RAM";
-	res->start = start;
-	res->end = start + size - 1;
-	res->flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
-	conflict =  request_resource_conflict(&iomem_resource, res);
-	if (conflict) {
-		if (conflict->desc == IORES_DESC_DEVICE_PRIVATE_MEMORY) {
-			pr_debug("Device unaddressable memory block "
-				 "memory hotplug at %#010llx !\n",
-				 (unsigned long long)start);
-		}
-		pr_debug("System RAM resource %pR cannot be added\n", res);
-		kfree(res);
+	/*
+	 * Request ownership of the new memory range.  This might be
+	 * a child of an existing resource that was present but
+	 * not marked as busy.
+	 */
+	res = __request_region(&iomem_resource, start, size,
+			       resource_name, flags);
+
+	if (!res) {
+		pr_debug("Unable to reserve System RAM region: %016llx->%016llx\n",
+				start, start + size);
 		return ERR_PTR(-EEXIST);
 	}
 	return res;
_
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 7/9] dax/kmem: actually perform memory hotplug
  2018-10-22 20:13 [PATCH 0/9] Allow persistent memory to be used like normal RAM Dave Hansen
                   ` (5 preceding siblings ...)
  2018-10-22 20:13 ` [PATCH 6/9] mm/memory-hotplug: allow memory resources to be children Dave Hansen
@ 2018-10-22 20:13 ` Dave Hansen
  2018-10-22 20:13 ` [PATCH 8/9] dax/kmem: let walk_system_ram_range() search child resources Dave Hansen
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Dave Hansen @ 2018-10-22 20:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: thomas.lendacky, mhocko, linux-nvdimm, Dave Hansen, ying.huang,
	linux-mm, zwisler, fengguang.wu, akpm


This is the meat of this whole series.  When the "kmem" device's
probe function is called and we know we have a good persistent
memory device, hotplug the memory back into the main kernel.

Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ross Zwisler <zwisler@kernel.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: linux-nvdimm@lists.01.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: Huang Ying <ying.huang@intel.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>

---

 b/drivers/dax/kmem.c |   28 +++++++++++++++++++++++++---
 1 file changed, 25 insertions(+), 3 deletions(-)

diff -puN drivers/dax/kmem.c~dax-kmem-hotplug drivers/dax/kmem.c
--- a/drivers/dax/kmem.c~dax-kmem-hotplug	2018-10-22 13:12:24.069930387 -0700
+++ b/drivers/dax/kmem.c	2018-10-22 13:12:24.072930387 -0700
@@ -55,10 +55,12 @@ static void dax_kmem_percpu_kill(void *d
 static int dax_kmem_probe(struct device *dev)
 {
 	void *addr;
+	int numa_node;
 	struct resource res;
 	int rc, id, region_id;
 	struct nd_pfn_sb *pfn_sb;
 	struct dev_dax *dev_dax;
+	struct resource *new_res;
 	struct dax_kmem *dax_kmem;
 	struct nd_namespace_io *nsio;
 	struct dax_region *dax_region;
@@ -86,13 +88,30 @@ static int dax_kmem_probe(struct device
 
 	pfn_sb = nd_pfn->pfn_sb;
 
-	if (!devm_request_mem_region(dev, nsio->res.start,
-				resource_size(&nsio->res),
-				dev_name(&ndns->dev))) {
+	new_res = devm_request_mem_region(dev, nsio->res.start,
+					  resource_size(&nsio->res),
+					  "System RAM (pmem)");
+	if (!new_res) {
 		dev_warn(dev, "could not reserve region %pR\n", &nsio->res);
 		return -EBUSY;
 	}
 
+	/*
+	 * Set flags appropriate for System RAM.  Leave ..._BUSY clear
+	 * so that add_memory() can add a child resource.
+	 */
+	new_res->flags = IORESOURCE_SYSTEM_RAM;
+
+	numa_node = dev_to_node(dev);
+	if (numa_node < 0) {
+		pr_warn_once("bad numa_node: %d, forcing to 0\n", numa_node);
+		numa_node = 0;
+	}
+
+	rc = add_memory(numa_node, nsio->res.start, resource_size(&nsio->res));
+	if (rc)
+		return rc;
+
 	dax_kmem->dev = dev;
 	init_completion(&dax_kmem->cmp);
 	rc = percpu_ref_init(&dax_kmem->ref, dax_kmem_percpu_release, 0,
@@ -106,6 +125,9 @@ static int dax_kmem_probe(struct device
 		return rc;
 
 	dax_kmem->pgmap.ref = &dax_kmem->ref;
+
+	dax_kmem->pgmap.res.name = "name_kmem_override2";
+
 	addr = devm_memremap_pages(dev, &dax_kmem->pgmap);
 	if (IS_ERR(addr))
 		return PTR_ERR(addr);
_
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 8/9] dax/kmem: let walk_system_ram_range() search child resources
  2018-10-22 20:13 [PATCH 0/9] Allow persistent memory to be used like normal RAM Dave Hansen
                   ` (6 preceding siblings ...)
  2018-10-22 20:13 ` [PATCH 7/9] dax/kmem: actually perform memory hotplug Dave Hansen
@ 2018-10-22 20:13 ` Dave Hansen
  2018-10-22 20:13 ` [PATCH 9/9] dax/kmem: actually enable the code in Makefile Dave Hansen
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Dave Hansen @ 2018-10-22 20:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: thomas.lendacky, mhocko, linux-nvdimm, Dave Hansen, ying.huang,
	linux-mm, zwisler, fengguang.wu, akpm


In the process of onlining memory, we use walk_system_ram_range()
to find the actual RAM areas inside of the area being onlined.

However, it currently only finds memory resources which are
"top-level" iomem_resources.  Children are not currently
searched which causes it to skip System RAM in areas like this
(in the format of /proc/iomem):

a0000000-bfffffff : Persistent Memory (legacy)
  a0000000-afffffff : System RAM

Changing the true->false here allows children to be searched
as well.  We need this because we add a new "System RAM"
resource underneath the "persistent memory" resource when
we use persistent memory in a volatile mode.

Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ross Zwisler <zwisler@kernel.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: linux-nvdimm@lists.01.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: Huang Ying <ying.huang@intel.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>

---

 b/kernel/resource.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff -puN kernel/resource.c~mm-walk_system_ram_range-search-child-resources kernel/resource.c
--- a/kernel/resource.c~mm-walk_system_ram_range-search-child-resources	2018-10-22 13:12:24.565930386 -0700
+++ b/kernel/resource.c	2018-10-22 13:12:24.572930386 -0700
@@ -445,6 +445,9 @@ int walk_mem_res(u64 start, u64 end, voi
  * This function calls the @func callback against all memory ranges of type
  * System RAM which are marked as IORESOURCE_SYSTEM_RAM and IORESOUCE_BUSY.
  * It is to be used only for System RAM.
+ *
+ * This will find System RAM ranges that are children of top-level resources
+ * in addition to top-level System RAM resources.
  */
 int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
 			  void *arg, int (*func)(unsigned long, unsigned long, void *))
@@ -460,7 +463,7 @@ int walk_system_ram_range(unsigned long
 	flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
 	while (start < end &&
 	       !find_next_iomem_res(start, end, flags, IORES_DESC_NONE,
-				    true, &res)) {
+				    false, &res)) {
 		pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
 		end_pfn = (res.end + 1) >> PAGE_SHIFT;
 		if (end_pfn > pfn)
_
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 9/9] dax/kmem: actually enable the code in Makefile
  2018-10-22 20:13 [PATCH 0/9] Allow persistent memory to be used like normal RAM Dave Hansen
                   ` (7 preceding siblings ...)
  2018-10-22 20:13 ` [PATCH 8/9] dax/kmem: let walk_system_ram_range() search child resources Dave Hansen
@ 2018-10-22 20:13 ` Dave Hansen
  2018-10-23  1:05 ` [PATCH 0/9] Allow persistent memory to be used like normal RAM Dan Williams
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Dave Hansen @ 2018-10-22 20:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: thomas.lendacky, mhocko, linux-nvdimm, Dave Hansen, ying.huang,
	linux-mm, zwisler, fengguang.wu, akpm


Most of the new code was dead up to this point.  Now that
all the pieces are in place, enable it.

Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ross Zwisler <zwisler@kernel.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: linux-nvdimm@lists.01.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: Huang Ying <ying.huang@intel.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>

---

 b/drivers/dax/Makefile |    2 ++
 1 file changed, 2 insertions(+)

diff -puN drivers/dax/Makefile~dax-kmem-makefile drivers/dax/Makefile
--- a/drivers/dax/Makefile~dax-kmem-makefile	2018-10-22 13:12:25.068930384 -0700
+++ b/drivers/dax/Makefile	2018-10-22 13:12:25.071930384 -0700
@@ -2,7 +2,9 @@
 obj-$(CONFIG_DAX) += dax.o
 obj-$(CONFIG_DEV_DAX) += device_dax.o
 obj-$(CONFIG_DEV_DAX_PMEM) += dax_pmem.o
+obj-$(CONFIG_DEV_DAX_PMEM) += dax_kmem.o
 
 dax-y := super.o
 dax_pmem-y := pmem.o
+dax_kmem-y := kmem.o
 device_dax-y := device.o
_
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/9] Allow persistent memory to be used like normal RAM
  2018-10-22 20:13 [PATCH 0/9] Allow persistent memory to be used like normal RAM Dave Hansen
                   ` (8 preceding siblings ...)
  2018-10-22 20:13 ` [PATCH 9/9] dax/kmem: actually enable the code in Makefile Dave Hansen
@ 2018-10-23  1:05 ` Dan Williams
  2018-10-23  1:11   ` Dan Williams
  2018-10-23 18:12   ` Elliott, Robert (Persistent Memory)
  2018-10-26  5:42 ` Xishi Qiu
                   ` (3 subsequent siblings)
  13 siblings, 2 replies; 26+ messages in thread
From: Dan Williams @ 2018-10-23  1:05 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Linux Kernel Mailing List, Dave Jiang, zwisler, Vishal L Verma,
	Tom Lendacky, Andrew Morton, Michal Hocko, linux-nvdimm,
	Linux MM, Huang, Ying, Fengguang Wu

On Mon, Oct 22, 2018 at 1:18 PM Dave Hansen <dave.hansen@linux.intel.com> wrote:
>
> Persistent memory is cool.  But, currently, you have to rewrite
> your applications to use it.  Wouldn't it be cool if you could
> just have it show up in your system like normal RAM and get to
> it like a slow blob of memory?  Well... have I got the patch
> series for you!
>
> This series adds a new "driver" to which pmem devices can be
> attached.  Once attached, the memory "owned" by the device is
> hot-added to the kernel and managed like any other memory.  On
> systems with an HMAT (a new ACPI table), each socket (roughly)
> will have a separate NUMA node for its persistent memory so
> this newly-added memory can be selected by its unique NUMA
> node.
>
> This is highly RFC, and I really want the feedback from the
> nvdimm/pmem folks about whether this is a viable long-term
> perversion of their code and device mode.  It's insufficiently
> documented and probably not bisectable either.
>
> Todo:
> 1. The device re-binding hacks are ham-fisted at best.  We
>    need a better way of doing this, especially so the kmem
>    driver does not get in the way of normal pmem devices.
> 2. When the device has no proper node, we default it to
>    NUMA node 0.  Is that OK?
> 3. We muck with the 'struct resource' code quite a bit. It
>    definitely needs a once-over from folks more familiar
>    with it than I.
> 4. Is there a better way to do this than starting with a
>    copy of pmem.c?

So I don't think we want to do patch 2, 3, or 5. Just jump to patch 7
and remove all the devm_memremap_pages() infrastructure and dax_region
infrastructure.

The driver should be a dead simple turn around to call add_memory()
for the passed in range. The hard part is, as you say, arranging for
the kmem driver to not stand in the way of typical range / device
claims by the dax_pmem device.

To me this looks like teaching the nvdimm-bus and this dax_kmem driver
to require explicit matching based on 'id'. The attachment scheme
would look like this:

modprobe dax_kmem
echo dax0.0 > /sys/bus/nd/drivers/dax_kmem/new_id
echo dax0.0 > /sys/bus/nd/drivers/dax_pmem/unbind
echo dax0.0 > /sys/bus/nd/drivers/dax_kmem/bind

At step1 the dax_kmem drivers will match no devices and stays out of
the way of dax_pmem. It learns about devices it cares about by being
explicitly told about them. Then unbind from the typical dax_pmem
driver and attach to dax_kmem to perform the one way hotplug.

I expect udev can automate this by setting up a rule to watch for
device-dax instances by UUID and call a script to do the detach /
reattach dance.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/9] Allow persistent memory to be used like normal RAM
  2018-10-23  1:05 ` [PATCH 0/9] Allow persistent memory to be used like normal RAM Dan Williams
@ 2018-10-23  1:11   ` Dan Williams
  2018-10-26  8:03     ` Xishi Qiu
  2018-10-27  4:45     ` Dan Williams
  2018-10-23 18:12   ` Elliott, Robert (Persistent Memory)
  1 sibling, 2 replies; 26+ messages in thread
From: Dan Williams @ 2018-10-23  1:11 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Tom Lendacky, Michal Hocko, linux-nvdimm, Huang, Ying,
	Linux Kernel Mailing List, Linux MM, zwisler, Andrew Morton,
	Fengguang Wu

On Mon, Oct 22, 2018 at 6:05 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> On Mon, Oct 22, 2018 at 1:18 PM Dave Hansen <dave.hansen@linux.intel.com> wrote:
> >
> > Persistent memory is cool.  But, currently, you have to rewrite
> > your applications to use it.  Wouldn't it be cool if you could
> > just have it show up in your system like normal RAM and get to
> > it like a slow blob of memory?  Well... have I got the patch
> > series for you!
> >
> > This series adds a new "driver" to which pmem devices can be
> > attached.  Once attached, the memory "owned" by the device is
> > hot-added to the kernel and managed like any other memory.  On
> > systems with an HMAT (a new ACPI table), each socket (roughly)
> > will have a separate NUMA node for its persistent memory so
> > this newly-added memory can be selected by its unique NUMA
> > node.
> >
> > This is highly RFC, and I really want the feedback from the
> > nvdimm/pmem folks about whether this is a viable long-term
> > perversion of their code and device mode.  It's insufficiently
> > documented and probably not bisectable either.
> >
> > Todo:
> > 1. The device re-binding hacks are ham-fisted at best.  We
> >    need a better way of doing this, especially so the kmem
> >    driver does not get in the way of normal pmem devices.
> > 2. When the device has no proper node, we default it to
> >    NUMA node 0.  Is that OK?
> > 3. We muck with the 'struct resource' code quite a bit. It
> >    definitely needs a once-over from folks more familiar
> >    with it than I.
> > 4. Is there a better way to do this than starting with a
> >    copy of pmem.c?
>
> So I don't think we want to do patch 2, 3, or 5. Just jump to patch 7
> and remove all the devm_memremap_pages() infrastructure and dax_region
> infrastructure.
>
> The driver should be a dead simple turn around to call add_memory()
> for the passed in range. The hard part is, as you say, arranging for
> the kmem driver to not stand in the way of typical range / device
> claims by the dax_pmem device.
>
> To me this looks like teaching the nvdimm-bus and this dax_kmem driver
> to require explicit matching based on 'id'. The attachment scheme
> would look like this:
>
> modprobe dax_kmem
> echo dax0.0 > /sys/bus/nd/drivers/dax_kmem/new_id
> echo dax0.0 > /sys/bus/nd/drivers/dax_pmem/unbind
> echo dax0.0 > /sys/bus/nd/drivers/dax_kmem/bind
>
> At step1 the dax_kmem drivers will match no devices and stays out of
> the way of dax_pmem. It learns about devices it cares about by being
> explicitly told about them. Then unbind from the typical dax_pmem
> driver and attach to dax_kmem to perform the one way hotplug.
>
> I expect udev can automate this by setting up a rule to watch for
> device-dax instances by UUID and call a script to do the detach /
> reattach dance.

The next question is how to support this for ranges that don't
originate from the pmem sub-system. I expect we want dax_kmem to
register a generic platform device representing the range and have a
generic platofrm driver that turns around and does the add_memory().
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/9] dax: kernel memory driver for mm ownership of DAX
  2018-10-22 20:13 ` [PATCH 2/9] dax: kernel memory driver for mm ownership of DAX Dave Hansen
@ 2018-10-23  1:56   ` Randy Dunlap
  0 siblings, 0 replies; 26+ messages in thread
From: Randy Dunlap @ 2018-10-23  1:56 UTC (permalink / raw)
  To: Dave Hansen, linux-kernel
  Cc: thomas.lendacky, mhocko, linux-nvdimm, ying.huang, linux-mm,
	zwisler, fengguang.wu, akpm

On 10/22/18 1:13 PM, Dave Hansen wrote:
> Add the actual driver to which will own the DAX range.  This
> allows very nice party with the other possible "owners" of

Good to see a nice party sometimes.  :)

> a DAX region: device DAX and filesystem DAX.  It also greatly
> simplifies the process of handing off control of the memory
> between the different owners since it's just a matter of
> unbinding and rebinding the device to different drivers.


-- 
~Randy
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH 0/9] Allow persistent memory to be used like normal RAM
  2018-10-23  1:05 ` [PATCH 0/9] Allow persistent memory to be used like normal RAM Dan Williams
  2018-10-23  1:11   ` Dan Williams
@ 2018-10-23 18:12   ` Elliott, Robert (Persistent Memory)
  2018-10-23 18:16     ` Dave Hansen
  1 sibling, 1 reply; 26+ messages in thread
From: Elliott, Robert (Persistent Memory) @ 2018-10-23 18:12 UTC (permalink / raw)
  To: 'Dan Williams', Dave Hansen
  Cc: Tom Lendacky, Hocko, Michal, linux-nvdimm, zwisler,
	Linux Kernel Mailing List, Linux MM, Huang, Ying, Andrew Morton,
	Fengguang Wu



> -----Original Message-----
> From: Linux-nvdimm [mailto:linux-nvdimm-bounces@lists.01.org] On Behalf Of Dan Williams
> Sent: Monday, October 22, 2018 8:05 PM
> Subject: Re: [PATCH 0/9] Allow persistent memory to be used like normal RAM
> 
> On Mon, Oct 22, 2018 at 1:18 PM Dave Hansen <dave.hansen@linux.intel.com> wrote:
...
> This series adds a new "driver" to which pmem devices can be
> attached.  Once attached, the memory "owned" by the device is
> hot-added to the kernel and managed like any other memory.  On

Would this memory be considered volatile (with the driver initializing
it to zeros), or persistent (contents are presented unchanged,
applications may guarantee persistence by using cache flush
instructions, fence instructions, and writing to flush hint addresses
per the persistent memory programming model)?

> > 1. The device re-binding hacks are ham-fisted at best.  We
> >    need a better way of doing this, especially so the kmem
> >    driver does not get in the way of normal pmem devices.
...
> To me this looks like teaching the nvdimm-bus and this dax_kmem driver
> to require explicit matching based on 'id'. The attachment scheme
> would look like this:
> 
> modprobe dax_kmem
> echo dax0.0 > /sys/bus/nd/drivers/dax_kmem/new_id
> echo dax0.0 > /sys/bus/nd/drivers/dax_pmem/unbind
> echo dax0.0 > /sys/bus/nd/drivers/dax_kmem/bind
> 
> At step1 the dax_kmem drivers will match no devices and stays out of
> the way of dax_pmem. It learns about devices it cares about by being
> explicitly told about them. Then unbind from the typical dax_pmem
> driver and attach to dax_kmem to perform the one way hotplug.
> 
> I expect udev can automate this by setting up a rule to watch for
> device-dax instances by UUID and call a script to do the detach /
> reattach dance.

Where would that rule be stored? Storing it on another device
is problematic. If that rule is lost, it could confuse other
drivers trying to grab device DAX devices for use as persistent
memory.

A new namespace mode would record the intended usage in the
device itself, eliminating dependencies. It could join the
other modes like:

	ndctl create-namespace -m raw
		create /dev/pmem4 block device
	ndctl create-namespace -m sector
		create /dev/pmem4s block device
	ndctl create-namespace -m fsdax
		create /dev/pmem4 block device
	ndctl create-namespace -m devdax
		create /dev/dax4.3 character device
		for use as persistent memory
	ndctl create-namespace -m mem
		create /dev/mem4.3 character device
		for use as volatile memory

---
Robert Elliott, HPE Persistent Memory



_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/9] Allow persistent memory to be used like normal RAM
  2018-10-23 18:12   ` Elliott, Robert (Persistent Memory)
@ 2018-10-23 18:16     ` Dave Hansen
  2018-10-23 18:58       ` Dan Williams
  0 siblings, 1 reply; 26+ messages in thread
From: Dave Hansen @ 2018-10-23 18:16 UTC (permalink / raw)
  To: Elliott, Robert (Persistent Memory), 'Dan Williams', Dave Hansen
  Cc: Tom Lendacky, Hocko, Michal, linux-nvdimm, zwisler,
	Linux Kernel Mailing List, Linux MM, Huang, Ying, Andrew Morton,
	Fengguang Wu

>> This series adds a new "driver" to which pmem devices can be
>> attached.  Once attached, the memory "owned" by the device is
>> hot-added to the kernel and managed like any other memory.  On
> 
> Would this memory be considered volatile (with the driver initializing
> it to zeros), or persistent (contents are presented unchanged,
> applications may guarantee persistence by using cache flush
> instructions, fence instructions, and writing to flush hint addresses
> per the persistent memory programming model)?

Volatile.

>> I expect udev can automate this by setting up a rule to watch for
>> device-dax instances by UUID and call a script to do the detach /
>> reattach dance.
> 
> Where would that rule be stored? Storing it on another device
> is problematic. If that rule is lost, it could confuse other
> drivers trying to grab device DAX devices for use as persistent
> memory.

Well, we do lots of things like stable device naming from udev scripts.
 We depend on them not being lost.  At least this "fails safe" so we'll
default to persistence instead of defaulting to "eat your data".

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/9] Allow persistent memory to be used like normal RAM
  2018-10-23 18:16     ` Dave Hansen
@ 2018-10-23 18:58       ` Dan Williams
  0 siblings, 0 replies; 26+ messages in thread
From: Dan Williams @ 2018-10-23 18:58 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Tom Lendacky, Michal Hocko, linux-nvdimm, Dave Hansen,
	Linux Kernel Mailing List, Linux MM, Huang, Ying, zwisler,
	Andrew Morton, Fengguang Wu

On Tue, Oct 23, 2018 at 11:17 AM Dave Hansen <dave.hansen@intel.com> wrote:
>
> >> This series adds a new "driver" to which pmem devices can be
> >> attached.  Once attached, the memory "owned" by the device is
> >> hot-added to the kernel and managed like any other memory.  On
> >
> > Would this memory be considered volatile (with the driver initializing
> > it to zeros), or persistent (contents are presented unchanged,
> > applications may guarantee persistence by using cache flush
> > instructions, fence instructions, and writing to flush hint addresses
> > per the persistent memory programming model)?
>
> Volatile.
>
> >> I expect udev can automate this by setting up a rule to watch for
> >> device-dax instances by UUID and call a script to do the detach /
> >> reattach dance.
> >
> > Where would that rule be stored? Storing it on another device
> > is problematic. If that rule is lost, it could confuse other
> > drivers trying to grab device DAX devices for use as persistent
> > memory.
>
> Well, we do lots of things like stable device naming from udev scripts.
>  We depend on them not being lost.  At least this "fails safe" so we'll
> default to persistence instead of defaulting to "eat your data".
>

Right, and at least for the persistent memory to volatile conversion
case we will have the UUID to positively identify the DAX device. So
it will indeed "fail safe" and just become a dax_pmem device again if
the configuration is lost. We'll likely need to create/use a "by-path"
scheme for non-pmem use cases.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/9] Allow persistent memory to be used like normal RAM
  2018-10-22 20:13 [PATCH 0/9] Allow persistent memory to be used like normal RAM Dave Hansen
                   ` (9 preceding siblings ...)
  2018-10-23  1:05 ` [PATCH 0/9] Allow persistent memory to be used like normal RAM Dan Williams
@ 2018-10-26  5:42 ` Xishi Qiu
  2018-10-26  9:03   ` Fengguang Wu
  2018-10-27 11:00 ` Fengguang Wu
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 26+ messages in thread
From: Xishi Qiu @ 2018-10-26  5:42 UTC (permalink / raw)
  To: Dave Hansen, linux-kernel
  Cc: thomas.lendacky, mhocko, Xishi Qiu, linux-nvdimm, ying.huang,
	linux-mm, zy107165, zwisler, fengguang.wu, akpm

Hi Dave,

This patchset hotadd a pmem and use it like a normal DRAM, I
have some questions here, and I think my production line may
also concerned.

1) How to set the AEP (Apache Pass) usage percentage for one
process (or a vma)?
e.g. there are two vms from two customers, they pay different
money for the vm. So if we alloc and convert AEP/DRAM by global,
the high load vm may get 100% DRAM, and the low load vm may get
100% AEP, this is unfair. The low load is compared to another
one, for himself, the actual low load maybe is high load.

2) I find page idle only check the access bit, _PAGE_BIT_ACCESSED,
as we know AEP read performance is much higher than write, so I
think we should also check the dirty bit, _PAGE_BIT_DIRTY. Test
and clear dirty bit is safe for anon page, but unsafe for file
page, e.g. should call clear_page_dirty_for_io first, right?

3) I think we should manage the AEP memory separately instead
of together with the DRAM. Manage them together maybe change less
code, but it will cause some problems at high priority DRAM
allocation if there is no DRAM, then should convert (steal DRAM)
from another one, it takes much time.
How about create a new zone, e.g. ZONE_AEP, and use madvise
to set a new flag VM_AEP, which will enable the vma to alloc AEP
memory in page fault later, then use vma_rss_stat(like mm_rss_stat)
to control the AEP usage percentage for a vma.

4) I am interesting about the conversion mechanism betweent AEP
and DRAM. I think numa balancing will cause page fault, this is
unacceptable for some apps, it cause performance jitter. And the
kswapd is not precise enough. So use a daemon kernel thread
(like khugepaged) maybe a good solution, add the AEP used processes
to a list, then scan the VM_AEP marked vmas, get the access state,
and do the conversion.

Thanks,
Xishi Qiu
On 2018/10/23 04:13, Dave Hansen wrote:
> Persistent memory is cool.  But, currently, you have to rewrite
> your applications to use it.  Wouldn't it be cool if you could
> just have it show up in your system like normal RAM and get to
> it like a slow blob of memory?  Well... have I got the patch
> series for you!
> 
> This series adds a new "driver" to which pmem devices can be
> attached.  Once attached, the memory "owned" by the device is
> hot-added to the kernel and managed like any other memory.  On
> systems with an HMAT (a new ACPI table), each socket (roughly)
> will have a separate NUMA node for its persistent memory so
> this newly-added memory can be selected by its unique NUMA
> node.
> 
> This is highly RFC, and I really want the feedback from the
> nvdimm/pmem folks about whether this is a viable long-term
> perversion of their code and device mode.  It's insufficiently
> documented and probably not bisectable either.
> 
> Todo:
> 1. The device re-binding hacks are ham-fisted at best.  We
>    need a better way of doing this, especially so the kmem
>    driver does not get in the way of normal pmem devices.
> 2. When the device has no proper node, we default it to
>    NUMA node 0.  Is that OK?
> 3. We muck with the 'struct resource' code quite a bit. It
>    definitely needs a once-over from folks more familiar
>    with it than I.
> 4. Is there a better way to do this than starting with a
>    copy of pmem.c?
> 
> Here's how I set up a system to test this thing:
> 
> 1. Boot qemu with lots of memory: "-m 4096", for instance
> 2. Reserve 512MB of physical memory.  Reserving a spot a 2GB
>    physical seems to work: memmap=512M!0x0000000080000000
>    This will end up looking like a pmem device at boot.
> 3. When booted, convert fsdax device to "device dax":
> 	ndctl create-namespace -fe namespace0.0 -m dax
> 4. In the background, the kmem driver will probably bind to the
>    new device.
> 5. Now, online the new memory sections.  Perhaps:
> 
> grep ^MemTotal /proc/meminfo
> for f in `grep -vl online /sys/devices/system/memory/*/state`; do
> 	echo $f: `cat $f`
> 	echo online > $f
> 	grep ^MemTotal /proc/meminfo
> done
> 
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Dave Jiang <dave.jiang@intel.com>
> Cc: Ross Zwisler <zwisler@kernel.org>
> Cc: Vishal Verma <vishal.l.verma@intel.com>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: linux-nvdimm@lists.01.org
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-mm@kvack.org
> Cc: Huang Ying <ying.huang@intel.com>
> Cc: Fengguang Wu <fengguang.wu@intel.com>
> 
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/9] Allow persistent memory to be used like normal RAM
  2018-10-23  1:11   ` Dan Williams
@ 2018-10-26  8:03     ` Xishi Qiu
  2018-10-26 13:58       ` Dave Hansen
  2018-10-27  4:45     ` Dan Williams
  1 sibling, 1 reply; 26+ messages in thread
From: Xishi Qiu @ 2018-10-26  8:03 UTC (permalink / raw)
  To: Dan Williams, Dave Hansen
  Cc: Linux Kernel Mailing List, Dave Jiang, zwisler, Vishal L Verma,
	Tom Lendacky, Andrew Morton, Michal Hocko, linux-nvdimm,
	Linux MM, Huang, Ying, Fengguang Wu, Xishi Qiu, zy107165

Hi Dan,

How about let the BIOS report a new type for kmem in e820 table?
e.g.
#define E820_PMEM	7
#define E820_KMEM	8

Then pmem and kmem are separately, and we can easily hotadd kmem
to the memory subsystem, no disturb the existing code (e.g. pmem,
nvdimm, dax...).

I don't know whether Intel will change some hardware features for
pmem which used like a volatility memory in the future. Perhaps
faster than pmem, cheaper, but volatility, and no need to care
about atomicity, consistency, L2/L3 cache...

Another question, why call it kmem? what does the "k" mean?

Thanks,
Xishi Qiu
On 2018/10/23 09:11, Dan Williams wrote:
> On Mon, Oct 22, 2018 at 6:05 PM Dan Williams <dan.j.williams@intel.com> wrote:
>>
>> On Mon, Oct 22, 2018 at 1:18 PM Dave Hansen <dave.hansen@linux.intel.com> wrote:
>>>
>>> Persistent memory is cool.  But, currently, you have to rewrite
>>> your applications to use it.  Wouldn't it be cool if you could
>>> just have it show up in your system like normal RAM and get to
>>> it like a slow blob of memory?  Well... have I got the patch
>>> series for you!
>>>
>>> This series adds a new "driver" to which pmem devices can be
>>> attached.  Once attached, the memory "owned" by the device is
>>> hot-added to the kernel and managed like any other memory.  On
>>> systems with an HMAT (a new ACPI table), each socket (roughly)
>>> will have a separate NUMA node for its persistent memory so
>>> this newly-added memory can be selected by its unique NUMA
>>> node.
>>>
>>> This is highly RFC, and I really want the feedback from the
>>> nvdimm/pmem folks about whether this is a viable long-term
>>> perversion of their code and device mode.  It's insufficiently
>>> documented and probably not bisectable either.
>>>
>>> Todo:
>>> 1. The device re-binding hacks are ham-fisted at best.  We
>>>    need a better way of doing this, especially so the kmem
>>>    driver does not get in the way of normal pmem devices.
>>> 2. When the device has no proper node, we default it to
>>>    NUMA node 0.  Is that OK?
>>> 3. We muck with the 'struct resource' code quite a bit. It
>>>    definitely needs a once-over from folks more familiar
>>>    with it than I.
>>> 4. Is there a better way to do this than starting with a
>>>    copy of pmem.c?
>>
>> So I don't think we want to do patch 2, 3, or 5. Just jump to patch 7
>> and remove all the devm_memremap_pages() infrastructure and dax_region
>> infrastructure.
>>
>> The driver should be a dead simple turn around to call add_memory()
>> for the passed in range. The hard part is, as you say, arranging for
>> the kmem driver to not stand in the way of typical range / device
>> claims by the dax_pmem device.
>>
>> To me this looks like teaching the nvdimm-bus and this dax_kmem driver
>> to require explicit matching based on 'id'. The attachment scheme
>> would look like this:
>>
>> modprobe dax_kmem
>> echo dax0.0 > /sys/bus/nd/drivers/dax_kmem/new_id
>> echo dax0.0 > /sys/bus/nd/drivers/dax_pmem/unbind
>> echo dax0.0 > /sys/bus/nd/drivers/dax_kmem/bind
>>
>> At step1 the dax_kmem drivers will match no devices and stays out of
>> the way of dax_pmem. It learns about devices it cares about by being
>> explicitly told about them. Then unbind from the typical dax_pmem
>> driver and attach to dax_kmem to perform the one way hotplug.
>>
>> I expect udev can automate this by setting up a rule to watch for
>> device-dax instances by UUID and call a script to do the detach /
>> reattach dance.
> 
> The next question is how to support this for ranges that don't
> originate from the pmem sub-system. I expect we want dax_kmem to
> register a generic platform device representing the range and have a
> generic platofrm driver that turns around and does the add_memory().
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/9] Allow persistent memory to be used like normal RAM
  2018-10-26  5:42 ` Xishi Qiu
@ 2018-10-26  9:03   ` Fengguang Wu
  0 siblings, 0 replies; 26+ messages in thread
From: Fengguang Wu @ 2018-10-26  9:03 UTC (permalink / raw)
  To: Xishi Qiu
  Cc: thomas.lendacky, mhocko, Xishi Qiu, linux-nvdimm, Dave Hansen,
	ying.huang, linux-kernel, linux-mm, zy107165, zwisler, akpm

Hi Xishi,

I can help answer the migration and policy related questions.

On Fri, Oct 26, 2018 at 01:42:43PM +0800, Xishi Qiu wrote:
>Hi Dave,
>
>This patchset hotadd a pmem and use it like a normal DRAM, I
>have some questions here, and I think my production line may
>also concerned.
>
>1) How to set the AEP (Apache Pass) usage percentage for one
>process (or a vma)?
>e.g. there are two vms from two customers, they pay different
>money for the vm. So if we alloc and convert AEP/DRAM by global,
>the high load vm may get 100% DRAM, and the low load vm may get
>100% AEP, this is unfair. The low load is compared to another
>one, for himself, the actual low load maybe is high load.

Per VM, process, VMA policies are possible. They can be implemented
in user space migration daemon. We can dip into details when user
space code is released.

>2) I find page idle only check the access bit, _PAGE_BIT_ACCESSED,
>as we know AEP read performance is much higher than write, so I
>think we should also check the dirty bit, _PAGE_BIT_DIRTY. Test

Yeah dirty bit could be considered later. The initial version will
only check accessed bit.

>and clear dirty bit is safe for anon page, but unsafe for file
>page, e.g. should call clear_page_dirty_for_io first, right?

We'll only migrate anonymous pages in the initial version.

>3) I think we should manage the AEP memory separately instead
>of together with the DRAM.

I guess the intention of this patchset is to use different
NUMA nodes for AEP and DRAM.

>Manage them together maybe change less
>code, but it will cause some problems at high priority DRAM
>allocation if there is no DRAM, then should convert (steal DRAM)
>from another one, it takes much time.
>How about create a new zone, e.g. ZONE_AEP, and use madvise
>to set a new flag VM_AEP, which will enable the vma to alloc AEP
>memory in page fault later, then use vma_rss_stat(like mm_rss_stat)
>to control the AEP usage percentage for a vma.
>
>4) I am interesting about the conversion mechanism betweent AEP
>and DRAM. I think numa balancing will cause page fault, this is
>unacceptable for some apps, it cause performance jitter. And the

NUMA balancing can be taught to be enabled per task. I'm not sure
there is already such knob, but it looks easy to implement such a
policy.

>kswapd is not precise enough. So use a daemon kernel thread
>(like khugepaged) maybe a good solution, add the AEP used processes
>to a list, then scan the VM_AEP marked vmas, get the access state,
>and do the conversion.

If that's a desirable policy, our user space migration daemon could
possibly do that, too.

Thanks,
Fengguang

>On 2018/10/23 04:13, Dave Hansen wrote:
>> Persistent memory is cool.  But, currently, you have to rewrite
>> your applications to use it.  Wouldn't it be cool if you could
>> just have it show up in your system like normal RAM and get to
>> it like a slow blob of memory?  Well... have I got the patch
>> series for you!
>>
>> This series adds a new "driver" to which pmem devices can be
>> attached.  Once attached, the memory "owned" by the device is
>> hot-added to the kernel and managed like any other memory.  On
>> systems with an HMAT (a new ACPI table), each socket (roughly)
>> will have a separate NUMA node for its persistent memory so
>> this newly-added memory can be selected by its unique NUMA
>> node.
>>
>> This is highly RFC, and I really want the feedback from the
>> nvdimm/pmem folks about whether this is a viable long-term
>> perversion of their code and device mode.  It's insufficiently
>> documented and probably not bisectable either.
>>
>> Todo:
>> 1. The device re-binding hacks are ham-fisted at best.  We
>>    need a better way of doing this, especially so the kmem
>>    driver does not get in the way of normal pmem devices.
>> 2. When the device has no proper node, we default it to
>>    NUMA node 0.  Is that OK?
>> 3. We muck with the 'struct resource' code quite a bit. It
>>    definitely needs a once-over from folks more familiar
>>    with it than I.
>> 4. Is there a better way to do this than starting with a
>>    copy of pmem.c?
>>
>> Here's how I set up a system to test this thing:
>>
>> 1. Boot qemu with lots of memory: "-m 4096", for instance
>> 2. Reserve 512MB of physical memory.  Reserving a spot a 2GB
>>    physical seems to work: memmap=512M!0x0000000080000000
>>    This will end up looking like a pmem device at boot.
>> 3. When booted, convert fsdax device to "device dax":
>> 	ndctl create-namespace -fe namespace0.0 -m dax
>> 4. In the background, the kmem driver will probably bind to the
>>    new device.
>> 5. Now, online the new memory sections.  Perhaps:
>>
>> grep ^MemTotal /proc/meminfo
>> for f in `grep -vl online /sys/devices/system/memory/*/state`; do
>> 	echo $f: `cat $f`
>> 	echo online > $f
>> 	grep ^MemTotal /proc/meminfo
>> done
>>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Dave Jiang <dave.jiang@intel.com>
>> Cc: Ross Zwisler <zwisler@kernel.org>
>> Cc: Vishal Verma <vishal.l.verma@intel.com>
>> Cc: Tom Lendacky <thomas.lendacky@amd.com>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: linux-nvdimm@lists.01.org
>> Cc: linux-kernel@vger.kernel.org
>> Cc: linux-mm@kvack.org
>> Cc: Huang Ying <ying.huang@intel.com>
>> Cc: Fengguang Wu <fengguang.wu@intel.com>
>>
>
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/9] Allow persistent memory to be used like normal RAM
  2018-10-26  8:03     ` Xishi Qiu
@ 2018-10-26 13:58       ` Dave Hansen
  0 siblings, 0 replies; 26+ messages in thread
From: Dave Hansen @ 2018-10-26 13:58 UTC (permalink / raw)
  To: Xishi Qiu, Dan Williams, Dave Hansen
  Cc: Tom Lendacky, Michal Hocko, Xishi Qiu, linux-nvdimm, Huang, Ying,
	Linux Kernel Mailing List, Linux MM, zy107165, zwisler,
	Andrew Morton, Fengguang Wu

On 10/26/18 1:03 AM, Xishi Qiu wrote:
> How about let the BIOS report a new type for kmem in e820 table?
> e.g.
> #define E820_PMEM	7
> #define E820_KMEM	8

It would be best if the BIOS just did this all for us.  But, what you're
describing would take years to get from concept to showing up in
someone's hands.  I'd rather not wait.

Plus, doing it the way I suggested gives the OS the most control.  The
BIOS isn't in the critical path to do the right thing.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/9] Allow persistent memory to be used like normal RAM
  2018-10-23  1:11   ` Dan Williams
  2018-10-26  8:03     ` Xishi Qiu
@ 2018-10-27  4:45     ` Dan Williams
  1 sibling, 0 replies; 26+ messages in thread
From: Dan Williams @ 2018-10-27  4:45 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Tom Lendacky, Michal Hocko, linux-nvdimm, Huang, Ying,
	Linux Kernel Mailing List, Linux MM, zwisler, Andrew Morton,
	Fengguang Wu

On Mon, Oct 22, 2018 at 6:11 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> On Mon, Oct 22, 2018 at 6:05 PM Dan Williams <dan.j.williams@intel.com> wrote:
> >
> > On Mon, Oct 22, 2018 at 1:18 PM Dave Hansen <dave.hansen@linux.intel.com> wrote:
> > >
> > > Persistent memory is cool.  But, currently, you have to rewrite
> > > your applications to use it.  Wouldn't it be cool if you could
> > > just have it show up in your system like normal RAM and get to
> > > it like a slow blob of memory?  Well... have I got the patch
> > > series for you!
> > >
> > > This series adds a new "driver" to which pmem devices can be
> > > attached.  Once attached, the memory "owned" by the device is
> > > hot-added to the kernel and managed like any other memory.  On
> > > systems with an HMAT (a new ACPI table), each socket (roughly)
> > > will have a separate NUMA node for its persistent memory so
> > > this newly-added memory can be selected by its unique NUMA
> > > node.
> > >
> > > This is highly RFC, and I really want the feedback from the
> > > nvdimm/pmem folks about whether this is a viable long-term
> > > perversion of their code and device mode.  It's insufficiently
> > > documented and probably not bisectable either.
> > >
> > > Todo:
> > > 1. The device re-binding hacks are ham-fisted at best.  We
> > >    need a better way of doing this, especially so the kmem
> > >    driver does not get in the way of normal pmem devices.
> > > 2. When the device has no proper node, we default it to
> > >    NUMA node 0.  Is that OK?
> > > 3. We muck with the 'struct resource' code quite a bit. It
> > >    definitely needs a once-over from folks more familiar
> > >    with it than I.
> > > 4. Is there a better way to do this than starting with a
> > >    copy of pmem.c?
> >
> > So I don't think we want to do patch 2, 3, or 5. Just jump to patch 7
> > and remove all the devm_memremap_pages() infrastructure and dax_region
> > infrastructure.
> >
> > The driver should be a dead simple turn around to call add_memory()
> > for the passed in range. The hard part is, as you say, arranging for
> > the kmem driver to not stand in the way of typical range / device
> > claims by the dax_pmem device.
> >
> > To me this looks like teaching the nvdimm-bus and this dax_kmem driver
> > to require explicit matching based on 'id'. The attachment scheme
> > would look like this:
> >
> > modprobe dax_kmem
> > echo dax0.0 > /sys/bus/nd/drivers/dax_kmem/new_id
> > echo dax0.0 > /sys/bus/nd/drivers/dax_pmem/unbind
> > echo dax0.0 > /sys/bus/nd/drivers/dax_kmem/bind
> >
> > At step1 the dax_kmem drivers will match no devices and stays out of
> > the way of dax_pmem. It learns about devices it cares about by being
> > explicitly told about them. Then unbind from the typical dax_pmem
> > driver and attach to dax_kmem to perform the one way hotplug.
> >
> > I expect udev can automate this by setting up a rule to watch for
> > device-dax instances by UUID and call a script to do the detach /
> > reattach dance.
>
> The next question is how to support this for ranges that don't
> originate from the pmem sub-system. I expect we want dax_kmem to
> register a generic platform device representing the range and have a
> generic platofrm driver that turns around and does the add_memory().

I forgot I have some old patches that do something along these lines
and make device-dax it's own bus. I'll dust those off so we can
discern what's left.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/9] Allow persistent memory to be used like normal RAM
  2018-10-22 20:13 [PATCH 0/9] Allow persistent memory to be used like normal RAM Dave Hansen
                   ` (10 preceding siblings ...)
  2018-10-26  5:42 ` Xishi Qiu
@ 2018-10-27 11:00 ` Fengguang Wu
  2018-10-31  5:11 ` Yang Shi
  2018-12-03  9:22 ` Brice Goglin
  13 siblings, 0 replies; 26+ messages in thread
From: Fengguang Wu @ 2018-10-27 11:00 UTC (permalink / raw)
  To: Dave Hansen
  Cc: thomas.lendacky, mhocko, linux-nvdimm, ying.huang, linux-kernel,
	linux-mm, zwisler, akpm

Hi Dave,

What's the base tree for this patchset? I tried 4.19, linux-next and
Dan's libnvdimm-for-next branch, but none applies cleanly.

Thanks,
Fengguang
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/9] Allow persistent memory to be used like normal RAM
  2018-10-22 20:13 [PATCH 0/9] Allow persistent memory to be used like normal RAM Dave Hansen
                   ` (11 preceding siblings ...)
  2018-10-27 11:00 ` Fengguang Wu
@ 2018-10-31  5:11 ` Yang Shi
  2018-12-03  9:22 ` Brice Goglin
  13 siblings, 0 replies; 26+ messages in thread
From: Yang Shi @ 2018-10-31  5:11 UTC (permalink / raw)
  To: dave.hansen
  Cc: thomas.lendacky, Michal Hocko, linux-nvdimm, Huang Ying,
	Linux Kernel Mailing List, Linux MM, zwisler, fengguang.wu,
	Andrew Morton

On Mon, Oct 22, 2018 at 1:18 PM Dave Hansen <dave.hansen@linux.intel.com> wrote:
>
> Persistent memory is cool.  But, currently, you have to rewrite
> your applications to use it.  Wouldn't it be cool if you could
> just have it show up in your system like normal RAM and get to
> it like a slow blob of memory?  Well... have I got the patch
> series for you!
>
> This series adds a new "driver" to which pmem devices can be
> attached.  Once attached, the memory "owned" by the device is
> hot-added to the kernel and managed like any other memory.  On
> systems with an HMAT (a new ACPI table), each socket (roughly)
> will have a separate NUMA node for its persistent memory so
> this newly-added memory can be selected by its unique NUMA
> node.

Could you please elaborate this? I'm supposed you mean the pmem will
be a separate NUMA node, right?

I would like to try the patches on real hardware, any prerequisite is needed?

Thanks,
Yang

>
> This is highly RFC, and I really want the feedback from the
> nvdimm/pmem folks about whether this is a viable long-term
> perversion of their code and device mode.  It's insufficiently
> documented and probably not bisectable either.
>
> Todo:
> 1. The device re-binding hacks are ham-fisted at best.  We
>    need a better way of doing this, especially so the kmem
>    driver does not get in the way of normal pmem devices.
> 2. When the device has no proper node, we default it to
>    NUMA node 0.  Is that OK?
> 3. We muck with the 'struct resource' code quite a bit. It
>    definitely needs a once-over from folks more familiar
>    with it than I.
> 4. Is there a better way to do this than starting with a
>    copy of pmem.c?
>
> Here's how I set up a system to test this thing:
>
> 1. Boot qemu with lots of memory: "-m 4096", for instance
> 2. Reserve 512MB of physical memory.  Reserving a spot a 2GB
>    physical seems to work: memmap=512M!0x0000000080000000
>    This will end up looking like a pmem device at boot.
> 3. When booted, convert fsdax device to "device dax":
>         ndctl create-namespace -fe namespace0.0 -m dax
> 4. In the background, the kmem driver will probably bind to the
>    new device.
> 5. Now, online the new memory sections.  Perhaps:
>
> grep ^MemTotal /proc/meminfo
> for f in `grep -vl online /sys/devices/system/memory/*/state`; do
>         echo $f: `cat $f`
>         echo online > $f
>         grep ^MemTotal /proc/meminfo
> done
>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Dave Jiang <dave.jiang@intel.com>
> Cc: Ross Zwisler <zwisler@kernel.org>
> Cc: Vishal Verma <vishal.l.verma@intel.com>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: linux-nvdimm@lists.01.org
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-mm@kvack.org
> Cc: Huang Ying <ying.huang@intel.com>
> Cc: Fengguang Wu <fengguang.wu@intel.com>
>
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/9] Allow persistent memory to be used like normal RAM
  2018-10-22 20:13 [PATCH 0/9] Allow persistent memory to be used like normal RAM Dave Hansen
                   ` (12 preceding siblings ...)
  2018-10-31  5:11 ` Yang Shi
@ 2018-12-03  9:22 ` Brice Goglin
  2018-12-03 16:56   ` Dave Hansen
  13 siblings, 1 reply; 26+ messages in thread
From: Brice Goglin @ 2018-12-03  9:22 UTC (permalink / raw)
  To: Dave Hansen, linux-kernel
  Cc: dan.j.williams, dave.jiang, zwisler, vishal.l.verma,
	thomas.lendacky, akpm, mhocko, linux-nvdimm, linux-mm,
	ying.huang, fengguang.wu

Le 22/10/2018 à 22:13, Dave Hansen a écrit :
> Persistent memory is cool.  But, currently, you have to rewrite
> your applications to use it.  Wouldn't it be cool if you could
> just have it show up in your system like normal RAM and get to
> it like a slow blob of memory?  Well... have I got the patch
> series for you!
>
> This series adds a new "driver" to which pmem devices can be
> attached.  Once attached, the memory "owned" by the device is
> hot-added to the kernel and managed like any other memory.  On
> systems with an HMAT (a new ACPI table), each socket (roughly)
> will have a separate NUMA node for its persistent memory so
> this newly-added memory can be selected by its unique NUMA
> node.


Hello Dave

What happens on systems without an HMAT? Does this new memory get merged
into existing NUMA nodes?

Also, do you plan to have a way for applications to find out which NUMA
nodes are "real DRAM" while others are "pmem-backed"? (something like a
new attribute in /sys/devices/system/node/nodeX/) Or should we use HMAT
performance attributes for this?

Brice

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/9] Allow persistent memory to be used like normal RAM
  2018-12-03  9:22 ` Brice Goglin
@ 2018-12-03 16:56   ` Dave Hansen
  2018-12-03 17:16     ` Dan Williams
  0 siblings, 1 reply; 26+ messages in thread
From: Dave Hansen @ 2018-12-03 16:56 UTC (permalink / raw)
  To: Brice Goglin, Dave Hansen, linux-kernel
  Cc: thomas.lendacky, mhocko, linux-nvdimm, ying.huang, linux-mm,
	zwisler, fengguang.wu, akpm

On 12/3/18 1:22 AM, Brice Goglin wrote:
> Le 22/10/2018 à 22:13, Dave Hansen a écrit :
> What happens on systems without an HMAT? Does this new memory get merged
> into existing NUMA nodes?

It gets merged into the persistent memory device's node, as told by the
firmware.  Intel's persistent memory should always be in its own node,
separate from DRAM.

> Also, do you plan to have a way for applications to find out which NUMA
> nodes are "real DRAM" while others are "pmem-backed"? (something like a
> new attribute in /sys/devices/system/node/nodeX/) Or should we use HMAT
> performance attributes for this?

The best way is to use the sysfs-generic interfaces to the HMAT that
Keith Busch is pushing.  In the end, we really think folks will only
care about the memory's performance properties rather than whether it's
*actually* persistent memory or not.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/9] Allow persistent memory to be used like normal RAM
  2018-12-03 16:56   ` Dave Hansen
@ 2018-12-03 17:16     ` Dan Williams
  0 siblings, 0 replies; 26+ messages in thread
From: Dan Williams @ 2018-12-03 17:16 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Tom Lendacky, Michal Hocko, linux-nvdimm, Brice Goglin,
	Dave Hansen, Huang, Ying  <ying.huang@intel.com>,
	Linux Kernel Mailing List, Linux MM, zwisler, Andrew Morton,
	Fengguang Wu

On Mon, Dec 3, 2018 at 8:56 AM Dave Hansen <dave.hansen@intel.com> wrote:
>
> On 12/3/18 1:22 AM, Brice Goglin wrote:
> > Le 22/10/2018 à 22:13, Dave Hansen a écrit :
> > What happens on systems without an HMAT? Does this new memory get merged
> > into existing NUMA nodes?
>
> It gets merged into the persistent memory device's node, as told by the
> firmware.  Intel's persistent memory should always be in its own node,
> separate from DRAM.
>
> > Also, do you plan to have a way for applications to find out which NUMA
> > nodes are "real DRAM" while others are "pmem-backed"? (something like a
> > new attribute in /sys/devices/system/node/nodeX/) Or should we use HMAT
> > performance attributes for this?
>
> The best way is to use the sysfs-generic interfaces to the HMAT that
> Keith Busch is pushing.  In the end, we really think folks will only
> care about the memory's performance properties rather than whether it's
> *actually* persistent memory or not.

It's also important to point out that "persistent memory" by itself is
an ambiguous memory type. It's anything from new media with distinct
performance characteristics to battery backed DRAM. I.e. the
performance of "persistent memory" may be indistinguishable from "real
DRAM".
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2018-12-03 17:16 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-22 20:13 [PATCH 0/9] Allow persistent memory to be used like normal RAM Dave Hansen
2018-10-22 20:13 ` [PATCH 1/9] mm/resource: return real error codes from walk failures Dave Hansen
2018-10-22 20:13 ` [PATCH 2/9] dax: kernel memory driver for mm ownership of DAX Dave Hansen
2018-10-23  1:56   ` Randy Dunlap
2018-10-22 20:13 ` [PATCH 3/9] dax: add more kmem device infrastructure Dave Hansen
2018-10-22 20:13 ` [PATCH 4/9] dax/kmem: allow PMEM devices to bind to KMEM driver Dave Hansen
2018-10-22 20:13 ` [PATCH 5/9] dax/kmem: add more nd dax kmem infrastructure Dave Hansen
2018-10-22 20:13 ` [PATCH 6/9] mm/memory-hotplug: allow memory resources to be children Dave Hansen
2018-10-22 20:13 ` [PATCH 7/9] dax/kmem: actually perform memory hotplug Dave Hansen
2018-10-22 20:13 ` [PATCH 8/9] dax/kmem: let walk_system_ram_range() search child resources Dave Hansen
2018-10-22 20:13 ` [PATCH 9/9] dax/kmem: actually enable the code in Makefile Dave Hansen
2018-10-23  1:05 ` [PATCH 0/9] Allow persistent memory to be used like normal RAM Dan Williams
2018-10-23  1:11   ` Dan Williams
2018-10-26  8:03     ` Xishi Qiu
2018-10-26 13:58       ` Dave Hansen
2018-10-27  4:45     ` Dan Williams
2018-10-23 18:12   ` Elliott, Robert (Persistent Memory)
2018-10-23 18:16     ` Dave Hansen
2018-10-23 18:58       ` Dan Williams
2018-10-26  5:42 ` Xishi Qiu
2018-10-26  9:03   ` Fengguang Wu
2018-10-27 11:00 ` Fengguang Wu
2018-10-31  5:11 ` Yang Shi
2018-12-03  9:22 ` Brice Goglin
2018-12-03 16:56   ` Dave Hansen
2018-12-03 17:16     ` Dan Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).