Linux-NVDIMM Archive on lore.kernel.org
 help / color / Atom feed
* [ndctl PATCH v2 00/10] fixes and movability for system-ram mode
@ 2019-10-20  3:23 Vishal Verma
  2019-10-20  3:23 ` [ndctl PATCH v2 01/10] libdaxctl: refactor path construction in op_for_one_memblock() Vishal Verma
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: Vishal Verma @ 2019-10-20  3:23 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Ben Olson, Dave Hansen

v2:
- Change MEM_FIND_ZONE to MEM_GET_ZONE (Dan)
- Remove the verbose race warning from runtime and just point to the man
  page (Dan)
- Remove the short option for '--no-movable' (Dan)

This patchset improves the user experience around memory onlining,
specifically the state it is onlined in - movable vs non-movable. It
also adds an option to make the memory non-movable when onlining.

Patches 1-3 perform some preparatory and clean up steps.
Patches 4-6 add a way to determine and display the 'movable' vs
'non-movable' state of memory.
Patches 7-8 attempt to detect a race with memory onlining, and add a
Documentation clarification
Patches 9-10 Add the new --no-movable option to commands that may
online memory.

Vishal Verma (10):
  libdaxctl: refactor path construction in op_for_one_memblock()
  libdaxctl: refactor memblock_is_online() checks
  daxctl/device.c: fix json output omission for reconfigure-device
  libdaxctl: add an API to determine if memory is movable
  libdaxctl: allow memblock_in_dev() to return an error
  daxctl: show a 'movable' attribute in device listings
  daxctl: detect races when onlining memory blocks
  Documentation: clarify memory movablity for reconfigure-device
  libdaxctl: add an API to online memory in a non-movable state
  daxctl: add --no-movable option for onlining memory

 Documentation/daxctl/daxctl-online-memory.txt |   2 +
 .../daxctl/daxctl-reconfigure-device.txt      |  24 +-
 Documentation/daxctl/movable-options.txt      |   9 +
 daxctl/device.c                               |  45 ++-
 daxctl/lib/libdaxctl-private.h                |  26 ++
 daxctl/lib/libdaxctl.c                        | 277 +++++++++++++-----
 daxctl/lib/libdaxctl.sym                      |   6 +
 daxctl/libdaxctl.h                            |   2 +
 util/json.c                                   |  14 +-
 9 files changed, 320 insertions(+), 85 deletions(-)
 create mode 100644 Documentation/daxctl/movable-options.txt

-- 
2.20.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [ndctl PATCH v2 01/10] libdaxctl: refactor path construction in op_for_one_memblock()
  2019-10-20  3:23 [ndctl PATCH v2 00/10] fixes and movability for system-ram mode Vishal Verma
@ 2019-10-20  3:23 ` Vishal Verma
  2019-10-20  3:23 ` [ndctl PATCH v2 02/10] libdaxctl: refactor memblock_is_online() checks Vishal Verma
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Vishal Verma @ 2019-10-20  3:23 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Ben Olson, Dave Hansen

In preparation for memblock operations to check additional sysfs
attributes in the memoryXXX block, 'path' can't be prematurely set
to the memoryXXX/state file.

Push down path construction into each memory op helper, so that each
helper gets an opportunity to reconstruct and act upon multiple paths.

Cc: Dan Williams <dan.j.williams@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 daxctl/lib/libdaxctl.c | 64 +++++++++++++++++++++++++++++-------------
 1 file changed, 44 insertions(+), 20 deletions(-)

diff --git a/daxctl/lib/libdaxctl.c b/daxctl/lib/libdaxctl.c
index 4dfc524..a828644 100644
--- a/daxctl/lib/libdaxctl.c
+++ b/daxctl/lib/libdaxctl.c
@@ -1047,13 +1047,24 @@ DAXCTL_EXPORT unsigned long daxctl_memory_get_block_size(struct daxctl_memory *m
 	return mem->block_size;
 }
 
-static int online_one_memblock(struct daxctl_dev *dev, char *path)
+static int online_one_memblock(struct daxctl_memory *mem, char *memblock)
 {
+	struct daxctl_dev *dev = daxctl_memory_get_dev(mem);
 	const char *devname = daxctl_dev_get_devname(dev);
 	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
 	const char *mode = "online_movable";
+	int len = mem->buf_len, rc;
 	char buf[SYSFS_ATTR_SIZE];
-	int rc;
+	char *path = mem->mem_buf;
+	const char *node_path;
+
+	node_path = daxctl_memory_get_node_path(mem);
+	if (!node_path)
+		return -ENXIO;
+
+	rc = snprintf(path, len, "%s/%s/state", node_path, memblock);
+	if (rc < 0)
+		return -ENOMEM;
 
 	rc = sysfs_read_attr(ctx, path, buf);
 	if (rc) {
@@ -1089,13 +1100,24 @@ static int online_one_memblock(struct daxctl_dev *dev, char *path)
 	return rc;
 }
 
-static int offline_one_memblock(struct daxctl_dev *dev, char *path)
+static int offline_one_memblock(struct daxctl_memory *mem, char *memblock)
 {
+	struct daxctl_dev *dev = daxctl_memory_get_dev(mem);
 	const char *devname = daxctl_dev_get_devname(dev);
 	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
 	const char *mode = "offline";
+	int len = mem->buf_len, rc;
 	char buf[SYSFS_ATTR_SIZE];
-	int rc;
+	char *path = mem->mem_buf;
+	const char *node_path;
+
+	node_path = daxctl_memory_get_node_path(mem);
+	if (!node_path)
+		return -ENXIO;
+
+	rc = snprintf(path, len, "%s/%s/state", node_path, memblock);
+	if (rc < 0)
+		return -ENOMEM;
 
 	rc = sysfs_read_attr(ctx, path, buf);
 	if (rc) {
@@ -1121,12 +1143,23 @@ static int offline_one_memblock(struct daxctl_dev *dev, char *path)
 	return rc;
 }
 
-static int memblock_is_online(struct daxctl_dev *dev, char *path)
+static int memblock_is_online(struct daxctl_memory *mem, char *memblock)
 {
+	struct daxctl_dev *dev = daxctl_memory_get_dev(mem);
 	const char *devname = daxctl_dev_get_devname(dev);
 	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
+	int len = mem->buf_len, rc;
 	char buf[SYSFS_ATTR_SIZE];
-	int rc;
+	char *path = mem->mem_buf;
+	const char *node_path;
+
+	node_path = daxctl_memory_get_node_path(mem);
+	if (!node_path)
+		return -ENXIO;
+
+	rc = snprintf(path, len, "%s/%s/state", node_path, memblock);
+	if (rc < 0)
+		return -ENOMEM;
 
 	rc = sysfs_read_attr(ctx, path, buf);
 	if (rc) {
@@ -1193,7 +1226,7 @@ static bool memblock_in_dev(struct daxctl_memory *mem, const char *memblock)
 	return false;
 }
 
-static int op_for_one_memblock(struct daxctl_memory *mem, char *path,
+static int op_for_one_memblock(struct daxctl_memory *mem, char *memblock,
 		enum memory_op op)
 {
 	struct daxctl_dev *dev = daxctl_memory_get_dev(mem);
@@ -1203,11 +1236,11 @@ static int op_for_one_memblock(struct daxctl_memory *mem, char *path,
 
 	switch (op) {
 	case MEM_SET_ONLINE:
-		return online_one_memblock(dev, path);
+		return online_one_memblock(mem, memblock);
 	case MEM_SET_OFFLINE:
-		return offline_one_memblock(dev, path);
+		return offline_one_memblock(mem, memblock);
 	case MEM_IS_ONLINE:
-		rc = memblock_is_online(dev, path);
+		rc = memblock_is_online(mem, memblock);
 		if (rc < 0)
 			return rc;
 		/*
@@ -1245,19 +1278,10 @@ static int daxctl_memory_op(struct daxctl_memory *mem, enum memory_op op)
 
 	errno = 0;
 	while ((de = readdir(node_dir)) != NULL) {
-		char *path = mem->mem_buf;
-		int len = mem->buf_len;
-
 		if (strncmp(de->d_name, "memory", 6) == 0) {
 			if (!memblock_in_dev(mem, de->d_name))
 				continue;
-			rc = snprintf(path, len, "%s/%s/state",
-				node_path, de->d_name);
-			if (rc < 0) {
-				rc = -ENOMEM;
-				goto out_dir;
-			}
-			rc = op_for_one_memblock(mem, path, op);
+			rc = op_for_one_memblock(mem, de->d_name, op);
 			if (rc < 0)
 				goto out_dir;
 			if (rc == 0)
-- 
2.20.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [ndctl PATCH v2 02/10] libdaxctl: refactor memblock_is_online() checks
  2019-10-20  3:23 [ndctl PATCH v2 00/10] fixes and movability for system-ram mode Vishal Verma
  2019-10-20  3:23 ` [ndctl PATCH v2 01/10] libdaxctl: refactor path construction in op_for_one_memblock() Vishal Verma
@ 2019-10-20  3:23 ` Vishal Verma
  2019-10-20  3:23 ` [ndctl PATCH v2 03/10] daxctl/device.c: fix json output omission for reconfigure-device Vishal Verma
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Vishal Verma @ 2019-10-20  3:23 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Ben Olson, Dave Hansen

The {online,offline}_one_memblock() helpers both open-coded the check
for whether a block is online. There is already a function to perform
this check - memblock_is_online(). Consolidate the checking using this
helper everywhere it is applicable.

Cc: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 daxctl/lib/libdaxctl.c | 90 +++++++++++++++++-------------------------
 1 file changed, 37 insertions(+), 53 deletions(-)

diff --git a/daxctl/lib/libdaxctl.c b/daxctl/lib/libdaxctl.c
index a828644..6243857 100644
--- a/daxctl/lib/libdaxctl.c
+++ b/daxctl/lib/libdaxctl.c
@@ -1047,12 +1047,11 @@ DAXCTL_EXPORT unsigned long daxctl_memory_get_block_size(struct daxctl_memory *m
 	return mem->block_size;
 }
 
-static int online_one_memblock(struct daxctl_memory *mem, char *memblock)
+static int memblock_is_online(struct daxctl_memory *mem, char *memblock)
 {
 	struct daxctl_dev *dev = daxctl_memory_get_dev(mem);
 	const char *devname = daxctl_dev_get_devname(dev);
 	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
-	const char *mode = "online_movable";
 	int len = mem->buf_len, rc;
 	char buf[SYSFS_ATTR_SIZE];
 	char *path = mem->mem_buf;
@@ -1073,41 +1072,20 @@ static int online_one_memblock(struct daxctl_memory *mem, char *memblock)
 		return rc;
 	}
 
-	/*
-	 * if already online, possibly due to kernel config or a udev rule,
-	 * there is nothing to do and we can skip over the memblock
-	 */
 	if (strncmp(buf, "online", 6) == 0)
 		return 1;
 
-	rc = sysfs_write_attr_quiet(ctx, path, mode);
-	if (rc) {
-		/*
-		 * While we performed an already-online check above, there
-		 * is still a TOCTOU hole where someone (such as a udev rule)
-		 * may have raced to online the memory. In such a case,
-		 * the sysfs store will fail, however we can check for this
-		 * by simply reading the state again. If it changed to the
-		 * desired state, then we don't have to error out.
-		 */
-		if (sysfs_read_attr(ctx, path, buf) == 0) {
-			if (strncmp(buf, "online", 6) == 0)
-				return 1;
-		}
-		err(ctx, "%s: Failed to online %s: %s\n",
-			devname, path, strerror(-rc));
-	}
-	return rc;
+	/* offline */
+	return 0;
 }
 
-static int offline_one_memblock(struct daxctl_memory *mem, char *memblock)
+static int online_one_memblock(struct daxctl_memory *mem, char *memblock)
 {
 	struct daxctl_dev *dev = daxctl_memory_get_dev(mem);
 	const char *devname = daxctl_dev_get_devname(dev);
 	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
-	const char *mode = "offline";
+	const char *mode = "online_movable";
 	int len = mem->buf_len, rc;
-	char buf[SYSFS_ATTR_SIZE];
 	char *path = mem->mem_buf;
 	const char *node_path;
 
@@ -1119,37 +1097,39 @@ static int offline_one_memblock(struct daxctl_memory *mem, char *memblock)
 	if (rc < 0)
 		return -ENOMEM;
 
-	rc = sysfs_read_attr(ctx, path, buf);
-	if (rc) {
-		err(ctx, "%s: Failed to read %s: %s\n",
-			devname, path, strerror(-rc));
+	/*
+	 * if already online, possibly due to kernel config or a udev rule,
+	 * there is nothing to do and we can skip over the memblock
+	 */
+	rc = memblock_is_online(mem, memblock);
+	if (rc)
 		return rc;
-	}
-
-	/* if already offline, there is nothing to do */
-	if (strncmp(buf, "offline", 7) == 0)
-		return 1;
 
 	rc = sysfs_write_attr_quiet(ctx, path, mode);
 	if (rc) {
-		/* Close the TOCTOU hole like in online_one_memblock() above */
-		if (sysfs_read_attr(ctx, path, buf) == 0) {
-			if (strncmp(buf, "offline", 7) == 0)
-				return 1;
-		}
-		err(ctx, "%s: Failed to offline %s: %s\n",
+		/*
+		 * While we performed an already-online check above, there
+		 * is still a TOCTOU hole where someone (such as a udev rule)
+		 * may have raced to online the memory. In such a case,
+		 * the sysfs store will fail, however we can check for this
+		 * by simply reading the state again. If it changed to the
+		 * desired state, then we don't have to error out.
+		 */
+		if (memblock_is_online(mem, memblock))
+			return 1;
+		err(ctx, "%s: Failed to online %s: %s\n",
 			devname, path, strerror(-rc));
 	}
 	return rc;
 }
 
-static int memblock_is_online(struct daxctl_memory *mem, char *memblock)
+static int offline_one_memblock(struct daxctl_memory *mem, char *memblock)
 {
 	struct daxctl_dev *dev = daxctl_memory_get_dev(mem);
 	const char *devname = daxctl_dev_get_devname(dev);
 	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
+	const char *mode = "offline";
 	int len = mem->buf_len, rc;
-	char buf[SYSFS_ATTR_SIZE];
 	char *path = mem->mem_buf;
 	const char *node_path;
 
@@ -1161,18 +1141,22 @@ static int memblock_is_online(struct daxctl_memory *mem, char *memblock)
 	if (rc < 0)
 		return -ENOMEM;
 
-	rc = sysfs_read_attr(ctx, path, buf);
-	if (rc) {
-		err(ctx, "%s: Failed to read %s: %s\n",
-			devname, path, strerror(-rc));
+	/* if already offline, there is nothing to do */
+	rc = memblock_is_online(mem, memblock);
+	if (rc < 0)
 		return rc;
-	}
-
-	if (strncmp(buf, "online", 6) == 0)
+	if (!rc)
 		return 1;
 
-	/* offline */
-	return 0;
+	rc = sysfs_write_attr_quiet(ctx, path, mode);
+	if (rc) {
+		/* Close the TOCTOU hole like in online_one_memblock() above */
+		if (!memblock_is_online(mem, memblock))
+			return 1;
+		err(ctx, "%s: Failed to offline %s: %s\n",
+			devname, path, strerror(-rc));
+	}
+	return rc;
 }
 
 static bool memblock_in_dev(struct daxctl_memory *mem, const char *memblock)
-- 
2.20.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [ndctl PATCH v2 03/10] daxctl/device.c: fix json output omission for reconfigure-device
  2019-10-20  3:23 [ndctl PATCH v2 00/10] fixes and movability for system-ram mode Vishal Verma
  2019-10-20  3:23 ` [ndctl PATCH v2 01/10] libdaxctl: refactor path construction in op_for_one_memblock() Vishal Verma
  2019-10-20  3:23 ` [ndctl PATCH v2 02/10] libdaxctl: refactor memblock_is_online() checks Vishal Verma
@ 2019-10-20  3:23 ` Vishal Verma
  2019-10-20  3:23 ` [ndctl PATCH v2 04/10] libdaxctl: add an API to determine if memory is movable Vishal Verma
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Vishal Verma @ 2019-10-20  3:23 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Ben Olson, Dave Hansen, Michal Biesek

The reconfig_mode_{devdax,system_ram}() function can have a positive
return status from memory online/offline operations, indicating skipped
memory blocks. Don't omit printing the device listing json in these
cases; the reconfiguration has succeeded, and its results should be
printed.

Link: https://github.com/pmem/ndctl/issues/115
Reported-by: Michal Biesek <michal.biesek@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 daxctl/device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/daxctl/device.c b/daxctl/device.c
index 4887ccf..920efc6 100644
--- a/daxctl/device.c
+++ b/daxctl/device.c
@@ -398,7 +398,7 @@ static int do_reconfig(struct daxctl_dev *dev, enum dev_mode mode,
 		rc = -EINVAL;
 	}
 
-	if (rc)
+	if (rc < 0)
 		return rc;
 
 	*jdevs = json_object_new_array();
-- 
2.20.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [ndctl PATCH v2 04/10] libdaxctl: add an API to determine if memory is movable
  2019-10-20  3:23 [ndctl PATCH v2 00/10] fixes and movability for system-ram mode Vishal Verma
                   ` (2 preceding siblings ...)
  2019-10-20  3:23 ` [ndctl PATCH v2 03/10] daxctl/device.c: fix json output omission for reconfigure-device Vishal Verma
@ 2019-10-20  3:23 ` Vishal Verma
  2019-10-20  3:23 ` [ndctl PATCH v2 05/10] libdaxctl: allow memblock_in_dev() to return an error Vishal Verma
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Vishal Verma @ 2019-10-20  3:23 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Ben Olson, Dave Hansen

By default, daxctl always attempts to online new memory sections as
'movable' so that routine kernel allocations aren't serviced from this
memory, and the memory is later removable via hot-unplug.

System configuration, or other agents (such as udev rules) may race
'daxctl' to online memory, and this may result in the memory not being
'movable'. Add an interface to query the movability of a memory object
associated with a dax device.

This is in preparation to both display a 'movable' attribute in device
listings, as well as optionally allowing memory to be onlined as
non-movable.

Cc: Dan Williams <dan.j.williams@intel.com>
Reported-by: Ben Olson <ben.olson@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 daxctl/lib/libdaxctl-private.h | 20 +++++++++
 daxctl/lib/libdaxctl.c         | 77 ++++++++++++++++++++++++++++++++--
 daxctl/lib/libdaxctl.sym       |  5 +++
 daxctl/libdaxctl.h             |  1 +
 4 files changed, 100 insertions(+), 3 deletions(-)

diff --git a/daxctl/lib/libdaxctl-private.h b/daxctl/lib/libdaxctl-private.h
index 7ba3c46..ebaecb8 100644
--- a/daxctl/lib/libdaxctl-private.h
+++ b/daxctl/lib/libdaxctl-private.h
@@ -44,6 +44,25 @@ enum memory_op {
 	MEM_SET_ONLINE,
 	MEM_IS_ONLINE,
 	MEM_COUNT,
+	MEM_GET_ZONE,
+};
+
+/* OR-able flags, 1, 2, 4, 8 etc */
+enum memory_op_status {
+	MEM_ST_OK = 0,
+	MEM_ST_ZONE_INCONSISTENT = 1,
+};
+
+enum memory_zones {
+	MEM_ZONE_UNKNOWN = 1,
+	MEM_ZONE_MOVABLE,
+	MEM_ZONE_NORMAL,
+};
+
+static const char *zone_strings[] = {
+	[MEM_ZONE_UNKNOWN] = "mixed",
+	[MEM_ZONE_NORMAL] = "Normal",
+	[MEM_ZONE_MOVABLE] = "Movable",
 };
 
 /**
@@ -86,6 +105,7 @@ struct daxctl_memory {
 	size_t buf_len;
 	char *node_path;
 	unsigned long block_size;
+	enum memory_zones zone;
 };
 
 
diff --git a/daxctl/lib/libdaxctl.c b/daxctl/lib/libdaxctl.c
index 6243857..03f38f2 100644
--- a/daxctl/lib/libdaxctl.c
+++ b/daxctl/lib/libdaxctl.c
@@ -1159,6 +1159,58 @@ static int offline_one_memblock(struct daxctl_memory *mem, char *memblock)
 	return rc;
 }
 
+static int memblock_find_zone(struct daxctl_memory *mem, char *memblock,
+		int *status)
+{
+	struct daxctl_dev *dev = daxctl_memory_get_dev(mem);
+	const char *devname = daxctl_dev_get_devname(dev);
+	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
+	enum memory_zones cur_zone;
+	int len = mem->buf_len, rc;
+	char buf[SYSFS_ATTR_SIZE];
+	char *path = mem->mem_buf;
+	const char *node_path;
+
+	rc = memblock_is_online(mem, memblock);
+	if (rc < 0)
+		return rc;
+	if (rc == 0)
+		return -ENXIO;
+
+	node_path = daxctl_memory_get_node_path(mem);
+	if (!node_path)
+		return -ENXIO;
+
+	rc = snprintf(path, len, "%s/%s/valid_zones", node_path, memblock);
+	if (rc < 0)
+		return -ENOMEM;
+
+	rc = sysfs_read_attr(ctx, path, buf);
+	if (rc) {
+		err(ctx, "%s: Failed to read %s: %s\n",
+			devname, path, strerror(-rc));
+		return rc;
+	}
+
+	if (strcmp(buf, zone_strings[MEM_ZONE_MOVABLE]) == 0)
+		cur_zone = MEM_ZONE_MOVABLE;
+	else if (strcmp(buf, zone_strings[MEM_ZONE_NORMAL]) == 0)
+		cur_zone = MEM_ZONE_NORMAL;
+	else
+		cur_zone = MEM_ZONE_UNKNOWN;
+
+	if (mem->zone) {
+		if (mem->zone == cur_zone)
+			return 0;
+		else
+			*status |= MEM_ST_ZONE_INCONSISTENT;
+	} else {
+		mem->zone = cur_zone;
+	}
+
+	return 0;
+}
+
 static bool memblock_in_dev(struct daxctl_memory *mem, const char *memblock)
 {
 	const char *mem_base = "/sys/devices/system/memory/";
@@ -1211,7 +1263,7 @@ static bool memblock_in_dev(struct daxctl_memory *mem, const char *memblock)
 }
 
 static int op_for_one_memblock(struct daxctl_memory *mem, char *memblock,
-		enum memory_op op)
+		enum memory_op op, int *status)
 {
 	struct daxctl_dev *dev = daxctl_memory_get_dev(mem);
 	const char *devname = daxctl_dev_get_devname(dev);
@@ -1234,6 +1286,8 @@ static int op_for_one_memblock(struct daxctl_memory *mem, char *memblock,
 		return !rc;
 	case MEM_COUNT:
 		return 0;
+	case MEM_GET_ZONE:
+		return memblock_find_zone(mem, memblock, status);
 	}
 
 	err(ctx, "%s: BUG: unknown op: %d\n", devname, op);
@@ -1245,8 +1299,8 @@ static int daxctl_memory_op(struct daxctl_memory *mem, enum memory_op op)
 	struct daxctl_dev *dev = daxctl_memory_get_dev(mem);
 	const char *devname = daxctl_dev_get_devname(dev);
 	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
+	int rc, count = 0, status_flags = 0;
 	const char *node_path;
-	int rc, count = 0;
 	struct dirent *de;
 	DIR *node_dir;
 
@@ -1265,7 +1319,8 @@ static int daxctl_memory_op(struct daxctl_memory *mem, enum memory_op op)
 		if (strncmp(de->d_name, "memory", 6) == 0) {
 			if (!memblock_in_dev(mem, de->d_name))
 				continue;
-			rc = op_for_one_memblock(mem, de->d_name, op);
+			rc = op_for_one_memblock(mem, de->d_name, op,
+					&status_flags);
 			if (rc < 0)
 				goto out_dir;
 			if (rc == 0)
@@ -1273,6 +1328,10 @@ static int daxctl_memory_op(struct daxctl_memory *mem, enum memory_op op)
 		}
 		errno = 0;
 	}
+
+	if (status_flags & MEM_ST_ZONE_INCONSISTENT)
+		mem->zone = MEM_ZONE_UNKNOWN;
+
 	if (errno) {
 		rc = -errno;
 		goto out_dir;
@@ -1303,3 +1362,15 @@ DAXCTL_EXPORT int daxctl_memory_num_sections(struct daxctl_memory *mem)
 {
 	return daxctl_memory_op(mem, MEM_COUNT);
 }
+
+DAXCTL_EXPORT int daxctl_memory_is_movable(struct daxctl_memory *mem)
+{
+	int rc;
+
+	/* Start a fresh zone scan, clear any previous info */
+	mem->zone = 0;
+	rc = daxctl_memory_op(mem, MEM_GET_ZONE);
+	if (rc < 0)
+		return rc;
+	return (mem->zone == MEM_ZONE_MOVABLE) ? 1 : 0;
+}
diff --git a/daxctl/lib/libdaxctl.sym b/daxctl/lib/libdaxctl.sym
index bc18604..041d9a5 100644
--- a/daxctl/lib/libdaxctl.sym
+++ b/daxctl/lib/libdaxctl.sym
@@ -69,3 +69,8 @@ global:
 	daxctl_memory_is_online;
 	daxctl_memory_num_sections;
 } LIBDAXCTL_5;
+
+LIBDAXCTL_7 {
+global:
+	daxctl_memory_is_movable;
+} LIBDAXCTL_6;
diff --git a/daxctl/libdaxctl.h b/daxctl/libdaxctl.h
index fb6c3b1..8d5f8b7 100644
--- a/daxctl/libdaxctl.h
+++ b/daxctl/libdaxctl.h
@@ -84,6 +84,7 @@ int daxctl_memory_online(struct daxctl_memory *mem);
 int daxctl_memory_offline(struct daxctl_memory *mem);
 int daxctl_memory_is_online(struct daxctl_memory *mem);
 int daxctl_memory_num_sections(struct daxctl_memory *mem);
+int daxctl_memory_is_movable(struct daxctl_memory *mem);
 
 #define daxctl_dev_foreach(region, dev) \
         for (dev = daxctl_dev_get_first(region); \
-- 
2.20.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [ndctl PATCH v2 05/10] libdaxctl: allow memblock_in_dev() to return an error
  2019-10-20  3:23 [ndctl PATCH v2 00/10] fixes and movability for system-ram mode Vishal Verma
                   ` (3 preceding siblings ...)
  2019-10-20  3:23 ` [ndctl PATCH v2 04/10] libdaxctl: add an API to determine if memory is movable Vishal Verma
@ 2019-10-20  3:23 ` Vishal Verma
  2019-10-20  3:23 ` [ndctl PATCH v2 06/10] daxctl: show a 'movable' attribute in device listings Vishal Verma
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Vishal Verma @ 2019-10-20  3:23 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Ben Olson, Dave Hansen

With the MEM_FIND_ZONE operation, and the expectation that it will be
called from 'daxctl list' listings, it is possible that memblock_in_dev()
gets called without sufficient privileges. If this happens, currently,
the above simply returns a 'false'. This was acceptable when the only
operations were onlining/offlining (as there would be an actual failure
later). However, it is not acceptable in the MEM_FIND_ZONE case, as it
could yeild a different answer based on the privilege level.

Change memblock_in_dev() to return an 'int' instead of a 'bool' so that
error cases can be distinguished from actual address range test.

Cc: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 daxctl/lib/libdaxctl.c | 32 +++++++++++++++++++-------------
 1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/daxctl/lib/libdaxctl.c b/daxctl/lib/libdaxctl.c
index 03f38f2..65a09c8 100644
--- a/daxctl/lib/libdaxctl.c
+++ b/daxctl/lib/libdaxctl.c
@@ -1211,40 +1211,42 @@ static int memblock_find_zone(struct daxctl_memory *mem, char *memblock,
 	return 0;
 }
 
-static bool memblock_in_dev(struct daxctl_memory *mem, const char *memblock)
+static int memblock_in_dev(struct daxctl_memory *mem, const char *memblock)
 {
 	const char *mem_base = "/sys/devices/system/memory/";
 	struct daxctl_dev *dev = daxctl_memory_get_dev(mem);
 	unsigned long long memblock_res, dev_start, dev_end;
 	const char *devname = daxctl_dev_get_devname(dev);
 	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
+	int rc, path_len = mem->buf_len;
 	unsigned long memblock_size;
-	int path_len = mem->buf_len;
 	char buf[SYSFS_ATTR_SIZE];
 	unsigned long phys_index;
 	char *path = mem->mem_buf;
 
 	if (snprintf(path, path_len, "%s/%s/phys_index",
 			mem_base, memblock) < 0)
-		return false;
+		return -ENXIO;
 
-	if (sysfs_read_attr(ctx, path, buf) == 0) {
+	rc = sysfs_read_attr(ctx, path, buf);
+	if (rc == 0) {
 		phys_index = strtoul(buf, NULL, 16);
 		if (phys_index == 0 || phys_index == ULONG_MAX) {
+			rc = -errno;
 			err(ctx, "%s: %s: Unable to determine phys_index: %s\n",
-				devname, memblock, strerror(errno));
-			return false;
+				devname, memblock, strerror(-rc));
+			return rc;
 		}
 	} else {
 		err(ctx, "%s: %s: Unable to determine phys_index: %s\n",
-			devname, memblock, strerror(errno));
-		return false;
+			devname, memblock, strerror(-rc));
+		return rc;
 	}
 
 	dev_start = daxctl_dev_get_resource(dev);
 	if (!dev_start) {
 		err(ctx, "%s: Unable to determine resource\n", devname);
-		return false;
+		return -EACCES;
 	}
 	dev_end = dev_start + daxctl_dev_get_size(dev);
 
@@ -1252,14 +1254,14 @@ static bool memblock_in_dev(struct daxctl_memory *mem, const char *memblock)
 	if (!memblock_size) {
 		err(ctx, "%s: Unable to determine memory block size\n",
 			devname);
-		return false;
+		return -ENXIO;
 	}
 	memblock_res = phys_index * memblock_size;
 
 	if (memblock_res >= dev_start && memblock_res <= dev_end)
-		return true;
+		return 1;
 
-	return false;
+	return 0;
 }
 
 static int op_for_one_memblock(struct daxctl_memory *mem, char *memblock,
@@ -1317,8 +1319,12 @@ static int daxctl_memory_op(struct daxctl_memory *mem, enum memory_op op)
 	errno = 0;
 	while ((de = readdir(node_dir)) != NULL) {
 		if (strncmp(de->d_name, "memory", 6) == 0) {
-			if (!memblock_in_dev(mem, de->d_name))
+			rc = memblock_in_dev(mem, de->d_name);
+			if (rc < 0)
+				goto out_dir;
+			if (rc == 0) /* memblock not in dev */
 				continue;
+			/* memblock is in dev, perform op */
 			rc = op_for_one_memblock(mem, de->d_name, op,
 					&status_flags);
 			if (rc < 0)
-- 
2.20.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [ndctl PATCH v2 06/10] daxctl: show a 'movable' attribute in device listings
  2019-10-20  3:23 [ndctl PATCH v2 00/10] fixes and movability for system-ram mode Vishal Verma
                   ` (4 preceding siblings ...)
  2019-10-20  3:23 ` [ndctl PATCH v2 05/10] libdaxctl: allow memblock_in_dev() to return an error Vishal Verma
@ 2019-10-20  3:23 ` Vishal Verma
  2019-10-20  3:23 ` [ndctl PATCH v2 07/10] daxctl: detect races when onlining memory blocks Vishal Verma
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Vishal Verma @ 2019-10-20  3:23 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Ben Olson, Dave Hansen

For dax devices in 'system-ram' mode, display a 'movable' attribute in
device listings. This helps a user easily determine if memory was not
onlined in the expected way for any reason.

Link: https://github.com/pmem/ndctl/issues/110
Reported-by: Ben Olson <ben.olson@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 util/json.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/util/json.c b/util/json.c
index 5e6a32a..497c52b 100644
--- a/util/json.c
+++ b/util/json.c
@@ -278,7 +278,7 @@ struct json_object *util_daxctl_dev_to_json(struct daxctl_dev *dev,
 	struct daxctl_memory *mem = daxctl_dev_get_memory(dev);
 	const char *devname = daxctl_dev_get_devname(dev);
 	struct json_object *jdev, *jobj;
-	int node;
+	int node, movable;
 
 	jdev = json_object_new_object();
 	if (!devname || !jdev)
@@ -306,6 +306,18 @@ struct json_object *util_daxctl_dev_to_json(struct daxctl_dev *dev,
 	if (jobj)
 		json_object_object_add(jdev, "mode", jobj);
 
+	if (mem) {
+		movable = daxctl_memory_is_movable(mem);
+		if (movable == 1)
+			jobj = json_object_new_boolean(true);
+		else if (movable == 0)
+			jobj = json_object_new_boolean(false);
+		else
+			jobj = NULL;
+		if (jobj)
+			json_object_object_add(jdev, "movable", jobj);
+	}
+
 	if (!daxctl_dev_is_enabled(dev)) {
 		jobj = json_object_new_string("disabled");
 		if (jobj)
-- 
2.20.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [ndctl PATCH v2 07/10] daxctl: detect races when onlining memory blocks
  2019-10-20  3:23 [ndctl PATCH v2 00/10] fixes and movability for system-ram mode Vishal Verma
                   ` (5 preceding siblings ...)
  2019-10-20  3:23 ` [ndctl PATCH v2 06/10] daxctl: show a 'movable' attribute in device listings Vishal Verma
@ 2019-10-20  3:23 ` Vishal Verma
  2019-10-20  3:23 ` [ndctl PATCH v2 08/10] Documentation: clarify memory movablity for reconfigure-device Vishal Verma
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Vishal Verma @ 2019-10-20  3:23 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Ben Olson, Dave Hansen

Sometimes, system configuration can result in new memory blocks getting
onlined automatically. Often, these auto-onlining mechanisms don't
provide a choice or configurability in the matter of which zone is used
for these new blocks, and they just end up in ZONE_NORMAL.

Usually, for hot-plugged memory, ZONE_NORMAL is undesirable because:
 - An application might want total control over this memory
 - ZONE_NORMAL precludes hot-removal of this memory
 - Having kernel data structures in this memory, especially performance
   sensitive ones, such as page tables, may be undesirable.

Thus report if a race condition is encountered while onlining memory,
and provide the user options to remedy it.

Clarify the default zone expectations, and the race detection behavior
in the daxctl-reconfigure-device man page, and move the relevant section
under the 'Description' header, instead of hidden away under the
'--no-online' option.

Cc: Ben Olson <ben.olson@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 daxctl/device.c        |  9 +++++++
 daxctl/lib/libdaxctl.c | 59 ++++++++++++++++++++++++++++--------------
 2 files changed, 49 insertions(+), 19 deletions(-)

diff --git a/daxctl/device.c b/daxctl/device.c
index 920efc6..28698bf 100644
--- a/daxctl/device.c
+++ b/daxctl/device.c
@@ -174,6 +174,15 @@ static int dev_online_memory(struct daxctl_dev *dev)
 			devname, strerror(-num_on));
 		return num_on;
 	}
+	if (num_on)
+		fprintf(stderr,
+		    "%s:\n  WARNING: detected a race while onlining memory\n"
+		    "  Some memory may not be in the expected zone. It is\n"
+		    "  recommended to disable any other onlining mechanisms,\n"
+		    "  and retry. If onlining is to be left to other agents,\n"
+		    "  use the --no-online option to suppress this warning\n",
+		    devname);
+
 	if (num_on == num_sections) {
 		fprintf(stderr, "%s: all memory sections (%d) already online\n",
 			devname, num_on);
diff --git a/daxctl/lib/libdaxctl.c b/daxctl/lib/libdaxctl.c
index 65a09c8..49986ca 100644
--- a/daxctl/lib/libdaxctl.c
+++ b/daxctl/lib/libdaxctl.c
@@ -1079,10 +1079,10 @@ static int memblock_is_online(struct daxctl_memory *mem, char *memblock)
 	return 0;
 }
 
-static int online_one_memblock(struct daxctl_memory *mem, char *memblock)
+static int online_one_memblock(struct daxctl_memory *mem, char *memblock,
+		int *status)
 {
 	struct daxctl_dev *dev = daxctl_memory_get_dev(mem);
-	const char *devname = daxctl_dev_get_devname(dev);
 	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
 	const char *mode = "online_movable";
 	int len = mem->buf_len, rc;
@@ -1097,10 +1097,6 @@ static int online_one_memblock(struct daxctl_memory *mem, char *memblock)
 	if (rc < 0)
 		return -ENOMEM;
 
-	/*
-	 * if already online, possibly due to kernel config or a udev rule,
-	 * there is nothing to do and we can skip over the memblock
-	 */
 	rc = memblock_is_online(mem, memblock);
 	if (rc)
 		return rc;
@@ -1108,18 +1104,14 @@ static int online_one_memblock(struct daxctl_memory *mem, char *memblock)
 	rc = sysfs_write_attr_quiet(ctx, path, mode);
 	if (rc) {
 		/*
-		 * While we performed an already-online check above, there
-		 * is still a TOCTOU hole where someone (such as a udev rule)
-		 * may have raced to online the memory. In such a case,
-		 * the sysfs store will fail, however we can check for this
-		 * by simply reading the state again. If it changed to the
-		 * desired state, then we don't have to error out.
+		 * If the block got onlined, potentially by some other agent,
+		 * do nothing for now. There will be a full scan for zone
+		 * correctness later.
 		 */
-		if (memblock_is_online(mem, memblock))
-			return 1;
-		err(ctx, "%s: Failed to online %s: %s\n",
-			devname, path, strerror(-rc));
+		if (memblock_is_online(mem, memblock) == 1)
+			return 0;
 	}
+
 	return rc;
 }
 
@@ -1150,7 +1142,7 @@ static int offline_one_memblock(struct daxctl_memory *mem, char *memblock)
 
 	rc = sysfs_write_attr_quiet(ctx, path, mode);
 	if (rc) {
-		/* Close the TOCTOU hole like in online_one_memblock() above */
+		/* check if something raced us to offline (unlikely) */
 		if (!memblock_is_online(mem, memblock))
 			return 1;
 		err(ctx, "%s: Failed to offline %s: %s\n",
@@ -1274,7 +1266,7 @@ static int op_for_one_memblock(struct daxctl_memory *mem, char *memblock,
 
 	switch (op) {
 	case MEM_SET_ONLINE:
-		return online_one_memblock(mem, memblock);
+		return online_one_memblock(mem, memblock, status);
 	case MEM_SET_OFFLINE:
 		return offline_one_memblock(mem, memblock);
 	case MEM_IS_ONLINE:
@@ -1349,9 +1341,38 @@ out_dir:
 	return rc;
 }
 
+/*
+ * daxctl_memory_online() will online to ZONE_MOVABLE by default
+ */
 DAXCTL_EXPORT int daxctl_memory_online(struct daxctl_memory *mem)
 {
-	return daxctl_memory_op(mem, MEM_SET_ONLINE);
+	struct daxctl_dev *dev = daxctl_memory_get_dev(mem);
+	const char *devname = daxctl_dev_get_devname(dev);
+	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
+	int rc;
+
+	rc = daxctl_memory_op(mem, MEM_SET_ONLINE);
+	if (rc)
+		return rc;
+
+	/*
+	 * Detect any potential races when blocks were being brought online by
+	 * checking the zone in which the memory blocks are at this point. If
+	 * any of the blocks are not in ZONE_MOVABLE, emit a warning.
+	 */
+	mem->zone = 0;
+	rc = daxctl_memory_op(mem, MEM_FIND_ZONE);
+	if (rc)
+		return rc;
+	if (mem->zone != MEM_ZONE_MOVABLE) {
+		err(ctx,
+		    "%s:\n  WARNING: detected a race while onlining memory\n"
+		    "  See 'man daxctl-reconfigure-device' for more details\n",
+		    devname);
+		return -EBUSY;
+	}
+
+	return rc;
 }
 
 DAXCTL_EXPORT int daxctl_memory_offline(struct daxctl_memory *mem)
-- 
2.20.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [ndctl PATCH v2 08/10] Documentation: clarify memory movablity for reconfigure-device
  2019-10-20  3:23 [ndctl PATCH v2 00/10] fixes and movability for system-ram mode Vishal Verma
                   ` (6 preceding siblings ...)
  2019-10-20  3:23 ` [ndctl PATCH v2 07/10] daxctl: detect races when onlining memory blocks Vishal Verma
@ 2019-10-20  3:23 ` Vishal Verma
  2019-10-20  3:23 ` [ndctl PATCH v2 09/10] libdaxctl: add an API to online memory in a non-movable state Vishal Verma
  2019-10-20  3:23 ` [ndctl PATCH v2 10/10] daxctl: add --no-movable option for onlining memory Vishal Verma
  9 siblings, 0 replies; 11+ messages in thread
From: Vishal Verma @ 2019-10-20  3:23 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Ben Olson, Dave Hansen

Move the note about potential races in memory onlining into the
'Description' section to make it more prominent. Reword the note to talk
about the sources of such races, and talk about the new 'race detection'
semantics in daxctl.

Link: https://github.com/pmem/ndctl/issues/110
Reported-by: Ben Olson <ben.olson@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 .../daxctl/daxctl-reconfigure-device.txt      | 24 +++++++++++--------
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/Documentation/daxctl/daxctl-reconfigure-device.txt b/Documentation/daxctl/daxctl-reconfigure-device.txt
index 196d692..4663529 100644
--- a/Documentation/daxctl/daxctl-reconfigure-device.txt
+++ b/Documentation/daxctl/daxctl-reconfigure-device.txt
@@ -97,6 +97,20 @@ error reconfiguring devices: Operation not supported
 reconfigured 0 devices
 ----
 
+'daxctl-reconfigure-device' nominally expects that it will online new memory
+blocks as 'movable', so that kernel data doesn't make it into this memory.
+However, there are other potential agents that may be configured to
+automatically online new hot-plugged memory as it appears. Most notably,
+these are the '/sys/devices/system/memory/auto_online_blocks' configuration,
+or system udev rules. If such an agent races to online memory sections, daxctl
+checks if the blocks were onlined as 'movable' memory. If this was not the
+case, and the memory blocks are found to be in a different zone, then a
+warning is displayed. If it is desired that a different agent control the
+onlining of memory blocks, and the associated memory zone, then it is
+recommended to use the --no-online option described below. This will abridge
+the device reconfiguration operation to just hotplugging the memory, and
+refrain from then onlining it.
+
 OPTIONS
 -------
 -r::
@@ -121,16 +135,6 @@ OPTIONS
 	brought online automatically and immediately with the 'online_movable'
 	policy. Use this option to disable the automatic onlining behavior.
 
-	NOTE: While this option prevents daxctl from automatically onlining
-	the memory sections, there may be other agents, notably system udev
-	rules, that online new memory sections as they appear. Coordinating
-	with such rules is out of scope of this utility, and the system
-	administrator is expected to remove them if they are undesirable.
-	If such an agent races to online memory sections, daxctl is prepared
-	to lose the race, and not fail the onlining operation as it only
-	cares that the memory section was onlined, not that it was the one
-	to do so.
-
 -f::
 --force::
 	When converting from "system-ram" mode to "devdax", it is expected
-- 
2.20.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [ndctl PATCH v2 09/10] libdaxctl: add an API to online memory in a non-movable state
  2019-10-20  3:23 [ndctl PATCH v2 00/10] fixes and movability for system-ram mode Vishal Verma
                   ` (7 preceding siblings ...)
  2019-10-20  3:23 ` [ndctl PATCH v2 08/10] Documentation: clarify memory movablity for reconfigure-device Vishal Verma
@ 2019-10-20  3:23 ` Vishal Verma
  2019-10-20  3:23 ` [ndctl PATCH v2 10/10] daxctl: add --no-movable option for onlining memory Vishal Verma
  9 siblings, 0 replies; 11+ messages in thread
From: Vishal Verma @ 2019-10-20  3:23 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Ben Olson, Dave Hansen

Some users may want additional control over the state in which memory
gets onlined - specifically, controlling the 'movable' aspect of the
resulting memory blocks.

Until now, libdaxctl only brought up memory in the 'movable' state. Add
a new interface to online memory in the non-movable state.

Link: https://github.com/pmem/ndctl/issues/110
Cc: Ben Olson <ben.olson@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 daxctl/lib/libdaxctl-private.h |  6 +++++
 daxctl/lib/libdaxctl.c         | 47 ++++++++++++++++++++++++++++------
 daxctl/lib/libdaxctl.sym       |  1 +
 daxctl/libdaxctl.h             |  1 +
 4 files changed, 47 insertions(+), 8 deletions(-)

diff --git a/daxctl/lib/libdaxctl-private.h b/daxctl/lib/libdaxctl-private.h
index ebaecb8..9f9c70d 100644
--- a/daxctl/lib/libdaxctl-private.h
+++ b/daxctl/lib/libdaxctl-private.h
@@ -42,6 +42,7 @@ static const char *dax_modules[] = {
 enum memory_op {
 	MEM_SET_OFFLINE,
 	MEM_SET_ONLINE,
+	MEM_SET_ONLINE_NO_MOVABLE,
 	MEM_IS_ONLINE,
 	MEM_COUNT,
 	MEM_GET_ZONE,
@@ -65,6 +66,11 @@ static const char *zone_strings[] = {
 	[MEM_ZONE_MOVABLE] = "Movable",
 };
 
+static const char *state_strings[] = {
+	[MEM_ZONE_NORMAL] = "online",
+	[MEM_ZONE_MOVABLE] = "online_movable",
+};
+
 /**
  * struct daxctl_region - container for dax_devices
  */
diff --git a/daxctl/lib/libdaxctl.c b/daxctl/lib/libdaxctl.c
index 49986ca..ee4a069 100644
--- a/daxctl/lib/libdaxctl.c
+++ b/daxctl/lib/libdaxctl.c
@@ -1080,11 +1080,10 @@ static int memblock_is_online(struct daxctl_memory *mem, char *memblock)
 }
 
 static int online_one_memblock(struct daxctl_memory *mem, char *memblock,
-		int *status)
+		enum memory_zones zone, int *status)
 {
 	struct daxctl_dev *dev = daxctl_memory_get_dev(mem);
 	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
-	const char *mode = "online_movable";
 	int len = mem->buf_len, rc;
 	char *path = mem->mem_buf;
 	const char *node_path;
@@ -1101,7 +1100,14 @@ static int online_one_memblock(struct daxctl_memory *mem, char *memblock,
 	if (rc)
 		return rc;
 
-	rc = sysfs_write_attr_quiet(ctx, path, mode);
+	switch (zone) {
+	case MEM_ZONE_MOVABLE:
+	case MEM_ZONE_NORMAL:
+		rc = sysfs_write_attr_quiet(ctx, path, state_strings[zone]);
+		break;
+	default:
+		rc = -EINVAL;
+	}
 	if (rc) {
 		/*
 		 * If the block got onlined, potentially by some other agent,
@@ -1266,7 +1272,11 @@ static int op_for_one_memblock(struct daxctl_memory *mem, char *memblock,
 
 	switch (op) {
 	case MEM_SET_ONLINE:
-		return online_one_memblock(mem, memblock, status);
+		return online_one_memblock(mem, memblock, MEM_ZONE_MOVABLE,
+				status);
+	case MEM_SET_ONLINE_NO_MOVABLE:
+		return online_one_memblock(mem, memblock, MEM_ZONE_NORMAL,
+				status);
 	case MEM_SET_OFFLINE:
 		return offline_one_memblock(mem, memblock);
 	case MEM_IS_ONLINE:
@@ -1344,14 +1354,25 @@ out_dir:
 /*
  * daxctl_memory_online() will online to ZONE_MOVABLE by default
  */
-DAXCTL_EXPORT int daxctl_memory_online(struct daxctl_memory *mem)
+static int daxctl_memory_online_with_zone(struct daxctl_memory *mem,
+		enum memory_zones zone)
 {
 	struct daxctl_dev *dev = daxctl_memory_get_dev(mem);
 	const char *devname = daxctl_dev_get_devname(dev);
 	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
 	int rc;
 
-	rc = daxctl_memory_op(mem, MEM_SET_ONLINE);
+	switch (zone) {
+	case MEM_ZONE_MOVABLE:
+		rc = daxctl_memory_op(mem, MEM_SET_ONLINE);
+		break;
+	case MEM_ZONE_NORMAL:
+		rc = daxctl_memory_op(mem, MEM_SET_ONLINE_NO_MOVABLE);
+		break;
+	default:
+		err(ctx, "%s: BUG: invalid zone for onlining\n", devname);
+		rc = -EINVAL;
+	}
 	if (rc)
 		return rc;
 
@@ -1361,10 +1382,10 @@ DAXCTL_EXPORT int daxctl_memory_online(struct daxctl_memory *mem)
 	 * any of the blocks are not in ZONE_MOVABLE, emit a warning.
 	 */
 	mem->zone = 0;
-	rc = daxctl_memory_op(mem, MEM_FIND_ZONE);
+	rc = daxctl_memory_op(mem, MEM_GET_ZONE);
 	if (rc)
 		return rc;
-	if (mem->zone != MEM_ZONE_MOVABLE) {
+	if (mem->zone != zone) {
 		err(ctx,
 		    "%s:\n  WARNING: detected a race while onlining memory\n"
 		    "  See 'man daxctl-reconfigure-device' for more details\n",
@@ -1375,6 +1396,16 @@ DAXCTL_EXPORT int daxctl_memory_online(struct daxctl_memory *mem)
 	return rc;
 }
 
+DAXCTL_EXPORT int daxctl_memory_online(struct daxctl_memory *mem)
+{
+	return daxctl_memory_online_with_zone(mem, MEM_ZONE_MOVABLE);
+}
+
+DAXCTL_EXPORT int daxctl_memory_online_no_movable(struct daxctl_memory *mem)
+{
+	return daxctl_memory_online_with_zone(mem, MEM_ZONE_NORMAL);
+}
+
 DAXCTL_EXPORT int daxctl_memory_offline(struct daxctl_memory *mem)
 {
 	return daxctl_memory_op(mem, MEM_SET_OFFLINE);
diff --git a/daxctl/lib/libdaxctl.sym b/daxctl/lib/libdaxctl.sym
index 041d9a5..87d0236 100644
--- a/daxctl/lib/libdaxctl.sym
+++ b/daxctl/lib/libdaxctl.sym
@@ -73,4 +73,5 @@ global:
 LIBDAXCTL_7 {
 global:
 	daxctl_memory_is_movable;
+	daxctl_memory_online_no_movable;
 } LIBDAXCTL_6;
diff --git a/daxctl/libdaxctl.h b/daxctl/libdaxctl.h
index 8d5f8b7..6c545e1 100644
--- a/daxctl/libdaxctl.h
+++ b/daxctl/libdaxctl.h
@@ -85,6 +85,7 @@ int daxctl_memory_offline(struct daxctl_memory *mem);
 int daxctl_memory_is_online(struct daxctl_memory *mem);
 int daxctl_memory_num_sections(struct daxctl_memory *mem);
 int daxctl_memory_is_movable(struct daxctl_memory *mem);
+int daxctl_memory_online_no_movable(struct daxctl_memory *mem);
 
 #define daxctl_dev_foreach(region, dev) \
         for (dev = daxctl_dev_get_first(region); \
-- 
2.20.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [ndctl PATCH v2 10/10] daxctl: add --no-movable option for onlining memory
  2019-10-20  3:23 [ndctl PATCH v2 00/10] fixes and movability for system-ram mode Vishal Verma
                   ` (8 preceding siblings ...)
  2019-10-20  3:23 ` [ndctl PATCH v2 09/10] libdaxctl: add an API to online memory in a non-movable state Vishal Verma
@ 2019-10-20  3:23 ` Vishal Verma
  9 siblings, 0 replies; 11+ messages in thread
From: Vishal Verma @ 2019-10-20  3:23 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Ben Olson, Dave Hansen

Add a new '--no-movable' option to daxctl commands that may online
memory - i.e. daxctl-reconfigure-device and daxctl-online-memory.

Users may wish additional control over the state of the newly added
memory. Retain the daxctl default for onlining memory as 'movable', but
allow it to be overridden using the above option.

Link: https://github.com/pmem/ndctl/issues/110
Cc: Ben Olson <ben.olson@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 Documentation/daxctl/daxctl-online-memory.txt |  2 ++
 .../daxctl/daxctl-reconfigure-device.txt      |  2 ++
 Documentation/daxctl/movable-options.txt      |  9 +++++
 daxctl/device.c                               | 34 ++++++++++++++++---
 4 files changed, 43 insertions(+), 4 deletions(-)
 create mode 100644 Documentation/daxctl/movable-options.txt

diff --git a/Documentation/daxctl/daxctl-online-memory.txt b/Documentation/daxctl/daxctl-online-memory.txt
index 5ac1cbf..08b45cc 100644
--- a/Documentation/daxctl/daxctl-online-memory.txt
+++ b/Documentation/daxctl/daxctl-online-memory.txt
@@ -62,6 +62,8 @@ OPTIONS
 	more /dev/daxX.Y devices, where X is the region id and Y is the device
 	instance id.
 
+include::movable-options.txt[]
+
 -u::
 --human::
 	By default the command will output machine-friendly raw-integer
diff --git a/Documentation/daxctl/daxctl-reconfigure-device.txt b/Documentation/daxctl/daxctl-reconfigure-device.txt
index 4663529..cb28fed 100644
--- a/Documentation/daxctl/daxctl-reconfigure-device.txt
+++ b/Documentation/daxctl/daxctl-reconfigure-device.txt
@@ -135,6 +135,8 @@ OPTIONS
 	brought online automatically and immediately with the 'online_movable'
 	policy. Use this option to disable the automatic onlining behavior.
 
+include::movable-options.txt[]
+
 -f::
 --force::
 	When converting from "system-ram" mode to "devdax", it is expected
diff --git a/Documentation/daxctl/movable-options.txt b/Documentation/daxctl/movable-options.txt
new file mode 100644
index 0000000..cecd401
--- /dev/null
+++ b/Documentation/daxctl/movable-options.txt
@@ -0,0 +1,9 @@
+// SPDX-License-Identifier: GPL-2.0
+
+--no-movable::
+	'--movable' is the default. This can be overridden to online new
+	memory such that is is not 'movable'. This allows any allocation
+	to potentially be served from this memory. This may preclude subsequent
+	removal. With the '--movable' behavior (which is default), kernel
+	allocations will not consider this memory, and it will be reserved
+	for application use.
diff --git a/daxctl/device.c b/daxctl/device.c
index 28698bf..72e506e 100644
--- a/daxctl/device.c
+++ b/daxctl/device.c
@@ -21,6 +21,7 @@ static struct {
 	const char *mode;
 	int region_id;
 	bool no_online;
+	bool no_movable;
 	bool force;
 	bool human;
 	bool verbose;
@@ -37,6 +38,12 @@ enum dev_mode {
 static enum dev_mode reconfig_mode = DAXCTL_DEV_MODE_UNKNOWN;
 static unsigned long flags;
 
+enum memory_zone {
+	MEM_ZONE_MOVABLE,
+	MEM_ZONE_NORMAL,
+};
+static enum memory_zone mem_zone = MEM_ZONE_MOVABLE;
+
 enum device_action {
 	ACTION_RECONFIG,
 	ACTION_ONLINE,
@@ -55,13 +62,24 @@ OPT_BOOLEAN('N', "no-online", &param.no_online, \
 OPT_BOOLEAN('f', "force", &param.force, \
 		"attempt to offline memory sections before reconfiguration")
 
+#define ZONE_OPTIONS() \
+OPT_BOOLEAN('\0', "no-movable", &param.no_movable, \
+		"online memory in ZONE_NORMAL")
+
 static const struct option reconfig_options[] = {
 	BASE_OPTIONS(),
 	RECONFIG_OPTIONS(),
+	ZONE_OPTIONS(),
+	OPT_END(),
+};
+
+static const struct option online_options[] = {
+	BASE_OPTIONS(),
+	ZONE_OPTIONS(),
 	OPT_END(),
 };
 
-static const struct option memory_options[] = {
+static const struct option offline_options[] = {
 	BASE_OPTIONS(),
 	OPT_END(),
 };
@@ -126,6 +144,8 @@ static const char *parse_device_options(int argc, const char **argv,
 		}
 		if (strcmp(param.mode, "system-ram") == 0) {
 			reconfig_mode = DAXCTL_DEV_MODE_RAM;
+			if (param.no_movable)
+				mem_zone = MEM_ZONE_NORMAL;
 		} else if (strcmp(param.mode, "devdax") == 0) {
 			reconfig_mode = DAXCTL_DEV_MODE_DEVDAX;
 			if (param.no_online) {
@@ -136,6 +156,9 @@ static const char *parse_device_options(int argc, const char **argv,
 		}
 		break;
 	case ACTION_ONLINE:
+		if (param.no_movable)
+			mem_zone = MEM_ZONE_NORMAL;
+		/* fall through */
 	case ACTION_OFFLINE:
 		/* nothing special */
 		break;
@@ -194,7 +217,10 @@ static int dev_online_memory(struct daxctl_dev *dev)
 			num_on == 1 ? "" : "s");
 
 	/* online the remaining sections */
-	rc = daxctl_memory_online(mem);
+	if (param.no_movable)
+		rc = daxctl_memory_online_no_movable(mem);
+	else
+		rc = daxctl_memory_online(mem);
 	if (rc < 0) {
 		fprintf(stderr, "%s: failed to online memory: %s\n",
 			devname, strerror(-rc));
@@ -521,7 +547,7 @@ int cmd_online_memory(int argc, const char **argv, struct daxctl_ctx *ctx)
 {
 	char *usage = "daxctl online-memory <device> [<options>]";
 	const char *device = parse_device_options(argc, argv, ACTION_ONLINE,
-			memory_options, usage, ctx);
+			online_options, usage, ctx);
 	int processed, rc;
 
 	rc = do_xaction_device(device, ACTION_ONLINE, ctx, &processed);
@@ -538,7 +564,7 @@ int cmd_offline_memory(int argc, const char **argv, struct daxctl_ctx *ctx)
 {
 	char *usage = "daxctl offline-memory <device> [<options>]";
 	const char *device = parse_device_options(argc, argv, ACTION_OFFLINE,
-			memory_options, usage, ctx);
+			offline_options, usage, ctx);
 	int processed, rc;
 
 	rc = do_xaction_device(device, ACTION_OFFLINE, ctx, &processed);
-- 
2.20.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, back to index

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-20  3:23 [ndctl PATCH v2 00/10] fixes and movability for system-ram mode Vishal Verma
2019-10-20  3:23 ` [ndctl PATCH v2 01/10] libdaxctl: refactor path construction in op_for_one_memblock() Vishal Verma
2019-10-20  3:23 ` [ndctl PATCH v2 02/10] libdaxctl: refactor memblock_is_online() checks Vishal Verma
2019-10-20  3:23 ` [ndctl PATCH v2 03/10] daxctl/device.c: fix json output omission for reconfigure-device Vishal Verma
2019-10-20  3:23 ` [ndctl PATCH v2 04/10] libdaxctl: add an API to determine if memory is movable Vishal Verma
2019-10-20  3:23 ` [ndctl PATCH v2 05/10] libdaxctl: allow memblock_in_dev() to return an error Vishal Verma
2019-10-20  3:23 ` [ndctl PATCH v2 06/10] daxctl: show a 'movable' attribute in device listings Vishal Verma
2019-10-20  3:23 ` [ndctl PATCH v2 07/10] daxctl: detect races when onlining memory blocks Vishal Verma
2019-10-20  3:23 ` [ndctl PATCH v2 08/10] Documentation: clarify memory movablity for reconfigure-device Vishal Verma
2019-10-20  3:23 ` [ndctl PATCH v2 09/10] libdaxctl: add an API to online memory in a non-movable state Vishal Verma
2019-10-20  3:23 ` [ndctl PATCH v2 10/10] daxctl: add --no-movable option for onlining memory Vishal Verma

Linux-NVDIMM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nvdimm/0 linux-nvdimm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nvdimm linux-nvdimm/ https://lore.kernel.org/linux-nvdimm \
		linux-nvdimm@lists.01.org
	public-inbox-index linux-nvdimm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.01.lists.linux-nvdimm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git