nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [ndctl PATCH v9 00/13] daxctl: add a new reconfigure-device command
@ 2019-08-01  0:29 Vishal Verma
  2019-08-01  0:29 ` [ndctl PATCH v9 01/13] libdaxctl: add interfaces to get ctx and check device state Vishal Verma
                   ` (12 more replies)
  0 siblings, 13 replies; 15+ messages in thread
From: Vishal Verma @ 2019-08-01  0:29 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Dave Hansen, Pavel Tatashin

Changes in v9:
- Move the device model checking into the library. This way, daxctl-list
  can correctly determine 'state' which only applies to the dax-bus
  model.

Changes in v8:
- rename the --attempt-offline option to --force (Dan)
- clarify the messages when device is already in the requested state (Dan)
- s/unable/failed/ in device.c error messages (Dan)
- daxctl_memory_{on,off}line() instead of daxctl_memory_set_{on,off}line (Dan)
- Add an interface to get a count of the memory sections associated with a
  device (Dan)
- As a result, refactor the readdir loop into a common memory_op function that
  can set the state, get the online state, and get a count of all blocks.
- Update the onlining/offlining routines used in both the reconfigure-device
  and {on,off}line-memory commands to use the new daxctl_memory_num_sections()
  interface to validate the number of sections for which we changed the state.
- Add some small clarifications in the daxctl-reconfigure-device man page (Dan)
- In device.c add a verify_dax_bus_model() helper to check for the dax-bus
  subsystem (Dan).

Changes in v7:
- Fix a couple of checkaptch type errors in the new lines added in v6 (Dan).
- Get rid of daxctl_dev_get_mode. daxctl_dev_get_memory is sufficient to
  both check the mode and allocate the memory related structures on its
  first call. (Dan)
- Due to the above, daxctl_dev_mode is now private to libdaxctl, and not
  part of the API exported through libdaxctl.h
- Add a large enough buffer at init time to construct dynamic paths, and avoid
  asprintf() type allocations for memory blocks at runtime (Dan).

Changes in v6:
- For memory block online/offline operations, the kernel responds with
  an EINVAL for both 'real' errors, and if the memory was already in the
  requested state. Since there is a TOCTOU hole between checking the
  state and storing it, just perform a second check if the store results
  in an error. If the check shows the state to be the same as the one
  we're attempting, it means that another agent (usually udev) won the
  race, but we don't care so long as the state change happened, so don't
  report an error. (Fan Du)

Changes in v5:
 - device.c: correctly set loglevel for daxctl_ctx for --verbose
 - drop the subsys caching, its complexity started to exceed its
   benefit. dax-class device models will simply error out during
   reconfigure. (Dan)
 - Add a note to the man page for the above.
 - Clarify the onlining policy (online_movable) in the man page
 - rename "numa_node" to "target_node" in device listings (Dan)
 - When printing a device 'mode', assume devdax if !system-ram,
   avoiding a "mode: unknown" situation which can be confusing. (Dan)
 - Add a "state: disabled" attribute to the device listing if a driver
   is not bound. This is more apt than the previous "mode: unknown"
   listing.
 - add an api to get 'dev->resource' parsing /proc/iomem as a
   fallback for when the kernel doesn't provide the attribute (Dan)
 - convert node_* apis to 'memory_* apis that act on a new daxctl_memory
   object (Dan)
 - online only memory sections belonging to the device in question by
   cross referencing block indices with the dax device resource (Dan)
 - Refuse to reconfigure a device that is already in the target mode.
   Until now, reconfiguring a system-ram device back to system-ram would
   result in a 'online memory may not be hot-removed' kernel warning.
 - If the device was already in the system-ram mode, skip
   disabling/enabling, but still try to online the memory unless the
   --no-online option is in effect.
 - In daxctl_unbind, also 'remove_id' to prevent devices automatically
   binding to the kmem driver on a disable + re-enable, which can be
   surprising (Dan).
 - Rewrite the top half of daxctl/device.c to borrow elements from
   ndctl/namespace.c so that it can support growing additional commands
   that operate on devices (online-memory and offline-memory)
 - Refactor the bottom half of daxctl/device.c so we only do the
   disabling/offlining steps if the device was enabled.
 - Add new commands to online and offline memory sections (Dan)
   associated with a given dax device (Dan)
 - Add a new test - daxctl-device.sh - to test daxctl reconfigure-device,
   online-memory, and offline-memory commands.
 - Add an example in documentation demonstrating how to use numactl
   to bind a process to a node surfaced from a dax device (Andy Rudoff)

Changes in v4:
 - Don't fail add_dax_dev for kmod failures. Instead fail only when the kmod
   list is actually used, i.e. during daxctl-reconfigure-device

Changes in v3:
 - In daxctl_dev_get_mode(), remove the subsystem warning, detect dax-class
   and simply make it return devdax

Changes in v2:
 - Add examples to the documentation page (Dave Hansen)
 - Clarify documentation regarding the conversion from system-ram to devdax
 - Remove any references to a persistent config from the documentation -
   those can be added when the feature is added.
 - device.c: validate option compatibility
 - daxctl-list: display numa_node for device listings
 - daxctl-list: display mode for device listings
 - make the options more consistent by adding a '-O' short option
   for --attempt-offline

Add a new daxctl-reconfigure-device command that lets us reconfigure DAX
devices back and forth between 'system-ram' and 'device-dax' modes. It
also includes facilities to online any newly hot-plugged memory
(default), and attempt to offline memory before converting away from the
system-ram mode (not default, requires a --attempt-offline option).

Currently missing from this series is a way to persistently store which
devices have been 'marked' for use as system-ram. This depends on a
config system overhaul in ndctl, and patches for those will follow
separately and are independent of this work.

Example invocations:

1. Reconfigure dax0.0 to system-ram mode, don’t online the memory
    # daxctl reconfigure-device --mode=system-ram --no-online dax0.0
    [
      {
        "chardev":"dax0.0",
        "size":16777216000,
        "target_node":2,
        "mode":"system-ram"
      }
    ]

2. Reconfigure dax0.0 to devdax mode, attempt to offline the memory
    # daxctl reconfigure-device --human --mode=devdax --attempt-offline dax0.0
    {
      "chardev":"dax0.0",
      "size":"15.63 GiB (16.78 GB)",
      "target_node":2,
      "mode":"devdax"
    }

3. Reconfigure all dax devices on region0 to system-ram mode
    # daxctl reconfigure-device --mode=system-ram --region=0 all
    [
      {
        "chardev":"dax0.0",
        "size":16777216000,
        "target_node":2,
        "mode":"system-ram"
      },
      {
        "chardev":"dax0.1",
        "size":16777216000,
        "target_node":3,
        "mode":"system-ram"
      }
    ]

These patches can also be found in the 'kmem-pending' branch on github:
https://github.com/pmem/ndctl/tree/kmem-pending

Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>

Vishal Verma (13):
  libdaxctl: add interfaces to get ctx and check device state
  libdaxctl: add interfaces to enable/disable devices
  libdaxctl: add an interface to retrieve the device resource
  libdaxctl: add a 'daxctl_memory' object for memory based operations
  daxctl/list: add target_node for device listings
  daxctl/list: display the mode for a dax device
  daxctl: add a new reconfigure-device command
  Documentation/daxctl: add a man page for daxctl-reconfigure-device
  daxctl: add commands to online and offline memory
  Documentation: Add man pages for daxctl-{on,off}line-memory
  contrib/ndctl: fix region-id completions for daxctl
  contrib/ndctl: add bash-completion for the new daxctl commands
  test: Add a unit test for daxctl-reconfigure-device and friends

 Documentation/daxctl/Makefile.am              |   5 +-
 .../daxctl/daxctl-offline-memory.txt          |  72 ++
 Documentation/daxctl/daxctl-online-memory.txt |  80 ++
 .../daxctl/daxctl-reconfigure-device.txt      | 157 ++++
 Makefile.am                                   |   3 +-
 contrib/ndctl                                 |  38 +-
 daxctl/Makefile.am                            |   2 +
 daxctl/builtin.h                              |   3 +
 daxctl/daxctl.c                               |   3 +
 daxctl/device.c                               | 543 +++++++++++++
 daxctl/lib/Makefile.am                        |   5 +-
 daxctl/lib/libdaxctl-private.h                |  40 +
 daxctl/lib/libdaxctl.c                        | 712 ++++++++++++++++++
 daxctl/lib/libdaxctl.sym                      |  19 +
 daxctl/libdaxctl.h                            |  17 +
 test/Makefile.am                              |   3 +-
 test/common                                   |  19 +-
 test/daxctl-devices.sh                        |  81 ++
 util/iomem.c                                  |  37 +
 util/iomem.h                                  |  12 +
 util/json.c                                   |  22 +
 21 files changed, 1859 insertions(+), 14 deletions(-)
 create mode 100644 Documentation/daxctl/daxctl-offline-memory.txt
 create mode 100644 Documentation/daxctl/daxctl-online-memory.txt
 create mode 100644 Documentation/daxctl/daxctl-reconfigure-device.txt
 create mode 100644 daxctl/device.c
 create mode 100755 test/daxctl-devices.sh
 create mode 100644 util/iomem.c
 create mode 100644 util/iomem.h

-- 
2.20.1

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [ndctl PATCH v9 01/13] libdaxctl: add interfaces to get ctx and check device state
  2019-08-01  0:29 [ndctl PATCH v9 00/13] daxctl: add a new reconfigure-device command Vishal Verma
@ 2019-08-01  0:29 ` Vishal Verma
  2019-08-01  0:29 ` [ndctl PATCH v9 02/13] libdaxctl: add interfaces to enable/disable devices Vishal Verma
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Vishal Verma @ 2019-08-01  0:29 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Dave Hansen, Pavel Tatashin

In preparation for libdaxctl and daxctl to grow operational modes for
DAX devices, add the following supporting APIs:

  daxctl_dev_get_ctx
  daxctl_dev_is_enabled

It also adds and uses a helper to verify the device model for the
_is_enabled API, since enable/disable only make sense for the dax-bus
model.

Cc: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 daxctl/lib/libdaxctl.c   | 70 ++++++++++++++++++++++++++++++++++++++++
 daxctl/lib/libdaxctl.sym |  6 ++++
 daxctl/libdaxctl.h       |  2 ++
 3 files changed, 78 insertions(+)

diff --git a/daxctl/lib/libdaxctl.c b/daxctl/lib/libdaxctl.c
index c2e3a52..916a49e 100644
--- a/daxctl/lib/libdaxctl.c
+++ b/daxctl/lib/libdaxctl.c
@@ -306,6 +306,43 @@ DAXCTL_EXPORT struct daxctl_region *daxctl_new_region(struct daxctl_ctx *ctx,
 	return region;
 }
 
+static bool device_model_is_dax_bus(struct daxctl_dev *dev)
+{
+	const char *devname = daxctl_dev_get_devname(dev);
+	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
+	char *path = dev->dev_buf, *resolved;
+	size_t len = dev->buf_len;
+	struct stat sb;
+
+	if (snprintf(path, len, "/dev/%s", devname) < 0)
+		return false;
+
+	if (lstat(path, &sb) < 0) {
+		err(ctx, "%s: stat for %s failed: %s\n",
+			devname, path, strerror(errno));
+		return false;
+	}
+
+	if (snprintf(path, len, "/sys/dev/char/%d:%d/subsystem",
+			major(sb.st_rdev), minor(sb.st_rdev)) < 0)
+		return false;
+
+	resolved = realpath(path, NULL);
+	if (!resolved) {
+		err(ctx, "%s:  unable to determine subsys: %s\n",
+			devname, strerror(errno));
+		return false;
+	}
+
+	if (strcmp(resolved, "/sys/bus/dax") == 0) {
+		free(resolved);
+		return true;
+	}
+
+	free(resolved);
+	return false;
+}
+
 static void *add_dax_dev(void *parent, int id, const char *daxdev_base)
 {
 	const char *devname = devpath_to_devname(daxdev_base);
@@ -559,6 +596,39 @@ static void dax_regions_init(struct daxctl_ctx *ctx)
 	}
 }
 
+static int is_enabled(const char *drvpath)
+{
+	struct stat st;
+
+	if (lstat(drvpath, &st) < 0 || !S_ISLNK(st.st_mode))
+		return 0;
+	else
+		return 1;
+}
+
+DAXCTL_EXPORT int daxctl_dev_is_enabled(struct daxctl_dev *dev)
+{
+	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
+	char *path = dev->dev_buf;
+	int len = dev->buf_len;
+
+	if (!device_model_is_dax_bus(dev))
+		return 1;
+
+	if (snprintf(path, len, "%s/driver", dev->dev_path) >= len) {
+		err(ctx, "%s: buffer too small!\n",
+				daxctl_dev_get_devname(dev));
+		return 0;
+	}
+
+	return is_enabled(path);
+}
+
+DAXCTL_EXPORT struct daxctl_ctx *daxctl_dev_get_ctx(struct daxctl_dev *dev)
+{
+	return dev->region->ctx;
+}
+
 DAXCTL_EXPORT struct daxctl_dev *daxctl_dev_get_first(struct daxctl_region *region)
 {
 	dax_devices_init(region);
diff --git a/daxctl/lib/libdaxctl.sym b/daxctl/lib/libdaxctl.sym
index 84d3a69..c4af9a7 100644
--- a/daxctl/lib/libdaxctl.sym
+++ b/daxctl/lib/libdaxctl.sym
@@ -50,3 +50,9 @@ LIBDAXCTL_5 {
 global:
 	daxctl_region_get_path;
 } LIBDAXCTL_4;
+
+LIBDAXCTL_6 {
+global:
+	daxctl_dev_get_ctx;
+	daxctl_dev_is_enabled;
+} LIBDAXCTL_5;
diff --git a/daxctl/libdaxctl.h b/daxctl/libdaxctl.h
index 1d13ea2..e20ccb4 100644
--- a/daxctl/libdaxctl.h
+++ b/daxctl/libdaxctl.h
@@ -67,6 +67,8 @@ const char *daxctl_dev_get_devname(struct daxctl_dev *dev);
 int daxctl_dev_get_major(struct daxctl_dev *dev);
 int daxctl_dev_get_minor(struct daxctl_dev *dev);
 unsigned long long daxctl_dev_get_size(struct daxctl_dev *dev);
+struct daxctl_ctx *daxctl_dev_get_ctx(struct daxctl_dev *dev);
+int daxctl_dev_is_enabled(struct daxctl_dev *dev);
 
 #define daxctl_dev_foreach(region, dev) \
         for (dev = daxctl_dev_get_first(region); \
-- 
2.20.1

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [ndctl PATCH v9 02/13] libdaxctl: add interfaces to enable/disable devices
  2019-08-01  0:29 [ndctl PATCH v9 00/13] daxctl: add a new reconfigure-device command Vishal Verma
  2019-08-01  0:29 ` [ndctl PATCH v9 01/13] libdaxctl: add interfaces to get ctx and check device state Vishal Verma
@ 2019-08-01  0:29 ` Vishal Verma
  2019-08-01  0:29 ` [ndctl PATCH v9 03/13] libdaxctl: add an interface to retrieve the device resource Vishal Verma
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Vishal Verma @ 2019-08-01  0:29 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Dave Hansen, Pavel Tatashin

Add new libdaxctl interfaces to disable a device_dax based device, and
to enable it into the given mode. The modes available are 'devdax',
and 'system-ram', where devdax is the normal device DAX mode used
via a character device, and 'system-ram' uses the kernel's 'kmem'
facility to hotplug the device making it usable as normal memory.

This adds the following new interfaces:

  daxctl_dev_disable;
  daxctl_dev_enable_devdax;
  daxctl_dev_enable_ram;

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 daxctl/lib/Makefile.am         |   3 +-
 daxctl/lib/libdaxctl-private.h |  21 +++
 daxctl/lib/libdaxctl.c         | 244 +++++++++++++++++++++++++++++++++
 daxctl/lib/libdaxctl.sym       |   3 +
 daxctl/libdaxctl.h             |   3 +
 5 files changed, 273 insertions(+), 1 deletion(-)

diff --git a/daxctl/lib/Makefile.am b/daxctl/lib/Makefile.am
index d3d4852..9f0e444 100644
--- a/daxctl/lib/Makefile.am
+++ b/daxctl/lib/Makefile.am
@@ -16,7 +16,8 @@ libdaxctl_la_SOURCES =\
 	libdaxctl.c
 
 libdaxctl_la_LIBADD =\
-	$(UUID_LIBS)
+	$(UUID_LIBS) \
+	$(KMOD_LIBS)
 
 daxctl_modprobe_data_DATA = daxctl.conf
 
diff --git a/daxctl/lib/libdaxctl-private.h b/daxctl/lib/libdaxctl-private.h
index 4a462e7..120137f 100644
--- a/daxctl/lib/libdaxctl-private.h
+++ b/daxctl/lib/libdaxctl-private.h
@@ -13,6 +13,8 @@
 #ifndef _LIBDAXCTL_PRIVATE_H_
 #define _LIBDAXCTL_PRIVATE_H_
 
+#include <libkmod.h>
+
 #define DAXCTL_EXPORT __attribute__ ((visibility("default")))
 
 enum dax_subsystem {
@@ -26,6 +28,17 @@ static const char *dax_subsystems[] = {
 	[DAX_BUS] = "/sys/bus/dax/devices",
 };
 
+enum daxctl_dev_mode {
+	DAXCTL_DEV_MODE_DEVDAX = 0,
+	DAXCTL_DEV_MODE_RAM,
+	DAXCTL_DEV_MODE_END,
+};
+
+static const char *dax_modules[] = {
+	[DAXCTL_DEV_MODE_DEVDAX] = "device_dax",
+	[DAXCTL_DEV_MODE_RAM] = "kmem",
+};
+
 /**
  * struct daxctl_region - container for dax_devices
  */
@@ -53,6 +66,14 @@ struct daxctl_dev {
 	char *dev_path;
 	struct list_node list;
 	unsigned long long size;
+	struct kmod_module *module;
+	struct kmod_list *kmod_list;
 	struct daxctl_region *region;
 };
+
+static inline int check_kmod(struct kmod_ctx *kmod_ctx)
+{
+	return kmod_ctx ? 0 : -ENXIO;
+}
+
 #endif /* _LIBDAXCTL_PRIVATE_H_ */
diff --git a/daxctl/lib/libdaxctl.c b/daxctl/lib/libdaxctl.c
index 916a49e..caf661e 100644
--- a/daxctl/lib/libdaxctl.c
+++ b/daxctl/lib/libdaxctl.c
@@ -46,6 +46,7 @@ struct daxctl_ctx {
 	void *userdata;
 	int regions_init;
 	struct list_head regions;
+	struct kmod_ctx *kmod_ctx;
 };
 
 /**
@@ -84,20 +85,32 @@ DAXCTL_EXPORT void daxctl_set_userdata(struct daxctl_ctx *ctx, void *userdata)
  */
 DAXCTL_EXPORT int daxctl_new(struct daxctl_ctx **ctx)
 {
+	struct kmod_ctx *kmod_ctx;
 	struct daxctl_ctx *c;
+	int rc = 0;
 
 	c = calloc(1, sizeof(struct daxctl_ctx));
 	if (!c)
 		return -ENOMEM;
 
+	kmod_ctx = kmod_new(NULL, NULL);
+	if (check_kmod(kmod_ctx) != 0) {
+		rc = -ENXIO;
+		goto out;
+	}
+
 	c->refcount = 1;
 	log_init(&c->ctx, "libdaxctl", "DAXCTL_LOG");
 	info(c, "ctx %p created\n", c);
 	dbg(c, "log_priority=%d\n", c->ctx.log_priority);
 	*ctx = c;
 	list_head_init(&c->regions);
+	c->kmod_ctx = kmod_ctx;
 
 	return 0;
+out:
+	free(c);
+	return rc;
 }
 
 /**
@@ -132,6 +145,7 @@ DAXCTL_EXPORT void daxctl_unref(struct daxctl_ctx *ctx)
 	list_for_each_safe(&ctx->regions, region, _r, list)
 		free_region(region, &ctx->regions);
 
+	kmod_unref(ctx->kmod_ctx);
 	info(ctx, "context %p released\n", ctx);
 	free(ctx);
 }
@@ -189,6 +203,7 @@ static void free_dev(struct daxctl_dev *dev, struct list_head *head)
 {
 	if (head)
 		list_del_from(head, &dev->list);
+	kmod_module_unref_list(dev->kmod_list);
 	free(dev->dev_buf);
 	free(dev->dev_path);
 	free(dev);
@@ -343,6 +358,27 @@ static bool device_model_is_dax_bus(struct daxctl_dev *dev)
 	return false;
 }
 
+static struct kmod_list *to_module_list(struct daxctl_ctx *ctx,
+		const char *alias)
+{
+	struct kmod_list *list = NULL;
+	int rc;
+
+	if (!ctx->kmod_ctx || !alias)
+		return NULL;
+	if (alias[0] == 0)
+		return NULL;
+
+	rc = kmod_module_new_from_lookup(ctx->kmod_ctx, alias, &list);
+	if (rc < 0 || !list) {
+		dbg(ctx, "failed to find modules for alias: %s %d list: %s\n",
+				alias, rc, list ? "populated" : "empty");
+		return NULL;
+	}
+
+	return list;
+}
+
 static void *add_dax_dev(void *parent, int id, const char *daxdev_base)
 {
 	const char *devname = devpath_to_devname(daxdev_base);
@@ -352,6 +388,7 @@ static void *add_dax_dev(void *parent, int id, const char *daxdev_base)
 	struct daxctl_dev *dev, *dev_dup;
 	char buf[SYSFS_ATTR_SIZE];
 	struct stat st;
+	int rc;
 
 	if (!path)
 		return NULL;
@@ -383,6 +420,14 @@ static void *add_dax_dev(void *parent, int id, const char *daxdev_base)
 		goto err_read;
 	dev->buf_len = strlen(daxdev_base) + 50;
 
+	sprintf(path, "%s/modalias", daxdev_base);
+	rc = sysfs_read_attr(ctx, path, buf);
+	/* older kernels may be lack the modalias attribute */
+	if (rc < 0 && rc != -ENOENT)
+		goto err_read;
+	if (rc == 0)
+		dev->kmod_list = to_module_list(ctx, buf);
+
 	daxctl_dev_foreach(region, dev_dup)
 		if (dev_dup->id == dev->id) {
 			free_dev(dev, NULL);
@@ -606,6 +651,92 @@ static int is_enabled(const char *drvpath)
 		return 1;
 }
 
+static int daxctl_bind(struct daxctl_ctx *ctx, const char *devname,
+		const char *mod_name)
+{
+	DIR *dir;
+	int rc = 0;
+	char path[200];
+	struct dirent *de;
+	const int len = sizeof(path);
+
+	if (!devname) {
+		err(ctx, "missing devname\n");
+		return -EINVAL;
+	}
+
+	if (snprintf(path, len, "/sys/bus/dax/drivers") >= len) {
+		err(ctx, "%s: buffer too small!\n", devname);
+		return -ENXIO;
+	}
+
+	dir = opendir(path);
+	if (!dir) {
+		err(ctx, "%s: opendir(\"%s\") failed\n", devname, path);
+		return -ENXIO;
+	}
+
+	while ((de = readdir(dir)) != NULL) {
+		char *drv_path;
+
+		if (de->d_ino == 0)
+			continue;
+		if (de->d_name[0] == '.')
+			continue;
+		if (strcmp(de->d_name, mod_name) != 0)
+			continue;
+
+		if (asprintf(&drv_path, "%s/%s/new_id", path, de->d_name) < 0) {
+			err(ctx, "%s: path allocation failure\n", devname);
+			rc = -ENOMEM;
+			break;
+		}
+		rc = sysfs_write_attr_quiet(ctx, drv_path, devname);
+		free(drv_path);
+
+		if (asprintf(&drv_path, "%s/%s/bind", path, de->d_name) < 0) {
+			err(ctx, "%s: path allocation failure\n", devname);
+			rc = -ENOMEM;
+			break;
+		}
+		rc = sysfs_write_attr_quiet(ctx, drv_path, devname);
+		free(drv_path);
+		break;
+	}
+	closedir(dir);
+
+	if (rc) {
+		dbg(ctx, "%s: bind failed\n", devname);
+		return rc;
+	}
+	return 0;
+}
+
+static int daxctl_unbind(struct daxctl_ctx *ctx, const char *devpath)
+{
+	const char *devname = devpath_to_devname(devpath);
+	char path[200];
+	const int len = sizeof(path);
+	int rc;
+
+	if (snprintf(path, len, "%s/driver/remove_id", devpath) >= len) {
+		err(ctx, "%s: buffer too small!\n", devname);
+		return -ENXIO;
+	}
+
+	rc = sysfs_write_attr(ctx, path, devname);
+	if (rc)
+		return rc;
+
+	if (snprintf(path, len, "%s/driver/unbind", devpath) >= len) {
+		err(ctx, "%s: buffer too small!\n", devname);
+		return -ENXIO;
+	}
+
+	return sysfs_write_attr(ctx, path, devname);
+
+}
+
 DAXCTL_EXPORT int daxctl_dev_is_enabled(struct daxctl_dev *dev)
 {
 	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
@@ -624,6 +755,119 @@ DAXCTL_EXPORT int daxctl_dev_is_enabled(struct daxctl_dev *dev)
 	return is_enabled(path);
 }
 
+static int daxctl_insert_kmod_for_mode(struct daxctl_dev *dev,
+		const char *mod_name)
+{
+	const char *devname = daxctl_dev_get_devname(dev);
+	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
+	struct kmod_list *iter;
+	int rc = -ENXIO;
+
+	if (dev->kmod_list == NULL) {
+		err(ctx, "%s: a modalias lookup list was not created\n",
+				devname);
+		return rc;
+	}
+
+	kmod_list_foreach(iter, dev->kmod_list) {
+		struct kmod_module *mod = kmod_module_get_module(iter);
+		const char *name = kmod_module_get_name(mod);
+
+		if (strcmp(name, mod_name) != 0) {
+			kmod_module_unref(mod);
+			continue;
+		}
+		dbg(ctx, "%s inserting module: %s\n", devname, name);
+		rc = kmod_module_probe_insert_module(mod,
+				KMOD_PROBE_APPLY_BLACKLIST, NULL, NULL, NULL,
+				NULL);
+		if (rc < 0) {
+			err(ctx, "%s: insert failure: %d\n", devname, rc);
+			return rc;
+		}
+		dev->module = mod;
+	}
+
+	if (rc == -ENXIO)
+		err(ctx, "%s: Unable to find module: %s in alias list\n",
+				devname, mod_name);
+	return rc;
+}
+
+static int daxctl_dev_enable(struct daxctl_dev *dev, enum daxctl_dev_mode mode)
+{
+	struct daxctl_region *region = daxctl_dev_get_region(dev);
+	const char *devname = daxctl_dev_get_devname(dev);
+	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
+	const char *mod_name = dax_modules[mode];
+	int rc;
+
+	if (!device_model_is_dax_bus(dev)) {
+		err(ctx, "%s: error: device model is dax-class\n", devname);
+		return -EOPNOTSUPP;
+	}
+
+	if (daxctl_dev_is_enabled(dev))
+		return 0;
+
+	if (mode >= DAXCTL_DEV_MODE_END || mod_name == NULL) {
+		err(ctx, "%s: Invalid mode: %d\n", devname, mode);
+		return -EINVAL;
+	}
+
+	rc = daxctl_insert_kmod_for_mode(dev, mod_name);
+	if (rc)
+		return rc;
+
+	rc = daxctl_bind(ctx, devname, mod_name);
+	if (!daxctl_dev_is_enabled(dev)) {
+		err(ctx, "%s: failed to enable\n", devname);
+		return rc ? rc : -ENXIO;
+	}
+
+	region->devices_init = 0;
+	dax_devices_init(region);
+	rc = 0;
+	dbg(ctx, "%s: enabled\n", devname);
+	return rc;
+}
+
+DAXCTL_EXPORT int daxctl_dev_enable_devdax(struct daxctl_dev *dev)
+{
+	return daxctl_dev_enable(dev, DAXCTL_DEV_MODE_DEVDAX);
+}
+
+DAXCTL_EXPORT int daxctl_dev_enable_ram(struct daxctl_dev *dev)
+{
+	return daxctl_dev_enable(dev, DAXCTL_DEV_MODE_RAM);
+}
+
+DAXCTL_EXPORT int daxctl_dev_disable(struct daxctl_dev *dev)
+{
+	const char *devname = daxctl_dev_get_devname(dev);
+	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
+
+	if (!device_model_is_dax_bus(dev)) {
+		err(ctx, "%s: error: device model is dax-class\n", devname);
+		return -EOPNOTSUPP;
+	}
+
+	if (!daxctl_dev_is_enabled(dev))
+		return 0;
+
+	daxctl_unbind(ctx, dev->dev_path);
+
+	if (daxctl_dev_is_enabled(dev)) {
+		err(ctx, "%s: failed to disable\n", devname);
+		return -EBUSY;
+	}
+
+	kmod_module_unref(dev->module);
+	dbg(ctx, "%s: disabled\n", devname);
+
+	return 0;
+}
+
 DAXCTL_EXPORT struct daxctl_ctx *daxctl_dev_get_ctx(struct daxctl_dev *dev)
 {
 	return dev->region->ctx;
diff --git a/daxctl/lib/libdaxctl.sym b/daxctl/lib/libdaxctl.sym
index c4af9a7..19904a2 100644
--- a/daxctl/lib/libdaxctl.sym
+++ b/daxctl/lib/libdaxctl.sym
@@ -55,4 +55,7 @@ LIBDAXCTL_6 {
 global:
 	daxctl_dev_get_ctx;
 	daxctl_dev_is_enabled;
+	daxctl_dev_disable;
+	daxctl_dev_enable_devdax;
+	daxctl_dev_enable_ram;
 } LIBDAXCTL_5;
diff --git a/daxctl/libdaxctl.h b/daxctl/libdaxctl.h
index e20ccb4..407f459 100644
--- a/daxctl/libdaxctl.h
+++ b/daxctl/libdaxctl.h
@@ -69,6 +69,9 @@ int daxctl_dev_get_minor(struct daxctl_dev *dev);
 unsigned long long daxctl_dev_get_size(struct daxctl_dev *dev);
 struct daxctl_ctx *daxctl_dev_get_ctx(struct daxctl_dev *dev);
 int daxctl_dev_is_enabled(struct daxctl_dev *dev);
+int daxctl_dev_disable(struct daxctl_dev *dev);
+int daxctl_dev_enable_devdax(struct daxctl_dev *dev);
+int daxctl_dev_enable_ram(struct daxctl_dev *dev);
 
 #define daxctl_dev_foreach(region, dev) \
         for (dev = daxctl_dev_get_first(region); \
-- 
2.20.1

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [ndctl PATCH v9 03/13] libdaxctl: add an interface to retrieve the device resource
  2019-08-01  0:29 [ndctl PATCH v9 00/13] daxctl: add a new reconfigure-device command Vishal Verma
  2019-08-01  0:29 ` [ndctl PATCH v9 01/13] libdaxctl: add interfaces to get ctx and check device state Vishal Verma
  2019-08-01  0:29 ` [ndctl PATCH v9 02/13] libdaxctl: add interfaces to enable/disable devices Vishal Verma
@ 2019-08-01  0:29 ` Vishal Verma
  2019-08-01  0:29 ` [ndctl PATCH v9 04/13] libdaxctl: add a 'daxctl_memory' object for memory based operations Vishal Verma
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Vishal Verma @ 2019-08-01  0:29 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Dave Hansen, Pavel Tatashin, Tony Luck

Add an interface to retrieve the 'resource' attribute for a dax device.

Attempt to retrieve it as usual via sysfs, but since older kernels may
be missing this attribute, as a fallback, attempt to retrieve it from
/proc/iomem

Cc: Dan Williams <dan.j.williams@intel.com>
[fscanf format string problem and diagnosis]
Reported-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 Makefile.am                    |  3 ++-
 daxctl/lib/Makefile.am         |  2 ++
 daxctl/lib/libdaxctl-private.h |  1 +
 daxctl/lib/libdaxctl.c         | 12 +++++++++++
 daxctl/lib/libdaxctl.sym       |  1 +
 daxctl/libdaxctl.h             |  1 +
 util/iomem.c                   | 37 ++++++++++++++++++++++++++++++++++
 util/iomem.h                   | 12 +++++++++++
 8 files changed, 68 insertions(+), 1 deletion(-)
 create mode 100644 util/iomem.c
 create mode 100644 util/iomem.h

diff --git a/Makefile.am b/Makefile.am
index df8797e..8d10a10 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -74,6 +74,7 @@ libutil_a_SOURCES = \
 	util/wrapper.c \
 	util/filter.c \
 	util/bitmap.c \
-	util/abspath.c
+	util/abspath.c \
+	util/iomem.c
 
 nobase_include_HEADERS = daxctl/libdaxctl.h
diff --git a/daxctl/lib/Makefile.am b/daxctl/lib/Makefile.am
index 9f0e444..7704b1b 100644
--- a/daxctl/lib/Makefile.am
+++ b/daxctl/lib/Makefile.am
@@ -9,6 +9,8 @@ lib_LTLIBRARIES = libdaxctl.la
 libdaxctl_la_SOURCES =\
 	../libdaxctl.h \
 	libdaxctl-private.h \
+	../../util/iomem.c \
+	../../util/iomem.h \
 	../../util/sysfs.c \
 	../../util/sysfs.h \
 	../../util/log.c \
diff --git a/daxctl/lib/libdaxctl-private.h b/daxctl/lib/libdaxctl-private.h
index 120137f..fee67d1 100644
--- a/daxctl/lib/libdaxctl-private.h
+++ b/daxctl/lib/libdaxctl-private.h
@@ -65,6 +65,7 @@ struct daxctl_dev {
 	size_t buf_len;
 	char *dev_path;
 	struct list_node list;
+	unsigned long long resource;
 	unsigned long long size;
 	struct kmod_module *module;
 	struct kmod_list *kmod_list;
diff --git a/daxctl/lib/libdaxctl.c b/daxctl/lib/libdaxctl.c
index caf661e..aa0d2f2 100644
--- a/daxctl/lib/libdaxctl.c
+++ b/daxctl/lib/libdaxctl.c
@@ -24,6 +24,7 @@
 
 #include <util/log.h>
 #include <util/sysfs.h>
+#include <util/iomem.h>
 #include <daxctl/libdaxctl.h>
 #include "libdaxctl-private.h"
 
@@ -406,6 +407,12 @@ static void *add_dax_dev(void *parent, int id, const char *daxdev_base)
 	dev->major = major(st.st_rdev);
 	dev->minor = minor(st.st_rdev);
 
+	sprintf(path, "%s/resource", daxdev_base);
+	if (sysfs_read_attr(ctx, path, buf) == 0)
+		dev->resource = strtoull(buf, NULL, 0);
+	else
+		dev->resource = iomem_get_dev_resource(ctx, daxdev_base);
+
 	sprintf(path, "%s/size", daxdev_base);
 	if (sysfs_read_attr(ctx, path, buf) < 0)
 		goto err_read;
@@ -928,6 +935,11 @@ DAXCTL_EXPORT int daxctl_dev_get_minor(struct daxctl_dev *dev)
 	return dev->minor;
 }
 
+DAXCTL_EXPORT unsigned long long daxctl_dev_get_resource(struct daxctl_dev *dev)
+{
+	return dev->resource;
+}
+
 DAXCTL_EXPORT unsigned long long daxctl_dev_get_size(struct daxctl_dev *dev)
 {
 	return dev->size;
diff --git a/daxctl/lib/libdaxctl.sym b/daxctl/lib/libdaxctl.sym
index 19904a2..1692624 100644
--- a/daxctl/lib/libdaxctl.sym
+++ b/daxctl/lib/libdaxctl.sym
@@ -58,4 +58,5 @@ global:
 	daxctl_dev_disable;
 	daxctl_dev_enable_devdax;
 	daxctl_dev_enable_ram;
+	daxctl_dev_get_resource;
 } LIBDAXCTL_5;
diff --git a/daxctl/libdaxctl.h b/daxctl/libdaxctl.h
index 407f459..adf55f3 100644
--- a/daxctl/libdaxctl.h
+++ b/daxctl/libdaxctl.h
@@ -66,6 +66,7 @@ int daxctl_dev_get_id(struct daxctl_dev *dev);
 const char *daxctl_dev_get_devname(struct daxctl_dev *dev);
 int daxctl_dev_get_major(struct daxctl_dev *dev);
 int daxctl_dev_get_minor(struct daxctl_dev *dev);
+unsigned long long daxctl_dev_get_resource(struct daxctl_dev *dev);
 unsigned long long daxctl_dev_get_size(struct daxctl_dev *dev);
 struct daxctl_ctx *daxctl_dev_get_ctx(struct daxctl_dev *dev);
 int daxctl_dev_is_enabled(struct daxctl_dev *dev);
diff --git a/util/iomem.c b/util/iomem.c
new file mode 100644
index 0000000..a3c23f5
--- /dev/null
+++ b/util/iomem.c
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2019 Intel Corporation. All rights reserved. */
+
+#include <stdio.h>
+#include <errno.h>
+#include <string.h>
+#include <util/log.h>
+#include <util/iomem.h>
+#include <util/sysfs.h>
+
+unsigned long long __iomem_get_dev_resource(struct log_ctx *ctx,
+		const char *devpath)
+{
+	const char *devname = devpath_to_devname(devpath);
+	FILE *fp = fopen("/proc/iomem", "r");
+	unsigned long long res;
+	char name[256];
+
+	if (fp == NULL) {
+		log_err(ctx, "%s: open /proc/iomem: %s\n", devname,
+				strerror(errno));
+		return 0;
+	}
+
+	while (fscanf(fp, "%llx-%*x : %254[^\n]\n", &res, name) == 2) {
+		if (strcmp(name, devname) == 0) {
+			log_dbg(ctx, "%s: got resource via iomem: %#llx\n",
+					devname, res);
+			fclose(fp);
+			return res;
+		}
+	}
+
+	log_dbg(ctx, "%s: not found in iomem\n", devname);
+	fclose(fp);
+	return 0;
+}
diff --git a/util/iomem.h b/util/iomem.h
new file mode 100644
index 0000000..aaaf6a7
--- /dev/null
+++ b/util/iomem.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2019 Intel Corporation. All rights reserved. */
+#ifndef _NDCTL_IOMEM_H_
+#define _NDCTL_IOMEM_H_
+
+struct log_ctx;
+unsigned long long __iomem_get_dev_resource(struct log_ctx *ctx,
+		const char *path);
+
+#define iomem_get_dev_resource(c, p) __iomem_get_dev_resource(&(c)->ctx, (p))
+
+#endif /* _NDCTL_IOMEM_H_ */
-- 
2.20.1

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [ndctl PATCH v9 04/13] libdaxctl: add a 'daxctl_memory' object for memory based operations
  2019-08-01  0:29 [ndctl PATCH v9 00/13] daxctl: add a new reconfigure-device command Vishal Verma
                   ` (2 preceding siblings ...)
  2019-08-01  0:29 ` [ndctl PATCH v9 03/13] libdaxctl: add an interface to retrieve the device resource Vishal Verma
@ 2019-08-01  0:29 ` Vishal Verma
  2019-08-05 23:57   ` Verma, Vishal L
  2019-08-01  0:29 ` [ndctl PATCH v9 05/13] daxctl/list: add target_node for device listings Vishal Verma
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 15+ messages in thread
From: Vishal Verma @ 2019-08-01  0:29 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Dave Hansen, Pavel Tatashin

Introduce a new 'daxctl_memory' object, which will be used for
operations related to managing dax devices in 'system-memory' modes.

Add libdaxctl APIs to get the target_node of a DAX device, and to
online, offline, and query the state of hotplugged memory sections
associated with a given device.

This adds the following new interfaces:

  daxctl_dev_get_target_node
  daxctl_dev_get_memory
  daxctl_memory_get_dev
  daxctl_memory_get_node_path
  daxctl_memory_get_block_size
  daxctl_memory_online
  daxctl_memory_offline
  daxctl_memory_is_online
  daxctl_memory_num_sections

Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
[for the memblock-already-online TOCTOU hole]
Reported-by: Fan Du <fan.du@intel.com>
Tested-by: Fan Du <fan.du@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 daxctl/lib/libdaxctl-private.h |  18 ++
 daxctl/lib/libdaxctl.c         | 384 +++++++++++++++++++++++++++++++++
 daxctl/lib/libdaxctl.sym       |   9 +
 daxctl/libdaxctl.h             |  11 +
 4 files changed, 422 insertions(+)

diff --git a/daxctl/lib/libdaxctl-private.h b/daxctl/lib/libdaxctl-private.h
index fee67d1..01091de 100644
--- a/daxctl/lib/libdaxctl-private.h
+++ b/daxctl/lib/libdaxctl-private.h
@@ -39,6 +39,13 @@ static const char *dax_modules[] = {
 	[DAXCTL_DEV_MODE_RAM] = "kmem",
 };
 
+enum memory_op {
+	MEM_SET_OFFLINE,
+	MEM_SET_ONLINE,
+	MEM_IS_ONLINE,
+	MEM_COUNT,
+};
+
 /**
  * struct daxctl_region - container for dax_devices
  */
@@ -70,8 +77,19 @@ struct daxctl_dev {
 	struct kmod_module *module;
 	struct kmod_list *kmod_list;
 	struct daxctl_region *region;
+	struct daxctl_memory *mem;
+	int target_node;
+};
+
+struct daxctl_memory {
+	struct daxctl_dev *dev;
+	void *mem_buf;
+	size_t buf_len;
+	char *node_path;
+	unsigned long block_size;
 };
 
+
 static inline int check_kmod(struct kmod_ctx *kmod_ctx)
 {
 	return kmod_ctx ? 0 : -ENXIO;
diff --git a/daxctl/lib/libdaxctl.c b/daxctl/lib/libdaxctl.c
index aa0d2f2..949c56f 100644
--- a/daxctl/lib/libdaxctl.c
+++ b/daxctl/lib/libdaxctl.c
@@ -200,6 +200,15 @@ DAXCTL_EXPORT void daxctl_region_get_uuid(struct daxctl_region *region, uuid_t u
 	uuid_copy(uu, region->uuid);
 }
 
+static void free_mem(struct daxctl_dev *dev)
+{
+	if (dev && dev->mem) {
+		free(dev->mem->node_path);
+		free(dev->mem);
+		dev->mem = NULL;
+	}
+}
+
 static void free_dev(struct daxctl_dev *dev, struct list_head *head)
 {
 	if (head)
@@ -207,6 +216,7 @@ static void free_dev(struct daxctl_dev *dev, struct list_head *head)
 	kmod_module_unref_list(dev->kmod_list);
 	free(dev->dev_buf);
 	free(dev->dev_path);
+	free_mem(dev);
 	free(dev);
 }
 
@@ -380,6 +390,94 @@ static struct kmod_list *to_module_list(struct daxctl_ctx *ctx,
 	return list;
 }
 
+static int dev_is_system_ram_capable(struct daxctl_dev *dev)
+{
+	const char *devname = daxctl_dev_get_devname(dev);
+	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
+	char *mod_path, *mod_base;
+	char path[200];
+	const int len = sizeof(path);
+
+	if (!device_model_is_dax_bus(dev))
+		return false;
+
+	if (!daxctl_dev_is_enabled(dev))
+		return false;
+
+	if (snprintf(path, len, "%s/driver/module", dev->dev_path) >= len) {
+		err(ctx, "%s: buffer too small!\n", devname);
+		return false;
+	}
+
+	mod_path = realpath(path, NULL);
+	if (!mod_path)
+		return false;
+
+	mod_base = basename(mod_path);
+	if (strcmp(mod_base, dax_modules[DAXCTL_DEV_MODE_RAM]) == 0) {
+		free(mod_path);
+		return true;
+	}
+
+	free(mod_path);
+	return false;
+}
+
+/*
+ * This checks for the device to be in system-ram mode, so calling
+ * daxctl_dev_get_memory() on a devdax mode device will always return NULL.
+ */
+static struct daxctl_memory *daxctl_dev_alloc_mem(struct daxctl_dev *dev)
+{
+	const char *size_path = "/sys/devices/system/memory/block_size_bytes";
+	const char *node_base = "/sys/devices/system/node/node";
+	const char *devname = daxctl_dev_get_devname(dev);
+	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
+	struct daxctl_memory *mem;
+	char buf[SYSFS_ATTR_SIZE];
+	int node_num;
+
+	if (!dev_is_system_ram_capable(dev))
+		return NULL;
+
+	mem = calloc(1, sizeof(*mem));
+	if (!mem)
+		return NULL;
+
+	mem->dev = dev;
+
+	if (sysfs_read_attr(ctx, size_path, buf) == 0) {
+		mem->block_size = strtoul(buf, NULL, 16);
+		if (mem->block_size == 0 || mem->block_size == ULONG_MAX) {
+			err(ctx, "%s: Unable to determine memblock size: %s\n",
+				devname, strerror(errno));
+			mem->block_size = 0;
+		}
+	}
+
+	node_num = daxctl_dev_get_target_node(dev);
+	if (node_num >= 0) {
+		if (asprintf(&mem->node_path, "%s%d", node_base,
+				node_num) < 0) {
+			err(ctx, "%s: Unable to set node_path\n", devname);
+			goto err_mem;
+		}
+	}
+
+	mem->mem_buf = calloc(1, strlen(node_base) + 256);
+	if (!mem->mem_buf)
+		goto err_node;
+	mem->buf_len = strlen(node_base) + 256;
+
+	return mem;
+
+err_node:
+	free(mem->node_path);
+err_mem:
+	free(mem);
+	return NULL;
+}
+
 static void *add_dax_dev(void *parent, int id, const char *daxdev_base)
 {
 	const char *devname = devpath_to_devname(daxdev_base);
@@ -435,6 +533,12 @@ static void *add_dax_dev(void *parent, int id, const char *daxdev_base)
 	if (rc == 0)
 		dev->kmod_list = to_module_list(ctx, buf);
 
+	sprintf(path, "%s/target_node", daxdev_base);
+	if (sysfs_read_attr(ctx, path, buf) == 0)
+		dev->target_node = strtol(buf, NULL, 0);
+	else
+		dev->target_node = -1;
+
 	daxctl_dev_foreach(region, dev_dup)
 		if (dev_dup->id == dev->id) {
 			free_dev(dev, NULL);
@@ -862,6 +966,9 @@ DAXCTL_EXPORT int daxctl_dev_disable(struct daxctl_dev *dev)
 	if (!daxctl_dev_is_enabled(dev))
 		return 0;
 
+	/* If there is a memory object, first free that */
+	free_mem(dev);
+
 	daxctl_unbind(ctx, dev->dev_path);
 
 	if (daxctl_dev_is_enabled(dev)) {
@@ -944,3 +1051,280 @@ DAXCTL_EXPORT unsigned long long daxctl_dev_get_size(struct daxctl_dev *dev)
 {
 	return dev->size;
 }
+
+DAXCTL_EXPORT int daxctl_dev_get_target_node(struct daxctl_dev *dev)
+{
+	return dev->target_node;
+}
+
+DAXCTL_EXPORT struct daxctl_memory *daxctl_dev_get_memory(struct daxctl_dev *dev)
+{
+	if (dev->mem)
+		return dev->mem;
+	else
+		return daxctl_dev_alloc_mem(dev);
+}
+
+DAXCTL_EXPORT struct daxctl_dev *daxctl_memory_get_dev(struct daxctl_memory *mem)
+{
+	return mem->dev;
+}
+
+DAXCTL_EXPORT const char *daxctl_memory_get_node_path(struct daxctl_memory *mem)
+{
+	return mem->node_path;
+}
+
+DAXCTL_EXPORT unsigned long daxctl_memory_get_block_size(struct daxctl_memory *mem)
+{
+	return mem->block_size;
+}
+
+static int online_one_memblock(struct daxctl_dev *dev, char *path)
+{
+	const char *devname = daxctl_dev_get_devname(dev);
+	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
+	const char *mode = "online_movable";
+	char buf[SYSFS_ATTR_SIZE];
+	int rc;
+
+	rc = sysfs_read_attr(ctx, path, buf);
+	if (rc) {
+		err(ctx, "%s: Failed to read %s: %s\n",
+			devname, path, strerror(-rc));
+		return rc;
+	}
+
+	/*
+	 * if already online, possibly due to kernel config or a udev rule,
+	 * there is nothing to do and we can skip over the memblock
+	 */
+	if (strncmp(buf, "online", 6) == 0)
+		return 1;
+
+	rc = sysfs_write_attr_quiet(ctx, path, mode);
+	if (rc) {
+		/*
+		 * While we performed an already-online check above, there
+		 * is still a TOCTOU hole where someone (such as a udev rule)
+		 * may have raced to online the memory. In such a case,
+		 * the sysfs store will fail, however we can check for this
+		 * by simply reading the state again. If it changed to the
+		 * desired state, then we don't have to error out.
+		 */
+		if (sysfs_read_attr(ctx, path, buf) == 0) {
+			if (strncmp(buf, "online", 6) == 0)
+				return 1;
+		}
+		err(ctx, "%s: Failed to online %s: %s\n",
+			devname, path, strerror(-rc));
+	}
+	return rc;
+}
+
+static int offline_one_memblock(struct daxctl_dev *dev, char *path)
+{
+	const char *devname = daxctl_dev_get_devname(dev);
+	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
+	const char *mode = "offline";
+	char buf[SYSFS_ATTR_SIZE];
+	int rc;
+
+	rc = sysfs_read_attr(ctx, path, buf);
+	if (rc) {
+		err(ctx, "%s: Failed to read %s: %s\n",
+			devname, path, strerror(-rc));
+		return rc;
+	}
+
+	/* if already offline, there is nothing to do */
+	if (strncmp(buf, "offline", 7) == 0)
+		return 1;
+
+	rc = sysfs_write_attr_quiet(ctx, path, mode);
+	if (rc) {
+		/* Close the TOCTOU hole like in online_one_memblock() above */
+		if (sysfs_read_attr(ctx, path, buf) == 0) {
+			if (strncmp(buf, "offline", 7) == 0)
+				return 1;
+		}
+		err(ctx, "%s: Failed to offline %s: %s\n",
+			devname, path, strerror(-rc));
+	}
+	return rc;
+}
+
+static int memblock_is_online(struct daxctl_dev *dev, char *path)
+{
+	const char *devname = daxctl_dev_get_devname(dev);
+	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
+	char buf[SYSFS_ATTR_SIZE];
+	int rc;
+
+	rc = sysfs_read_attr(ctx, path, buf);
+	if (rc) {
+		err(ctx, "%s: Failed to read %s: %s\n",
+			devname, path, strerror(-rc));
+		return rc;
+	}
+
+	if (strncmp(buf, "online", 6) == 0)
+		return 1;
+
+	/* offline */
+	return 0;
+}
+
+static bool memblock_in_dev(struct daxctl_dev *dev, const char *memblock)
+{
+	struct daxctl_memory *mem = daxctl_dev_get_memory(dev);
+	const char *mem_base = "/sys/devices/system/memory/";
+	unsigned long long memblock_res, dev_start, dev_end;
+	const char *devname = daxctl_dev_get_devname(dev);
+	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
+	unsigned long memblock_size;
+	int path_len = mem->buf_len;
+	char buf[SYSFS_ATTR_SIZE];
+	unsigned long phys_index;
+	char *path = mem->mem_buf;
+
+	if (snprintf(path, path_len, "%s/%s/phys_index",
+			mem_base, memblock) < 0)
+		return false;
+
+	if (sysfs_read_attr(ctx, path, buf) == 0) {
+		phys_index = strtoul(buf, NULL, 16);
+		if (phys_index == 0 || phys_index == ULONG_MAX) {
+			err(ctx, "%s: %s: Unable to determine phys_index: %s\n",
+				devname, memblock, strerror(errno));
+			return false;
+		}
+	} else {
+		err(ctx, "%s: %s: Unable to determine phys_index: %s\n",
+			devname, memblock, strerror(errno));
+		return false;
+	}
+
+	dev_start = daxctl_dev_get_resource(dev);
+	if (!dev_start) {
+		err(ctx, "%s: Unable to determine resource\n", devname);
+		return false;
+	}
+	dev_end = dev_start + daxctl_dev_get_size(dev);
+
+	memblock_size = daxctl_memory_get_block_size(mem);
+	if (!memblock_size) {
+		err(ctx, "%s: Unable to determine memory block size\n",
+			devname);
+		return false;
+	}
+	memblock_res = phys_index * memblock_size;
+
+	if (memblock_res >= dev_start && memblock_res <= dev_end)
+		return true;
+
+	return false;
+}
+
+static int op_for_one_memblock(struct daxctl_memory *mem, char *path,
+		enum memory_op op)
+{
+	struct daxctl_dev *dev = daxctl_memory_get_dev(mem);
+	const char *devname = daxctl_dev_get_devname(dev);
+	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
+	int rc;
+
+	switch (op) {
+	case MEM_SET_ONLINE:
+		return online_one_memblock(dev, path);
+	case MEM_SET_OFFLINE:
+		return offline_one_memblock(dev, path);
+	case MEM_IS_ONLINE:
+		rc = memblock_is_online(dev, path);
+		if (rc < 0)
+			return rc;
+		/*
+		 * Retain the 'normal' semantics for if (memblock_is_online()),
+		 * but since count needs rc == 0, we'll just flip rc for this op
+		 */
+		return !rc;
+	case MEM_COUNT:
+		return 0;
+	}
+
+	err(ctx, "%s: BUG: unknown op: %d\n", devname, op);
+	return -EINVAL;
+}
+
+static int daxctl_memory_op(struct daxctl_memory *mem, enum memory_op op)
+{
+	struct daxctl_dev *dev = daxctl_memory_get_dev(mem);
+	const char *devname = daxctl_dev_get_devname(dev);
+	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
+	const char *node_path;
+	int rc, count = 0;
+	struct dirent *de;
+	DIR *node_dir;
+
+	node_path = daxctl_memory_get_node_path(mem);
+	if (!node_path) {
+		err(ctx, "%s: Failed to get node_path\n", devname);
+		return -ENXIO;
+	}
+
+	node_dir = opendir(node_path);
+	if (!node_dir)
+		return -errno;
+
+	errno = 0;
+	while ((de = readdir(node_dir)) != NULL) {
+		char *path = mem->mem_buf;
+		int len = mem->buf_len;
+
+		if (strncmp(de->d_name, "memory", 6) == 0) {
+			if (!memblock_in_dev(dev, de->d_name))
+				continue;
+			rc = snprintf(path, len, "%s/%s/state",
+				node_path, de->d_name);
+			if (rc < 0) {
+				rc = -ENOMEM;
+				goto out_dir;
+			}
+			rc = op_for_one_memblock(mem, path, op);
+			if (rc < 0)
+				goto out_dir;
+			if (rc == 0)
+				count++;
+		}
+		errno = 0;
+	}
+	if (errno) {
+		rc = -errno;
+		goto out_dir;
+	}
+	rc = count;
+
+out_dir:
+	closedir(node_dir);
+	return rc;
+}
+
+DAXCTL_EXPORT int daxctl_memory_online(struct daxctl_memory *mem)
+{
+	return daxctl_memory_op(mem, MEM_SET_ONLINE);
+}
+
+DAXCTL_EXPORT int daxctl_memory_offline(struct daxctl_memory *mem)
+{
+	return daxctl_memory_op(mem, MEM_SET_OFFLINE);
+}
+
+DAXCTL_EXPORT int daxctl_memory_is_online(struct daxctl_memory *mem)
+{
+	return daxctl_memory_op(mem, MEM_IS_ONLINE);
+}
+
+DAXCTL_EXPORT int daxctl_memory_num_sections(struct daxctl_memory *mem)
+{
+	return daxctl_memory_op(mem, MEM_COUNT);
+}
diff --git a/daxctl/lib/libdaxctl.sym b/daxctl/lib/libdaxctl.sym
index 1692624..bc18604 100644
--- a/daxctl/lib/libdaxctl.sym
+++ b/daxctl/lib/libdaxctl.sym
@@ -59,4 +59,13 @@ global:
 	daxctl_dev_enable_devdax;
 	daxctl_dev_enable_ram;
 	daxctl_dev_get_resource;
+	daxctl_dev_get_target_node;
+	daxctl_dev_get_memory;
+	daxctl_memory_get_dev;
+	daxctl_memory_get_node_path;
+	daxctl_memory_get_block_size;
+	daxctl_memory_online;
+	daxctl_memory_offline;
+	daxctl_memory_is_online;
+	daxctl_memory_num_sections;
 } LIBDAXCTL_5;
diff --git a/daxctl/libdaxctl.h b/daxctl/libdaxctl.h
index adf55f3..fb6c3b1 100644
--- a/daxctl/libdaxctl.h
+++ b/daxctl/libdaxctl.h
@@ -73,6 +73,17 @@ int daxctl_dev_is_enabled(struct daxctl_dev *dev);
 int daxctl_dev_disable(struct daxctl_dev *dev);
 int daxctl_dev_enable_devdax(struct daxctl_dev *dev);
 int daxctl_dev_enable_ram(struct daxctl_dev *dev);
+int daxctl_dev_get_target_node(struct daxctl_dev *dev);
+
+struct daxctl_memory;
+struct daxctl_memory *daxctl_dev_get_memory(struct daxctl_dev *dev);
+struct daxctl_dev *daxctl_memory_get_dev(struct daxctl_memory *mem);
+const char *daxctl_memory_get_node_path(struct daxctl_memory *mem);
+unsigned long daxctl_memory_get_block_size(struct daxctl_memory *mem);
+int daxctl_memory_online(struct daxctl_memory *mem);
+int daxctl_memory_offline(struct daxctl_memory *mem);
+int daxctl_memory_is_online(struct daxctl_memory *mem);
+int daxctl_memory_num_sections(struct daxctl_memory *mem);
 
 #define daxctl_dev_foreach(region, dev) \
         for (dev = daxctl_dev_get_first(region); \
-- 
2.20.1

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [ndctl PATCH v9 05/13] daxctl/list: add target_node for device listings
  2019-08-01  0:29 [ndctl PATCH v9 00/13] daxctl: add a new reconfigure-device command Vishal Verma
                   ` (3 preceding siblings ...)
  2019-08-01  0:29 ` [ndctl PATCH v9 04/13] libdaxctl: add a 'daxctl_memory' object for memory based operations Vishal Verma
@ 2019-08-01  0:29 ` Vishal Verma
  2019-08-01  0:29 ` [ndctl PATCH v9 06/13] daxctl/list: display the mode for a dax device Vishal Verma
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Vishal Verma @ 2019-08-01  0:29 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Dave Hansen, Pavel Tatashin

The kernel provides a 'target_node' attribute for dax devices. When
converting a dax device to the system-ram mode, the memory is hotplugged
into this numa node. It would be helpful to print this in device
listings so that it is easy for applications to detect the node to
which the new memory belongs.

Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 util/json.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/util/json.c b/util/json.c
index babdc8c..f521337 100644
--- a/util/json.c
+++ b/util/json.c
@@ -271,6 +271,7 @@ struct json_object *util_daxctl_dev_to_json(struct daxctl_dev *dev,
 {
 	const char *devname = daxctl_dev_get_devname(dev);
 	struct json_object *jdev, *jobj;
+	int node;
 
 	jdev = json_object_new_object();
 	if (!devname || !jdev)
@@ -284,6 +285,13 @@ struct json_object *util_daxctl_dev_to_json(struct daxctl_dev *dev,
 	if (jobj)
 		json_object_object_add(jdev, "size", jobj);
 
+	node = daxctl_dev_get_target_node(dev);
+	if (node >= 0) {
+		jobj = json_object_new_int(node);
+		if (jobj)
+			json_object_object_add(jdev, "target_node", jobj);
+	}
+
 	return jdev;
 }
 
-- 
2.20.1

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [ndctl PATCH v9 06/13] daxctl/list: display the mode for a dax device
  2019-08-01  0:29 [ndctl PATCH v9 00/13] daxctl: add a new reconfigure-device command Vishal Verma
                   ` (4 preceding siblings ...)
  2019-08-01  0:29 ` [ndctl PATCH v9 05/13] daxctl/list: add target_node for device listings Vishal Verma
@ 2019-08-01  0:29 ` Vishal Verma
  2019-08-01  0:29 ` [ndctl PATCH v9 07/13] daxctl: add a new reconfigure-device command Vishal Verma
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Vishal Verma @ 2019-08-01  0:29 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Dave Hansen, Pavel Tatashin

In preparation for a reconfigure-device command, allow JSON listings to
display the 'mode' of a dax device. This will allow the
reconfigure-device command (and via daxctl_dev_to_json(), also
daxctl-list) to print the mode in device listings via a 'daxctl-list'
command or immediately after a mode change.

Add a 'state' attribute to the json listings for devices, since a device
could end up in a state where it is not bound to any driver, and hence,
'disabled'. The state attribute is only displayed for disabled devices.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 daxctl/lib/libdaxctl.c |  2 ++
 util/json.c            | 14 ++++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/daxctl/lib/libdaxctl.c b/daxctl/lib/libdaxctl.c
index 949c56f..edb8257 100644
--- a/daxctl/lib/libdaxctl.c
+++ b/daxctl/lib/libdaxctl.c
@@ -12,6 +12,8 @@
  */
 #include <stdio.h>
 #include <errno.h>
+#include <limits.h>
+#include <libgen.h>
 #include <stdlib.h>
 #include <dirent.h>
 #include <unistd.h>
diff --git a/util/json.c b/util/json.c
index f521337..9f80b5b 100644
--- a/util/json.c
+++ b/util/json.c
@@ -269,6 +269,7 @@ struct json_object *util_dimm_to_json(struct ndctl_dimm *dimm,
 struct json_object *util_daxctl_dev_to_json(struct daxctl_dev *dev,
 		unsigned long flags)
 {
+	struct daxctl_memory *mem = daxctl_dev_get_memory(dev);
 	const char *devname = daxctl_dev_get_devname(dev);
 	struct json_object *jdev, *jobj;
 	int node;
@@ -292,6 +293,19 @@ struct json_object *util_daxctl_dev_to_json(struct daxctl_dev *dev,
 			json_object_object_add(jdev, "target_node", jobj);
 	}
 
+	if (mem)
+		jobj = json_object_new_string("system-ram");
+	else
+		jobj = json_object_new_string("devdax");
+	if (jobj)
+		json_object_object_add(jdev, "mode", jobj);
+
+	if (!daxctl_dev_is_enabled(dev)) {
+		jobj = json_object_new_string("disabled");
+		if (jobj)
+			json_object_object_add(jdev, "state", jobj);
+	}
+
 	return jdev;
 }
 
-- 
2.20.1

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [ndctl PATCH v9 07/13] daxctl: add a new reconfigure-device command
  2019-08-01  0:29 [ndctl PATCH v9 00/13] daxctl: add a new reconfigure-device command Vishal Verma
                   ` (5 preceding siblings ...)
  2019-08-01  0:29 ` [ndctl PATCH v9 06/13] daxctl/list: display the mode for a dax device Vishal Verma
@ 2019-08-01  0:29 ` Vishal Verma
  2019-08-01  0:29 ` [ndctl PATCH v9 08/13] Documentation/daxctl: add a man page for daxctl-reconfigure-device Vishal Verma
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Vishal Verma @ 2019-08-01  0:29 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Dave Hansen, Pavel Tatashin

Add a new command 'daxctl-reconfigure-device'. This is used to switch
the mode of a dax device between regular 'device_dax' and
'system-memory'. The command also uses the memory hotplug sysfs
interfaces to online the newly available memory when converting to
'system-ram', and to attempt to offline the memory when converting back
to a DAX device.

Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 daxctl/Makefile.am |   2 +
 daxctl/builtin.h   |   1 +
 daxctl/daxctl.c    |   1 +
 daxctl/device.c    | 455 +++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 459 insertions(+)
 create mode 100644 daxctl/device.c

diff --git a/daxctl/Makefile.am b/daxctl/Makefile.am
index 94f73f9..66dcc7f 100644
--- a/daxctl/Makefile.am
+++ b/daxctl/Makefile.am
@@ -15,10 +15,12 @@ daxctl_SOURCES =\
 		daxctl.c \
 		list.c \
 		migrate.c \
+		device.c \
 		../util/json.c
 
 daxctl_LDADD =\
 	lib/libdaxctl.la \
 	../libutil.a \
 	$(UUID_LIBS) \
+	$(KMOD_LIBS) \
 	$(JSON_LIBS)
diff --git a/daxctl/builtin.h b/daxctl/builtin.h
index 00ef5e9..756ba2a 100644
--- a/daxctl/builtin.h
+++ b/daxctl/builtin.h
@@ -6,4 +6,5 @@
 struct daxctl_ctx;
 int cmd_list(int argc, const char **argv, struct daxctl_ctx *ctx);
 int cmd_migrate(int argc, const char **argv, struct daxctl_ctx *ctx);
+int cmd_reconfig_device(int argc, const char **argv, struct daxctl_ctx *ctx);
 #endif /* _DAXCTL_BUILTIN_H_ */
diff --git a/daxctl/daxctl.c b/daxctl/daxctl.c
index 2e41747..e1ba7b8 100644
--- a/daxctl/daxctl.c
+++ b/daxctl/daxctl.c
@@ -71,6 +71,7 @@ static struct cmd_struct commands[] = {
 	{ "list", .d_fn = cmd_list },
 	{ "help", .d_fn = cmd_help },
 	{ "migrate-device-model", .d_fn = cmd_migrate },
+	{ "reconfigure-device", .d_fn = cmd_reconfig_device },
 };
 
 int main(int argc, const char **argv)
diff --git a/daxctl/device.c b/daxctl/device.c
new file mode 100644
index 0000000..ed2af76
--- /dev/null
+++ b/daxctl/device.c
@@ -0,0 +1,455 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2019 Intel Corporation. All rights reserved. */
+#include <stdio.h>
+#include <errno.h>
+#include <stdlib.h>
+#include <syslog.h>
+#include <unistd.h>
+#include <limits.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <sys/sysmacros.h>
+#include <util/json.h>
+#include <util/filter.h>
+#include <json-c/json.h>
+#include <daxctl/libdaxctl.h>
+#include <util/parse-options.h>
+#include <ccan/array_size/array_size.h>
+
+static struct {
+	const char *dev;
+	const char *mode;
+	int region_id;
+	bool no_online;
+	bool force;
+	bool human;
+	bool verbose;
+} param = {
+	.region_id = -1,
+};
+
+enum dev_mode {
+	DAXCTL_DEV_MODE_UNKNOWN,
+	DAXCTL_DEV_MODE_DEVDAX,
+	DAXCTL_DEV_MODE_RAM,
+};
+
+static enum dev_mode reconfig_mode = DAXCTL_DEV_MODE_UNKNOWN;
+static unsigned long flags;
+
+enum device_action {
+	ACTION_RECONFIG,
+};
+
+#define BASE_OPTIONS() \
+OPT_INTEGER('r', "region", &param.region_id, "restrict to the given region"), \
+OPT_BOOLEAN('u', "human", &param.human, "use human friendly number formats"), \
+OPT_BOOLEAN('v', "verbose", &param.verbose, "emit more debug messages")
+
+#define RECONFIG_OPTIONS() \
+OPT_STRING('m', "mode", &param.mode, "mode", "mode to switch the device to"), \
+OPT_BOOLEAN('N', "no-online", &param.no_online, \
+	"don't auto-online memory sections"), \
+OPT_BOOLEAN('f', "force", &param.force, \
+		"attempt to offline memory sections before reconfiguration")
+
+static const struct option reconfig_options[] = {
+	BASE_OPTIONS(),
+	RECONFIG_OPTIONS(),
+	OPT_END(),
+};
+
+static const char *parse_device_options(int argc, const char **argv,
+		enum device_action action, const struct option *options,
+		const char *usage, struct daxctl_ctx *ctx)
+{
+	const char * const u[] = {
+		usage,
+		NULL
+	};
+	int i, rc = 0;
+
+	argc = parse_options(argc, argv, options, u, 0);
+
+	/* Handle action-agnostic non-option arguments */
+	if (argc == 0) {
+		char *action_string;
+
+		switch (action) {
+		case ACTION_RECONFIG:
+			action_string = "reconfigure";
+			break;
+		default:
+			action_string = "<>";
+			break;
+		}
+		fprintf(stderr, "specify a device to %s, or \"all\"\n",
+			action_string);
+		rc = -EINVAL;
+	}
+	for (i = 1; i < argc; i++) {
+		fprintf(stderr, "unknown extra parameter \"%s\"\n", argv[i]);
+		rc = -EINVAL;
+	}
+
+	if (rc) {
+		usage_with_options(u, options);
+		return NULL;
+	}
+
+	/* Handle action-agnostic options */
+	if (param.verbose)
+		daxctl_set_log_priority(ctx, LOG_DEBUG);
+	if (param.human)
+		flags |= UTIL_JSON_HUMAN;
+
+	/* Handle action-specific options */
+	switch (action) {
+	case ACTION_RECONFIG:
+		if (!param.mode) {
+			fprintf(stderr, "error: a 'mode' option is required\n");
+			usage_with_options(u, reconfig_options);
+			rc = -EINVAL;
+		}
+		if (strcmp(param.mode, "system-ram") == 0) {
+			reconfig_mode = DAXCTL_DEV_MODE_RAM;
+		} else if (strcmp(param.mode, "devdax") == 0) {
+			reconfig_mode = DAXCTL_DEV_MODE_DEVDAX;
+			if (param.no_online) {
+				fprintf(stderr,
+					"--no-online is incompatible with --mode=devdax\n");
+				rc =  -EINVAL;
+			}
+		}
+		break;
+	}
+	if (rc) {
+		usage_with_options(u, options);
+		return NULL;
+	}
+
+	return argv[0];
+}
+
+static int dev_online_memory(struct daxctl_dev *dev)
+{
+	struct daxctl_memory *mem = daxctl_dev_get_memory(dev);
+	const char *devname = daxctl_dev_get_devname(dev);
+	int num_sections, num_on, rc;
+
+	if (!mem) {
+		fprintf(stderr, "%s: failed to get the memory object\n",
+			devname);
+		return -ENXIO;
+	}
+
+	/* get total number of sections and sections already online */
+	num_sections = daxctl_memory_num_sections(mem);
+	if (num_sections < 0) {
+		fprintf(stderr, "%s: failed to get number of memory sections\n",
+			devname);
+		return num_sections;
+	}
+
+	num_on = daxctl_memory_is_online(mem);
+	if (num_on < 0) {
+		fprintf(stderr, "%s: failed to determine online state: %s\n",
+			devname, strerror(-num_on));
+		return num_on;
+	}
+	if (num_on == num_sections) {
+		fprintf(stderr, "%s: all memory sections (%d) already online\n",
+			devname, num_on);
+		return 1;
+	}
+	if (num_on > 0)
+		fprintf(stderr, "%s: %d memory section%s already online\n",
+			devname, num_on,
+			num_on == 1 ? "" : "s");
+
+	/* online the remaining sections */
+	rc = daxctl_memory_online(mem);
+	if (rc < 0) {
+		fprintf(stderr, "%s: failed to online memory: %s\n",
+			devname, strerror(-rc));
+		return rc;
+	}
+	if (param.verbose)
+		fprintf(stderr, "%s: %d memory section%s onlined\n", devname, rc,
+			rc == 1 ? "" : "s");
+
+	/* all sections should now be online */
+	num_on = daxctl_memory_is_online(mem);
+	if (num_on < 0) {
+		fprintf(stderr, "%s: failed to determine online state: %s\n",
+			devname, strerror(-num_on));
+		return num_on;
+	}
+	if (num_on < num_sections) {
+		fprintf(stderr, "%s: failed to online %d memory sections\n",
+			devname, num_sections - num_on);
+		return -ENXIO;
+	}
+
+	return 0;
+}
+
+static int dev_offline_memory(struct daxctl_dev *dev)
+{
+	struct daxctl_memory *mem = daxctl_dev_get_memory(dev);
+	const char *devname = daxctl_dev_get_devname(dev);
+	int num_sections, num_on, num_off, rc;
+
+	if (!mem) {
+		fprintf(stderr, "%s: failed to get the memory object\n",
+			devname);
+		return -ENXIO;
+	}
+
+	/* get total number of sections and sections already offline */
+	num_sections = daxctl_memory_num_sections(mem);
+	if (num_sections < 0) {
+		fprintf(stderr, "%s: failed to get number of memory sections\n",
+			devname);
+		return num_sections;
+	}
+
+	num_on = daxctl_memory_is_online(mem);
+	if (num_on < 0) {
+		fprintf(stderr, "%s: failed to determine online state: %s\n",
+			devname, strerror(-num_on));
+		return num_on;
+	}
+
+	num_off = num_sections - num_on;
+	if (num_off == num_sections) {
+		fprintf(stderr, "%s: all memory sections (%d) already offline\n",
+			devname, num_off);
+		return 1;
+	}
+	if (num_off)
+		fprintf(stderr, "%s: %d memory section%s already offline\n",
+			devname, num_off,
+			num_off == 1 ? "" : "s");
+
+	/* offline the remaining sections */
+	rc = daxctl_memory_offline(mem);
+	if (rc < 0) {
+		fprintf(stderr, "%s: failed to offline memory: %s\n",
+			devname, strerror(-rc));
+		return rc;
+	}
+	if (param.verbose)
+		fprintf(stderr, "%s: %d memory section%s offlined\n", devname, rc,
+			rc == 1 ? "" : "s");
+
+	/* all sections should now be ofline */
+	num_on = daxctl_memory_is_online(mem);
+	if (num_on < 0) {
+		fprintf(stderr, "%s: failed to determine online state: %s\n",
+			devname, strerror(-num_on));
+		return num_on;
+	}
+	if (num_on) {
+		fprintf(stderr, "%s: failed to offline %d memory sections\n",
+			devname, num_on);
+		return -ENXIO;
+	}
+
+	return 0;
+}
+
+static int disable_devdax_device(struct daxctl_dev *dev)
+{
+	struct daxctl_memory *mem = daxctl_dev_get_memory(dev);
+	const char *devname = daxctl_dev_get_devname(dev);
+	int rc;
+
+	if (mem) {
+		fprintf(stderr, "%s was already in system-ram mode\n",
+			devname);
+		return 1;
+	}
+	rc = daxctl_dev_disable(dev);
+	if (rc) {
+		fprintf(stderr, "%s: disable failed: %s\n",
+			daxctl_dev_get_devname(dev), strerror(-rc));
+		return rc;
+	}
+	return 0;
+}
+
+static int reconfig_mode_system_ram(struct daxctl_dev *dev)
+{
+	int rc, skip_enable = 0;
+
+	if (daxctl_dev_is_enabled(dev)) {
+		rc = disable_devdax_device(dev);
+		if (rc < 0)
+			return rc;
+		if (rc > 0)
+			skip_enable = 1;
+	}
+
+	if (!skip_enable) {
+		rc = daxctl_dev_enable_ram(dev);
+		if (rc)
+			return rc;
+	}
+
+	if (param.no_online)
+		return 0;
+
+	return dev_online_memory(dev);
+}
+
+static int disable_system_ram_device(struct daxctl_dev *dev)
+{
+	struct daxctl_memory *mem = daxctl_dev_get_memory(dev);
+	const char *devname = daxctl_dev_get_devname(dev);
+	int rc;
+
+	if (!mem) {
+		fprintf(stderr, "%s was already in devdax mode\n", devname);
+		return 1;
+	}
+
+	if (param.force) {
+		rc = dev_offline_memory(dev);
+		if (rc < 0)
+			return rc;
+	}
+
+	rc = daxctl_memory_is_online(mem);
+	if (rc < 0) {
+		fprintf(stderr, "%s: failed to determine online state: %s\n",
+			devname, strerror(-rc));
+		return rc;
+	}
+	if (rc > 0) {
+		if (param.verbose) {
+			fprintf(stderr, "%s: found %d memory sections online\n",
+				devname, rc);
+			fprintf(stderr, "%s: refusing to change modes\n",
+				devname);
+		}
+		return -EBUSY;
+	}
+	rc = daxctl_dev_disable(dev);
+	if (rc) {
+		fprintf(stderr, "%s: disable failed: %s\n",
+			daxctl_dev_get_devname(dev), strerror(-rc));
+		return rc;
+	}
+	return 0;
+}
+
+static int reconfig_mode_devdax(struct daxctl_dev *dev)
+{
+	int rc;
+
+	if (daxctl_dev_is_enabled(dev)) {
+		rc = disable_system_ram_device(dev);
+		if (rc)
+			return rc;
+	}
+
+	rc = daxctl_dev_enable_devdax(dev);
+	if (rc)
+		return rc;
+
+	return 0;
+}
+
+static int do_reconfig(struct daxctl_dev *dev, enum dev_mode mode,
+		struct json_object **jdevs)
+{
+	const char *devname = daxctl_dev_get_devname(dev);
+	struct json_object *jdev;
+	int rc = 0;
+
+	switch (mode) {
+	case DAXCTL_DEV_MODE_RAM:
+		rc = reconfig_mode_system_ram(dev);
+		break;
+	case DAXCTL_DEV_MODE_DEVDAX:
+		rc = reconfig_mode_devdax(dev);
+		break;
+	default:
+		fprintf(stderr, "%s: unknown mode requested: %d\n",
+			devname, mode);
+		rc = -EINVAL;
+	}
+
+	if (rc)
+		return rc;
+
+	*jdevs = json_object_new_array();
+	if (*jdevs) {
+		jdev = util_daxctl_dev_to_json(dev, flags);
+		if (jdev)
+			json_object_array_add(*jdevs, jdev);
+	}
+
+	return 0;
+}
+
+static int do_xaction_device(const char *device, enum device_action action,
+		struct daxctl_ctx *ctx, int *processed)
+{
+	struct json_object *jdevs = NULL;
+	struct daxctl_region *region;
+	struct daxctl_dev *dev;
+	int rc = -ENXIO;
+
+	*processed = 0;
+
+	daxctl_region_foreach(ctx, region) {
+		if (param.region_id >= 0 && param.region_id
+				!= daxctl_region_get_id(region))
+			continue;
+
+		daxctl_dev_foreach(region, dev) {
+			if (!util_daxctl_dev_filter(dev, device))
+				continue;
+
+			switch (action) {
+			case ACTION_RECONFIG:
+				rc = do_reconfig(dev, reconfig_mode, &jdevs);
+				if (rc == 0)
+					(*processed)++;
+				break;
+			default:
+				rc = -EINVAL;
+				break;
+			}
+		}
+	}
+
+	/*
+	 * jdevs is the containing json array for all devices we are reporting
+	 * on. It therefore needs to be outside the region/device iterators,
+	 * and passed in to the do_<action> functions to add their objects to
+	 */
+	if (jdevs)
+		util_display_json_array(stdout, jdevs, flags);
+
+	return rc;
+}
+
+int cmd_reconfig_device(int argc, const char **argv, struct daxctl_ctx *ctx)
+{
+	char *usage = "daxctl reconfigure-device <device> [<options>]";
+	const char *device = parse_device_options(argc, argv, ACTION_RECONFIG,
+			reconfig_options, usage, ctx);
+	int processed, rc;
+
+	rc = do_xaction_device(device, ACTION_RECONFIG, ctx, &processed);
+	if (rc < 0)
+		fprintf(stderr, "error reconfiguring devices: %s\n",
+				strerror(-rc));
+
+	fprintf(stderr, "reconfigured %d device%s\n", processed,
+			processed == 1 ? "" : "s");
+	return rc;
+}
-- 
2.20.1

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [ndctl PATCH v9 08/13] Documentation/daxctl: add a man page for daxctl-reconfigure-device
  2019-08-01  0:29 [ndctl PATCH v9 00/13] daxctl: add a new reconfigure-device command Vishal Verma
                   ` (6 preceding siblings ...)
  2019-08-01  0:29 ` [ndctl PATCH v9 07/13] daxctl: add a new reconfigure-device command Vishal Verma
@ 2019-08-01  0:29 ` Vishal Verma
  2019-08-01  0:29 ` [ndctl PATCH v9 09/13] daxctl: add commands to online and offline memory Vishal Verma
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Vishal Verma @ 2019-08-01  0:29 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Dave Hansen, Pavel Tatashin

Add a man page describing the new daxctl-reconfigure-device command.

Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 Documentation/daxctl/Makefile.am              |   3 +-
 .../daxctl/daxctl-reconfigure-device.txt      | 157 ++++++++++++++++++
 2 files changed, 159 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/daxctl/daxctl-reconfigure-device.txt

diff --git a/Documentation/daxctl/Makefile.am b/Documentation/daxctl/Makefile.am
index 6aba035..715fbad 100644
--- a/Documentation/daxctl/Makefile.am
+++ b/Documentation/daxctl/Makefile.am
@@ -28,7 +28,8 @@ endif
 man1_MANS = \
 	daxctl.1 \
 	daxctl-list.1 \
-	daxctl-migrate-device-model.1
+	daxctl-migrate-device-model.1 \
+	daxctl-reconfigure-device.1
 
 CLEANFILES = $(man1_MANS)
 
diff --git a/Documentation/daxctl/daxctl-reconfigure-device.txt b/Documentation/daxctl/daxctl-reconfigure-device.txt
new file mode 100644
index 0000000..196d692
--- /dev/null
+++ b/Documentation/daxctl/daxctl-reconfigure-device.txt
@@ -0,0 +1,157 @@
+// SPDX-License-Identifier: GPL-2.0
+
+daxctl-reconfigure-device(1)
+============================
+
+NAME
+----
+daxctl-reconfigure-device - Reconfigure a dax device into a different mode
+
+SYNOPSIS
+--------
+[verse]
+'daxctl reconfigure-device' <dax0.0> [<dax1.0>...<daxY.Z>] [<options>]
+
+EXAMPLES
+--------
+
+* Reconfigure dax0.0 to system-ram mode, don't online the memory
+----
+# daxctl reconfigure-device --mode=system-ram --no-online dax0.0
+[
+  {
+    "chardev":"dax0.0",
+    "size":16777216000,
+    "target_node":2,
+    "mode":"system-ram"
+  }
+]
+----
+
+* Reconfigure dax0.0 to devdax mode, attempt to offline the memory
+----
+# daxctl reconfigure-device --human --mode=devdax --force dax0.0
+{
+  "chardev":"dax0.0",
+  "size":"15.63 GiB (16.78 GB)",
+  "target_node":2,
+  "mode":"devdax"
+}
+----
+
+* Reconfigure all dax devices on region0 to system-ram mode
+----
+# daxctl reconfigure-device --mode=system-ram --region=0 all
+[
+  {
+    "chardev":"dax0.0",
+    "size":16777216000,
+    "target_node":2,
+    "mode":"system-ram"
+  },
+  {
+    "chardev":"dax0.1",
+    "size":16777216000,
+    "target_node":3,
+    "mode":"system-ram"
+  }
+]
+----
+
+* Run a process called 'some-service' using numactl to restrict its cpu
+nodes to '0' and '1', and  memory allocations to node 2 (determined using
+daxctl_dev_get_target_node() or 'daxctl list')
+----
+# daxctl reconfigure-device --mode=system-ram dax0.0
+[
+  {
+    "chardev":"dax0.0",
+    "size":16777216000,
+    "target_node":2,
+    "mode":"system-ram"
+  }
+]
+
+# numactl --cpunodebind=0-1 --membind=2 -- some-service --opt1 --opt2
+----
+
+DESCRIPTION
+-----------
+
+Reconfigure the operational mode of a dax device. This can be used to convert
+a regular 'devdax' mode device to the 'system-ram' mode which arranges for the
+dax range to be hot-plugged into the system as regular memory.
+
+NOTE: This is a destructive operation. Any data on the dax device *will* be
+lost.
+
+NOTE: Device reconfiguration depends on the dax-bus device model. See
+linkdaxctl:daxctl-migrate-device-model[1] for more information. If dax-class is
+in use (via the dax_pmem_compat driver), the reconfiguration will fail with an
+error such as the following:
+----
+# daxctl reconfigure-device --mode=system-ram --region=0 all
+libdaxctl: daxctl_dev_disable: dax3.0: error: device model is dax-class
+dax3.0: disable failed: Operation not supported
+error reconfiguring devices: Operation not supported
+reconfigured 0 devices
+----
+
+OPTIONS
+-------
+-r::
+--region=::
+	Restrict the operation to devices belonging to the specified region(s).
+	A device-dax region is a contiguous range of memory that hosts one or
+	more /dev/daxX.Y devices, where X is the region id and Y is the device
+	instance id.
+
+-m::
+--mode=::
+	Specify the mode to which the dax device(s) should be reconfigured.
+	- "system-ram": hotplug the device into system memory.
+
+	- "devdax": switch to the normal "device dax" mode. This requires the
+	  kernel to support hot-unplugging 'kmem' based memory. If this is not
+	  available, a reboot is the only way to switch back to 'devdax' mode.
+
+-N::
+--no-online::
+	By default, memory sections provided by system-ram devices will be
+	brought online automatically and immediately with the 'online_movable'
+	policy. Use this option to disable the automatic onlining behavior.
+
+	NOTE: While this option prevents daxctl from automatically onlining
+	the memory sections, there may be other agents, notably system udev
+	rules, that online new memory sections as they appear. Coordinating
+	with such rules is out of scope of this utility, and the system
+	administrator is expected to remove them if they are undesirable.
+	If such an agent races to online memory sections, daxctl is prepared
+	to lose the race, and not fail the onlining operation as it only
+	cares that the memory section was onlined, not that it was the one
+	to do so.
+
+-f::
+--force::
+	When converting from "system-ram" mode to "devdax", it is expected
+	that all the memory sections are first made offline. By default,
+	daxctl won't touch online memory. However with this option, attempt
+	to offline the memory on the NUMA node associated with the dax device
+	before converting it back to "devdax" mode.
+
+-u::
+--human::
+	By default the command will output machine-friendly raw-integer
+	data. Instead, with this flag, numbers representing storage size
+	will be formatted as human readable strings with units, other
+	fields are converted to hexadecimal strings.
+
+-v::
+--verbose::
+	Emit more debug messages
+
+include::../copyright.txt[]
+
+SEE ALSO
+--------
+linkdaxctl:daxctl-list[1],daxctl-migrate-device-model[1]
-- 
2.20.1

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [ndctl PATCH v9 09/13] daxctl: add commands to online and offline memory
  2019-08-01  0:29 [ndctl PATCH v9 00/13] daxctl: add a new reconfigure-device command Vishal Verma
                   ` (7 preceding siblings ...)
  2019-08-01  0:29 ` [ndctl PATCH v9 08/13] Documentation/daxctl: add a man page for daxctl-reconfigure-device Vishal Verma
@ 2019-08-01  0:29 ` Vishal Verma
  2019-08-01  0:29 ` [ndctl PATCH v9 10/13] Documentation: Add man pages for daxctl-{on, off}line-memory Vishal Verma
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Vishal Verma @ 2019-08-01  0:29 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Dave Hansen, Pavel Tatashin

Add two new commands:

  daxctl-online-memory
  daxctl-offline-memory

to manage the state of hot-plugged memory from the system-ram mode for
dax devices. This provides a way for the user to online/offline the
memory as a separate step from the reconfiguration. Without this, a user
that reconfigures a device into the system-ram mode with the --no-online
option, would have no way to later online the memory, and would have to
resort to shell scripting to online them manually via sysfs.

Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 daxctl/builtin.h |  2 ++
 daxctl/daxctl.c  |  2 ++
 daxctl/device.c  | 88 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 92 insertions(+)

diff --git a/daxctl/builtin.h b/daxctl/builtin.h
index 756ba2a..f5a0147 100644
--- a/daxctl/builtin.h
+++ b/daxctl/builtin.h
@@ -7,4 +7,6 @@ struct daxctl_ctx;
 int cmd_list(int argc, const char **argv, struct daxctl_ctx *ctx);
 int cmd_migrate(int argc, const char **argv, struct daxctl_ctx *ctx);
 int cmd_reconfig_device(int argc, const char **argv, struct daxctl_ctx *ctx);
+int cmd_online_memory(int argc, const char **argv, struct daxctl_ctx *ctx);
+int cmd_offline_memory(int argc, const char **argv, struct daxctl_ctx *ctx);
 #endif /* _DAXCTL_BUILTIN_H_ */
diff --git a/daxctl/daxctl.c b/daxctl/daxctl.c
index e1ba7b8..1ab0732 100644
--- a/daxctl/daxctl.c
+++ b/daxctl/daxctl.c
@@ -72,6 +72,8 @@ static struct cmd_struct commands[] = {
 	{ "help", .d_fn = cmd_help },
 	{ "migrate-device-model", .d_fn = cmd_migrate },
 	{ "reconfigure-device", .d_fn = cmd_reconfig_device },
+	{ "online-memory", .d_fn = cmd_online_memory },
+	{ "offline-memory", .d_fn = cmd_offline_memory },
 };
 
 int main(int argc, const char **argv)
diff --git a/daxctl/device.c b/daxctl/device.c
index ed2af76..4887ccf 100644
--- a/daxctl/device.c
+++ b/daxctl/device.c
@@ -39,6 +39,8 @@ static unsigned long flags;
 
 enum device_action {
 	ACTION_RECONFIG,
+	ACTION_ONLINE,
+	ACTION_OFFLINE,
 };
 
 #define BASE_OPTIONS() \
@@ -59,6 +61,11 @@ static const struct option reconfig_options[] = {
 	OPT_END(),
 };
 
+static const struct option memory_options[] = {
+	BASE_OPTIONS(),
+	OPT_END(),
+};
+
 static const char *parse_device_options(int argc, const char **argv,
 		enum device_action action, const struct option *options,
 		const char *usage, struct daxctl_ctx *ctx)
@@ -79,6 +86,12 @@ static const char *parse_device_options(int argc, const char **argv,
 		case ACTION_RECONFIG:
 			action_string = "reconfigure";
 			break;
+		case ACTION_ONLINE:
+			action_string = "online memory for";
+			break;
+		case ACTION_OFFLINE:
+			action_string = "offline memory for";
+			break;
 		default:
 			action_string = "<>";
 			break;
@@ -122,6 +135,10 @@ static const char *parse_device_options(int argc, const char **argv,
 			}
 		}
 		break;
+	case ACTION_ONLINE:
+	case ACTION_OFFLINE:
+		/* nothing special */
+		break;
 	}
 	if (rc) {
 		usage_with_options(u, options);
@@ -394,6 +411,33 @@ static int do_reconfig(struct daxctl_dev *dev, enum dev_mode mode,
 	return 0;
 }
 
+static int do_xline(struct daxctl_dev *dev, enum device_action action)
+{
+	struct daxctl_memory *mem = daxctl_dev_get_memory(dev);
+	const char *devname = daxctl_dev_get_devname(dev);
+	int rc;
+
+	if (!mem) {
+		fprintf(stderr,
+			"%s: memory operations are not applicable in devdax mode\n",
+			devname);
+		return -ENXIO;
+	}
+
+	switch (action) {
+	case ACTION_ONLINE:
+		rc = dev_online_memory(dev);
+		break;
+	case ACTION_OFFLINE:
+		rc = dev_offline_memory(dev);
+		break;
+	default:
+		fprintf(stderr, "%s: invalid action: %d\n", devname, action);
+		rc = -EINVAL;
+	}
+	return rc;
+}
+
 static int do_xaction_device(const char *device, enum device_action action,
 		struct daxctl_ctx *ctx, int *processed)
 {
@@ -419,6 +463,16 @@ static int do_xaction_device(const char *device, enum device_action action,
 				if (rc == 0)
 					(*processed)++;
 				break;
+			case ACTION_ONLINE:
+				rc = do_xline(dev, action);
+				if (rc == 0)
+					(*processed)++;
+				break;
+			case ACTION_OFFLINE:
+				rc = do_xline(dev, action);
+				if (rc == 0)
+					(*processed)++;
+				break;
 			default:
 				rc = -EINVAL;
 				break;
@@ -453,3 +507,37 @@ int cmd_reconfig_device(int argc, const char **argv, struct daxctl_ctx *ctx)
 			processed == 1 ? "" : "s");
 	return rc;
 }
+
+int cmd_online_memory(int argc, const char **argv, struct daxctl_ctx *ctx)
+{
+	char *usage = "daxctl online-memory <device> [<options>]";
+	const char *device = parse_device_options(argc, argv, ACTION_ONLINE,
+			memory_options, usage, ctx);
+	int processed, rc;
+
+	rc = do_xaction_device(device, ACTION_ONLINE, ctx, &processed);
+	if (rc < 0)
+		fprintf(stderr, "error onlining memory: %s\n",
+				strerror(-rc));
+
+	fprintf(stderr, "onlined memory for %d device%s\n", processed,
+			processed == 1 ? "" : "s");
+	return rc;
+}
+
+int cmd_offline_memory(int argc, const char **argv, struct daxctl_ctx *ctx)
+{
+	char *usage = "daxctl offline-memory <device> [<options>]";
+	const char *device = parse_device_options(argc, argv, ACTION_OFFLINE,
+			memory_options, usage, ctx);
+	int processed, rc;
+
+	rc = do_xaction_device(device, ACTION_OFFLINE, ctx, &processed);
+	if (rc < 0)
+		fprintf(stderr, "error offlining memory: %s\n",
+				strerror(-rc));
+
+	fprintf(stderr, "offlined memory for %d device%s\n", processed,
+			processed == 1 ? "" : "s");
+	return rc;
+}
-- 
2.20.1

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [ndctl PATCH v9 10/13] Documentation: Add man pages for daxctl-{on, off}line-memory
  2019-08-01  0:29 [ndctl PATCH v9 00/13] daxctl: add a new reconfigure-device command Vishal Verma
                   ` (8 preceding siblings ...)
  2019-08-01  0:29 ` [ndctl PATCH v9 09/13] daxctl: add commands to online and offline memory Vishal Verma
@ 2019-08-01  0:29 ` Vishal Verma
  2019-08-01  0:29 ` [ndctl PATCH v9 11/13] contrib/ndctl: fix region-id completions for daxctl Vishal Verma
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Vishal Verma @ 2019-08-01  0:29 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Dave Hansen, Pavel Tatashin

Add man pages for the two new commands: daxctl-online-memory, and
daxctl-offline-memory.

Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 Documentation/daxctl/Makefile.am              |  4 +-
 .../daxctl/daxctl-offline-memory.txt          | 72 +++++++++++++++++
 Documentation/daxctl/daxctl-online-memory.txt | 80 +++++++++++++++++++
 3 files changed, 155 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/daxctl/daxctl-offline-memory.txt
 create mode 100644 Documentation/daxctl/daxctl-online-memory.txt

diff --git a/Documentation/daxctl/Makefile.am b/Documentation/daxctl/Makefile.am
index 715fbad..37c3bde 100644
--- a/Documentation/daxctl/Makefile.am
+++ b/Documentation/daxctl/Makefile.am
@@ -29,7 +29,9 @@ man1_MANS = \
 	daxctl.1 \
 	daxctl-list.1 \
 	daxctl-migrate-device-model.1 \
-	daxctl-reconfigure-device.1
+	daxctl-reconfigure-device.1 \
+	daxctl-online-memory.1 \
+	daxctl-offline-memory.1
 
 CLEANFILES = $(man1_MANS)
 
diff --git a/Documentation/daxctl/daxctl-offline-memory.txt b/Documentation/daxctl/daxctl-offline-memory.txt
new file mode 100644
index 0000000..ba06287
--- /dev/null
+++ b/Documentation/daxctl/daxctl-offline-memory.txt
@@ -0,0 +1,72 @@
+// SPDX-License-Identifier: GPL-2.0
+
+daxctl-offline-memory(1)
+========================
+
+NAME
+----
+daxctl-offline-memory - Offline the memory for a device that is in system-ram mode
+
+SYNOPSIS
+--------
+[verse]
+'daxctl offline-memory' <dax0.0> [<dax1.0>...<daxY.Z>] [<options>]
+
+EXAMPLES
+--------
+
+* Reconfigure dax0.0 to system-ram mode
+----
+# daxctl reconfigure-device --mode=system-ram --human dax0.0
+{
+  "chardev":"dax0.0",
+  "size":"7.87 GiB (8.45 GB)",
+  "target_node":2,
+  "mode":"system-ram"
+}
+----
+
+* Offline the memory
+----
+# daxctl offline-memory dax0.0
+dax0.0: 62 sections offlined
+offlined memory for 1 device
+----
+
+DESCRIPTION
+-----------
+
+Offline the memory sections associated with a device that has been converted
+to the system-ram mode. If one or more blocks are already offline, attempt to
+offline the remaining blocks. If all blocks were already offline, print a
+message and return success without actually doing anything.
+
+This is complementary to the 'daxctl-online-memory' command, and may be used
+when it is wished to offline the memory sections, but not convert the device
+back to 'devdax' mode.
+
+OPTIONS
+-------
+-r::
+--region=::
+	Restrict the operation to devices belonging to the specified region(s).
+	A device-dax region is a contiguous range of memory that hosts one or
+	more /dev/daxX.Y devices, where X is the region id and Y is the device
+	instance id.
+
+-u::
+--human::
+	By default the command will output machine-friendly raw-integer
+	data. Instead, with this flag, numbers representing storage size
+	will be formatted as human readable strings with units, other
+	fields are converted to hexadecimal strings.
+
+-v::
+--verbose::
+	Emit more debug messages
+
+include::../copyright.txt[]
+
+SEE ALSO
+--------
+linkdaxctl:daxctl-reconfigure-device[1],daxctl-online-memory[1]
diff --git a/Documentation/daxctl/daxctl-online-memory.txt b/Documentation/daxctl/daxctl-online-memory.txt
new file mode 100644
index 0000000..5ac1cbf
--- /dev/null
+++ b/Documentation/daxctl/daxctl-online-memory.txt
@@ -0,0 +1,80 @@
+// SPDX-License-Identifier: GPL-2.0
+
+daxctl-online-memory(1)
+=======================
+
+NAME
+----
+daxctl-online-memory - Online the memory for a device that is in system-ram mode
+
+SYNOPSIS
+--------
+[verse]
+'daxctl online-memory' <dax0.0> [<dax1.0>...<daxY.Z>] [<options>]
+
+EXAMPLES
+--------
+
+* Reconfigure dax0.0 to system-ram mode, don't online the memory
+----
+# daxctl reconfigure-device --mode=system-ram --no-online --human dax0.0
+{
+  "chardev":"dax0.0",
+  "size":"7.87 GiB (8.45 GB)",
+  "target_node":2,
+  "mode":"system-ram"
+}
+----
+
+* Online the memory separately
+----
+# daxctl online-memory dax0.0
+dax0.0: 62 new sections onlined
+onlined memory for 1 device
+----
+
+* Onlining memory when some sections were already online
+----
+# daxctl online-memory dax0.0
+dax0.0: 1 section already online
+dax0.0: 61 new sections onlined
+onlined memory for 1 device
+----
+
+DESCRIPTION
+-----------
+
+Online the memory sections associated with a device that has been converted
+to the system-ram mode. If one or more blocks are already online, print a
+message about them, and attempt to online the remaining blocks.
+
+This is complementary to the 'daxctl-reconfigure-device' command, when used with
+the '--no-online' option to skip onlining memory sections immediately after the
+reconfigure. In these scenarios, the memory can be onlined at a later time using
+'daxctl-online-memory'.
+
+OPTIONS
+-------
+-r::
+--region=::
+	Restrict the operation to devices belonging to the specified region(s).
+	A device-dax region is a contiguous range of memory that hosts one or
+	more /dev/daxX.Y devices, where X is the region id and Y is the device
+	instance id.
+
+-u::
+--human::
+	By default the command will output machine-friendly raw-integer
+	data. Instead, with this flag, numbers representing storage size
+	will be formatted as human readable strings with units, other
+	fields are converted to hexadecimal strings.
+
+-v::
+--verbose::
+	Emit more debug messages
+
+include::../copyright.txt[]
+
+SEE ALSO
+--------
+linkdaxctl:daxctl-reconfigure-device[1],daxctl-offline-memory[1]
-- 
2.20.1

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [ndctl PATCH v9 11/13] contrib/ndctl: fix region-id completions for daxctl
  2019-08-01  0:29 [ndctl PATCH v9 00/13] daxctl: add a new reconfigure-device command Vishal Verma
                   ` (9 preceding siblings ...)
  2019-08-01  0:29 ` [ndctl PATCH v9 10/13] Documentation: Add man pages for daxctl-{on, off}line-memory Vishal Verma
@ 2019-08-01  0:29 ` Vishal Verma
  2019-08-01  0:29 ` [ndctl PATCH v9 12/13] contrib/ndctl: add bash-completion for the new daxctl commands Vishal Verma
  2019-08-01  0:29 ` [ndctl PATCH v9 13/13] test: Add a unit test for daxctl-reconfigure-device and friends Vishal Verma
  12 siblings, 0 replies; 15+ messages in thread
From: Vishal Verma @ 2019-08-01  0:29 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Dave Hansen, Pavel Tatashin

The completion helpers for daxctl assumed the region arguments for
specifying daxctl regions were the same as ndctl regions, i.e.
"regionX". This is not true - daxctl region arguments are a simple
numeric 'id'.

Add a new helper __daxctl_get_regions() to complete daxctl region IDs
properly.

While at it, fix a useless use of 'echo' in __daxctl_get_devs() and
quoting in __daxctl_comp_options()

Fixes: d6790a32f32c ("daxctl: Add bash-completion")
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 contrib/ndctl | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/contrib/ndctl b/contrib/ndctl
index 396a344..cacee2d 100755
--- a/contrib/ndctl
+++ b/contrib/ndctl
@@ -531,8 +531,14 @@ _ndctl()
 
 __daxctl_get_devs()
 {
-	local opts="--devices $*"
-	echo "$(daxctl list $opts | grep -E "^\s*\"chardev\":" | cut -d\" -f4)"
+	local opts=("--devices" "$*")
+	daxctl list "${opts[@]}" | grep -E "^\s*\"chardev\":" | cut -d'"' -f4
+}
+
+__daxctl_get_regions()
+{
+	local opts=("--regions" "$*")
+	daxctl list "${opts[@]}" | grep -E "^\s*\"id\":" | grep -Eo "[0-9]+"
 }
 
 __daxctlcomp()
@@ -561,10 +567,10 @@ __daxctl_comp_options()
 		local cur_arg=${cur##*=}
 		case $cur_subopt in
 		--region)
-			opts=$(__ndctl_get_regions -i)
+			opts="$(__daxctl_get_regions -i)"
 			;;
 		--dev)
-			opts=$(__daxctl_get_devs -i)
+			opts="$(__daxctl_get_devs -i)"
 			;;
 		*)
 			return
-- 
2.20.1

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [ndctl PATCH v9 12/13] contrib/ndctl: add bash-completion for the new daxctl commands
  2019-08-01  0:29 [ndctl PATCH v9 00/13] daxctl: add a new reconfigure-device command Vishal Verma
                   ` (10 preceding siblings ...)
  2019-08-01  0:29 ` [ndctl PATCH v9 11/13] contrib/ndctl: fix region-id completions for daxctl Vishal Verma
@ 2019-08-01  0:29 ` Vishal Verma
  2019-08-01  0:29 ` [ndctl PATCH v9 13/13] test: Add a unit test for daxctl-reconfigure-device and friends Vishal Verma
  12 siblings, 0 replies; 15+ messages in thread
From: Vishal Verma @ 2019-08-01  0:29 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Dave Hansen, Pavel Tatashin

Add bash completion helpers for the new daxctl-reconfigure-device,
daxctl-online-memory, and daxctl-offline-memory commands.

Cc: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 contrib/ndctl | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/contrib/ndctl b/contrib/ndctl
index cacee2d..680fe6a 100755
--- a/contrib/ndctl
+++ b/contrib/ndctl
@@ -547,7 +547,7 @@ __daxctlcomp()
 
 	COMPREPLY=( $( compgen -W "$1" -- "$2" ) )
 	for cword in "${COMPREPLY[@]}"; do
-		if [[ "$cword" == @(--region|--dev) ]]; then
+		if [[ "$cword" == @(--region|--dev|--mode) ]]; then
 			COMPREPLY[$i]="${cword}="
 		else
 			COMPREPLY[$i]="${cword} "
@@ -572,6 +572,9 @@ __daxctl_comp_options()
 		--dev)
 			opts="$(__daxctl_get_devs -i)"
 			;;
+		--mode)
+			opts="system-ram devdax"
+			;;
 		*)
 			return
 			;;
@@ -582,8 +585,23 @@ __daxctl_comp_options()
 
 __daxctl_comp_non_option_args()
 {
-	# there aren't any commands that accept non option arguments yet
-	return
+	local subcmd=$1
+	local cur=$2
+	local opts
+
+	case $subcmd in
+	reconfigure-device)
+		;&
+	online-memory)
+		;&
+	offline-memory)
+		opts="$(__daxctl_get_devs -i) all"
+		;;
+	*)
+		return
+		;;
+	esac
+	__daxctlcomp "$opts" "$cur"
 }
 
 __daxctl_main()
-- 
2.20.1

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [ndctl PATCH v9 13/13] test: Add a unit test for daxctl-reconfigure-device and friends
  2019-08-01  0:29 [ndctl PATCH v9 00/13] daxctl: add a new reconfigure-device command Vishal Verma
                   ` (11 preceding siblings ...)
  2019-08-01  0:29 ` [ndctl PATCH v9 12/13] contrib/ndctl: add bash-completion for the new daxctl commands Vishal Verma
@ 2019-08-01  0:29 ` Vishal Verma
  12 siblings, 0 replies; 15+ messages in thread
From: Vishal Verma @ 2019-08-01  0:29 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Dave Hansen, Pavel Tatashin

Add a new unit test to test dax device reconfiguration and memory
operations. This teaches test/common about daxctl, and adds an ACPI.NFIT
bus variable. Since we have to operate on the ACPI.NFIT bus, the test is
marked as destructive.

Cc: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 test/Makefile.am       |  3 +-
 test/common            | 19 ++++++++--
 test/daxctl-devices.sh | 81 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 99 insertions(+), 4 deletions(-)
 create mode 100755 test/daxctl-devices.sh

diff --git a/test/Makefile.am b/test/Makefile.am
index 874c4bb..84474d0 100644
--- a/test/Makefile.am
+++ b/test/Makefile.am
@@ -49,7 +49,8 @@ TESTS +=\
 	dax.sh \
 	device-dax \
 	device-dax-fio.sh \
-	mmap.sh
+	mmap.sh \
+	daxctl-devices.sh
 
 if ENABLE_KEYUTILS
 TESTS += security.sh
diff --git a/test/common b/test/common
index 1b9d3da..1814a0c 100644
--- a/test/common
+++ b/test/common
@@ -15,12 +15,25 @@ else
 	exit 1
 fi
 
-# NFIT_TEST_BUS[01]
+# DAXCTL
 #
-NFIT_TEST_BUS0=nfit_test.0
-NFIT_TEST_BUS1=nfit_test.1
+if [ -f "../daxctl/daxctl" ] && [ -x "../daxctl/daxctl" ]; then
+	export DAXCTL=../daxctl/daxctl
+elif [ -f "./daxctl/daxctl" ] && [ -x "./daxctl/daxctl" ]; then
+	export DAXCTL=./daxctl/daxctl
+else
+	echo "Couldn't find an daxctl binary"
+	exit 1
+fi
 
 
+# NFIT_TEST_BUS[01]
+#
+NFIT_TEST_BUS0="nfit_test.0"
+NFIT_TEST_BUS1="nfit_test.1"
+ACPI_BUS="ACPI.NFIT"
+E820_BUS="e820"
+
 # Functions
 
 # err
diff --git a/test/daxctl-devices.sh b/test/daxctl-devices.sh
new file mode 100755
index 0000000..04f53f7
--- /dev/null
+++ b/test/daxctl-devices.sh
@@ -0,0 +1,81 @@
+#!/bin/bash -Ex
+# SPDX-License-Identifier: GPL-2.0
+# Copyright(c) 2019 Intel Corporation. All rights reserved.
+
+rc=77
+. ./common
+
+trap 'cleanup $LINENO' ERR
+
+cleanup()
+{
+	printf "Error at line %d\n" "$1"
+	[[ $testdev ]] && reset_dev
+	exit $rc
+}
+
+find_testdev()
+{
+	local rc=77
+
+	# find a victim device
+	testbus="$ACPI_BUS"
+	testdev=$("$NDCTL" list -b "$testbus" -Ni | jq -er '.[0].dev | .//""')
+	if [[ ! $testdev  ]]; then
+		printf "Unable to find a victim device\n"
+		exit "$rc"
+	fi
+	printf "Found victim dev: %s on bus: %s\n" "$testdev" "$testbus"
+}
+
+setup_dev()
+{
+	test -n "$testbus"
+	test -n "$testdev"
+
+	"$NDCTL" destroy-namespace -f -b "$testbus" "$testdev"
+	testdev=$("$NDCTL" create-namespace -b "$testbus" -m devdax -fe "$testdev" -s 256M | \
+		jq -er '.dev')
+	test -n "$testdev"
+}
+
+reset_dev()
+{
+	"$NDCTL" destroy-namespace -f -b "$testbus" "$testdev"
+}
+
+daxctl_get_dev()
+{
+	"$NDCTL" list -n "$1" -X | jq -er '.[].daxregion.devices[0].chardev'
+}
+
+daxctl_get_mode()
+{
+	"$DAXCTL" list -d "$1" | jq -er '.[].mode'
+}
+
+daxctl_test()
+{
+	local daxdev
+
+	daxdev=$(daxctl_get_dev "$testdev")
+	test -n "$daxdev"
+
+	"$DAXCTL" reconfigure-device -N -m system-ram "$daxdev"
+	[[ $(daxctl_get_mode "$daxdev") == "system-ram" ]]
+	"$DAXCTL" online-memory "$daxdev"
+	"$DAXCTL" offline-memory "$daxdev"
+	"$DAXCTL" reconfigure-device -m devdax "$daxdev"
+	[[ $(daxctl_get_mode "$daxdev") == "devdax" ]]
+	"$DAXCTL" reconfigure-device -m system-ram "$daxdev"
+	[[ $(daxctl_get_mode "$daxdev") == "system-ram" ]]
+	"$DAXCTL" reconfigure-device -f -m devdax "$daxdev"
+	[[ $(daxctl_get_mode "$daxdev") == "devdax" ]]
+}
+
+find_testdev
+setup_dev
+rc=1
+daxctl_test
+reset_dev
+exit 0
-- 
2.20.1

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [ndctl PATCH v9 04/13] libdaxctl: add a 'daxctl_memory' object for memory based operations
  2019-08-01  0:29 ` [ndctl PATCH v9 04/13] libdaxctl: add a 'daxctl_memory' object for memory based operations Vishal Verma
@ 2019-08-05 23:57   ` Verma, Vishal L
  0 siblings, 0 replies; 15+ messages in thread
From: Verma, Vishal L @ 2019-08-05 23:57 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: dave.hansen, pasha.tatashin

On Wed, 2019-07-31 at 18:29 -0600, Vishal Verma wrote:
> Introduce a new 'daxctl_memory' object, which will be used for
> operations related to managing dax devices in 'system-memory' modes.
> 
> Add libdaxctl APIs to get the target_node of a DAX device, and to
> online, offline, and query the state of hotplugged memory sections
> associated with a given device.
> 
> This adds the following new interfaces:
> 
>   daxctl_dev_get_target_node
>   daxctl_dev_get_memory
>   daxctl_memory_get_dev
>   daxctl_memory_get_node_path
>   daxctl_memory_get_block_size
>   daxctl_memory_online
>   daxctl_memory_offline
>   daxctl_memory_is_online
>   daxctl_memory_num_sections
> 
> Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> [for the memblock-already-online TOCTOU hole]
> Reported-by: Fan Du <fan.du@intel.com>
> Tested-by: Fan Du <fan.du@intel.com>
> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
> ---
>  daxctl/lib/libdaxctl-private.h |  18 ++
>  daxctl/lib/libdaxctl.c         | 384
+++++++++++++++++++++++++++++++++
>  daxctl/lib/libdaxctl.sym       |   9 +
>  daxctl/libdaxctl.h             |  11 +
>  4 files changed, 422 insertions(+)
> 
[..]
> +
> +static bool memblock_in_dev(struct daxctl_dev *dev, const char
*memblock)
> +{
> +	struct daxctl_memory *mem = daxctl_dev_get_memory(dev);
> 
Static analysis complains that this can potentially cause a NULL
dereference. Fix it by passing the mem object to memblock_in_dev(),
since it has already been validated by that time.

3<----


>From e8bf803e359b784259f645d1ff68e964b2c8618f Mon Sep 17 00:00:00 2001
From: Vishal Verma <vishal.l.verma@intel.com>
Date: Fri, 3 May 2019 13:27:35 -0600
Subject: [ndctl PATCH] libdaxctl: add a 'daxctl_memory' object for
memory
 based operations

Introduce a new 'daxctl_memory' object, which will be used for
operations related to managing dax devices in 'system-memory' modes.

Add libdaxctl APIs to get the target_node of a DAX device, and to
online, offline, and query the state of hotplugged memory sections
associated with a given device.

This adds the following new interfaces:

  daxctl_dev_get_target_node
  daxctl_dev_get_memory
  daxctl_memory_get_dev
  daxctl_memory_get_node_path
  daxctl_memory_get_block_size
  daxctl_memory_online
  daxctl_memory_offline
  daxctl_memory_is_online
  daxctl_memory_num_sections

Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
[for the memblock-already-online TOCTOU hole]
Reported-by: Fan Du <fan.du@intel.com>
Tested-by: Fan Du <fan.du@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 daxctl/lib/libdaxctl-private.h |  18 ++
 daxctl/lib/libdaxctl.c         | 384 +++++++++++++++++++++++++++++++++
 daxctl/lib/libdaxctl.sym       |   9 +
 daxctl/libdaxctl.h             |  11 +
 4 files changed, 422 insertions(+)

diff --git a/daxctl/lib/libdaxctl-private.h b/daxctl/lib/libdaxctl-
private.h
index fee67d1..01091de 100644
--- a/daxctl/lib/libdaxctl-private.h
+++ b/daxctl/lib/libdaxctl-private.h
@@ -39,6 +39,13 @@ static const char *dax_modules[] = {
 	[DAXCTL_DEV_MODE_RAM] = "kmem",
 };
 
+enum memory_op {
+	MEM_SET_OFFLINE,
+	MEM_SET_ONLINE,
+	MEM_IS_ONLINE,
+	MEM_COUNT,
+};
+
 /**
  * struct daxctl_region - container for dax_devices
  */
@@ -70,8 +77,19 @@ struct daxctl_dev {
 	struct kmod_module *module;
 	struct kmod_list *kmod_list;
 	struct daxctl_region *region;
+	struct daxctl_memory *mem;
+	int target_node;
+};
+
+struct daxctl_memory {
+	struct daxctl_dev *dev;
+	void *mem_buf;
+	size_t buf_len;
+	char *node_path;
+	unsigned long block_size;
 };
 
+
 static inline int check_kmod(struct kmod_ctx *kmod_ctx)
 {
 	return kmod_ctx ? 0 : -ENXIO;
diff --git a/daxctl/lib/libdaxctl.c b/daxctl/lib/libdaxctl.c
index aa0d2f2..bcc77b6 100644
--- a/daxctl/lib/libdaxctl.c
+++ b/daxctl/lib/libdaxctl.c
@@ -200,6 +200,15 @@ DAXCTL_EXPORT void daxctl_region_get_uuid(struct
daxctl_region *region, uuid_t u
 	uuid_copy(uu, region->uuid);
 }
 
+static void free_mem(struct daxctl_dev *dev)
+{
+	if (dev && dev->mem) {
+		free(dev->mem->node_path);
+		free(dev->mem);
+		dev->mem = NULL;
+	}
+}
+
 static void free_dev(struct daxctl_dev *dev, struct list_head *head)
 {
 	if (head)
@@ -207,6 +216,7 @@ static void free_dev(struct daxctl_dev *dev, struct
list_head *head)
 	kmod_module_unref_list(dev->kmod_list);
 	free(dev->dev_buf);
 	free(dev->dev_path);
+	free_mem(dev);
 	free(dev);
 }
 
@@ -380,6 +390,94 @@ static struct kmod_list *to_module_list(struct
daxctl_ctx *ctx,
 	return list;
 }
 
+static int dev_is_system_ram_capable(struct daxctl_dev *dev)
+{
+	const char *devname = daxctl_dev_get_devname(dev);
+	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
+	char *mod_path, *mod_base;
+	char path[200];
+	const int len = sizeof(path);
+
+	if (!device_model_is_dax_bus(dev))
+		return false;
+
+	if (!daxctl_dev_is_enabled(dev))
+		return false;
+
+	if (snprintf(path, len, "%s/driver/module", dev->dev_path) >=
len) {
+		err(ctx, "%s: buffer too small!\n", devname);
+		return false;
+	}
+
+	mod_path = realpath(path, NULL);
+	if (!mod_path)
+		return false;
+
+	mod_base = basename(mod_path);
+	if (strcmp(mod_base, dax_modules[DAXCTL_DEV_MODE_RAM]) == 0) {
+		free(mod_path);
+		return true;
+	}
+
+	free(mod_path);
+	return false;
+}
+
+/*
+ * This checks for the device to be in system-ram mode, so calling
+ * daxctl_dev_get_memory() on a devdax mode device will always return
NULL.
+ */
+static struct daxctl_memory *daxctl_dev_alloc_mem(struct daxctl_dev
*dev)
+{
+	const char *size_path =
"/sys/devices/system/memory/block_size_bytes";
+	const char *node_base = "/sys/devices/system/node/node";
+	const char *devname = daxctl_dev_get_devname(dev);
+	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
+	struct daxctl_memory *mem;
+	char buf[SYSFS_ATTR_SIZE];
+	int node_num;
+
+	if (!dev_is_system_ram_capable(dev))
+		return NULL;
+
+	mem = calloc(1, sizeof(*mem));
+	if (!mem)
+		return NULL;
+
+	mem->dev = dev;
+
+	if (sysfs_read_attr(ctx, size_path, buf) == 0) {
+		mem->block_size = strtoul(buf, NULL, 16);
+		if (mem->block_size == 0 || mem->block_size ==
ULONG_MAX) {
+			err(ctx, "%s: Unable to determine memblock size:
%s\n",
+				devname, strerror(errno));
+			mem->block_size = 0;
+		}
+	}
+
+	node_num = daxctl_dev_get_target_node(dev);
+	if (node_num >= 0) {
+		if (asprintf(&mem->node_path, "%s%d", node_base,
+				node_num) < 0) {
+			err(ctx, "%s: Unable to set node_path\n",
devname);
+			goto err_mem;
+		}
+	}
+
+	mem->mem_buf = calloc(1, strlen(node_base) + 256);
+	if (!mem->mem_buf)
+		goto err_node;
+	mem->buf_len = strlen(node_base) + 256;
+
+	return mem;
+
+err_node:
+	free(mem->node_path);
+err_mem:
+	free(mem);
+	return NULL;
+}
+
 static void *add_dax_dev(void *parent, int id, const char *daxdev_base)
 {
 	const char *devname = devpath_to_devname(daxdev_base);
@@ -435,6 +533,12 @@ static void *add_dax_dev(void *parent, int id,
const char *daxdev_base)
 	if (rc == 0)
 		dev->kmod_list = to_module_list(ctx, buf);
 
+	sprintf(path, "%s/target_node", daxdev_base);
+	if (sysfs_read_attr(ctx, path, buf) == 0)
+		dev->target_node = strtol(buf, NULL, 0);
+	else
+		dev->target_node = -1;
+
 	daxctl_dev_foreach(region, dev_dup)
 		if (dev_dup->id == dev->id) {
 			free_dev(dev, NULL);
@@ -862,6 +966,9 @@ DAXCTL_EXPORT int daxctl_dev_disable(struct
daxctl_dev *dev)
 	if (!daxctl_dev_is_enabled(dev))
 		return 0;
 
+	/* If there is a memory object, first free that */
+	free_mem(dev);
+
 	daxctl_unbind(ctx, dev->dev_path);
 
 	if (daxctl_dev_is_enabled(dev)) {
@@ -944,3 +1051,280 @@ DAXCTL_EXPORT unsigned long long
daxctl_dev_get_size(struct daxctl_dev *dev)
 {
 	return dev->size;
 }
+
+DAXCTL_EXPORT int daxctl_dev_get_target_node(struct daxctl_dev *dev)
+{
+	return dev->target_node;
+}
+
+DAXCTL_EXPORT struct daxctl_memory *daxctl_dev_get_memory(struct
daxctl_dev *dev)
+{
+	if (dev->mem)
+		return dev->mem;
+	else
+		return daxctl_dev_alloc_mem(dev);
+}
+
+DAXCTL_EXPORT struct daxctl_dev *daxctl_memory_get_dev(struct
daxctl_memory *mem)
+{
+	return mem->dev;
+}
+
+DAXCTL_EXPORT const char *daxctl_memory_get_node_path(struct
daxctl_memory *mem)
+{
+	return mem->node_path;
+}
+
+DAXCTL_EXPORT unsigned long daxctl_memory_get_block_size(struct
daxctl_memory *mem)
+{
+	return mem->block_size;
+}
+
+static int online_one_memblock(struct daxctl_dev *dev, char *path)
+{
+	const char *devname = daxctl_dev_get_devname(dev);
+	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
+	const char *mode = "online_movable";
+	char buf[SYSFS_ATTR_SIZE];
+	int rc;
+
+	rc = sysfs_read_attr(ctx, path, buf);
+	if (rc) {
+		err(ctx, "%s: Failed to read %s: %s\n",
+			devname, path, strerror(-rc));
+		return rc;
+	}
+
+	/*
+	 * if already online, possibly due to kernel config or a udev
rule,
+	 * there is nothing to do and we can skip over the memblock
+	 */
+	if (strncmp(buf, "online", 6) == 0)
+		return 1;
+
+	rc = sysfs_write_attr_quiet(ctx, path, mode);
+	if (rc) {
+		/*
+		 * While we performed an already-online check above,
there
+		 * is still a TOCTOU hole where someone (such as a udev
rule)
+		 * may have raced to online the memory. In such a case,
+		 * the sysfs store will fail, however we can check for
this
+		 * by simply reading the state again. If it changed to
the
+		 * desired state, then we don't have to error out.
+		 */
+		if (sysfs_read_attr(ctx, path, buf) == 0) {
+			if (strncmp(buf, "online", 6) == 0)
+				return 1;
+		}
+		err(ctx, "%s: Failed to online %s: %s\n",
+			devname, path, strerror(-rc));
+	}
+	return rc;
+}
+
+static int offline_one_memblock(struct daxctl_dev *dev, char *path)
+{
+	const char *devname = daxctl_dev_get_devname(dev);
+	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
+	const char *mode = "offline";
+	char buf[SYSFS_ATTR_SIZE];
+	int rc;
+
+	rc = sysfs_read_attr(ctx, path, buf);
+	if (rc) {
+		err(ctx, "%s: Failed to read %s: %s\n",
+			devname, path, strerror(-rc));
+		return rc;
+	}
+
+	/* if already offline, there is nothing to do */
+	if (strncmp(buf, "offline", 7) == 0)
+		return 1;
+
+	rc = sysfs_write_attr_quiet(ctx, path, mode);
+	if (rc) {
+		/* Close the TOCTOU hole like in online_one_memblock()
above */
+		if (sysfs_read_attr(ctx, path, buf) == 0) {
+			if (strncmp(buf, "offline", 7) == 0)
+				return 1;
+		}
+		err(ctx, "%s: Failed to offline %s: %s\n",
+			devname, path, strerror(-rc));
+	}
+	return rc;
+}
+
+static int memblock_is_online(struct daxctl_dev *dev, char *path)
+{
+	const char *devname = daxctl_dev_get_devname(dev);
+	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
+	char buf[SYSFS_ATTR_SIZE];
+	int rc;
+
+	rc = sysfs_read_attr(ctx, path, buf);
+	if (rc) {
+		err(ctx, "%s: Failed to read %s: %s\n",
+			devname, path, strerror(-rc));
+		return rc;
+	}
+
+	if (strncmp(buf, "online", 6) == 0)
+		return 1;
+
+	/* offline */
+	return 0;
+}
+
+static bool memblock_in_dev(struct daxctl_memory *mem, const char
*memblock)
+{
+	const char *mem_base = "/sys/devices/system/memory/";
+	struct daxctl_dev *dev = daxctl_memory_get_dev(mem);
+	unsigned long long memblock_res, dev_start, dev_end;
+	const char *devname = daxctl_dev_get_devname(dev);
+	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
+	unsigned long memblock_size;
+	int path_len = mem->buf_len;
+	char buf[SYSFS_ATTR_SIZE];
+	unsigned long phys_index;
+	char *path = mem->mem_buf;
+
+	if (snprintf(path, path_len, "%s/%s/phys_index",
+			mem_base, memblock) < 0)
+		return false;
+
+	if (sysfs_read_attr(ctx, path, buf) == 0) {
+		phys_index = strtoul(buf, NULL, 16);
+		if (phys_index == 0 || phys_index == ULONG_MAX) {
+			err(ctx, "%s: %s: Unable to determine
phys_index: %s\n",
+				devname, memblock, strerror(errno));
+			return false;
+		}
+	} else {
+		err(ctx, "%s: %s: Unable to determine phys_index: %s\n",
+			devname, memblock, strerror(errno));
+		return false;
+	}
+
+	dev_start = daxctl_dev_get_resource(dev);
+	if (!dev_start) {
+		err(ctx, "%s: Unable to determine resource\n", devname);
+		return false;
+	}
+	dev_end = dev_start + daxctl_dev_get_size(dev);
+
+	memblock_size = daxctl_memory_get_block_size(mem);
+	if (!memblock_size) {
+		err(ctx, "%s: Unable to determine memory block size\n",
+			devname);
+		return false;
+	}
+	memblock_res = phys_index * memblock_size;
+
+	if (memblock_res >= dev_start && memblock_res <= dev_end)
+		return true;
+
+	return false;
+}
+
+static int op_for_one_memblock(struct daxctl_memory *mem, char *path,
+		enum memory_op op)
+{
+	struct daxctl_dev *dev = daxctl_memory_get_dev(mem);
+	const char *devname = daxctl_dev_get_devname(dev);
+	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
+	int rc;
+
+	switch (op) {
+	case MEM_SET_ONLINE:
+		return online_one_memblock(dev, path);
+	case MEM_SET_OFFLINE:
+		return offline_one_memblock(dev, path);
+	case MEM_IS_ONLINE:
+		rc = memblock_is_online(dev, path);
+		if (rc < 0)
+			return rc;
+		/*
+		 * Retain the 'normal' semantics for if
(memblock_is_online()),
+		 * but since count needs rc == 0, we'll just flip rc for
this op
+		 */
+		return !rc;
+	case MEM_COUNT:
+		return 0;
+	}
+
+	err(ctx, "%s: BUG: unknown op: %d\n", devname, op);
+	return -EINVAL;
+}
+
+static int daxctl_memory_op(struct daxctl_memory *mem, enum memory_op
op)
+{
+	struct daxctl_dev *dev = daxctl_memory_get_dev(mem);
+	const char *devname = daxctl_dev_get_devname(dev);
+	struct daxctl_ctx *ctx = daxctl_dev_get_ctx(dev);
+	const char *node_path;
+	int rc, count = 0;
+	struct dirent *de;
+	DIR *node_dir;
+
+	node_path = daxctl_memory_get_node_path(mem);
+	if (!node_path) {
+		err(ctx, "%s: Failed to get node_path\n", devname);
+		return -ENXIO;
+	}
+
+	node_dir = opendir(node_path);
+	if (!node_dir)
+		return -errno;
+
+	errno = 0;
+	while ((de = readdir(node_dir)) != NULL) {
+		char *path = mem->mem_buf;
+		int len = mem->buf_len;
+
+		if (strncmp(de->d_name, "memory", 6) == 0) {
+			if (!memblock_in_dev(mem, de->d_name))
+				continue;
+			rc = snprintf(path, len, "%s/%s/state",
+				node_path, de->d_name);
+			if (rc < 0) {
+				rc = -ENOMEM;
+				goto out_dir;
+			}
+			rc = op_for_one_memblock(mem, path, op);
+			if (rc < 0)
+				goto out_dir;
+			if (rc == 0)
+				count++;
+		}
+		errno = 0;
+	}
+	if (errno) {
+		rc = -errno;
+		goto out_dir;
+	}
+	rc = count;
+
+out_dir:
+	closedir(node_dir);
+	return rc;
+}
+
+DAXCTL_EXPORT int daxctl_memory_online(struct daxctl_memory *mem)
+{
+	return daxctl_memory_op(mem, MEM_SET_ONLINE);
+}
+
+DAXCTL_EXPORT int daxctl_memory_offline(struct daxctl_memory *mem)
+{
+	return daxctl_memory_op(mem, MEM_SET_OFFLINE);
+}
+
+DAXCTL_EXPORT int daxctl_memory_is_online(struct daxctl_memory *mem)
+{
+	return daxctl_memory_op(mem, MEM_IS_ONLINE);
+}
+
+DAXCTL_EXPORT int daxctl_memory_num_sections(struct daxctl_memory *mem)
+{
+	return daxctl_memory_op(mem, MEM_COUNT);
+}
diff --git a/daxctl/lib/libdaxctl.sym b/daxctl/lib/libdaxctl.sym
index 1692624..bc18604 100644
--- a/daxctl/lib/libdaxctl.sym
+++ b/daxctl/lib/libdaxctl.sym
@@ -59,4 +59,13 @@ global:
 	daxctl_dev_enable_devdax;
 	daxctl_dev_enable_ram;
 	daxctl_dev_get_resource;
+	daxctl_dev_get_target_node;
+	daxctl_dev_get_memory;
+	daxctl_memory_get_dev;
+	daxctl_memory_get_node_path;
+	daxctl_memory_get_block_size;
+	daxctl_memory_online;
+	daxctl_memory_offline;
+	daxctl_memory_is_online;
+	daxctl_memory_num_sections;
 } LIBDAXCTL_5;
diff --git a/daxctl/libdaxctl.h b/daxctl/libdaxctl.h
index adf55f3..fb6c3b1 100644
--- a/daxctl/libdaxctl.h
+++ b/daxctl/libdaxctl.h
@@ -73,6 +73,17 @@ int daxctl_dev_is_enabled(struct daxctl_dev *dev);
 int daxctl_dev_disable(struct daxctl_dev *dev);
 int daxctl_dev_enable_devdax(struct daxctl_dev *dev);
 int daxctl_dev_enable_ram(struct daxctl_dev *dev);
+int daxctl_dev_get_target_node(struct daxctl_dev *dev);
+
+struct daxctl_memory;
+struct daxctl_memory *daxctl_dev_get_memory(struct daxctl_dev *dev);
+struct daxctl_dev *daxctl_memory_get_dev(struct daxctl_memory *mem);
+const char *daxctl_memory_get_node_path(struct daxctl_memory *mem);
+unsigned long daxctl_memory_get_block_size(struct daxctl_memory *mem);
+int daxctl_memory_online(struct daxctl_memory *mem);
+int daxctl_memory_offline(struct daxctl_memory *mem);
+int daxctl_memory_is_online(struct daxctl_memory *mem);
+int daxctl_memory_num_sections(struct daxctl_memory *mem);
 
 #define daxctl_dev_foreach(region, dev) \
         for (dev = daxctl_dev_get_first(region); \
-- 
2.20.1


_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2019-08-06  0:00 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-01  0:29 [ndctl PATCH v9 00/13] daxctl: add a new reconfigure-device command Vishal Verma
2019-08-01  0:29 ` [ndctl PATCH v9 01/13] libdaxctl: add interfaces to get ctx and check device state Vishal Verma
2019-08-01  0:29 ` [ndctl PATCH v9 02/13] libdaxctl: add interfaces to enable/disable devices Vishal Verma
2019-08-01  0:29 ` [ndctl PATCH v9 03/13] libdaxctl: add an interface to retrieve the device resource Vishal Verma
2019-08-01  0:29 ` [ndctl PATCH v9 04/13] libdaxctl: add a 'daxctl_memory' object for memory based operations Vishal Verma
2019-08-05 23:57   ` Verma, Vishal L
2019-08-01  0:29 ` [ndctl PATCH v9 05/13] daxctl/list: add target_node for device listings Vishal Verma
2019-08-01  0:29 ` [ndctl PATCH v9 06/13] daxctl/list: display the mode for a dax device Vishal Verma
2019-08-01  0:29 ` [ndctl PATCH v9 07/13] daxctl: add a new reconfigure-device command Vishal Verma
2019-08-01  0:29 ` [ndctl PATCH v9 08/13] Documentation/daxctl: add a man page for daxctl-reconfigure-device Vishal Verma
2019-08-01  0:29 ` [ndctl PATCH v9 09/13] daxctl: add commands to online and offline memory Vishal Verma
2019-08-01  0:29 ` [ndctl PATCH v9 10/13] Documentation: Add man pages for daxctl-{on, off}line-memory Vishal Verma
2019-08-01  0:29 ` [ndctl PATCH v9 11/13] contrib/ndctl: fix region-id completions for daxctl Vishal Verma
2019-08-01  0:29 ` [ndctl PATCH v9 12/13] contrib/ndctl: add bash-completion for the new daxctl commands Vishal Verma
2019-08-01  0:29 ` [ndctl PATCH v9 13/13] test: Add a unit test for daxctl-reconfigure-device and friends Vishal Verma

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).