nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v3 0/5] ndctl: monitor: monitor the smart events of
@ 2018-02-09  8:02 QI Fuli
  2018-02-09  8:02 ` [RFC PATCH v3 1/5] ndctl: nvdimmd: add LOG_NOTICE level to QI Fuli
                   ` (6 more replies)
  0 siblings, 7 replies; 21+ messages in thread
From: QI Fuli @ 2018-02-09  8:02 UTC (permalink / raw)
  To: linux-nvdimm

This is a patch set of ndctl monitor, a tiny daemon to monitor the smart
events of nvdimm dimms. When a smart event fires, monitor will output
the notification which including dimm health status to syslog or a
special file according to users' configuration. The output notification
follows json format and can be consumed by log collectors like Fluentd.

Currently, I implemeted the following four commands to control monitor daemon.
$ndctl create-monitor
$ndctl list-monitor
$ndctl show-monitor
$ndclt destroy-monitor

I will appreciate if you could give some comments.

Change log since v2:
 - Changing the interface of daemon to the ndctl command line
 - Changing the name of daemon form "nvdimmd" to "monitor"
 - Removing the config file, unit_file, nvdimmd dir
 - Removing nvdimmd_test program
 - Adding ndctl/monitor.c

Change log since v1:
 - Adding a config file(/etc/nvdimmd/nvdimmd.conf)
 - Using struct log_ctx instead of syslog()
    - Using log_syslog() to save the notify messages to syslog
    - Using log_file() to save the notify messages to special file
 - Adding LOG_NOTICE level to log_priority
 - Using automake instead of Makefile
 - Adding a new util file(nvdimmd/util.c) including helper functions
   needed for nvdimm daemon
 - Adding nvdimmd_test program

QI Fuli (5):
  ndctl: monitor: add LOG_NOTICE level to log_priority
  ndctl: monitor: add ndclt create-monitor command
  ndctl: monitor: add ndclt list-monitor command
  ndctl: monitor: add ndclt show-monitor command
  ndctl: monitor: add ndclt destroy-monitor command

 builtin.h         |   4 +
 configure.ac      |   3 +
 ndctl/Makefile.am |   3 +-
 ndctl/monitor.c   | 463 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 ndctl/ndctl.c     |   4 +
 util/log.c        |   2 +
 util/log.h        |   3 +
 7 files changed, 481 insertions(+), 1 deletion(-)
 create mode 100644 ndctl/monitor.c

-- 
2.9.5


_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC PATCH v3 1/5] ndctl: nvdimmd: add LOG_NOTICE level to
  2018-02-09  8:02 [RFC PATCH v3 0/5] ndctl: monitor: monitor the smart events of QI Fuli
@ 2018-02-09  8:02 ` QI Fuli
  2018-02-11 20:18   ` Dan Williams
  2018-02-09  8:02 ` [RFC PATCH v3 2/5] ndctl: monitor: add ndctl create-monitor command QI Fuli
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 21+ messages in thread
From: QI Fuli @ 2018-02-09  8:02 UTC (permalink / raw)
  To: linux-nvdimm

This patch adds LOG_NOTICE level to log_priority for the notifications
of nvdimm dimm over threshold event. Currently, the LOG_INFO level
is too low, and the notifications are not up to LOG_ERROR level.

Signed-off-by: QI Fuli <qi.fuli@jp.fujitsu.com>
---
 util/log.c | 2 ++
 util/log.h | 3 +++
 2 files changed, 5 insertions(+)

diff --git a/util/log.c b/util/log.c
index cc4d7d8..c60ca33 100644
--- a/util/log.c
+++ b/util/log.c
@@ -50,6 +50,8 @@ static int log_priority(const char *priority)
 		return LOG_INFO;
 	if (strncmp(priority, "debug", 5) == 0)
 		return LOG_DEBUG;
+	if (strncmp(priority, "notice", 6) == 0)
+		return LOG_NOTICE;
 	return 0;
 }
 
diff --git a/util/log.h b/util/log.h
index 5fc56e8..495e0d3 100644
--- a/util/log.h
+++ b/util/log.h
@@ -48,15 +48,18 @@ do { \
 #  endif
 #  define log_info(ctx, arg...) log_cond(ctx, LOG_INFO, ## arg)
 #  define log_err(ctx, arg...) log_cond(ctx, LOG_ERR, ## arg)
+#  define log_notice(ctx, arg...) log_cond(ctx, LOG_NOTICE, ## arg)
 #else
 #  define log_dbg(ctx, arg...) log_null(ctx, ## arg)
 #  define log_info(ctx, arg...) log_null(ctx, ## arg)
 #  define log_err(ctx, arg...) log_null(ctx, ## arg)
+#  define log_notice(ctx, arg...) log_null(ctx, ## arg)
 #endif
 
 #define dbg(x, arg...) log_dbg(&(x)->ctx, ## arg)
 #define info(x, arg...) log_info(&(x)->ctx, ## arg)
 #define err(x, arg...) log_err(&(x)->ctx, ## arg)
+#define notice(x, arg...) log_notice(&(x)->ctx, ## arg)
 
 #ifndef HAVE_SECURE_GETENV
 #  ifdef HAVE___SECURE_GETENV
-- 
2.9.5


_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v3 2/5] ndctl: monitor: add ndctl create-monitor command
  2018-02-09  8:02 [RFC PATCH v3 0/5] ndctl: monitor: monitor the smart events of QI Fuli
  2018-02-09  8:02 ` [RFC PATCH v3 1/5] ndctl: nvdimmd: add LOG_NOTICE level to QI Fuli
@ 2018-02-09  8:02 ` QI Fuli
  2018-02-11 20:48   ` Dan Williams
  2018-02-09  8:02 ` [RFC PATCH v3 3/5] ndctl: monitor: add ndclt list-monitor command QI Fuli
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 21+ messages in thread
From: QI Fuli @ 2018-02-09  8:02 UTC (permalink / raw)
  To: linux-nvdimm

This patch is used to add $ndctl create-monitor command, by which users can
create a new monitor. Users can select the DIMMS to be monitored by using
[--dimm] [--bus] [--region] [--namespace] options. The notifications can
be outputed to a special file or syslog by using [--output] option, the
special file will be placed under /var/log/ndctl. A name is also required for
a monitor,so users can destroy the monitor by the name. When a monitor is
created successfully, a file with same name will be created under
/var/ndctl/monitor.
Example: 
#ndctl create-monitor --monitor m_nmem1 --dimm nmem1 --output m_nmem.log

Signed-off-by: QI Fuli <qi.fuli@jp.fujitsu.com>
---
 builtin.h         |   1 +
 configure.ac      |   3 +
 ndctl/Makefile.am |   3 +-
 ndctl/monitor.c   | 342 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 ndctl/ndctl.c     |   1 +
 5 files changed, 349 insertions(+), 1 deletion(-)
 create mode 100644 ndctl/monitor.c

diff --git a/builtin.h b/builtin.h
index 5e1b7ef..850f6a8 100644
--- a/builtin.h
+++ b/builtin.h
@@ -36,6 +36,7 @@ int cmd_write_labels(int argc, const char **argv, void *ctx);
 int cmd_init_labels(int argc, const char **argv, void *ctx);
 int cmd_check_labels(int argc, const char **argv, void *ctx);
 int cmd_inject_error(int argc, const char **argv, void *ctx);
+int cmd_create_monitor(int argc, const char **argv, void *ctx);
 int cmd_list(int argc, const char **argv, void *ctx);
 #ifdef ENABLE_TEST
 int cmd_test(int argc, const char **argv, void *ctx);
diff --git a/configure.ac b/configure.ac
index 70ba360..e859e04 100644
--- a/configure.ac
+++ b/configure.ac
@@ -160,6 +160,9 @@ AC_CONFIG_FILES([
         Documentation/daxctl/Makefile
 ])
 
+AC_CONFIG_COMMANDS([monitorlogdir], [$MKDIR_P /var/log/ndctl])
+AC_CONFIG_COMMANDS([monitorprocdir], [$MKDIR_P /var/ndctl/monitor])
+
 AC_OUTPUT
 AC_MSG_RESULT([
         $PACKAGE $VERSION
diff --git a/ndctl/Makefile.am b/ndctl/Makefile.am
index 6677607..d9a484d 100644
--- a/ndctl/Makefile.am
+++ b/ndctl/Makefile.am
@@ -13,7 +13,8 @@ ndctl_SOURCES = ndctl.c \
 		test.c \
 		../util/json.c \
 		util/json-smart.c \
-		inject-error.c
+		inject-error.c \
+		monitor.c
 
 if ENABLE_DESTRUCTIVE
 ndctl_SOURCES += ../test/blk_namespaces.c \
diff --git a/ndctl/monitor.c b/ndctl/monitor.c
new file mode 100644
index 0000000..cf1cd6e
--- /dev/null
+++ b/ndctl/monitor.c
@@ -0,0 +1,342 @@
+/*
+ * Copyright (c) 2018, FUJITSU LIMITED. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU Lesser General Public License,
+ * version 2.1, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT ANY
+ * WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+ * FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for
+ * more details.
+ */
+#include <stdio.h>
+#include <json-c/json.h>
+#include <signal.h>
+#include <dirent.h>
+#include <util/parse-options.h>
+#include <util/log.h>
+#include <util/filter.h>
+#include <util/json.h>
+#include <ndctl/lib/private.h>
+#include <ndctl/libndctl.h>
+#include <sys/stat.h>
+#define NUM_MAX_DIMM 1024
+#define BUF_SIZE 4096
+
+struct monitor_dimm {
+	struct ndctl_dimm *dimm;
+	const char *devname;
+	int health_eventfd;
+};
+
+static struct parameter {
+	const char *bus;
+	const char *region;
+	const char *dimm;
+	const char *namespace;
+	const char *event;
+	const char *output;
+	const char *monitor;
+	bool all;
+} param;
+
+static const char *proc_path = "/var/ndctl/monitor/";
+
+static char *get_full_path_filename(const char *path, const char *name)
+{
+	char *filename;
+	int len = strlen(path) + strlen (name) +1;
+	filename = (char *) malloc(len);
+	if (!filename)
+		return NULL;
+	filename[0] = '\0';
+	strcpy(filename, path);
+	strcat(filename, name);
+	return filename;
+}
+
+static void log_syslog(struct ndctl_ctx *ctx, int priority, const char *file,
+	int line, const char *fn, const char *format, va_list args)
+{
+	char *buf;
+	buf = (char *)malloc(BUF_SIZE);
+	if (!buf) {
+		syslog(LOG_ERR, "could not get memory for log_syslog\n");
+		exit(EXIT_FAILURE);
+	}
+	vsnprintf(buf, BUF_SIZE, format, args);
+	syslog(priority, "%s", buf);
+	free(buf);
+}
+
+static void log_output(struct ndctl_ctx *ctx, int priority, const char *file,
+	int line, const char *fn, const char *format, va_list args)
+{
+	FILE *f;
+	char *filename, *buf;
+	const char *log_path = "/var/log/ndctl/";
+	filename = get_full_path_filename(log_path, param.output);
+
+	f = fopen(filename, "a+");
+	if (!f) {
+		syslog(LOG_ERR, "open %s failed\n", filename);
+		exit(EXIT_FAILURE);
+	}
+
+	buf = (char *)malloc(BUF_SIZE);
+	if (!buf) {
+		syslog(LOG_ERR, "could not get memory for log_output\n");
+		exit(EXIT_FAILURE);
+	}
+	vsnprintf(buf, BUF_SIZE, format, args);
+	fprintf(f, "%s\n", buf);
+	free(buf);
+	fclose(f);
+}
+
+static int notify_json_msg(struct ndctl_ctx *ctx, struct monitor_dimm *m_dimm)
+{
+	time_t c_time;
+	char date[32];
+	struct json_object *jmsg, *jdatetime, *jpid, *jdimm, *jhealth;
+
+	jmsg = json_object_new_object();
+
+	c_time = time(NULL);
+	strftime(date, sizeof(date), "%Y-%m-%d %H:%M:%S", localtime(&c_time));
+	jdatetime = json_object_new_string(date);
+	json_object_object_add(jmsg, "datetime", jdatetime);
+
+	jpid = json_object_new_int((int)getpid());
+	json_object_object_add(jmsg, "pid", jpid);
+
+	jdimm = util_dimm_to_json(m_dimm->dimm, 0);
+	json_object_object_add(jmsg, "dimm", jdimm);
+
+	jhealth = util_dimm_health_to_json(m_dimm->dimm);
+	if (jhealth)
+		json_object_object_add(jdimm, "health", jhealth);
+
+	notice(ctx, "%s",
+		json_object_to_json_string_ext(jmsg, JSON_C_TO_STRING_PLAIN));
+	return 0;
+}
+
+#define add_param_to_json(field) \
+if (param.field) { \
+	j##field = json_object_new_string(param.field); \
+	json_object_object_add(jmonitors, #field, j##field); \
+}
+
+static int create_monitor_proc(struct ndctl_ctx *ctx)
+{
+	time_t c_time;
+	char date[32];
+	char *filename;
+	struct json_object *jmonitors, *jmonitor, *jpid, *jcreate, *jbus,
+		*jdimm, *jregion, *jnamespace, *joutput, *jevent;
+	jmonitors = json_object_new_object();
+	add_param_to_json(monitor);
+	c_time = time(NULL);
+	strftime(date, sizeof(date), "%Y-%m-%d %H:%M:%S", localtime(&c_time));
+	jcreate = json_object_new_string(date);
+	json_object_object_add(jmonitors, "create", jcreate);
+	jpid = json_object_new_int((int)getpid());
+	json_object_object_add(jmonitors, "pid", jpid);
+	add_param_to_json(dimm);
+	add_param_to_json(bus);
+	add_param_to_json(region);
+	add_param_to_json(namespace);
+	add_param_to_json(event);
+	add_param_to_json(output);
+
+	filename = get_full_path_filename(proc_path, param.monitor);
+	return json_object_to_file(filename, jmonitors);
+}
+
+static int destroy_monitor_proc(void)
+{
+	char *filename;
+	filename = get_full_path_filename(proc_path, param.monitor);
+	return remove(filename);
+}
+
+static int set_monitor_dimm(struct ndctl_ctx *ctx, fd_set *fds, int *maxfd,
+				struct monitor_dimm *m_dimm)
+{
+	struct ndctl_bus *bus;
+	int fd, num_dimm = 0;
+	char buf[BUF_SIZE];
+
+	ndctl_bus_foreach(ctx, bus) {
+		struct ndctl_dimm *dimm;
+		if(!util_bus_filter(bus, param.bus)
+			|| !util_bus_filter_by_dimm(bus, param.dimm)
+			|| !util_bus_filter_by_region(bus, param.region)
+			|| !util_bus_filter_by_namespace(bus, param.namespace))
+			continue;
+		ndctl_dimm_foreach(bus, dimm) {
+			if(!util_dimm_filter(dimm, param.dimm)
+				|| !util_dimm_filter_by_region(dimm,
+						param.region)
+				|| !util_dimm_filter_by_namespace(dimm,
+						param.namespace))
+				continue;
+			if (!ndctl_dimm_is_cmd_supported(dimm,
+					ND_CMD_SMART_THRESHOLD))
+				continue;
+			m_dimm[num_dimm].dimm = dimm;
+			m_dimm[num_dimm].devname = ndctl_dimm_get_devname(dimm);
+			fd = ndctl_dimm_get_health_eventfd(dimm);
+			pread(fd, buf, sizeof(buf), 0);
+			m_dimm[num_dimm].health_eventfd = fd;
+
+			if (fds)
+				FD_SET(fd, fds);
+			if (maxfd) {
+				if (*maxfd < fd)
+					*maxfd = fd;
+			}
+			num_dimm++;
+		}
+	}
+	return num_dimm;
+}
+
+static bool check_monitor_exist(void)
+{
+	FILE *f;
+	char *filename;
+	filename = get_full_path_filename(proc_path, param.monitor);
+	f = fopen(filename, "r");
+	if (!f)
+		return false;
+	fclose(f);
+	return true;
+}
+static int monitor_dimm_event(struct ndctl_ctx *ctx)
+{
+	int rc, maxfd, num_dimm;
+	struct monitor_dimm *m_dimm;
+	char buf[BUF_SIZE];
+	m_dimm = calloc(NUM_MAX_DIMM, sizeof(struct monitor_dimm));
+	if (!m_dimm) {
+		error("monitor_dimm memory space cannot be allocated\n");
+		goto out;
+	}
+
+	fd_set fds;
+	FD_ZERO(&fds);
+
+	num_dimm = set_monitor_dimm(ctx, &fds, &maxfd, m_dimm);
+	if (num_dimm == 0) {
+		error("no monitor dimms can be found\n");
+		goto out;
+	}
+
+
+	if (daemon(0, 0) != 0) {
+		err(ctx, "daemon start failed\n");
+		goto out;
+	}
+
+	printf("ndctl create-monitor %s started\n", param.monitor);
+	if (create_monitor_proc(ctx) != 0) {
+		err(ctx, "daemon start failed\n");
+		goto out;
+	}
+
+	while(1){
+		rc = select(maxfd + 1, NULL, NULL, &fds, NULL);
+		if (rc < 1) {
+			if (rc == 0)
+				err(ctx, "select unexpected timeout\n");
+			else
+				err(ctx, "select %s\n", strerror(errno));
+			goto out_clean;
+		}
+		for (int i = 0; i < num_dimm; i++) {
+			if (!FD_ISSET(m_dimm[i].health_eventfd, &fds)){
+				FD_SET(m_dimm[i].health_eventfd, &fds);
+				continue;
+			}
+			if (notify_json_msg(ctx, &m_dimm[i]) != 0)
+				goto out_clean;
+			pread(m_dimm[i].health_eventfd, buf, sizeof(buf), 0);
+		}
+	}
+	free(m_dimm);
+	return 0;
+out:
+	printf("ndctl create-monitor %s failed\n", param.monitor);
+	return 1;
+
+out_clean:
+	err(ctx, "ndctl monitor %s stoped\n", param.monitor);
+	if (destroy_monitor_proc() != 0)
+		err(ctx, "failed to clean temp file");
+	return 1;
+
+}
+
+int cmd_create_monitor(int argc, const char **argv, void *ctx)
+{
+	const struct option options[] = {
+		OPT_STRING('b', "bus", &param.bus, "bus-id", "filter by bus"),
+		OPT_STRING('r', "region", &param.region, "region-id",
+				"filter by region"),
+		OPT_STRING('d', "dimm", &param.dimm, "dimm-id",
+				"filter by dimm"),
+		OPT_STRING('n', "namespace", &param.namespace,
+				"namespace-id", "filter by namespace id"),
+		OPT_STRING('e', "event", &param.event, "event-id",
+				"filter by event"),
+		OPT_STRING('o', "output", &param.output, "output file name",
+				"monitor output file name"),
+		OPT_STRING('m', "monitor", &param.monitor, "monitor name",
+				"monitor name")
+	};
+	const char * const u[] = {
+		"ndctl create-monitor <monitor> <output> [<options>]",
+		NULL
+	};
+	argc = parse_options(argc, argv, options, u, 0);
+	for (int i = 0; i < argc; i++) {
+		error("unknown parameter \"%s\"\n", argv[i]);
+		goto out;
+	}
+	if (!param.monitor) {
+		error("monitor name --monitor is required\n");
+		goto out;
+	}
+	if (check_monitor_exist()) {
+		error("monitor %s is exist\n", param.monitor);
+		goto out;
+	}
+	if (!param.output) {
+		error("output file name --output is required\n");
+		goto out;
+	}
+
+	ndctl_set_log_priority((struct ndctl_ctx*)ctx, LOG_NOTICE);
+
+	if (strcmp(param.output, "syslog") == 0)
+		ndctl_set_log_fn((struct ndctl_ctx*)ctx, log_syslog);
+	else
+		ndctl_set_log_fn((struct ndctl_ctx*)ctx, log_output);
+
+	if (monitor_dimm_event((struct ndctl_ctx*)ctx) != 0)
+		goto out;
+
+	char *filename;
+	filename = get_full_path_filename(proc_path, param.monitor);
+	if (remove(filename) != 0) {
+		err((struct ndctl_ctx*)ctx, "failed to delete %s\n", filename);
+		goto out;
+	}
+	return 0;
+
+out:
+	return 1;
+}
diff --git a/ndctl/ndctl.c b/ndctl/ndctl.c
index 0f748e1..6c63d79 100644
--- a/ndctl/ndctl.c
+++ b/ndctl/ndctl.c
@@ -84,6 +84,7 @@ static struct cmd_struct commands[] = {
 	{ "init-labels", cmd_init_labels },
 	{ "check-labels", cmd_check_labels },
 	{ "inject-error", cmd_inject_error },
+	{ "create-monitor", cmd_create_monitor },
 	{ "list", cmd_list },
 	{ "help", cmd_help },
 	#ifdef ENABLE_TEST
-- 
2.9.5


_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v3 3/5] ndctl: monitor: add ndclt list-monitor command
  2018-02-09  8:02 [RFC PATCH v3 0/5] ndctl: monitor: monitor the smart events of QI Fuli
  2018-02-09  8:02 ` [RFC PATCH v3 1/5] ndctl: nvdimmd: add LOG_NOTICE level to QI Fuli
  2018-02-09  8:02 ` [RFC PATCH v3 2/5] ndctl: monitor: add ndctl create-monitor command QI Fuli
@ 2018-02-09  8:02 ` QI Fuli
  2018-02-09  8:02 ` [RFC PATCH v3 4/5] ndctl: monitor: add ndclt show-monitor command QI Fuli
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 21+ messages in thread
From: QI Fuli @ 2018-02-09  8:02 UTC (permalink / raw)
  To: linux-nvdimm

This patch adds $ndctl list-monitor command, by which users can list all
currently running monitors.
Example: $ndctl list-monitor --all

Signed-off-by: QI Fuli <qi.fuli@jp.fujitsu.com>
---
 builtin.h       |  1 +
 ndctl/monitor.c | 32 ++++++++++++++++++++++++++++++++
 ndctl/ndctl.c   |  1 +
 3 files changed, 34 insertions(+)

diff --git a/builtin.h b/builtin.h
index 850f6a8..eda5c7a 100644
--- a/builtin.h
+++ b/builtin.h
@@ -37,6 +37,7 @@ int cmd_init_labels(int argc, const char **argv, void *ctx);
 int cmd_check_labels(int argc, const char **argv, void *ctx);
 int cmd_inject_error(int argc, const char **argv, void *ctx);
 int cmd_create_monitor(int argc, const char **argv, void *ctx);
+int cmd_list_monitor(int argc, const char **argv, void *ctx);
 int cmd_list(int argc, const char **argv, void *ctx);
 #ifdef ENABLE_TEST
 int cmd_test(int argc, const char **argv, void *ctx);
diff --git a/ndctl/monitor.c b/ndctl/monitor.c
index cf1cd6e..53b7c67 100644
--- a/ndctl/monitor.c
+++ b/ndctl/monitor.c
@@ -340,3 +340,35 @@ int cmd_create_monitor(int argc, const char **argv, void *ctx)
 out:
 	return 1;
 }
+
+int cmd_list_monitor(int argc, const char **argv, void *ctx)
+{
+	const struct option options[] = {
+		OPT_BOOLEAN('a', "all", &param.all, "list all monitors")
+	};
+	const char * const u[] = {
+		"ndctl list-monitor [<options>]",
+		NULL
+	};
+	argc = parse_options(argc, argv, options, u, 0);
+	for (int i = 0; i < argc; i++) {
+		error("unknown parameter \"%s\"\n", argv[i]);
+		goto out;
+	}
+	DIR *dir;
+	struct dirent *ent;
+	dir = opendir("/var/ndctl/monitor/");
+	if (!dir)
+		return -1;
+	while ((ent = readdir(dir)) != NULL)
+	{
+		if (strcmp(ent->d_name, ".") == 0
+				|| strcmp(ent->d_name, "..") == 0)
+			continue;
+		printf(" %s\n", ent->d_name);
+	}
+	return 0;
+
+out:
+	return 1;
+}
diff --git a/ndctl/ndctl.c b/ndctl/ndctl.c
index 6c63d79..7eae794 100644
--- a/ndctl/ndctl.c
+++ b/ndctl/ndctl.c
@@ -85,6 +85,7 @@ static struct cmd_struct commands[] = {
 	{ "check-labels", cmd_check_labels },
 	{ "inject-error", cmd_inject_error },
 	{ "create-monitor", cmd_create_monitor },
+	{ "list-monitor", cmd_list_monitor },
 	{ "list", cmd_list },
 	{ "help", cmd_help },
 	#ifdef ENABLE_TEST
-- 
2.9.5


_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v3 4/5] ndctl: monitor: add ndclt show-monitor command
  2018-02-09  8:02 [RFC PATCH v3 0/5] ndctl: monitor: monitor the smart events of QI Fuli
                   ` (2 preceding siblings ...)
  2018-02-09  8:02 ` [RFC PATCH v3 3/5] ndctl: monitor: add ndclt list-monitor command QI Fuli
@ 2018-02-09  8:02 ` QI Fuli
  2018-02-09  8:02 ` [RfC PATCH v3 5/5] ndctl: monitor: add ndclt destroy-monitor command QI Fuli
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 21+ messages in thread
From: QI Fuli @ 2018-02-09  8:02 UTC (permalink / raw)
  To: linux-nvdimm

This patch is used to add $ndctl show-monitor command, by which users
can check the specific of a monitor.
Example: $ndctl show-monitor --monitor m_nmem1

Signed-off-by: QI Fuli <qi.fuli@jp.fujitsu.com>
---
 builtin.h       |  1 +
 ndctl/monitor.c | 35 +++++++++++++++++++++++++++++++++++
 ndctl/ndctl.c   |  1 +
 3 files changed, 37 insertions(+)

diff --git a/builtin.h b/builtin.h
index eda5c7a..b5a006d 100644
--- a/builtin.h
+++ b/builtin.h
@@ -38,6 +38,7 @@ int cmd_check_labels(int argc, const char **argv, void *ctx);
 int cmd_inject_error(int argc, const char **argv, void *ctx);
 int cmd_create_monitor(int argc, const char **argv, void *ctx);
 int cmd_list_monitor(int argc, const char **argv, void *ctx);
+int cmd_show_monitor(int argc, const char **argv, void *ctx);
 int cmd_list(int argc, const char **argv, void *ctx);
 #ifdef ENABLE_TEST
 int cmd_test(int argc, const char **argv, void *ctx);
diff --git a/ndctl/monitor.c b/ndctl/monitor.c
index 53b7c67..1cff2d8 100644
--- a/ndctl/monitor.c
+++ b/ndctl/monitor.c
@@ -372,3 +372,38 @@ int cmd_list_monitor(int argc, const char **argv, void *ctx)
 out:
 	return 1;
 }
+
+int cmd_show_monitor(int argc, const char **argv, void *ctx)
+{
+	const struct option options[] = {
+		OPT_STRING('m', "monitor", &param.monitor, "monitor name",
+				"monitor name")
+	};
+	const char * const u[] = {
+		"ndctl show-monitor <monitor>",
+		NULL
+	};
+	argc = parse_options(argc, argv, options, u, 0);
+	for (int i = 0; i < argc; i++) {
+		error("unknown parameter \"%s\"\n", argv[i]);
+		goto out;
+	}
+	if (!param.monitor) {
+		error("monitor name --monitor is required\n");
+		goto out;
+	}
+
+	char *filename;
+	filename = get_full_path_filename(proc_path, param.monitor);
+	struct json_object *jmonitors = json_object_from_file(filename);
+	if (jmonitors == NULL) {
+		error("monitor %s is not exist\n", param.monitor);
+		goto out;
+	}
+
+	printf("%s\n", json_object_to_json_string_ext(jmonitors,
+				JSON_C_TO_STRING_PRETTY));
+	return 0;
+out:
+	return 1;
+}
diff --git a/ndctl/ndctl.c b/ndctl/ndctl.c
index 7eae794..460cf76 100644
--- a/ndctl/ndctl.c
+++ b/ndctl/ndctl.c
@@ -86,6 +86,7 @@ static struct cmd_struct commands[] = {
 	{ "inject-error", cmd_inject_error },
 	{ "create-monitor", cmd_create_monitor },
 	{ "list-monitor", cmd_list_monitor },
+	{ "show-monitor", cmd_show_monitor },
 	{ "list", cmd_list },
 	{ "help", cmd_help },
 	#ifdef ENABLE_TEST
-- 
2.9.5


_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RfC PATCH v3 5/5] ndctl: monitor: add ndclt destroy-monitor command
  2018-02-09  8:02 [RFC PATCH v3 0/5] ndctl: monitor: monitor the smart events of QI Fuli
                   ` (3 preceding siblings ...)
  2018-02-09  8:02 ` [RFC PATCH v3 4/5] ndctl: monitor: add ndclt show-monitor command QI Fuli
@ 2018-02-09  8:02 ` QI Fuli
  2018-02-09 18:06 ` [RFC PATCH v3 0/5] ndctl: monitor: monitor the smart events of Verma, Vishal L
  2018-02-10  4:06 ` Dan Williams
  6 siblings, 0 replies; 21+ messages in thread
From: QI Fuli @ 2018-02-09  8:02 UTC (permalink / raw)
  To: linux-nvdimm

This patch is used to add $ndctl destroy-monitor command, by which users
can destroy a monitor. After the monitor is destroyed, the file with
same name under /var/ndctl/monitor will also be deleted.
Example: #ndctl destroy-monitor --monitor m_nmem1

Signed-off-by: QI Fuli <qi.fuli@jp.fujitsu.com>
---
 builtin.h       |  1 +
 ndctl/monitor.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 ndctl/ndctl.c   |  1 +
 3 files changed, 56 insertions(+)

diff --git a/builtin.h b/builtin.h
index b5a006d..2143666 100644
--- a/builtin.h
+++ b/builtin.h
@@ -39,6 +39,7 @@ int cmd_inject_error(int argc, const char **argv, void *ctx);
 int cmd_create_monitor(int argc, const char **argv, void *ctx);
 int cmd_list_monitor(int argc, const char **argv, void *ctx);
 int cmd_show_monitor(int argc, const char **argv, void *ctx);
+int cmd_destroy_monitor(int argc, const char **argv, void *ctx);
 int cmd_list(int argc, const char **argv, void *ctx);
 #ifdef ENABLE_TEST
 int cmd_test(int argc, const char **argv, void *ctx);
diff --git a/ndctl/monitor.c b/ndctl/monitor.c
index 1cff2d8..563a726 100644
--- a/ndctl/monitor.c
+++ b/ndctl/monitor.c
@@ -407,3 +407,57 @@ int cmd_show_monitor(int argc, const char **argv, void *ctx)
 out:
 	return 1;
 }
+
+int cmd_destroy_monitor(int argc, const char **argv, void *ctx)
+{
+	const struct option options[] = {
+		OPT_STRING('m', "monitor", &param.monitor, "monitor name",
+				"monitor name")
+	};
+	const char * const u[] = {
+		"ndctl destroy-monitor <monitor>",
+		NULL
+	};
+	argc = parse_options(argc, argv, options, u, 0);
+	for (int i = 0; i < argc; i++) {
+		error("unknown parameter \"%s\"\n", argv[i]);
+		goto out;
+	}
+	if (!param.monitor) {
+		error("monitor name --monitor is required\n");
+		goto out;
+	}
+
+	char *filename;
+	struct json_object *jmonitors, *jpid;
+	int pid;
+	filename = get_full_path_filename(proc_path, param.monitor);
+	jmonitors = json_object_from_file(filename);
+	if (jmonitors == NULL) {
+		error("monitor %s is not exist\n", param.monitor);
+		goto out;
+	}
+
+	if (!json_object_object_get_ex(jmonitors, "pid", &jpid)) {
+		error("cannot find pid in %s\n", filename);
+		goto out;
+	}
+	pid = json_object_get_int(jpid);
+	if (!pid) {
+		error("cannot find pid in %s\n", filename);
+		goto out;
+	}
+	if (kill(pid, SIGUSR1) != 0) {
+		error("failed to kill pid(%d)\n", pid);
+		goto out;
+	}
+	if (destroy_monitor_proc() != 0) {
+		error("failed to delete %s\n", filename);
+		goto out;
+	}
+	printf("destroy monitor %s successed\n", param.monitor);
+
+	return 0;
+out:
+	return 1;
+}
diff --git a/ndctl/ndctl.c b/ndctl/ndctl.c
index 460cf76..ad69222 100644
--- a/ndctl/ndctl.c
+++ b/ndctl/ndctl.c
@@ -87,6 +87,7 @@ static struct cmd_struct commands[] = {
 	{ "create-monitor", cmd_create_monitor },
 	{ "list-monitor", cmd_list_monitor },
 	{ "show-monitor", cmd_show_monitor },
+	{ "destroy-monitor", cmd_destroy_monitor },
 	{ "list", cmd_list },
 	{ "help", cmd_help },
 	#ifdef ENABLE_TEST
-- 
2.9.5


_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v3 0/5] ndctl: monitor: monitor the smart events of
  2018-02-09  8:02 [RFC PATCH v3 0/5] ndctl: monitor: monitor the smart events of QI Fuli
                   ` (4 preceding siblings ...)
  2018-02-09  8:02 ` [RfC PATCH v3 5/5] ndctl: monitor: add ndclt destroy-monitor command QI Fuli
@ 2018-02-09 18:06 ` Verma, Vishal L
  2018-02-13  1:51   ` Qi, Fuli
  2018-02-10  4:06 ` Dan Williams
  6 siblings, 1 reply; 21+ messages in thread
From: Verma, Vishal L @ 2018-02-09 18:06 UTC (permalink / raw)
  To: linux-nvdimm, qi.fuli


On Fri, 2018-02-09 at 17:02 +0900, QI Fuli wrote:
> This is a patch set of ndctl monitor, a tiny daemon to monitor the
> smart
> events of nvdimm dimms. When a smart event fires, monitor will output
> the notification which including dimm health status to syslog or a
> special file according to users' configuration. The output
> notification
> follows json format and can be consumed by log collectors like
> Fluentd.
> 
> Currently, I implemeted the following four commands to control
> monitor daemon.
> $ndctl create-monitor
> $ndctl list-monitor
> $ndctl show-monitor
> $ndclt destroy-monitor
> 
> I will appreciate if you could give some comments.
> 
> Change log since v2:
>  - Changing the interface of daemon to the ndctl command line
>  - Changing the name of daemon form "nvdimmd" to "monitor"
>  - Removing the config file, unit_file, nvdimmd dir
>  - Removing nvdimmd_test program
>  - Adding ndctl/monitor.c
> 
> Change log since v1:
>  - Adding a config file(/etc/nvdimmd/nvdimmd.conf)
>  - Using struct log_ctx instead of syslog()
>     - Using log_syslog() to save the notify messages to syslog
>     - Using log_file() to save the notify messages to special file
>  - Adding LOG_NOTICE level to log_priority
>  - Using automake instead of Makefile
>  - Adding a new util file(nvdimmd/util.c) including helper functions
>    needed for nvdimm daemon
>  - Adding nvdimmd_test program
> 
> QI Fuli (5):
>   ndctl: monitor: add LOG_NOTICE level to log_priority
>   ndctl: monitor: add ndclt create-monitor command
>   ndctl: monitor: add ndclt list-monitor command
>   ndctl: monitor: add ndclt show-monitor command
>   ndctl: monitor: add ndclt destroy-monitor command
                        ^
I haven't had a chance to look at the rest of the series, but spotted a
quick typo - this is in the subject line of each commit as well:
s/ndclt/ndctl/

> 
>  builtin.h         |   4 +
>  configure.ac      |   3 +
>  ndctl/Makefile.am |   3 +-
>  ndctl/monitor.c   | 463
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  ndctl/ndctl.c     |   4 +
>  util/log.c        |   2 +
>  util/log.h        |   3 +
>  7 files changed, 481 insertions(+), 1 deletion(-)
>  create mode 100644 ndctl/monitor.c
> 
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v3 0/5] ndctl: monitor: monitor the smart events of
  2018-02-09  8:02 [RFC PATCH v3 0/5] ndctl: monitor: monitor the smart events of QI Fuli
                   ` (5 preceding siblings ...)
  2018-02-09 18:06 ` [RFC PATCH v3 0/5] ndctl: monitor: monitor the smart events of Verma, Vishal L
@ 2018-02-10  4:06 ` Dan Williams
  2018-02-13  1:54   ` Qi, Fuli
  6 siblings, 1 reply; 21+ messages in thread
From: Dan Williams @ 2018-02-10  4:06 UTC (permalink / raw)
  To: QI Fuli; +Cc: linux-nvdimm

On Fri, Feb 9, 2018 at 12:02 AM, QI Fuli <qi.fuli@jp.fujitsu.com> wrote:
> This is a patch set of ndctl monitor, a tiny daemon to monitor the smart
> events of nvdimm dimms. When a smart event fires, monitor will output
> the notification which including dimm health status to syslog or a
> special file according to users' configuration. The output notification
> follows json format and can be consumed by log collectors like Fluentd.
>
> Currently, I implemeted the following four commands to control monitor daemon.
> $ndctl create-monitor
> $ndctl list-monitor
> $ndctl show-monitor
> $ndclt destroy-monitor
>
> I will appreciate if you could give some comments.

Nice to see this new update.

I have a high level comment on the proposed command structure, as far
as I can see the only new command we need is:

   ndctl monitor

...that launches a monitor. We can have options to control whether
that monitor is a one-shot check of events sources, or a daemon that
forks into the background, but the 'create', 'list', 'show', and
'destroy' are the responsibility of systemd or System V services. For
example:

    create: systemctl start ndctl-monitor
    show: systemctl status ndctl-monitor
    destroy: systemctl stop ndctl-monitor
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v3 1/5] ndctl: nvdimmd: add LOG_NOTICE level to
  2018-02-09  8:02 ` [RFC PATCH v3 1/5] ndctl: nvdimmd: add LOG_NOTICE level to QI Fuli
@ 2018-02-11 20:18   ` Dan Williams
  2018-02-11 20:20     ` Dan Williams
  0 siblings, 1 reply; 21+ messages in thread
From: Dan Williams @ 2018-02-11 20:18 UTC (permalink / raw)
  To: QI Fuli; +Cc: linux-nvdimm

On Fri, Feb 9, 2018 at 12:02 AM, QI Fuli <qi.fuli@jp.fujitsu.com> wrote:
> This patch adds LOG_NOTICE level to log_priority for the notifications
> of nvdimm dimm over threshold event. Currently, the LOG_INFO level
> is too low, and the notifications are not up to LOG_ERROR level.
>

Looks good, applied.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v3 1/5] ndctl: nvdimmd: add LOG_NOTICE level to
  2018-02-11 20:18   ` Dan Williams
@ 2018-02-11 20:20     ` Dan Williams
  0 siblings, 0 replies; 21+ messages in thread
From: Dan Williams @ 2018-02-11 20:20 UTC (permalink / raw)
  To: QI Fuli; +Cc: linux-nvdimm

On Sun, Feb 11, 2018 at 12:18 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Fri, Feb 9, 2018 at 12:02 AM, QI Fuli <qi.fuli@jp.fujitsu.com> wrote:
>> This patch adds LOG_NOTICE level to log_priority for the notifications
>> of nvdimm dimm over threshold event. Currently, the LOG_INFO level
>> is too low, and the notifications are not up to LOG_ERROR level.
>>
>
> Looks good, applied.

Although, this patch has little to do with nvdimmd, so I'm changing
the subject to:

"ndctl, log: add support for LOG_NOTICE messages"
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v3 2/5] ndctl: monitor: add ndctl create-monitor command
  2018-02-09  8:02 ` [RFC PATCH v3 2/5] ndctl: monitor: add ndctl create-monitor command QI Fuli
@ 2018-02-11 20:48   ` Dan Williams
  2018-02-13  0:54     ` Verma, Vishal L
  2018-02-13  9:58     ` Yasunori Goto
  0 siblings, 2 replies; 21+ messages in thread
From: Dan Williams @ 2018-02-11 20:48 UTC (permalink / raw)
  To: QI Fuli; +Cc: linux-nvdimm

On Fri, Feb 9, 2018 at 12:02 AM, QI Fuli <qi.fuli@jp.fujitsu.com> wrote:
> This patch is used to add $ndctl create-monitor command, by which users can
> create a new monitor. Users can select the DIMMS to be monitored by using
> [--dimm] [--bus] [--region] [--namespace] options. The notifications can
> be outputed to a special file or syslog by using [--output] option, the
> special file will be placed under /var/log/ndctl. A name is also required for
> a monitor,so users can destroy the monitor by the name. When a monitor is
> created successfully, a file with same name will be created under
> /var/ndctl/monitor.
> Example:
> #ndctl create-monitor --monitor m_nmem1 --dimm nmem1 --output m_nmem.log

Hi Qi,

This is getting closer to where I want to see this go, but still some
architecture details to incorporate. I mentioned on the cover letter
that systemd can handle starting, stopping, and show the status of the
monitor. The other detail to incorporate is that monitor events can
come DIMMs, but also namespaces, regions, and the bus.

The event list I have collected to date is:

dimm-spares-remaining
dimm-media-temperature
dimm-controller-temperature
dimm-health-state
dimm-unclean-shutdown
dimm-detected
namespace-media-error
namespace-detected
region-media-error
region-detected
bus-media-error
bus-address-range-scrub-complete
bus-detected

...and I think all of those should be separate options, probably
something like the following, but I'd Vishal to comment if this scheme
can be handled with the bash tab-completion implementation:

   ndctl monitor --dimm-events=spares-remaining,media-temperature
--namespace-events=all --regions-events --bus=ACPI.NFIT

...where an empty --<object>-events option is equivalent to
--<object>-events=all. Also, similar to "ndctl list" specifying
specific buses, namespaces, etc causes the monitor to filter the
objects based on those properties.

Since "ndctl list" already has this filtering implemented I'd like to
see it refactored and shared between the 2 implementations rather than
duplicated as is done in this patch. In other words rework cmd_list()
into a generic nvdimm object walking routine with callback functions
to 'list' or 'monitor' a given object that matches the filter.

Let me know if the above makes sense. I'm thinking the 'ndctl list'
refactoring might be something I need to handle.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v3 2/5] ndctl: monitor: add ndctl create-monitor command
  2018-02-11 20:48   ` Dan Williams
@ 2018-02-13  0:54     ` Verma, Vishal L
  2018-02-15  5:51       ` Qi, Fuli
  2018-02-13  9:58     ` Yasunori Goto
  1 sibling, 1 reply; 21+ messages in thread
From: Verma, Vishal L @ 2018-02-13  0:54 UTC (permalink / raw)
  To: Williams, Dan J, qi.fuli; +Cc: linux-nvdimm

On Sun, 2018-02-11 at 12:48 -0800, Dan Williams wrote:
> On Fri, Feb 9, 2018 at 12:02 AM, QI Fuli <qi.fuli@jp.fujitsu.com>
> wrote:
> > This patch is used to add $ndctl create-monitor command, by which
> > users can
> > create a new monitor. Users can select the DIMMS to be monitored by
> > using
> > [--dimm] [--bus] [--region] [--namespace] options. The
> > notifications can
> > be outputed to a special file or syslog by using [--output] option,
> > the
> > special file will be placed under /var/log/ndctl. A name is also
> > required for
> > a monitor,so users can destroy the monitor by the name. When a
> > monitor is
> > created successfully, a file with same name will be created under
> > /var/ndctl/monitor.
> > Example:
> > #ndctl create-monitor --monitor m_nmem1 --dimm nmem1 --output
> > m_nmem.log
> 
> Hi Qi,
> 
> This is getting closer to where I want to see this go, but still some
> architecture details to incorporate. I mentioned on the cover letter
> that systemd can handle starting, stopping, and show the status of
> the
> monitor. The other detail to incorporate is that monitor events can
> come DIMMs, but also namespaces, regions, and the bus.
> 
> The event list I have collected to date is:
> 
> dimm-spares-remaining
> dimm-media-temperature
> dimm-controller-temperature
> dimm-health-state
> dimm-unclean-shutdown
> dimm-detected
> namespace-media-error
> namespace-detected
> region-media-error
> region-detected
> bus-media-error
> bus-address-range-scrub-complete
> bus-detected
> 
> ...and I think all of those should be separate options, probably
> something like the following, but I'd Vishal to comment if this
> scheme
> can be handled with the bash tab-completion implementation:
> 
>    ndctl monitor --dimm-events=spares-remaining,media-temperature
> --namespace-events=all --regions-events --bus=ACPI.NFIT

Yes I think we should be able to extend bash completion for a comma
separated list of arguments.

> 
> ...where an empty --<object>-events option is equivalent to
> --<object>-events=all. Also, similar to "ndctl list" specifying
> specific buses, namespaces, etc causes the monitor to filter the
> objects based on those properties.
> 
> Since "ndctl list" already has this filtering implemented I'd like to
> see it refactored and shared between the 2 implementations rather
> than
> duplicated as is done in this patch. In other words rework cmd_list()
> into a generic nvdimm object walking routine with callback functions
> to 'list' or 'monitor' a given object that matches the filter.
> 
> Let me know if the above makes sense. I'm thinking the 'ndctl list'
> refactoring might be something I need to handle.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v3 0/5] ndctl: monitor: monitor the smart events of
  2018-02-09 18:06 ` [RFC PATCH v3 0/5] ndctl: monitor: monitor the smart events of Verma, Vishal L
@ 2018-02-13  1:51   ` Qi, Fuli
  0 siblings, 0 replies; 21+ messages in thread
From: Qi, Fuli @ 2018-02-13  1:51 UTC (permalink / raw)
  To: Verma, Vishal L, linux-nvdimm



On 2018/02/10 3:06, Verma, Vishal L wrote:
> On Fri, 2018-02-09 at 17:02 +0900, QI Fuli wrote:
>> This is a patch set of ndctl monitor, a tiny daemon to monitor the
>> smart
>> events of nvdimm dimms. When a smart event fires, monitor will output
>> the notification which including dimm health status to syslog or a
>> special file according to users' configuration. The output
>> notification
>> follows json format and can be consumed by log collectors like
>> Fluentd.
>>
>> Currently, I implemeted the following four commands to control
>> monitor daemon.
>> $ndctl create-monitor
>> $ndctl list-monitor
>> $ndctl show-monitor
>> $ndclt destroy-monitor
>>
>> I will appreciate if you could give some comments.
>>
>> Change log since v2:
>>   - Changing the interface of daemon to the ndctl command line
>>   - Changing the name of daemon form "nvdimmd" to "monitor"
>>   - Removing the config file, unit_file, nvdimmd dir
>>   - Removing nvdimmd_test program
>>   - Adding ndctl/monitor.c
>>
>> Change log since v1:
>>   - Adding a config file(/etc/nvdimmd/nvdimmd.conf)
>>   - Using struct log_ctx instead of syslog()
>>      - Using log_syslog() to save the notify messages to syslog
>>      - Using log_file() to save the notify messages to special file
>>   - Adding LOG_NOTICE level to log_priority
>>   - Using automake instead of Makefile
>>   - Adding a new util file(nvdimmd/util.c) including helper functions
>>     needed for nvdimm daemon
>>   - Adding nvdimmd_test program
>>
>> QI Fuli (5):
>>    ndctl: monitor: add LOG_NOTICE level to log_priority
>>    ndctl: monitor: add ndclt create-monitor command
>>    ndctl: monitor: add ndclt list-monitor command
>>    ndctl: monitor: add ndclt show-monitor command
>>    ndctl: monitor: add ndclt destroy-monitor command
>                          ^
> I haven't had a chance to look at the rest of the series, but spotted a
> quick typo - this is in the subject line of each commit as well:
> s/ndclt/ndctl/
Thank you for your comment, I will be more careful next time.
>
>>   builtin.h         |   4 +
>>   configure.ac      |   3 +
>>   ndctl/Makefile.am |   3 +-
>>   ndctl/monitor.c   | 463
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   ndctl/ndctl.c     |   4 +
>>   util/log.c        |   2 +
>>   util/log.h        |   3 +
>>   7 files changed, 481 insertions(+), 1 deletion(-)
>>   create mode 100644 ndctl/monitor.c
>>


_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v3 0/5] ndctl: monitor: monitor the smart events of
  2018-02-10  4:06 ` Dan Williams
@ 2018-02-13  1:54   ` Qi, Fuli
  0 siblings, 0 replies; 21+ messages in thread
From: Qi, Fuli @ 2018-02-13  1:54 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-nvdimm

On 2018/02/10 13:06, Dan Williams wrote:

> On Fri, Feb 9, 2018 at 12:02 AM, QI Fuli <qi.fuli@jp.fujitsu.com> wrote:
>> This is a patch set of ndctl monitor, a tiny daemon to monitor the smart
>> events of nvdimm dimms. When a smart event fires, monitor will output
>> the notification which including dimm health status to syslog or a
>> special file according to users' configuration. The output notification
>> follows json format and can be consumed by log collectors like Fluentd.
>>
>> Currently, I implemeted the following four commands to control monitor daemon.
>> $ndctl create-monitor
>> $ndctl list-monitor
>> $ndctl show-monitor
>> $ndclt destroy-monitor
>>
>> I will appreciate if you could give some comments.
> Nice to see this new update.
>
> I have a high level comment on the proposed command structure, as far
> as I can see the only new command we need is:
>
>     ndctl monitor
>
> ...that launches a monitor. We can have options to control whether
> that monitor is a one-shot check of events sources, or a daemon that
> forks into the background, but the 'create', 'list', 'show', and
> 'destroy' are the responsibility of systemd or System V services. For
> example:
>
>      create: systemctl start ndctl-monitor
>      show: systemctl status ndctl-monitor
>      destroy: systemctl stop ndctl-monitor
>
>
Thank you for your comments, I will use systemd in next version.


_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v3 2/5] ndctl: monitor: add ndctl create-monitor command
  2018-02-11 20:48   ` Dan Williams
  2018-02-13  0:54     ` Verma, Vishal L
@ 2018-02-13  9:58     ` Yasunori Goto
  2018-02-13 10:12       ` Yasunori Goto
  2018-02-17  0:54       ` Dan Williams
  1 sibling, 2 replies; 21+ messages in thread
From: Yasunori Goto @ 2018-02-13  9:58 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-nvdimm

Hi,

> On Fri, Feb 9, 2018 at 12:02 AM, QI Fuli <qi.fuli@jp.fujitsu.com> wrote:
> > This patch is used to add $ndctl create-monitor command, by which users can
> > create a new monitor. Users can select the DIMMS to be monitored by using
> > [--dimm] [--bus] [--region] [--namespace] options. The notifications can
> > be outputed to a special file or syslog by using [--output] option, the
> > special file will be placed under /var/log/ndctl. A name is also required for
> > a monitor,so users can destroy the monitor by the name. When a monitor is
> > created successfully, a file with same name will be created under
> > /var/ndctl/monitor.
> > Example:
> > #ndctl create-monitor --monitor m_nmem1 --dimm nmem1 --output m_nmem.log
> 
> Hi Qi,
> 
> This is getting closer to where I want to see this go, but still some
> architecture details to incorporate. I mentioned on the cover letter
> that systemd can handle starting, stopping, and show the status of the
> monitor. The other detail to incorporate is that monitor events can
> come DIMMs, but also namespaces, regions, and the bus.
> 
> The event list I have collected to date is:
> 
> dimm-spares-remaining
> dimm-media-temperature
> dimm-controller-temperature
> dimm-health-state
> dimm-unclean-shutdown
> dimm-detected
> namespace-media-error
> namespace-detected
> region-media-error
> region-detected
> bus-media-error
> bus-address-range-scrub-complete
> bus-detected
> 
> ...and I think all of those should be separate options, probably
> something like the following, but I'd Vishal to comment if this scheme
> can be handled with the bash tab-completion implementation:
> 
>    ndctl monitor --dimm-events=spares-remaining,media-temperature
> --namespace-events=all --regions-events --bus=ACPI.NFIT
> 
> ...where an empty --<object>-events option is equivalent to
> --<object>-events=all. Also, similar to "ndctl list" specifying
> specific buses, namespaces, etc causes the monitor to filter the
> objects based on those properties.

Hmmmm....

Currently, I'm confusing what features/options are required for nvdimm daemon.
For example, what is use-case of "--bus=ACPI.NFIT"? 

For normal administorator of a server, what he/she's interest is 
"need to replace nvdimm module or not", and "need to backup/restore
on the nvdimm module or not". 

For normal programs, they just use device name or directory/filename of
the filesystem on the nvdimm.

To backup thier data, he/she need to solve relationship between
nvdimm modules and device name (/dev/pmem* or /dev/dax*).

So, IMHO, I suppose "namespace(device name) specifying (or all namespace)"
is enough the following events which requires replace the nvdimm module.

 - spare-remaining
 - helth-state
 - media-error

And I'm not sure what is use-case of specifying region, bus, and dimm
on these events.


In addition, could you tell me what administrator/program can do
on the following events? What nvdimm daemon should do on each event?

  - media-temperature
  - controller-temperature
  - address-range-scrub-complete
  - unclean-shutdown


Thanks,

> 
> Since "ndctl list" already has this filtering implemented I'd like to
> see it refactored and shared between the 2 implementations rather than
> duplicated as is done in this patch. In other words rework cmd_list()
> into a generic nvdimm object walking routine with callback functions
> to 'list' or 'monitor' a given object that matches the filter.
> 
> Let me know if the above makes sense. I'm thinking the 'ndctl list'
> refactoring might be something I need to handle.
> _______________________________________________
> Linux-nvdimm mailing list
> Linux-nvdimm@lists.01.org
> https://lists.01.org/mailman/listinfo/linux-nvdimm



_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v3 2/5] ndctl: monitor: add ndctl create-monitor command
  2018-02-13  9:58     ` Yasunori Goto
@ 2018-02-13 10:12       ` Yasunori Goto
  2018-02-17  1:00         ` Dan Williams
  2018-02-17  0:54       ` Dan Williams
  1 sibling, 1 reply; 21+ messages in thread
From: Yasunori Goto @ 2018-02-13 10:12 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-nvdimm

> Hi,
> 
> > On Fri, Feb 9, 2018 at 12:02 AM, QI Fuli <qi.fuli@jp.fujitsu.com> wrote:
> > > This patch is used to add $ndctl create-monitor command, by which users can
> > > create a new monitor. Users can select the DIMMS to be monitored by using
> > > [--dimm] [--bus] [--region] [--namespace] options. The notifications can
> > > be outputed to a special file or syslog by using [--output] option, the
> > > special file will be placed under /var/log/ndctl. A name is also required for
> > > a monitor,so users can destroy the monitor by the name. When a monitor is
> > > created successfully, a file with same name will be created under
> > > /var/ndctl/monitor.
> > > Example:
> > > #ndctl create-monitor --monitor m_nmem1 --dimm nmem1 --output m_nmem.log
> > 
> > Hi Qi,
> > 
> > This is getting closer to where I want to see this go, but still some
> > architecture details to incorporate. I mentioned on the cover letter
> > that systemd can handle starting, stopping, and show the status of the
> > monitor. The other detail to incorporate is that monitor events can
> > come DIMMs, but also namespaces, regions, and the bus.
> > 
> > The event list I have collected to date is:
> > 
> > dimm-spares-remaining
> > dimm-media-temperature
> > dimm-controller-temperature
> > dimm-health-state
> > dimm-unclean-shutdown
> > dimm-detected
> > namespace-media-error
> > namespace-detected
> > region-media-error
> > region-detected
> > bus-media-error
> > bus-address-range-scrub-complete
> > bus-detected
> > 
> > ...and I think all of those should be separate options, probably
> > something like the following, but I'd Vishal to comment if this scheme
> > can be handled with the bash tab-completion implementation:
> > 
> >    ndctl monitor --dimm-events=spares-remaining,media-temperature
> > --namespace-events=all --regions-events --bus=ACPI.NFIT
> > 
> > ...where an empty --<object>-events option is equivalent to
> > --<object>-events=all. Also, similar to "ndctl list" specifying
> > specific buses, namespaces, etc causes the monitor to filter the
> > objects based on those properties.
> 
> Hmmmm....
> 
> Currently, I'm confusing what features/options are required for nvdimm daemon.
> For example, what is use-case of "--bus=ACPI.NFIT"? 
> 
> For normal administorator of a server, what he/she's interest is 
> "need to replace nvdimm module or not", and "need to backup/restore
> on the nvdimm module or not". 
> 
> For normal programs, they just use device name or directory/filename of
> the filesystem on the nvdimm.
> 
> To backup thier data, he/she need to solve relationship between
> nvdimm modules and device name (/dev/pmem* or /dev/dax*).
> 
> So, IMHO, I suppose "namespace(device name) specifying (or all namespace)"
> is enough the following events which requires replace the nvdimm module.
> 
>  - spare-remaining
>  - helth-state
>  - media-error
> 
> And I'm not sure what is use-case of specifying region, bus, and dimm
> on these events.
> 
> 
> In addition, could you tell me what administrator/program can do
> on the following events? What nvdimm daemon should do on each event?
> 
>   - media-temperature
>   - controller-temperature
>   - address-range-scrub-complete
>   - unclean-shutdown
> 

I would like to ask one more thing.

What should nvdimm daemon do on detected events of bus/region/namespace/dimm ?
IIRC, udev handles such hotplug events. 
What is relationship/roles between nvdimm daemon and udev?

Bye,

> 
> Thanks,
> 
> > 
> > Since "ndctl list" already has this filtering implemented I'd like to
> > see it refactored and shared between the 2 implementations rather than
> > duplicated as is done in this patch. In other words rework cmd_list()
> > into a generic nvdimm object walking routine with callback functions
> > to 'list' or 'monitor' a given object that matches the filter.
> > 
> > Let me know if the above makes sense. I'm thinking the 'ndctl list'
> > refactoring might be something I need to handle.
> > _______________________________________________
> > Linux-nvdimm mailing list
> > Linux-nvdimm@lists.01.org
> > https://lists.01.org/mailman/listinfo/linux-nvdimm
> 



_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v3 2/5] ndctl: monitor: add ndctl create-monitor command
  2018-02-13  0:54     ` Verma, Vishal L
@ 2018-02-15  5:51       ` Qi, Fuli
  2018-02-17  1:23         ` Dan Williams
  0 siblings, 1 reply; 21+ messages in thread
From: Qi, Fuli @ 2018-02-15  5:51 UTC (permalink / raw)
  To: Verma, Vishal L, Williams, Dan J; +Cc: linux-nvdimm


>> Hi Qi,
>>
>> This is getting closer to where I want to see this go, but still some
>> architecture details to incorporate. I mentioned on the cover letter
>> that systemd can handle starting, stopping, and show the status of
>> the
>> monitor. The other detail to incorporate is that monitor events can
>> come DIMMs, but also namespaces, regions, and the bus.
>>
>> The event list I have collected to date is:
>>
>> dimm-spares-remaining
>> dimm-media-temperature
>> dimm-controller-temperature
>> dimm-health-state
>> dimm-unclean-shutdown
>> dimm-detected
>> namespace-media-error
>> namespace-detected
>> region-media-error
>> region-detected
>> bus-media-error
>> bus-address-range-scrub-complete
>> bus-detected
By referring to dimm_listen() in smart-listen.c, I understand how to 
monitor events
come DIMMs. I have checked the ndctl, but I could not find how to 
monitor events
come namespaces, region and bus. Could you please mention more?
>> ...and I think all of those should be separate options, probably
>> something like the following, but I'd Vishal to comment if this
>> scheme
>> can be handled with the bash tab-completion implementation:
>>
>>     ndctl monitor --dimm-events=spares-remaining,media-temperature
>> --namespace-events=all --regions-events --bus=ACPI.NFIT
> Yes I think we should be able to extend bash completion for a comma
> separated list of arguments.
>
>> ...where an empty --<object>-events option is equivalent to
>> --<object>-events=all. Also, similar to "ndctl list" specifying
>> specific buses, namespaces, etc causes the monitor to filter the
>> objects based on those properties.
>>
>> Since "ndctl list" already has this filtering implemented I'd like to
>> see it refactored and shared between the 2 implementations rather
>> than
>> duplicated as is done in this patch. In other words rework cmd_list()
>> into a generic nvdimm object walking routine with callback functions
>> to 'list' or 'monitor' a given object that matches the filter.
>>
>> Let me know if the above makes sense. I'm thinking the 'ndctl list'
>> refactoring might be something I need to handle.
Yes, I agree with refactoring ndctl_list and add a structure shared between
ndctl list and ndctl monitor.
But I prefer to use tokens rather than callback functions.

I also think the option should support multiple parameters likes:
  ndclt monitor --dimm nmem1,nmem2 --region region1,region2

The parameters from the same option can be stored in a list_node.

#define filter_node(field) \
struct filter_##field##_node { \
         const char *name; \
         struct list_node list; \
};

filter_node(bus);
filter_node(dimm);
filter_node(region);
filter_node(namespace);

struct ndctl_filter {
         struct list_head filter_bus;
         struct list_head filter_dimm;
         struct list_head filter_region;
         struct list_head filter_namespace;
} nf;

Make the tokens like follows:

#define list_util_field_filter(field) \
list_for_each(&nf.filter_##field, filter_##field##_node, list) { \
         if (util_##field##_filter(field, filter_##field##_node->name)) { \
                 flag = true; \
                 break; \
         } \
}

#define list_util_bus_filter_by_field(field) \
list_for_each(&nf.filter_##field, filter_##field##_node, list) { \
         if (util_bus_filter_by_##field(bus, 
filter_##field##_node->name)) { \
                 flag = true; \
                 break; \
         } \
}

#define list_util_dimm_filter_by_field(field) \
list_for_each(&nf.filter_##field, filter_##field##_node, list) { \
         if (util_dimm_filter_by_##field(dimm, 
filter_##field##_node->name)) { \
                 flag = true; \
                 break; \
         } \
}

Here is how to use the tokens:

ndctl_bus_foreach(ctx, bus) {
         struct filter_bus_node *filter_bus_node;
         list_util_field_filter(bus);
         if (!flag) {
                 struct filter_region_node *filter_region_node;
                 list_util_bus_filter_by_field(region);
         }
         if (!flag) {
                 struct filter_namespace_node *filter_namespace_node;
                 list_util_bus_filter_by_field(namespace);
         }
         if (!flag)
                 continue;
         ndctl_dimm_foreach(bus, dimm) {
         flag = false;
         struct filter_dimm_node *filter_dimm_node;
                 list_util_field_filter(dimm);
                 if (!flag) {
                         struct filter_region_node *filter_region_node;
                         list_util_dimm_filter_by_field(region);
                 }
                 if (!flag) {
                         struct filter_namespace_node 
*filter_namespace_node;
                         list_util_dimm_filter_by_field(namespace);
                 }
                 if (!flag)
                         continue;
     }
}

To make sure the util_ndctl_filter() goes right, I would like you give 
some comments?

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v3 2/5] ndctl: monitor: add ndctl create-monitor command
  2018-02-13  9:58     ` Yasunori Goto
  2018-02-13 10:12       ` Yasunori Goto
@ 2018-02-17  0:54       ` Dan Williams
  1 sibling, 0 replies; 21+ messages in thread
From: Dan Williams @ 2018-02-17  0:54 UTC (permalink / raw)
  To: Yasunori Goto; +Cc: linux-nvdimm

On Tue, Feb 13, 2018 at 1:58 AM, Yasunori Goto <y-goto@jp.fujitsu.com> wrote:
> Hi,
>
>> On Fri, Feb 9, 2018 at 12:02 AM, QI Fuli <qi.fuli@jp.fujitsu.com> wrote:
>> > This patch is used to add $ndctl create-monitor command, by which users can
>> > create a new monitor. Users can select the DIMMS to be monitored by using
>> > [--dimm] [--bus] [--region] [--namespace] options. The notifications can
>> > be outputed to a special file or syslog by using [--output] option, the
>> > special file will be placed under /var/log/ndctl. A name is also required for
>> > a monitor,so users can destroy the monitor by the name. When a monitor is
>> > created successfully, a file with same name will be created under
>> > /var/ndctl/monitor.
>> > Example:
>> > #ndctl create-monitor --monitor m_nmem1 --dimm nmem1 --output m_nmem.log
>>
>> Hi Qi,
>>
>> This is getting closer to where I want to see this go, but still some
>> architecture details to incorporate. I mentioned on the cover letter
>> that systemd can handle starting, stopping, and show the status of the
>> monitor. The other detail to incorporate is that monitor events can
>> come DIMMs, but also namespaces, regions, and the bus.
>>
>> The event list I have collected to date is:
>>
>> dimm-spares-remaining
>> dimm-media-temperature
>> dimm-controller-temperature
>> dimm-health-state
>> dimm-unclean-shutdown
>> dimm-detected
>> namespace-media-error
>> namespace-detected
>> region-media-error
>> region-detected
>> bus-media-error
>> bus-address-range-scrub-complete
>> bus-detected
>>
>> ...and I think all of those should be separate options, probably
>> something like the following, but I'd Vishal to comment if this scheme
>> can be handled with the bash tab-completion implementation:
>>
>>    ndctl monitor --dimm-events=spares-remaining,media-temperature
>> --namespace-events=all --regions-events --bus=ACPI.NFIT
>>
>> ...where an empty --<object>-events option is equivalent to
>> --<object>-events=all. Also, similar to "ndctl list" specifying
>> specific buses, namespaces, etc causes the monitor to filter the
>> objects based on those properties.
>
> Hmmmm....
>
> Currently, I'm confusing what features/options are required for nvdimm daemon.
> For example, what is use-case of "--bus=ACPI.NFIT"?

Other platforms may support different bus types, there are also
proposals like this one for custom NVDIMM buses [1]. The other use
case is allowing the user to monitor for any media error on the bus,
or the completion of ARS.

> For normal administorator of a server, what he/she's interest is
> "need to replace nvdimm module or not", and "need to backup/restore
> on the nvdimm module or not".

I think that's only part of it. Data center operations want to know
more than just when it is time to replace a module, they want to
collect almost any data that the operating system can provide about
the platform.

> For normal programs, they just use device name or directory/filename of
> the filesystem on the nvdimm.
>
> To backup thier data, he/she need to solve relationship between
> nvdimm modules and device name (/dev/pmem* or /dev/dax*).
>
> So, IMHO, I suppose "namespace(device name) specifying (or all namespace)"
> is enough the following events which requires replace the nvdimm module.
>
>  - spare-remaining
>  - helth-state
>  - media-error
>
> And I'm not sure what is use-case of specifying region, bus, and dimm
> on these events.

The reason for supporting those other event sources in my mind is
having a unified interface for tracking topology, health/status, and
error events.

>
> In addition, could you tell me what administrator/program can do
> on the following events? What nvdimm daemon should do on each event?
>
>   - media-temperature
>   - controller-temperature
>   - address-range-scrub-complete
>   - unclean-shutdown

Media temperature and controller temperature alarms can signal to data
center operations that the server is getting too hot, and might need
remediation, perhaps a specific fan has failed and replacing that fan
becomes a high priority when these alarms start firing.

Address-range-scrub complete might be a signal that the server may get
a boost in throughput since the overhead of the background operation
is now complete. ARS may continue to run long after the server has
booted, so the end of that event may be important to server loading
decisions.

Unclean shutdown notification allows events that occurred at the last
shutdown to be recorded at the next boot. Otherwise an operator would
need to write a separate tool to go retrieve this information. Having
it all in one place reduces the number of tools / data-sources that
operations infrastructure needs to consider.

[1]: https://lists.01.org/pipermail/linux-nvdimm/2018-January/013926.html
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v3 2/5] ndctl: monitor: add ndctl create-monitor command
  2018-02-13 10:12       ` Yasunori Goto
@ 2018-02-17  1:00         ` Dan Williams
  2018-02-19  2:36           ` Yasunori Goto
  0 siblings, 1 reply; 21+ messages in thread
From: Dan Williams @ 2018-02-17  1:00 UTC (permalink / raw)
  To: Yasunori Goto; +Cc: linux-nvdimm

On Tue, Feb 13, 2018 at 2:12 AM, Yasunori Goto <y-goto@jp.fujitsu.com> wrote:
>> Hi,
[..]
> I would like to ask one more thing.
>
> What should nvdimm daemon do on detected events of bus/region/namespace/dimm ?
> IIRC, udev handles such hotplug events.
> What is relationship/roles between nvdimm daemon and udev?

UDEV handles the hotplug events, but I imagine device topology changes
are of interest to whatever agent is going to be consuming monitor
events, or monitoring changes that an administrator makes to a
machine. My inspiration for including these events in the monitor
comes from the monitor functionality in mdadm. It has events like:

           DeviceDisappeared
           RebuildStarted
           RebuildNN
           RebuildFinished
           Fail
           FailSpare
           SpareActive
           NewArray
           DegradedArray
           MoveSpare
           SparesMissing

To be clear, I'm not suggesting that all of these events need to be
supported in the first implementation of the monitor, but I want the
architecture and the user interface to be ready to support events from
any device in the topology.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v3 2/5] ndctl: monitor: add ndctl create-monitor command
  2018-02-15  5:51       ` Qi, Fuli
@ 2018-02-17  1:23         ` Dan Williams
  0 siblings, 0 replies; 21+ messages in thread
From: Dan Williams @ 2018-02-17  1:23 UTC (permalink / raw)
  To: Qi, Fuli; +Cc: linux-nvdimm

On Wed, Feb 14, 2018 at 9:51 PM, Qi, Fuli <qi.fuli@jp.fujitsu.com> wrote:
>
>>> Hi Qi,
>>>
>>> This is getting closer to where I want to see this go, but still some
>>> architecture details to incorporate. I mentioned on the cover letter
>>> that systemd can handle starting, stopping, and show the status of
>>> the
>>> monitor. The other detail to incorporate is that monitor events can
>>> come DIMMs, but also namespaces, regions, and the bus.
>>>
>>> The event list I have collected to date is:
>>>
>>> dimm-spares-remaining
>>> dimm-media-temperature
>>> dimm-controller-temperature
>>> dimm-health-state
>>> dimm-unclean-shutdown
>>> dimm-detected
>>> namespace-media-error
>>> namespace-detected
>>> region-media-error
>>> region-detected
>>> bus-media-error
>>> bus-address-range-scrub-complete
>>> bus-detected
>
> By referring to dimm_listen() in smart-listen.c, I understand how to monitor
> events
> come DIMMs. I have checked the ndctl, but I could not find how to monitor
> events
> come namespaces, region and bus. Could you please mention more?
>>>
>>> ...and I think all of those should be separate options, probably
>>> something like the following, but I'd Vishal to comment if this
>>> scheme
>>> can be handled with the bash tab-completion implementation:
>>>
>>>     ndctl monitor --dimm-events=spares-remaining,media-temperature
>>> --namespace-events=all --regions-events --bus=ACPI.NFIT
>>
>> Yes I think we should be able to extend bash completion for a comma
>> separated list of arguments.
>>
>>> ...where an empty --<object>-events option is equivalent to
>>> --<object>-events=all. Also, similar to "ndctl list" specifying
>>> specific buses, namespaces, etc causes the monitor to filter the
>>> objects based on those properties.
>>>
>>> Since "ndctl list" already has this filtering implemented I'd like to
>>> see it refactored and shared between the 2 implementations rather
>>> than
>>> duplicated as is done in this patch. In other words rework cmd_list()
>>> into a generic nvdimm object walking routine with callback functions
>>> to 'list' or 'monitor' a given object that matches the filter.
>>>
>>> Let me know if the above makes sense. I'm thinking the 'ndctl list'
>>> refactoring might be something I need to handle.
>
> Yes, I agree with refactoring ndctl_list and add a structure shared between
> ndctl list and ndctl monitor.
> But I prefer to use tokens rather than callback functions.
>
> I also think the option should support multiple parameters likes:
>  ndclt monitor --dimm nmem1,nmem2 --region region1,region2
>
> The parameters from the same option can be stored in a list_node.
>
> #define filter_node(field) \
> struct filter_##field##_node { \
>         const char *name; \
>         struct list_node list; \
> };
>
> filter_node(bus);
> filter_node(dimm);
> filter_node(region);
> filter_node(namespace);
>
> struct ndctl_filter {
>         struct list_head filter_bus;
>         struct list_head filter_dimm;
>         struct list_head filter_region;
>         struct list_head filter_namespace;
> } nf;
>
> Make the tokens like follows:
>
> #define list_util_field_filter(field) \
> list_for_each(&nf.filter_##field, filter_##field##_node, list) { \
>         if (util_##field##_filter(field, filter_##field##_node->name)) { \
>                 flag = true; \
>                 break; \
>         } \
> }
>
> #define list_util_bus_filter_by_field(field) \
> list_for_each(&nf.filter_##field, filter_##field##_node, list) { \
>         if (util_bus_filter_by_##field(bus, filter_##field##_node->name)) {
> \
>                 flag = true; \
>                 break; \
>         } \
> }
>
> #define list_util_dimm_filter_by_field(field) \
> list_for_each(&nf.filter_##field, filter_##field##_node, list) { \
>         if (util_dimm_filter_by_##field(dimm, filter_##field##_node->name))
> { \
>                 flag = true; \
>                 break; \
>         } \
> }
>
> Here is how to use the tokens:
>
> ndctl_bus_foreach(ctx, bus) {
>         struct filter_bus_node *filter_bus_node;
>         list_util_field_filter(bus);
>         if (!flag) {
>                 struct filter_region_node *filter_region_node;
>                 list_util_bus_filter_by_field(region);
>         }
>         if (!flag) {
>                 struct filter_namespace_node *filter_namespace_node;
>                 list_util_bus_filter_by_field(namespace);
>         }
>         if (!flag)
>                 continue;
>         ndctl_dimm_foreach(bus, dimm) {
>         flag = false;
>         struct filter_dimm_node *filter_dimm_node;
>                 list_util_field_filter(dimm);
>                 if (!flag) {
>                         struct filter_region_node *filter_region_node;
>                         list_util_dimm_filter_by_field(region);
>                 }
>                 if (!flag) {
>                         struct filter_namespace_node *filter_namespace_node;
>                         list_util_dimm_filter_by_field(namespace);
>                 }
>                 if (!flag)
>                         continue;
>     }
> }
>
> To make sure the util_ndctl_filter() goes right, I would like you give some
> comments?

I think the multi-layer macros makes the code difficult to read, and
I'm not seeing how it handles the case where we're building up a
combined json record representing the entire bus, i.e. "ndctl list
-BRND". I'd rather re-write "ndctl list" in terms of this filter
structure:

struct ndctl_filter_ops {
        int (*match_bus)(struct ndctl_bus *bus, void **ctx, unsigned
long flags);
        int (*match_dimm)(struct ndctl_bus *dimm, void *ctx, unsigned
long flags);
        int (*match_region)(struct ndctl_bus *region, void **ctx,
unsigned long flags);
        int (*match_namespace)(struct ndctl_bus *namespace, void *ctx,
unsigned long flags);
};

...where the match routine is only populated if that device is
selected and it passes filter specific context via the "ctx" variable.
In the case of "ndctl list" that might be the "jbus" container json
object being returned from ->match_bus() and then passed to
->match_dimm() to hold "jdimm" instances. Note that ->match_dimm() and
->match_namespace() only take context as input because those objects
don't have any children.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v3 2/5] ndctl: monitor: add ndctl create-monitor command
  2018-02-17  1:00         ` Dan Williams
@ 2018-02-19  2:36           ` Yasunori Goto
  0 siblings, 0 replies; 21+ messages in thread
From: Yasunori Goto @ 2018-02-19  2:36 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-nvdimm

> On Tue, Feb 13, 2018 at 2:12 AM, Yasunori Goto <y-goto@jp.fujitsu.com> wrote:
> >> Hi,
> [..]
> > I would like to ask one more thing.
> >
> > What should nvdimm daemon do on detected events of bus/region/namespace/dimm ?
> > IIRC, udev handles such hotplug events.
> > What is relationship/roles between nvdimm daemon and udev?
> 
> UDEV handles the hotplug events, but I imagine device topology changes
> are of interest to whatever agent is going to be consuming monitor
> events, or monitoring changes that an administrator makes to a
> machine. My inspiration for including these events in the monitor
> comes from the monitor functionality in mdadm. It has events like:
> 
>            DeviceDisappeared
>            RebuildStarted
>            RebuildNN
>            RebuildFinished
>            Fail
>            FailSpare
>            SpareActive
>            NewArray
>            DegradedArray
>            MoveSpare
>            SparesMissing
> 
> To be clear, I'm not suggesting that all of these events need to be
> supported in the first implementation of the monitor, but I want the
> architecture and the user interface to be ready to support events from
> any device in the topology.
> 

Ok, I got it.
Thank you for your clarification.

Bye,
---
Yasunori Goto





_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2018-02-19  2:30 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-09  8:02 [RFC PATCH v3 0/5] ndctl: monitor: monitor the smart events of QI Fuli
2018-02-09  8:02 ` [RFC PATCH v3 1/5] ndctl: nvdimmd: add LOG_NOTICE level to QI Fuli
2018-02-11 20:18   ` Dan Williams
2018-02-11 20:20     ` Dan Williams
2018-02-09  8:02 ` [RFC PATCH v3 2/5] ndctl: monitor: add ndctl create-monitor command QI Fuli
2018-02-11 20:48   ` Dan Williams
2018-02-13  0:54     ` Verma, Vishal L
2018-02-15  5:51       ` Qi, Fuli
2018-02-17  1:23         ` Dan Williams
2018-02-13  9:58     ` Yasunori Goto
2018-02-13 10:12       ` Yasunori Goto
2018-02-17  1:00         ` Dan Williams
2018-02-19  2:36           ` Yasunori Goto
2018-02-17  0:54       ` Dan Williams
2018-02-09  8:02 ` [RFC PATCH v3 3/5] ndctl: monitor: add ndclt list-monitor command QI Fuli
2018-02-09  8:02 ` [RFC PATCH v3 4/5] ndctl: monitor: add ndclt show-monitor command QI Fuli
2018-02-09  8:02 ` [RfC PATCH v3 5/5] ndctl: monitor: add ndclt destroy-monitor command QI Fuli
2018-02-09 18:06 ` [RFC PATCH v3 0/5] ndctl: monitor: monitor the smart events of Verma, Vishal L
2018-02-13  1:51   ` Qi, Fuli
2018-02-10  4:06 ` Dan Williams
2018-02-13  1:54   ` Qi, Fuli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).