Linux-NVDIMM Archive on lore.kernel.org
 help / color / Atom feed
* [ndctl RESEND PATCH v2] monitor: Add epoll timeout for forcing a full dimm health check
@ 2020-07-22  5:24 Vaibhav Jain
  2020-07-22  5:29 ` Vishal Verma
  0 siblings, 1 reply; 2+ messages in thread
From: Vaibhav Jain @ 2020-07-22  5:24 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Vaibhav Jain, Aneesh Kumar K . V

This patch adds a new command argument to the 'monitor' command namely
'--poll' that triggers a call to notify_dimm_event() at regular
intervals forcing a periodic check of status/events for the nvdimm
objects i.e. bus, dimms, regions or namespaces.

This behavior is useful for dimms that do not support event notifications
in case the health status of an nvdimm changes. This is especially
true in case of PAPR-SCM nvdimms as the PHYP hypervisor doesn't provide
any notifications to the guest kernel on a change in nvdimm health
status. In such case periodic polling of the is the only way to track
the health of a nvdimm.

The patch updates monitor_event() adding a timeout value to
epoll_wait() call. Also to prevent the possibility of a single dimm
generating enough events thereby preventing check for status of other
nvdimms objects, a 'fullpoll_ts' time-stamp is added to keep track of
when full check of all nvdimms objects happened. If after epoll_wait()
returns 'fullpoll_ts' time-stamp indicates last a full status check
for nvdimm objects happened beyond 'poll-interval' seconds then a full
status check is enforced.

Cc: QI Fuli <qi.fuli@jp.fujitsu.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
Changelog:

Resend
* None

v1..v2
* Changed the '--check-interval' arg to '--poll' [Dan Williams]
* Update the documentation and patch description of the '--poll' arg
  to accuratly reflect that it can report status/events for
  all nvdimm objects. [Dan Williams]
---
 Documentation/ndctl/ndctl-monitor.txt |  4 ++++
 ndctl/monitor.c                       | 31 ++++++++++++++++++++++++---
 2 files changed, 32 insertions(+), 3 deletions(-)

diff --git a/Documentation/ndctl/ndctl-monitor.txt b/Documentation/ndctl/ndctl-monitor.txt
index 2239f047266d..0b6bb5c416c6 100644
--- a/Documentation/ndctl/ndctl-monitor.txt
+++ b/Documentation/ndctl/ndctl-monitor.txt
@@ -108,6 +108,10 @@ will not work if "--daemon" is specified.
 The monitor will attempt to enable the alarm control bits for all
 specified events.
 
+-p::
+--poll=::
+	Poll and report status/event every <n> seconds.
+
 -u::
 --human::
 	Output monitor notification as human friendly json format instead
diff --git a/ndctl/monitor.c b/ndctl/monitor.c
index 1755b87a5eeb..4e9b2236ff3c 100644
--- a/ndctl/monitor.c
+++ b/ndctl/monitor.c
@@ -4,6 +4,7 @@
 #include <stdio.h>
 #include <json-c/json.h>
 #include <libgen.h>
+#include <time.h>
 #include <dirent.h>
 #include <util/json.h>
 #include <util/filter.h>
@@ -33,6 +34,7 @@ static struct monitor {
 	bool daemon;
 	bool human;
 	bool verbose;
+	unsigned int poll_timeout;
 	unsigned int event_flags;
 	struct log_ctx ctx;
 } monitor;
@@ -322,9 +324,14 @@ static int monitor_event(struct ndctl_ctx *ctx,
 		struct monitor_filter_arg *mfa)
 {
 	struct epoll_event ev, *events;
-	int nfds, epollfd, i, rc = 0;
+	int nfds, epollfd, i, rc = 0, polltimeout = -1;
 	struct monitor_dimm *mdimm;
 	char buf;
+	/* last time a full poll happened */
+	struct timespec fullpoll_ts, ts;
+
+	if (monitor.poll_timeout)
+		polltimeout = monitor.poll_timeout * 1000;
 
 	events = calloc(mfa->num_dimm, sizeof(struct epoll_event));
 	if (!events) {
@@ -354,14 +361,30 @@ static int monitor_event(struct ndctl_ctx *ctx,
 		}
 	}
 
+	clock_gettime(CLOCK_BOOTTIME, &fullpoll_ts);
 	while (1) {
 		did_fail = 0;
-		nfds = epoll_wait(epollfd, events, mfa->num_dimm, -1);
-		if (nfds <= 0 && errno != EINTR) {
+		nfds = epoll_wait(epollfd, events, mfa->num_dimm, polltimeout);
+		if (nfds < 0 && errno != EINTR) {
 			err(&monitor, "epoll_wait error: (%s)\n", strerror(errno));
 			rc = -errno;
 			goto out;
 		}
+
+		/* If needed force a full poll of dimm health */
+		clock_gettime(CLOCK_BOOTTIME, &ts);
+		if ((fullpoll_ts.tv_sec - ts.tv_sec) > monitor.poll_timeout) {
+			nfds = 0;
+			dbg(&monitor, "forcing a full poll\n");
+		}
+
+		/* If we timed out then fill events array with all dimms */
+		if (nfds == 0) {
+			list_for_each(&mfa->dimms, mdimm, list)
+				events[nfds++].data.ptr = mdimm;
+			fullpoll_ts = ts;
+		}
+
 		for (i = 0; i < nfds; i++) {
 			mdimm = events[i].data.ptr;
 			if (util_dimm_event_filter(mdimm, monitor.event_flags)) {
@@ -570,6 +593,8 @@ int cmd_monitor(int argc, const char **argv, struct ndctl_ctx *ctx)
 				"use human friendly output formats"),
 		OPT_BOOLEAN('v', "verbose", &monitor.verbose,
 				"emit extra debug messages to log"),
+		OPT_UINTEGER('p', "poll", &monitor.poll_timeout,
+			     "poll and report events/status every <n> seconds"),
 		OPT_END(),
 	};
 	const char * const u[] = {
-- 
2.26.2
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [ndctl RESEND PATCH v2] monitor: Add epoll timeout for forcing a full dimm health check
  2020-07-22  5:24 [ndctl RESEND PATCH v2] monitor: Add epoll timeout for forcing a full dimm health check Vaibhav Jain
@ 2020-07-22  5:29 ` Vishal Verma
  0 siblings, 0 replies; 2+ messages in thread
From: Vishal Verma @ 2020-07-22  5:29 UTC (permalink / raw)
  To: Vaibhav Jain, linux-nvdimm; +Cc: Aneesh Kumar K . V

On Wed, 2020-07-22 at 10:54 +0530, Vaibhav Jain wrote:
> This patch adds a new command argument to the 'monitor' command namely
> '--poll' that triggers a call to notify_dimm_event() at regular
> intervals forcing a periodic check of status/events for the nvdimm
> objects i.e. bus, dimms, regions or namespaces.
> 
> This behavior is useful for dimms that do not support event notifications
> in case the health status of an nvdimm changes. This is especially
> true in case of PAPR-SCM nvdimms as the PHYP hypervisor doesn't provide
> any notifications to the guest kernel on a change in nvdimm health
> status. In such case periodic polling of the is the only way to track
> the health of a nvdimm.
> 
> The patch updates monitor_event() adding a timeout value to
> epoll_wait() call. Also to prevent the possibility of a single dimm
> generating enough events thereby preventing check for status of other
> nvdimms objects, a 'fullpoll_ts' time-stamp is added to keep track of
> when full check of all nvdimms objects happened. If after epoll_wait()
> returns 'fullpoll_ts' time-stamp indicates last a full status check
> for nvdimm objects happened beyond 'poll-interval' seconds then a full
> status check is enforced.
> 
> Cc: QI Fuli <qi.fuli@jp.fujitsu.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Vishal Verma <vishal.l.verma@intel.com>
> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
> ---
> Changelog:
> 
> Resend
> * None
> 
Hi Vaibhav,

I do have this queued up in my staging branch, I've just yet to push it out :)

Thanks for the ping!
-Vishal


_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, back to index

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-22  5:24 [ndctl RESEND PATCH v2] monitor: Add epoll timeout for forcing a full dimm health check Vaibhav Jain
2020-07-22  5:29 ` Vishal Verma

Linux-NVDIMM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nvdimm/0 linux-nvdimm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nvdimm linux-nvdimm/ https://lore.kernel.org/linux-nvdimm \
		linux-nvdimm@lists.01.org
	public-inbox-index linux-nvdimm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.01.lists.linux-nvdimm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git