netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [iproute2-next v3] devlink: display elapsed time during flash update
@ 2020-10-14 22:31 Jacob Keller
  2020-10-15 17:11 ` Jiri Pirko
  2020-10-17 15:34 ` David Ahern
  0 siblings, 2 replies; 8+ messages in thread
From: Jacob Keller @ 2020-10-14 22:31 UTC (permalink / raw)
  To: netdev; +Cc: Shannon Nelson, Jiri Pirko, David Ahern, Jacob Keller

For some devices, updating the flash can take significant time during
operations where no status can meaningfully be reported. This can be
somewhat confusing to a user who sees devlink appear to hang on the
terminal waiting for the device to update.

Recent changes to the kernel interface allow such long running commands
to provide a timeout value indicating some upper bound on how long the
relevant action could take.

Provide a ticking counter of the time elapsed since the previous status
message in order to make it clear that the program is not simply stuck.

Display this message whenever the status message from the kernel
indicates a timeout value. Additionally also display the message if
we've received no status for more than couple of seconds. If we elapse
more than the timeout provided by the status message, replace the
timeout display with "timeout reached".

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
Changes since v2
* use clock_gettime on CLOCK_MONOTONIC instead of gettimeofday
* remove use of timersub since we're now using struct timespec

 devlink/devlink.c | 105 +++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 104 insertions(+), 1 deletion(-)

diff --git a/devlink/devlink.c b/devlink/devlink.c
index ab9522d260fd..1ff865bc5c22 100644
--- a/devlink/devlink.c
+++ b/devlink/devlink.c
@@ -33,6 +33,7 @@
 #include <sys/select.h>
 #include <sys/socket.h>
 #include <sys/types.h>
+#include <sys/time.h>
 #include <rt_names.h>
 
 #include "version.h"
@@ -3110,6 +3111,9 @@ static int cmd_dev_info(struct dl *dl)
 
 struct cmd_dev_flash_status_ctx {
 	struct dl *dl;
+	struct timespec time_of_last_status;
+	uint64_t status_msg_timeout;
+	size_t elapsed_time_msg_len;
 	char *last_msg;
 	char *last_component;
 	uint8_t not_first:1,
@@ -3127,6 +3131,16 @@ static int nullstrcmp(const char *str1, const char *str2)
 	return str1 ? 1 : -1;
 }
 
+static void cmd_dev_flash_clear_elapsed_time(struct cmd_dev_flash_status_ctx *ctx)
+{
+	int i;
+
+	for (i = 0; i < ctx->elapsed_time_msg_len; i++)
+		pr_out_tty("\b \b");
+
+	ctx->elapsed_time_msg_len = 0;
+}
+
 static int cmd_dev_flash_status_cb(const struct nlmsghdr *nlh, void *data)
 {
 	struct cmd_dev_flash_status_ctx *ctx = data;
@@ -3139,6 +3153,8 @@ static int cmd_dev_flash_status_cb(const struct nlmsghdr *nlh, void *data)
 	const char *bus_name;
 	const char *dev_name;
 
+	cmd_dev_flash_clear_elapsed_time(ctx);
+
 	if (genl->cmd != DEVLINK_CMD_FLASH_UPDATE_STATUS &&
 	    genl->cmd != DEVLINK_CMD_FLASH_UPDATE_END)
 		return MNL_CB_STOP;
@@ -3168,12 +3184,19 @@ static int cmd_dev_flash_status_cb(const struct nlmsghdr *nlh, void *data)
 		done = mnl_attr_get_u64(tb[DEVLINK_ATTR_FLASH_UPDATE_STATUS_DONE]);
 	if (tb[DEVLINK_ATTR_FLASH_UPDATE_STATUS_TOTAL])
 		total = mnl_attr_get_u64(tb[DEVLINK_ATTR_FLASH_UPDATE_STATUS_TOTAL]);
+	if (tb[DEVLINK_ATTR_FLASH_UPDATE_STATUS_TIMEOUT])
+		ctx->status_msg_timeout = mnl_attr_get_u64(tb[DEVLINK_ATTR_FLASH_UPDATE_STATUS_TIMEOUT]);
+	else
+		ctx->status_msg_timeout = 0;
 
 	if (!nullstrcmp(msg, ctx->last_msg) &&
 	    !nullstrcmp(component, ctx->last_component) &&
 	    ctx->last_pc && ctx->not_first) {
 		pr_out_tty("\b\b\b\b\b"); /* clean percentage */
 	} else {
+		/* only update the last status timestamp if the message changed */
+		clock_gettime(CLOCK_MONOTONIC, &ctx->time_of_last_status);
+
 		if (ctx->not_first)
 			pr_out("\n");
 		if (component) {
@@ -3199,11 +3222,78 @@ static int cmd_dev_flash_status_cb(const struct nlmsghdr *nlh, void *data)
 	return MNL_CB_STOP;
 }
 
+static void cmd_dev_flash_time_elapsed(struct cmd_dev_flash_status_ctx *ctx)
+{
+	struct timespec now, res;
+
+	clock_gettime(CLOCK_MONOTONIC, &now);
+
+	res.tv_sec = now.tv_sec - ctx->time_of_last_status.tv_sec;
+	res.tv_nsec = now.tv_nsec - ctx->time_of_last_status.tv_nsec;
+	if (res.tv_nsec < 0) {
+		res.tv_sec--;
+		res.tv_nsec += 1000000000L;
+	}
+
+	/* Only begin displaying an elapsed time message if we've waited a few
+	 * seconds with no response, or the status message included a timeout
+	 * value.
+	 */
+	if (res.tv_sec > 2 || ctx->status_msg_timeout) {
+		uint64_t elapsed_m, elapsed_s;
+		char msg[128];
+		size_t len;
+
+		/* clear the last elapsed time message, if we have one */
+		cmd_dev_flash_clear_elapsed_time(ctx);
+
+		elapsed_m = res.tv_sec / 60;
+		elapsed_s = res.tv_sec % 60;
+
+		/**
+		 * If we've elapsed a few seconds without receiving any status
+		 * notification from the device, we display a time elapsed
+		 * message. This has a few possible formats:
+		 *
+		 * 1) just time elapsed, when no timeout was provided
+		 *    " ( Xm Ys )"
+		 * 2) time elapsed out of a timeout that came from the device
+		 *    driver via DEVLINK_CMD_FLASH_UPDATE_STATUS_TIMEOUT
+		 *    " ( Xm Ys : Am Ys)"
+		 * 3) time elapsed if we still receive no status after
+		 *    reaching the provided timeout.
+		 *    " ( Xm Ys : timeout reached )"
+		 */
+		if (!ctx->status_msg_timeout) {
+			len = snprintf(msg, sizeof(msg),
+				       " ( %lum %lus )", elapsed_m, elapsed_s);
+		} else if (res.tv_sec <= ctx->status_msg_timeout) {
+			uint64_t timeout_m, timeout_s;
+
+			timeout_m = ctx->status_msg_timeout / 60;
+			timeout_s = ctx->status_msg_timeout % 60;
+
+			len = snprintf(msg, sizeof(msg),
+				       " ( %lum %lus : %lum %lus )",
+				       elapsed_m, elapsed_s, timeout_m, timeout_s);
+		} else {
+			len = snprintf(msg, sizeof(msg),
+				       " ( %lum %lus : timeout reached )", elapsed_m, elapsed_s);
+		}
+
+		ctx->elapsed_time_msg_len = len;
+
+		pr_out_tty("%s", msg);
+		fflush(stdout);
+	}
+}
+
 static int cmd_dev_flash_fds_process(struct cmd_dev_flash_status_ctx *ctx,
 				     struct mnlg_socket *nlg_ntf,
 				     int pipe_r)
 {
 	int nlfd = mnlg_socket_get_fd(nlg_ntf);
+	struct timeval timeout;
 	fd_set fds[3];
 	int fdmax;
 	int i;
@@ -3218,7 +3308,14 @@ static int cmd_dev_flash_fds_process(struct cmd_dev_flash_status_ctx *ctx,
 	if (nlfd >= fdmax)
 		fdmax = nlfd + 1;
 
-	while (select(fdmax, &fds[0], &fds[1], &fds[2], NULL) < 0) {
+	/* select only for a short while (1/10th of a second) in order to
+	 * allow periodically updating the screen with an elapsed time
+	 * indicator.
+	 */
+	timeout.tv_sec = 0;
+	timeout.tv_usec = 100000;
+
+	while (select(fdmax, &fds[0], &fds[1], &fds[2], &timeout) < 0) {
 		if (errno == EINTR)
 			continue;
 		pr_err("select() failed\n");
@@ -3240,6 +3337,7 @@ static int cmd_dev_flash_fds_process(struct cmd_dev_flash_status_ctx *ctx,
 			return err2;
 		ctx->flash_done = 1;
 	}
+	cmd_dev_flash_time_elapsed(ctx);
 	return 0;
 }
 
@@ -3300,6 +3398,11 @@ static int cmd_dev_flash(struct dl *dl)
 	}
 	close(pipe_w);
 
+	/* initialize starting time to allow comparison for when to begin
+	 * displaying a time elapsed message.
+	 */
+	clock_gettime(CLOCK_MONOTONIC, &ctx.time_of_last_status);
+
 	do {
 		err = cmd_dev_flash_fds_process(&ctx, nlg_ntf, pipe_r);
 		if (err)

base-commit: b5a583fb32950aaad62ddefa64f791ab432e30e3
-- 
2.28.0.497.g54e85e7af1ac


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [iproute2-next v3] devlink: display elapsed time during flash update
  2020-10-14 22:31 [iproute2-next v3] devlink: display elapsed time during flash update Jacob Keller
@ 2020-10-15 17:11 ` Jiri Pirko
  2020-10-17 15:34 ` David Ahern
  1 sibling, 0 replies; 8+ messages in thread
From: Jiri Pirko @ 2020-10-15 17:11 UTC (permalink / raw)
  To: Jacob Keller; +Cc: netdev, Shannon Nelson, David Ahern

Thu, Oct 15, 2020 at 12:31:04AM CEST, jacob.e.keller@intel.com wrote:
>For some devices, updating the flash can take significant time during
>operations where no status can meaningfully be reported. This can be
>somewhat confusing to a user who sees devlink appear to hang on the
>terminal waiting for the device to update.
>
>Recent changes to the kernel interface allow such long running commands
>to provide a timeout value indicating some upper bound on how long the
>relevant action could take.
>
>Provide a ticking counter of the time elapsed since the previous status
>message in order to make it clear that the program is not simply stuck.
>
>Display this message whenever the status message from the kernel
>indicates a timeout value. Additionally also display the message if
>we've received no status for more than couple of seconds. If we elapse
>more than the timeout provided by the status message, replace the
>timeout display with "timeout reached".
>
>Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>

Reviewed-by: Jiri Pirko <jiri@nvidia.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [iproute2-next v3] devlink: display elapsed time during flash update
  2020-10-14 22:31 [iproute2-next v3] devlink: display elapsed time during flash update Jacob Keller
  2020-10-15 17:11 ` Jiri Pirko
@ 2020-10-17 15:34 ` David Ahern
  2020-10-19 19:05   ` Keller, Jacob E
  1 sibling, 1 reply; 8+ messages in thread
From: David Ahern @ 2020-10-17 15:34 UTC (permalink / raw)
  To: Jacob Keller, netdev, Jiri Pirko; +Cc: Shannon Nelson

On 10/14/20 4:31 PM, Jacob Keller wrote:
> For some devices, updating the flash can take significant time during
> operations where no status can meaningfully be reported. This can be
> somewhat confusing to a user who sees devlink appear to hang on the
> terminal waiting for the device to update.
> 
> Recent changes to the kernel interface allow such long running commands
> to provide a timeout value indicating some upper bound on how long the
> relevant action could take.
> 
> Provide a ticking counter of the time elapsed since the previous status
> message in order to make it clear that the program is not simply stuck.
> 
> Display this message whenever the status message from the kernel
> indicates a timeout value. Additionally also display the message if
> we've received no status for more than couple of seconds. If we elapse
> more than the timeout provided by the status message, replace the
> timeout display with "timeout reached".
> 
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> ---
> Changes since v2
> * use clock_gettime on CLOCK_MONOTONIC instead of gettimeofday
> * remove use of timersub since we're now using struct timespec
> 
>  devlink/devlink.c | 105 +++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 104 insertions(+), 1 deletion(-)
> 

applied to iproute2-next.

The DEVLINK attributes are ridiculously long --
DEVLINK_ATTR_FLASH_UPDATE_STATUS_TIMEOUT is 40 characters -- which
forces really long code lines or oddly wrapped lines. Going forward
please consider abbreviations on name components to reduce their lengths.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [iproute2-next v3] devlink: display elapsed time during flash update
  2020-10-17 15:34 ` David Ahern
@ 2020-10-19 19:05   ` Keller, Jacob E
  2020-10-19 19:20     ` Jakub Kicinski
  2020-10-20  6:45     ` Jiri Pirko
  0 siblings, 2 replies; 8+ messages in thread
From: Keller, Jacob E @ 2020-10-19 19:05 UTC (permalink / raw)
  To: David Ahern, netdev, Jiri Pirko, Jakub Kicinski; +Cc: Shannon Nelson

> -----Original Message-----
> From: David Ahern <dsahern@gmail.com>
> Sent: Saturday, October 17, 2020 8:35 AM
> To: Keller, Jacob E <jacob.e.keller@intel.com>; netdev@vger.kernel.org; Jiri Pirko
> <jiri@resnulli.us>
> Cc: Shannon Nelson <snelson@pensando.io>
> Subject: Re: [iproute2-next v3] devlink: display elapsed time during flash update
> 
> On 10/14/20 4:31 PM, Jacob Keller wrote:
> > For some devices, updating the flash can take significant time during
> > operations where no status can meaningfully be reported. This can be
> > somewhat confusing to a user who sees devlink appear to hang on the
> > terminal waiting for the device to update.
> >
> > Recent changes to the kernel interface allow such long running commands
> > to provide a timeout value indicating some upper bound on how long the
> > relevant action could take.
> >
> > Provide a ticking counter of the time elapsed since the previous status
> > message in order to make it clear that the program is not simply stuck.
> >
> > Display this message whenever the status message from the kernel
> > indicates a timeout value. Additionally also display the message if
> > we've received no status for more than couple of seconds. If we elapse
> > more than the timeout provided by the status message, replace the
> > timeout display with "timeout reached".
> >
> > Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> > ---
> > Changes since v2
> > * use clock_gettime on CLOCK_MONOTONIC instead of gettimeofday
> > * remove use of timersub since we're now using struct timespec
> >
> >  devlink/devlink.c | 105 +++++++++++++++++++++++++++++++++++++++++++++-
> >  1 file changed, 104 insertions(+), 1 deletion(-)
> >
> 
> applied to iproute2-next.
> 
> The DEVLINK attributes are ridiculously long --
> DEVLINK_ATTR_FLASH_UPDATE_STATUS_TIMEOUT is 40 characters -- which
> forces really long code lines or oddly wrapped lines. Going forward
> please consider abbreviations on name components to reduce their lengths.

This is probably a larger discussion, since basically every devlink attribute name is long.

Jiri, Jakub, any thoughts on this? I'd like to see whatever abbreviation scheme we use be consistent.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [iproute2-next v3] devlink: display elapsed time during flash update
  2020-10-19 19:05   ` Keller, Jacob E
@ 2020-10-19 19:20     ` Jakub Kicinski
  2020-10-20  3:21       ` David Ahern
  2020-10-20  6:45     ` Jiri Pirko
  1 sibling, 1 reply; 8+ messages in thread
From: Jakub Kicinski @ 2020-10-19 19:20 UTC (permalink / raw)
  To: Keller, Jacob E; +Cc: David Ahern, netdev, Jiri Pirko, Shannon Nelson

On Mon, 19 Oct 2020 19:05:34 +0000 Keller, Jacob E wrote:
> > The DEVLINK attributes are ridiculously long --
> > DEVLINK_ATTR_FLASH_UPDATE_STATUS_TIMEOUT is 40 characters -- which
> > forces really long code lines or oddly wrapped lines. Going forward
> > please consider abbreviations on name components to reduce their lengths.  
> 
> This is probably a larger discussion, since basically every devlink attribute name is long.
> 
> Jiri, Jakub, any thoughts on this? I'd like to see whatever abbreviation scheme we use be consistent.

As David said - let's keep an eye on this going forward. We already 
pushed back with Moshe's live reload work.

If you really want to make things better for this particular name
you could probably drop the word _STATUS from it.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [iproute2-next v3] devlink: display elapsed time during flash update
  2020-10-19 19:20     ` Jakub Kicinski
@ 2020-10-20  3:21       ` David Ahern
  0 siblings, 0 replies; 8+ messages in thread
From: David Ahern @ 2020-10-20  3:21 UTC (permalink / raw)
  To: Jakub Kicinski, Keller, Jacob E; +Cc: netdev, Jiri Pirko, Shannon Nelson

On 10/19/20 1:20 PM, Jakub Kicinski wrote:
> On Mon, 19 Oct 2020 19:05:34 +0000 Keller, Jacob E wrote:
>>> The DEVLINK attributes are ridiculously long --
>>> DEVLINK_ATTR_FLASH_UPDATE_STATUS_TIMEOUT is 40 characters -- which
>>> forces really long code lines or oddly wrapped lines. Going forward
>>> please consider abbreviations on name components to reduce their lengths.  
>>
>> This is probably a larger discussion, since basically every devlink attribute name is long.
>>
>> Jiri, Jakub, any thoughts on this? I'd like to see whatever abbreviation scheme we use be consistent.
> 
> As David said - let's keep an eye on this going forward. We already 
> pushed back with Moshe's live reload work.
> 
> If you really want to make things better for this particular name
> you could probably drop the word _STATUS from it.
> 

and typical strategies - like dropping vowels.

DEVLINK_ATTR_ is the established prefix so every attribute starts at 13
characters. In this case dropping STATUS and shortening TIMEOUT so it
becomes DEVLINK_ATTR_FLASH_UPDATE_TMOUT does not overly harm the
readability yet drops the char count to 31. Still long but is an example
of what can be done.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [iproute2-next v3] devlink: display elapsed time during flash update
  2020-10-19 19:05   ` Keller, Jacob E
  2020-10-19 19:20     ` Jakub Kicinski
@ 2020-10-20  6:45     ` Jiri Pirko
  2020-10-20 14:32       ` David Ahern
  1 sibling, 1 reply; 8+ messages in thread
From: Jiri Pirko @ 2020-10-20  6:45 UTC (permalink / raw)
  To: Keller, Jacob E; +Cc: David Ahern, netdev, Jakub Kicinski, Shannon Nelson

Mon, Oct 19, 2020 at 09:05:34PM CEST, jacob.e.keller@intel.com wrote:
>> -----Original Message-----
>> From: David Ahern <dsahern@gmail.com>
>> Sent: Saturday, October 17, 2020 8:35 AM
>> To: Keller, Jacob E <jacob.e.keller@intel.com>; netdev@vger.kernel.org; Jiri Pirko
>> <jiri@resnulli.us>
>> Cc: Shannon Nelson <snelson@pensando.io>
>> Subject: Re: [iproute2-next v3] devlink: display elapsed time during flash update
>> 
>> On 10/14/20 4:31 PM, Jacob Keller wrote:
>> > For some devices, updating the flash can take significant time during
>> > operations where no status can meaningfully be reported. This can be
>> > somewhat confusing to a user who sees devlink appear to hang on the
>> > terminal waiting for the device to update.
>> >
>> > Recent changes to the kernel interface allow such long running commands
>> > to provide a timeout value indicating some upper bound on how long the
>> > relevant action could take.
>> >
>> > Provide a ticking counter of the time elapsed since the previous status
>> > message in order to make it clear that the program is not simply stuck.
>> >
>> > Display this message whenever the status message from the kernel
>> > indicates a timeout value. Additionally also display the message if
>> > we've received no status for more than couple of seconds. If we elapse
>> > more than the timeout provided by the status message, replace the
>> > timeout display with "timeout reached".
>> >
>> > Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
>> > ---
>> > Changes since v2
>> > * use clock_gettime on CLOCK_MONOTONIC instead of gettimeofday
>> > * remove use of timersub since we're now using struct timespec
>> >
>> >  devlink/devlink.c | 105 +++++++++++++++++++++++++++++++++++++++++++++-
>> >  1 file changed, 104 insertions(+), 1 deletion(-)
>> >
>> 
>> applied to iproute2-next.
>> 
>> The DEVLINK attributes are ridiculously long --
>> DEVLINK_ATTR_FLASH_UPDATE_STATUS_TIMEOUT is 40 characters -- which
>> forces really long code lines or oddly wrapped lines. Going forward
>> please consider abbreviations on name components to reduce their lengths.
>
>This is probably a larger discussion, since basically every devlink attribute name is long.
>
>Jiri, Jakub, any thoughts on this? I'd like to see whatever abbreviation scheme we use be consistent.

I prefer long and easy to understand.


>
>Thanks,
>Jake

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [iproute2-next v3] devlink: display elapsed time during flash update
  2020-10-20  6:45     ` Jiri Pirko
@ 2020-10-20 14:32       ` David Ahern
  0 siblings, 0 replies; 8+ messages in thread
From: David Ahern @ 2020-10-20 14:32 UTC (permalink / raw)
  To: Jiri Pirko, Keller, Jacob E; +Cc: netdev, Jakub Kicinski, Shannon Nelson

On 10/20/20 12:45 AM, Jiri Pirko wrote:
> I prefer long and easy to understand.

and the code needs to be readable as well. There is middle ground here
and reasonable naming schemes.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-10-20 14:32 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-14 22:31 [iproute2-next v3] devlink: display elapsed time during flash update Jacob Keller
2020-10-15 17:11 ` Jiri Pirko
2020-10-17 15:34 ` David Ahern
2020-10-19 19:05   ` Keller, Jacob E
2020-10-19 19:20     ` Jakub Kicinski
2020-10-20  3:21       ` David Ahern
2020-10-20  6:45     ` Jiri Pirko
2020-10-20 14:32       ` David Ahern

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).