netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch net-next v2 0/4] netdevsim: add devlink health reporters suppor
@ 2019-10-10 13:18 Jiri Pirko
  2019-10-10 13:18 ` [patch net-next v2 1/4] devlink: don't do reporter recovery if the state is healthy Jiri Pirko
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Jiri Pirko @ 2019-10-10 13:18 UTC (permalink / raw)
  To: netdev; +Cc: davem, jakub.kicinski, ayal, moshe, eranbe, mlxsw

From: Jiri Pirko <jiri@mellanox.com>

This patchset adds support for devlink health reporter interface
testing. First 2 patches are small dependencies of the last 2.

Jiri Pirko (4):
  devlink: don't do reporter recovery if the state is healthy
  devlink: propagate extack down to health reporter ops
  netdevsim: implement couple of testing devlink health reporters
  selftests: add netdevsim devlink health tests

 .../net/ethernet/broadcom/bnxt/bnxt_devlink.c |   9 +-
 .../mellanox/mlx5/core/en/reporter_rx.c       |   6 +-
 .../mellanox/mlx5/core/en/reporter_tx.c       |   6 +-
 .../net/ethernet/mellanox/mlx5/core/health.c  |  12 +-
 drivers/net/netdevsim/Makefile                |   2 +-
 drivers/net/netdevsim/dev.c                   |  17 +-
 drivers/net/netdevsim/health.c                | 325 ++++++++++++++++++
 drivers/net/netdevsim/netdevsim.h             |  13 +
 include/net/devlink.h                         |   9 +-
 net/core/devlink.c                            |  23 +-
 .../drivers/net/netdevsim/devlink.sh          | 127 ++++++-
 11 files changed, 522 insertions(+), 27 deletions(-)
 create mode 100644 drivers/net/netdevsim/health.c

-- 
2.21.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [patch net-next v2 1/4] devlink: don't do reporter recovery if the state is healthy
  2019-10-10 13:18 [patch net-next v2 0/4] netdevsim: add devlink health reporters suppor Jiri Pirko
@ 2019-10-10 13:18 ` Jiri Pirko
  2019-10-10 13:18 ` [patch net-next v2 2/4] devlink: propagate extack down to health reporter ops Jiri Pirko
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Jiri Pirko @ 2019-10-10 13:18 UTC (permalink / raw)
  To: netdev; +Cc: davem, jakub.kicinski, ayal, moshe, eranbe, mlxsw

From: Jiri Pirko <jiri@mellanox.com>

If reporter state is healthy, don't call into a driver for recover and
don't increase recovery count.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 net/core/devlink.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/core/devlink.c b/net/core/devlink.c
index eb0a22f05887..95887462eecf 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -4851,6 +4851,9 @@ devlink_health_reporter_recover(struct devlink_health_reporter *reporter,
 {
 	int err;
 
+	if (reporter->health_state == DEVLINK_HEALTH_REPORTER_STATE_HEALTHY)
+		return 0;
+
 	if (!reporter->ops->recover)
 		return -EOPNOTSUPP;
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [patch net-next v2 2/4] devlink: propagate extack down to health reporter ops
  2019-10-10 13:18 [patch net-next v2 0/4] netdevsim: add devlink health reporters suppor Jiri Pirko
  2019-10-10 13:18 ` [patch net-next v2 1/4] devlink: don't do reporter recovery if the state is healthy Jiri Pirko
@ 2019-10-10 13:18 ` Jiri Pirko
  2019-10-11  2:04   ` Jakub Kicinski
  2019-10-10 13:18 ` [patch net-next v2 3/4] netdevsim: implement couple of testing devlink health reporters Jiri Pirko
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 10+ messages in thread
From: Jiri Pirko @ 2019-10-10 13:18 UTC (permalink / raw)
  To: netdev; +Cc: davem, jakub.kicinski, ayal, moshe, eranbe, mlxsw

From: Jiri Pirko <jiri@mellanox.com>

During health reporter operations, driver might want to fill-up
the extack message, so propagate extack down to the health reporter ops.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 .../net/ethernet/broadcom/bnxt/bnxt_devlink.c |  9 ++++++---
 .../mellanox/mlx5/core/en/reporter_rx.c       |  6 ++++--
 .../mellanox/mlx5/core/en/reporter_tx.c       |  6 ++++--
 .../net/ethernet/mellanox/mlx5/core/health.c  | 12 +++++++----
 include/net/devlink.h                         |  9 ++++++---
 net/core/devlink.c                            | 20 ++++++++++---------
 6 files changed, 39 insertions(+), 23 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
index e664392dccc0..ff1bc0ec2e7c 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
@@ -16,7 +16,8 @@
 #include "bnxt_devlink.h"
 
 static int bnxt_fw_reporter_diagnose(struct devlink_health_reporter *reporter,
-				     struct devlink_fmsg *fmsg)
+				     struct devlink_fmsg *fmsg,
+				     struct netlink_ext_ack *extack)
 {
 	struct bnxt *bp = devlink_health_reporter_priv(reporter);
 	struct bnxt_fw_health *health = bp->fw_health;
@@ -66,7 +67,8 @@ static const struct devlink_health_reporter_ops bnxt_dl_fw_reporter_ops = {
 };
 
 static int bnxt_fw_reset_recover(struct devlink_health_reporter *reporter,
-				 void *priv_ctx)
+				 void *priv_ctx,
+				 struct netlink_ext_ack *extack)
 {
 	struct bnxt *bp = devlink_health_reporter_priv(reporter);
 
@@ -84,7 +86,8 @@ struct devlink_health_reporter_ops bnxt_dl_fw_reset_reporter_ops = {
 };
 
 static int bnxt_fw_fatal_recover(struct devlink_health_reporter *reporter,
-				 void *priv_ctx)
+				 void *priv_ctx,
+				 struct netlink_ext_ack *extack)
 {
 	struct bnxt *bp = devlink_health_reporter_priv(reporter);
 	struct bnxt_fw_reporter_ctx *fw_reporter_ctx = priv_ctx;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c
index b860569d4247..6c72b592315b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c
@@ -222,7 +222,8 @@ static int mlx5e_rx_reporter_recover_from_ctx(struct mlx5e_err_ctx *err_ctx)
 }
 
 static int mlx5e_rx_reporter_recover(struct devlink_health_reporter *reporter,
-				     void *context)
+				     void *context,
+				     struct netlink_ext_ack *extack)
 {
 	struct mlx5e_priv *priv = devlink_health_reporter_priv(reporter);
 	struct mlx5e_err_ctx *err_ctx = context;
@@ -301,7 +302,8 @@ static int mlx5e_rx_reporter_build_diagnose_output(struct mlx5e_rq *rq,
 }
 
 static int mlx5e_rx_reporter_diagnose(struct devlink_health_reporter *reporter,
-				      struct devlink_fmsg *fmsg)
+				      struct devlink_fmsg *fmsg,
+				      struct netlink_ext_ack *extack)
 {
 	struct mlx5e_priv *priv = devlink_health_reporter_priv(reporter);
 	struct mlx5e_params *params = &priv->channels.params;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
index bfed558637c2..b468549e96ff 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
@@ -135,7 +135,8 @@ static int mlx5e_tx_reporter_recover_from_ctx(struct mlx5e_err_ctx *err_ctx)
 }
 
 static int mlx5e_tx_reporter_recover(struct devlink_health_reporter *reporter,
-				     void *context)
+				     void *context,
+				     struct netlink_ext_ack *extack)
 {
 	struct mlx5e_priv *priv = devlink_health_reporter_priv(reporter);
 	struct mlx5e_err_ctx *err_ctx = context;
@@ -205,7 +206,8 @@ mlx5e_tx_reporter_build_diagnose_output(struct devlink_fmsg *fmsg,
 }
 
 static int mlx5e_tx_reporter_diagnose(struct devlink_health_reporter *reporter,
-				      struct devlink_fmsg *fmsg)
+				      struct devlink_fmsg *fmsg,
+				      struct netlink_ext_ack *extack)
 {
 	struct mlx5e_priv *priv = devlink_health_reporter_priv(reporter);
 	struct mlx5e_txqsq *generic_sq = priv->txq2sq[0];
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index d685122d9ff7..be3c3c704bfc 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -390,7 +390,8 @@ static void print_health_info(struct mlx5_core_dev *dev)
 
 static int
 mlx5_fw_reporter_diagnose(struct devlink_health_reporter *reporter,
-			  struct devlink_fmsg *fmsg)
+			  struct devlink_fmsg *fmsg,
+			  struct netlink_ext_ack *extack)
 {
 	struct mlx5_core_dev *dev = devlink_health_reporter_priv(reporter);
 	struct mlx5_core_health *health = &dev->priv.health;
@@ -491,7 +492,8 @@ mlx5_fw_reporter_heath_buffer_data_put(struct mlx5_core_dev *dev,
 
 static int
 mlx5_fw_reporter_dump(struct devlink_health_reporter *reporter,
-		      struct devlink_fmsg *fmsg, void *priv_ctx)
+		      struct devlink_fmsg *fmsg, void *priv_ctx,
+		      struct netlink_ext_ack *extack)
 {
 	struct mlx5_core_dev *dev = devlink_health_reporter_priv(reporter);
 	int err;
@@ -545,7 +547,8 @@ static const struct devlink_health_reporter_ops mlx5_fw_reporter_ops = {
 
 static int
 mlx5_fw_fatal_reporter_recover(struct devlink_health_reporter *reporter,
-			       void *priv_ctx)
+			       void *priv_ctx,
+			       struct netlink_ext_ack *extack)
 {
 	struct mlx5_core_dev *dev = devlink_health_reporter_priv(reporter);
 
@@ -555,7 +558,8 @@ mlx5_fw_fatal_reporter_recover(struct devlink_health_reporter *reporter,
 #define MLX5_CR_DUMP_CHUNK_SIZE 256
 static int
 mlx5_fw_fatal_reporter_dump(struct devlink_health_reporter *reporter,
-			    struct devlink_fmsg *fmsg, void *priv_ctx)
+			    struct devlink_fmsg *fmsg, void *priv_ctx,
+			    struct netlink_ext_ack *extack)
 {
 	struct mlx5_core_dev *dev = devlink_health_reporter_priv(reporter);
 	u32 crdump_size = dev->priv.health.crdump_size;
diff --git a/include/net/devlink.h b/include/net/devlink.h
index 4095657fc23f..d35a1be107b5 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -507,11 +507,14 @@ enum devlink_health_reporter_state {
 struct devlink_health_reporter_ops {
 	char *name;
 	int (*recover)(struct devlink_health_reporter *reporter,
-		       void *priv_ctx);
+		       void *priv_ctx, struct netlink_ext_ack *extack);
 	int (*dump)(struct devlink_health_reporter *reporter,
-		    struct devlink_fmsg *fmsg, void *priv_ctx);
+		    struct devlink_fmsg *fmsg, void *priv_ctx,
+		    struct netlink_ext_ack *extack);
 	int (*diagnose)(struct devlink_health_reporter *reporter,
-			struct devlink_fmsg *fmsg);
+			struct devlink_fmsg *fmsg,
+			struct netlink_ext_ack *extack);
+
 };
 
 /**
diff --git a/net/core/devlink.c b/net/core/devlink.c
index 95887462eecf..97e9a2246929 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -4847,7 +4847,7 @@ EXPORT_SYMBOL_GPL(devlink_health_reporter_state_update);
 
 static int
 devlink_health_reporter_recover(struct devlink_health_reporter *reporter,
-				void *priv_ctx)
+				void *priv_ctx, struct netlink_ext_ack *extack)
 {
 	int err;
 
@@ -4857,7 +4857,7 @@ devlink_health_reporter_recover(struct devlink_health_reporter *reporter,
 	if (!reporter->ops->recover)
 		return -EOPNOTSUPP;
 
-	err = reporter->ops->recover(reporter, priv_ctx);
+	err = reporter->ops->recover(reporter, priv_ctx, extack);
 	if (err)
 		return err;
 
@@ -4878,7 +4878,8 @@ devlink_health_dump_clear(struct devlink_health_reporter *reporter)
 }
 
 static int devlink_health_do_dump(struct devlink_health_reporter *reporter,
-				  void *priv_ctx)
+				  void *priv_ctx,
+				  struct netlink_ext_ack *extack)
 {
 	int err;
 
@@ -4899,7 +4900,7 @@ static int devlink_health_do_dump(struct devlink_health_reporter *reporter,
 		goto dump_err;
 
 	err = reporter->ops->dump(reporter, reporter->dump_fmsg,
-				  priv_ctx);
+				  priv_ctx, extack);
 	if (err)
 		goto dump_err;
 
@@ -4946,11 +4947,12 @@ int devlink_health_report(struct devlink_health_reporter *reporter,
 
 	mutex_lock(&reporter->dump_lock);
 	/* store current dump of current error, for later analysis */
-	devlink_health_do_dump(reporter, priv_ctx);
+	devlink_health_do_dump(reporter, priv_ctx, NULL);
 	mutex_unlock(&reporter->dump_lock);
 
 	if (reporter->auto_recover)
-		return devlink_health_reporter_recover(reporter, priv_ctx);
+		return devlink_health_reporter_recover(reporter,
+						       priv_ctx, NULL);
 
 	return 0;
 }
@@ -5188,7 +5190,7 @@ static int devlink_nl_cmd_health_reporter_recover_doit(struct sk_buff *skb,
 	if (!reporter)
 		return -EINVAL;
 
-	err = devlink_health_reporter_recover(reporter, NULL);
+	err = devlink_health_reporter_recover(reporter, NULL, info->extack);
 
 	devlink_health_reporter_put(reporter);
 	return err;
@@ -5221,7 +5223,7 @@ static int devlink_nl_cmd_health_reporter_diagnose_doit(struct sk_buff *skb,
 	if (err)
 		goto out;
 
-	err = reporter->ops->diagnose(reporter, fmsg);
+	err = reporter->ops->diagnose(reporter, fmsg, info->extack);
 	if (err)
 		goto out;
 
@@ -5256,7 +5258,7 @@ devlink_nl_cmd_health_reporter_dump_get_dumpit(struct sk_buff *skb,
 	}
 	mutex_lock(&reporter->dump_lock);
 	if (!start) {
-		err = devlink_health_do_dump(reporter, NULL);
+		err = devlink_health_do_dump(reporter, NULL, cb->extack);
 		if (err)
 			goto unlock;
 		cb->args[1] = reporter->dump_ts;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [patch net-next v2 3/4] netdevsim: implement couple of testing devlink health reporters
  2019-10-10 13:18 [patch net-next v2 0/4] netdevsim: add devlink health reporters suppor Jiri Pirko
  2019-10-10 13:18 ` [patch net-next v2 1/4] devlink: don't do reporter recovery if the state is healthy Jiri Pirko
  2019-10-10 13:18 ` [patch net-next v2 2/4] devlink: propagate extack down to health reporter ops Jiri Pirko
@ 2019-10-10 13:18 ` Jiri Pirko
  2019-10-10 13:18 ` [patch net-next v2 4/4] selftests: add netdevsim devlink health tests Jiri Pirko
  2019-10-12  4:05 ` [patch net-next v2 0/4] netdevsim: add devlink health reporters suppor David Miller
  4 siblings, 0 replies; 10+ messages in thread
From: Jiri Pirko @ 2019-10-10 13:18 UTC (permalink / raw)
  To: netdev; +Cc: davem, jakub.kicinski, ayal, moshe, eranbe, mlxsw

From: Jiri Pirko <jiri@mellanox.com>

Implement "empty" and "dummy" reporters. The first one is really simple
and does nothing. The other one has debugfs files to trigger breakage
and it is able to do recovery. The ops also implement dummy fmsg
content.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
v1->v2:
- fixed break message memory leak in nsim_dev_dummy_reporter_recover
- removed debugfs.h include from netdevsim.h
- use ?: in return of nsim_dev_health_break_write()
- fix error path after failure of debugfs_create_dir() call
- fixed fmsg binary serialization
---
 drivers/net/netdevsim/Makefile    |   2 +-
 drivers/net/netdevsim/dev.c       |  17 +-
 drivers/net/netdevsim/health.c    | 325 ++++++++++++++++++++++++++++++
 drivers/net/netdevsim/netdevsim.h |  13 ++
 4 files changed, 354 insertions(+), 3 deletions(-)
 create mode 100644 drivers/net/netdevsim/health.c

diff --git a/drivers/net/netdevsim/Makefile b/drivers/net/netdevsim/Makefile
index 09f1315d2f2a..f4d8f62f28c2 100644
--- a/drivers/net/netdevsim/Makefile
+++ b/drivers/net/netdevsim/Makefile
@@ -3,7 +3,7 @@
 obj-$(CONFIG_NETDEVSIM) += netdevsim.o
 
 netdevsim-objs := \
-	netdev.o dev.o fib.o bus.o
+	netdev.o dev.o fib.o bus.o health.o
 
 ifeq ($(CONFIG_BPF_SYSCALL),y)
 netdevsim-objs += \
diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
index e47fa7b6ca7c..468e157a7cb1 100644
--- a/drivers/net/netdevsim/dev.c
+++ b/drivers/net/netdevsim/dev.c
@@ -730,12 +730,18 @@ static int nsim_dev_reload_create(struct nsim_dev *nsim_dev,
 	if (err)
 		goto err_dummy_region_exit;
 
-	err = nsim_dev_port_add_all(nsim_dev, nsim_bus_dev->port_count);
+	err = nsim_dev_health_init(nsim_dev, devlink);
 	if (err)
 		goto err_traps_exit;
 
+	err = nsim_dev_port_add_all(nsim_dev, nsim_bus_dev->port_count);
+	if (err)
+		goto err_health_exit;
+
 	return 0;
 
+err_health_exit:
+	nsim_dev_health_exit(nsim_dev);
 err_traps_exit:
 	nsim_dev_traps_exit(devlink);
 err_dummy_region_exit:
@@ -797,10 +803,14 @@ static struct nsim_dev *nsim_dev_create(struct nsim_bus_dev *nsim_bus_dev)
 	if (err)
 		goto err_traps_exit;
 
-	err = nsim_bpf_dev_init(nsim_dev);
+	err = nsim_dev_health_init(nsim_dev, devlink);
 	if (err)
 		goto err_debugfs_exit;
 
+	err = nsim_bpf_dev_init(nsim_dev);
+	if (err)
+		goto err_health_exit;
+
 	err = nsim_dev_port_add_all(nsim_dev, nsim_bus_dev->port_count);
 	if (err)
 		goto err_bpf_dev_exit;
@@ -810,6 +820,8 @@ static struct nsim_dev *nsim_dev_create(struct nsim_bus_dev *nsim_bus_dev)
 
 err_bpf_dev_exit:
 	nsim_bpf_dev_exit(nsim_dev);
+err_health_exit:
+	nsim_dev_health_exit(nsim_dev);
 err_debugfs_exit:
 	nsim_dev_debugfs_exit(nsim_dev);
 err_traps_exit:
@@ -837,6 +849,7 @@ static void nsim_dev_reload_destroy(struct nsim_dev *nsim_dev)
 	if (devlink_is_reload_failed(devlink))
 		return;
 	nsim_dev_port_del_all(nsim_dev);
+	nsim_dev_health_exit(nsim_dev);
 	nsim_dev_traps_exit(devlink);
 	nsim_dev_dummy_region_exit(nsim_dev);
 	mutex_destroy(&nsim_dev->port_list_lock);
diff --git a/drivers/net/netdevsim/health.c b/drivers/net/netdevsim/health.c
new file mode 100644
index 000000000000..2716235a0336
--- /dev/null
+++ b/drivers/net/netdevsim/health.c
@@ -0,0 +1,325 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2019 Mellanox Technologies. All rights reserved */
+
+#include <linux/debugfs.h>
+#include <linux/err.h>
+#include <linux/kernel.h>
+#include <linux/slab.h>
+
+#include "netdevsim.h"
+
+static int
+nsim_dev_empty_reporter_dump(struct devlink_health_reporter *reporter,
+			     struct devlink_fmsg *fmsg, void *priv_ctx,
+			     struct netlink_ext_ack *extack)
+{
+	return 0;
+}
+
+static int
+nsim_dev_empty_reporter_diagnose(struct devlink_health_reporter *reporter,
+				 struct devlink_fmsg *fmsg,
+				 struct netlink_ext_ack *extack)
+{
+	return 0;
+}
+
+static const
+struct devlink_health_reporter_ops nsim_dev_empty_reporter_ops = {
+	.name = "empty",
+	.dump = nsim_dev_empty_reporter_dump,
+	.diagnose = nsim_dev_empty_reporter_diagnose,
+};
+
+struct nsim_dev_dummy_reporter_ctx {
+	char *break_msg;
+};
+
+static int
+nsim_dev_dummy_reporter_recover(struct devlink_health_reporter *reporter,
+				void *priv_ctx,
+				struct netlink_ext_ack *extack)
+{
+	struct nsim_dev_health *health = devlink_health_reporter_priv(reporter);
+	struct nsim_dev_dummy_reporter_ctx *ctx = priv_ctx;
+
+	if (health->fail_recover) {
+		/* For testing purposes, user set debugfs fail_recover
+		 * value to true. Fail right away.
+		 */
+		NL_SET_ERR_MSG_MOD(extack, "User setup the recover to fail for testing purposes");
+		return -EINVAL;
+	}
+	if (ctx) {
+		kfree(health->recovered_break_msg);
+		health->recovered_break_msg = kstrdup(ctx->break_msg,
+						      GFP_KERNEL);
+		if (!health->recovered_break_msg)
+			return -ENOMEM;
+	}
+	return 0;
+}
+
+static int nsim_dev_dummy_fmsg_put(struct devlink_fmsg *fmsg, u32 binary_len)
+{
+	char *binary;
+	int err;
+	int i;
+
+	err = devlink_fmsg_bool_pair_put(fmsg, "test_bool", true);
+	if (err)
+		return err;
+	err = devlink_fmsg_u8_pair_put(fmsg, "test_u8", 1);
+	if (err)
+		return err;
+	err = devlink_fmsg_u32_pair_put(fmsg, "test_u32", 3);
+	if (err)
+		return err;
+	err = devlink_fmsg_u64_pair_put(fmsg, "test_u64", 4);
+	if (err)
+		return err;
+	err = devlink_fmsg_string_pair_put(fmsg, "test_string", "somestring");
+	if (err)
+		return err;
+
+	err = devlink_fmsg_arr_pair_nest_start(fmsg, "test_binary");
+	if (err)
+		return err;
+	binary = kmalloc(binary_len, GFP_KERNEL);
+	if (!binary)
+		return -ENOMEM;
+	get_random_bytes(binary, binary_len);
+	err = devlink_fmsg_binary_put(fmsg, binary, binary_len);
+	kfree(binary);
+	if (err)
+		return err;
+	err = devlink_fmsg_arr_pair_nest_end(fmsg);
+	if (err)
+		return err;
+
+	err = devlink_fmsg_pair_nest_start(fmsg, "test_nest");
+	if (err)
+		return err;
+	err = devlink_fmsg_obj_nest_start(fmsg);
+	if (err)
+		return err;
+	err = devlink_fmsg_bool_pair_put(fmsg, "nested_test_bool", false);
+	if (err)
+		return err;
+	err = devlink_fmsg_u8_pair_put(fmsg, "nested_test_u8", false);
+	if (err)
+		return err;
+	err = devlink_fmsg_obj_nest_end(fmsg);
+	if (err)
+		return err;
+	err = devlink_fmsg_pair_nest_end(fmsg);
+	if (err)
+		return err;
+
+	err = devlink_fmsg_arr_pair_nest_start(fmsg, "test_bool_array");
+	if (err)
+		return err;
+	for (i = 0; i < 10; i++) {
+		err = devlink_fmsg_bool_put(fmsg, true);
+		if (err)
+			return err;
+	}
+	err = devlink_fmsg_arr_pair_nest_end(fmsg);
+	if (err)
+		return err;
+
+	err = devlink_fmsg_arr_pair_nest_start(fmsg, "test_u8_array");
+	if (err)
+		return err;
+	for (i = 0; i < 10; i++) {
+		err = devlink_fmsg_u8_put(fmsg, i);
+		if (err)
+			return err;
+	}
+	err = devlink_fmsg_arr_pair_nest_end(fmsg);
+	if (err)
+		return err;
+
+	err = devlink_fmsg_arr_pair_nest_start(fmsg, "test_u32_array");
+	if (err)
+		return err;
+	for (i = 0; i < 10; i++) {
+		err = devlink_fmsg_u32_put(fmsg, i);
+		if (err)
+			return err;
+	}
+	err = devlink_fmsg_arr_pair_nest_end(fmsg);
+	if (err)
+		return err;
+
+	err = devlink_fmsg_arr_pair_nest_start(fmsg, "test_u64_array");
+	if (err)
+		return err;
+	for (i = 0; i < 10; i++) {
+		err = devlink_fmsg_u64_put(fmsg, i);
+		if (err)
+			return err;
+	}
+	err = devlink_fmsg_arr_pair_nest_end(fmsg);
+	if (err)
+		return err;
+
+	err = devlink_fmsg_arr_pair_nest_start(fmsg, "test_array_of_objects");
+	if (err)
+		return err;
+	for (i = 0; i < 10; i++) {
+		err = devlink_fmsg_obj_nest_start(fmsg);
+		if (err)
+			return err;
+		err = devlink_fmsg_bool_pair_put(fmsg,
+						 "in_array_nested_test_bool",
+						 false);
+		if (err)
+			return err;
+		err = devlink_fmsg_u8_pair_put(fmsg,
+					       "in_array_nested_test_u8",
+					       i);
+		if (err)
+			return err;
+		err = devlink_fmsg_obj_nest_end(fmsg);
+		if (err)
+			return err;
+	}
+	return devlink_fmsg_arr_pair_nest_end(fmsg);
+}
+
+static int
+nsim_dev_dummy_reporter_dump(struct devlink_health_reporter *reporter,
+			     struct devlink_fmsg *fmsg, void *priv_ctx,
+			     struct netlink_ext_ack *extack)
+{
+	struct nsim_dev_health *health = devlink_health_reporter_priv(reporter);
+	struct nsim_dev_dummy_reporter_ctx *ctx = priv_ctx;
+	int err;
+
+	if (ctx) {
+		err = devlink_fmsg_string_pair_put(fmsg, "break_message",
+						   ctx->break_msg);
+		if (err)
+			return err;
+	}
+	return nsim_dev_dummy_fmsg_put(fmsg, health->binary_len);
+}
+
+static int
+nsim_dev_dummy_reporter_diagnose(struct devlink_health_reporter *reporter,
+				 struct devlink_fmsg *fmsg,
+				 struct netlink_ext_ack *extack)
+{
+	struct nsim_dev_health *health = devlink_health_reporter_priv(reporter);
+	int err;
+
+	if (health->recovered_break_msg) {
+		err = devlink_fmsg_string_pair_put(fmsg,
+						   "recovered_break_message",
+						   health->recovered_break_msg);
+		if (err)
+			return err;
+	}
+	return nsim_dev_dummy_fmsg_put(fmsg, health->binary_len);
+}
+
+static const
+struct devlink_health_reporter_ops nsim_dev_dummy_reporter_ops = {
+	.name = "dummy",
+	.recover = nsim_dev_dummy_reporter_recover,
+	.dump = nsim_dev_dummy_reporter_dump,
+	.diagnose = nsim_dev_dummy_reporter_diagnose,
+};
+
+static ssize_t nsim_dev_health_break_write(struct file *file,
+					   const char __user *data,
+					   size_t count, loff_t *ppos)
+{
+	struct nsim_dev_health *health = file->private_data;
+	struct nsim_dev_dummy_reporter_ctx ctx;
+	char *break_msg;
+	int err;
+
+	break_msg = kmalloc(count + 1, GFP_KERNEL);
+	if (!break_msg)
+		return -ENOMEM;
+
+	if (copy_from_user(break_msg, data, count)) {
+		err = -EFAULT;
+		goto out;
+	}
+	break_msg[count] = '\0';
+	if (break_msg[count - 1] == '\n')
+		break_msg[count - 1] = '\0';
+
+	ctx.break_msg = break_msg;
+	err = devlink_health_report(health->dummy_reporter, break_msg, &ctx);
+	if (err)
+		goto out;
+
+out:
+	kfree(break_msg);
+	return err ?: count;
+}
+
+static const struct file_operations nsim_dev_health_break_fops = {
+	.open = simple_open,
+	.write = nsim_dev_health_break_write,
+	.llseek = generic_file_llseek,
+};
+
+int nsim_dev_health_init(struct nsim_dev *nsim_dev, struct devlink *devlink)
+{
+	struct nsim_dev_health *health = &nsim_dev->health;
+	int err;
+
+	health->empty_reporter =
+		devlink_health_reporter_create(devlink,
+					       &nsim_dev_empty_reporter_ops,
+					       0, false, health);
+	if (IS_ERR(health->empty_reporter))
+		return PTR_ERR(health->empty_reporter);
+
+	health->dummy_reporter =
+		devlink_health_reporter_create(devlink,
+					       &nsim_dev_dummy_reporter_ops,
+					       0, false, health);
+	if (IS_ERR(health->dummy_reporter)) {
+		err = PTR_ERR(health->dummy_reporter);
+		goto err_empty_reporter_destroy;
+	}
+
+	health->ddir = debugfs_create_dir("health", nsim_dev->ddir);
+	if (IS_ERR_OR_NULL(health->ddir)) {
+		err = PTR_ERR_OR_ZERO(health->ddir) ?: -EINVAL;
+		goto err_dummy_reporter_destroy;
+	}
+
+	health->recovered_break_msg = NULL;
+	debugfs_create_file("break_health", 0200, health->ddir, health,
+			    &nsim_dev_health_break_fops);
+	health->binary_len = 16;
+	debugfs_create_u32("binary_len", 0600, health->ddir,
+			   &health->binary_len);
+	health->fail_recover = false;
+	debugfs_create_bool("fail_recover", 0600, health->ddir,
+			    &health->fail_recover);
+	return 0;
+
+err_dummy_reporter_destroy:
+	devlink_health_reporter_destroy(health->dummy_reporter);
+err_empty_reporter_destroy:
+	devlink_health_reporter_destroy(health->empty_reporter);
+	return err;
+}
+
+void nsim_dev_health_exit(struct nsim_dev *nsim_dev)
+{
+	struct nsim_dev_health *health = &nsim_dev->health;
+
+	debugfs_remove_recursive(health->ddir);
+	kfree(health->recovered_break_msg);
+	devlink_health_reporter_destroy(health->dummy_reporter);
+	devlink_health_reporter_destroy(health->empty_reporter);
+}
diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
index 24358385d869..94df795ef4d3 100644
--- a/drivers/net/netdevsim/netdevsim.h
+++ b/drivers/net/netdevsim/netdevsim.h
@@ -134,6 +134,18 @@ enum nsim_resource_id {
 	NSIM_RESOURCE_IPV6_FIB_RULES,
 };
 
+struct nsim_dev_health {
+	struct devlink_health_reporter *empty_reporter;
+	struct devlink_health_reporter *dummy_reporter;
+	struct dentry *ddir;
+	char *recovered_break_msg;
+	u32 binary_len;
+	bool fail_recover;
+};
+
+int nsim_dev_health_init(struct nsim_dev *nsim_dev, struct devlink *devlink);
+void nsim_dev_health_exit(struct nsim_dev *nsim_dev);
+
 struct nsim_dev_port {
 	struct list_head list;
 	struct devlink_port devlink_port;
@@ -164,6 +176,7 @@ struct nsim_dev {
 	bool dont_allow_reload;
 	bool fail_reload;
 	struct devlink_region *dummy_region;
+	struct nsim_dev_health health;
 };
 
 static inline struct net *nsim_dev_net(struct nsim_dev *nsim_dev)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [patch net-next v2 4/4] selftests: add netdevsim devlink health tests
  2019-10-10 13:18 [patch net-next v2 0/4] netdevsim: add devlink health reporters suppor Jiri Pirko
                   ` (2 preceding siblings ...)
  2019-10-10 13:18 ` [patch net-next v2 3/4] netdevsim: implement couple of testing devlink health reporters Jiri Pirko
@ 2019-10-10 13:18 ` Jiri Pirko
  2019-10-12  4:05 ` [patch net-next v2 0/4] netdevsim: add devlink health reporters suppor David Miller
  4 siblings, 0 replies; 10+ messages in thread
From: Jiri Pirko @ 2019-10-10 13:18 UTC (permalink / raw)
  To: netdev; +Cc: davem, jakub.kicinski, ayal, moshe, eranbe, mlxsw

From: Jiri Pirko <jiri@mellanox.com>

Add basic tests to verify functionality of netdevsim reporters.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
v1->v2:
- added test of manual recovery fail

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 .../drivers/net/netdevsim/devlink.sh          | 127 +++++++++++++++++-
 1 file changed, 126 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/drivers/net/netdevsim/devlink.sh b/tools/testing/selftests/drivers/net/netdevsim/devlink.sh
index cb0f17e17abc..5e4ab193559a 100755
--- a/tools/testing/selftests/drivers/net/netdevsim/devlink.sh
+++ b/tools/testing/selftests/drivers/net/netdevsim/devlink.sh
@@ -4,7 +4,8 @@
 lib_dir=$(dirname $0)/../../../net/forwarding
 
 ALL_TESTS="fw_flash_test params_test regions_test reload_test \
-	   netns_reload_test resource_test dev_info_test"
+	   netns_reload_test resource_test dev_info_test \
+	   empty_reporter_test dummy_reporter_test"
 NUM_NETIFS=0
 source $lib_dir/lib.sh
 
@@ -303,6 +304,130 @@ dev_info_test()
 	log_test "dev_info test"
 }
 
+empty_reporter_test()
+{
+	RET=0
+
+	devlink health show $DL_HANDLE reporter empty >/dev/null
+	check_err $? "Failed show empty reporter"
+
+	devlink health dump show $DL_HANDLE reporter empty >/dev/null
+	check_err $? "Failed show dump of empty reporter"
+
+	devlink health diagnose $DL_HANDLE reporter empty >/dev/null
+	check_err $? "Failed diagnose empty reporter"
+
+	devlink health recover $DL_HANDLE reporter empty
+	check_err $? "Failed recover empty reporter"
+
+	log_test "empty reporter test"
+}
+
+check_reporter_info()
+{
+	local name=$1
+	local expected_state=$2
+	local expected_error=$3
+	local expected_recover=$4
+	local expected_grace_period=$5
+	local expected_auto_recover=$6
+
+	local show=$(devlink health show $DL_HANDLE reporter $name -j | jq -e -r ".[][][]")
+	check_err $? "Failed show $name reporter"
+
+	local state=$(echo $show | jq -r ".state")
+	[ "$state" == "$expected_state" ]
+	check_err $? "Unexpected \"state\" value (got $state, expected $expected_state)"
+
+	local error=$(echo $show | jq -r ".error")
+	[ "$error" == "$expected_error" ]
+	check_err $? "Unexpected \"error\" value (got $error, expected $expected_error)"
+
+	local recover=`echo $show | jq -r ".recover"`
+	[ "$recover" == "$expected_recover" ]
+	check_err $? "Unexpected \"recover\" value (got $recover, expected $expected_recover)"
+
+	local grace_period=$(echo $show | jq -r ".grace_period")
+	check_err $? "Failed get $name reporter grace_period"
+	[ "$grace_period" == "$expected_grace_period" ]
+	check_err $? "Unexpected \"grace_period\" value (got $grace_period, expected $expected_grace_period)"
+
+	local auto_recover=$(echo $show | jq -r ".auto_recover")
+	[ "$auto_recover" == "$expected_auto_recover" ]
+	check_err $? "Unexpected \"auto_recover\" value (got $auto_recover, expected $expected_auto_recover)"
+}
+
+dummy_reporter_test()
+{
+	RET=0
+
+	check_reporter_info dummy healthy 0 0 0 false
+
+	local BREAK_MSG="foo bar"
+	echo "$BREAK_MSG"> $DEBUGFS_DIR/health/break_health
+	check_err $? "Failed to break dummy reporter"
+
+	check_reporter_info dummy error 1 0 0 false
+
+	local dump=$(devlink health dump show $DL_HANDLE reporter dummy -j)
+	check_err $? "Failed show dump of dummy reporter"
+
+	local dump_break_msg=$(echo $dump | jq -r ".break_message")
+	[ "$dump_break_msg" == "$BREAK_MSG" ]
+	check_err $? "Unexpected dump break message value (got $dump_break_msg, expected $BREAK_MSG)"
+
+	devlink health dump clear $DL_HANDLE reporter dummy
+	check_err $? "Failed clear dump of dummy reporter"
+
+	devlink health recover $DL_HANDLE reporter dummy
+	check_err $? "Failed recover dummy reporter"
+
+	check_reporter_info dummy healthy 1 1 0 false
+
+	devlink health set $DL_HANDLE reporter dummy auto_recover true
+	check_err $? "Failed to dummy reporter auto_recover option"
+
+	check_reporter_info dummy healthy 1 1 0 true
+
+	echo "$BREAK_MSG"> $DEBUGFS_DIR/health/break_health
+	check_err $? "Failed to break dummy reporter"
+
+	check_reporter_info dummy healthy 2 2 0 true
+
+	local diagnose=$(devlink health diagnose $DL_HANDLE reporter dummy -j -p)
+	check_err $? "Failed show diagnose of dummy reporter"
+
+	local rcvrd_break_msg=$(echo $diagnose | jq -r ".recovered_break_message")
+	[ "$rcvrd_break_msg" == "$BREAK_MSG" ]
+	check_err $? "Unexpected recovered break message value (got $rcvrd_break_msg, expected $BREAK_MSG)"
+
+	devlink health set $DL_HANDLE reporter dummy grace_period 10
+	check_err $? "Failed to dummy reporter grace_period option"
+
+	check_reporter_info dummy healthy 2 2 10 true
+	
+	echo "Y"> $DEBUGFS_DIR/health/fail_recover
+	check_err $? "Failed set dummy reporter recovery to fail"
+
+	echo "$BREAK_MSG"> $DEBUGFS_DIR/health/break_health
+	check_fail $? "Unexpected success of dummy reporter break"
+	
+	check_reporter_info dummy error 3 2 10 true
+
+	devlink health recover $DL_HANDLE reporter dummy
+	check_fail $? "Unexpected success of dummy reporter recover"
+
+	echo "N"> $DEBUGFS_DIR/health/fail_recover
+	check_err $? "Failed set dummy reporter recovery to be successful"
+	
+	devlink health recover $DL_HANDLE reporter dummy
+	check_err $? "Failed recover dummy reporter"
+	
+	check_reporter_info dummy healthy 3 3 10 true
+
+	log_test "dummy reporter test"
+}
+
 setup_prepare()
 {
 	modprobe netdevsim
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [patch net-next v2 2/4] devlink: propagate extack down to health reporter ops
  2019-10-10 13:18 ` [patch net-next v2 2/4] devlink: propagate extack down to health reporter ops Jiri Pirko
@ 2019-10-11  2:04   ` Jakub Kicinski
  2019-10-11  6:13     ` Jiri Pirko
  2019-10-11  6:45     ` Johannes Berg
  0 siblings, 2 replies; 10+ messages in thread
From: Jakub Kicinski @ 2019-10-11  2:04 UTC (permalink / raw)
  To: Jiri Pirko, davem, Johannes Berg, dsahern
  Cc: netdev, ayal, moshe, eranbe, mlxsw

On Thu, 10 Oct 2019 15:18:49 +0200, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@mellanox.com>
> 
> During health reporter operations, driver might want to fill-up
> the extack message, so propagate extack down to the health reporter ops.
> 
> Signed-off-by: Jiri Pirko <jiri@mellanox.com>

> @@ -507,11 +507,14 @@ enum devlink_health_reporter_state {
>  struct devlink_health_reporter_ops {
>  	char *name;
>  	int (*recover)(struct devlink_health_reporter *reporter,
> -		       void *priv_ctx);
> +		       void *priv_ctx, struct netlink_ext_ack *extack);
>  	int (*dump)(struct devlink_health_reporter *reporter,
> -		    struct devlink_fmsg *fmsg, void *priv_ctx);
> +		    struct devlink_fmsg *fmsg, void *priv_ctx,
> +		    struct netlink_ext_ack *extack);
>  	int (*diagnose)(struct devlink_health_reporter *reporter,
> -			struct devlink_fmsg *fmsg);
> +			struct devlink_fmsg *fmsg,
> +			struct netlink_ext_ack *extack);
> +

nit: Looks like an extra new line snuck in here?

>  };
>  
>  /**

> @@ -4946,11 +4947,12 @@ int devlink_health_report(struct devlink_health_reporter *reporter,
>  
>  	mutex_lock(&reporter->dump_lock);
>  	/* store current dump of current error, for later analysis */
> -	devlink_health_do_dump(reporter, priv_ctx);
> +	devlink_health_do_dump(reporter, priv_ctx, NULL);
>  	mutex_unlock(&reporter->dump_lock);
>  
>  	if (reporter->auto_recover)
> -		return devlink_health_reporter_recover(reporter, priv_ctx);
> +		return devlink_health_reporter_recover(reporter,
> +						       priv_ctx, NULL);
>  
>  	return 0;
>  }

Thinking about this again - would it be entirely insane to allocate the
extack on the stack here? And if anything gets set output into the logs?

For context the situation here is that the health API can be poked from
user space, but also the recovery actions are triggered automatically
when failure is detected, if so configured (usually we expect them to
be).

When we were adding the extack helper for the drivers to use Johannes
was concerned about printing to logs because that gave us a
disincentive to convert all locations, and people could get surprised
by the logs disappearing when more places are converted to extack [1].

I wonder if this is a special case where outputting to the logs is a
good idea? Really for all auto-recoverable health reporters the extack
argument will just confuse driver authors. If driver uses extack here
instead of printing to the logs information why auto-recovery failed is
likely to get lost.

Am I over-thinking this?

[1] https://www.spinics.net/lists/netdev/msg431998.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch net-next v2 2/4] devlink: propagate extack down to health reporter ops
  2019-10-11  2:04   ` Jakub Kicinski
@ 2019-10-11  6:13     ` Jiri Pirko
  2019-10-11  6:45     ` Johannes Berg
  1 sibling, 0 replies; 10+ messages in thread
From: Jiri Pirko @ 2019-10-11  6:13 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, Johannes Berg, dsahern, netdev, ayal, moshe, eranbe, mlxsw

Fri, Oct 11, 2019 at 04:04:29AM CEST, jakub.kicinski@netronome.com wrote:
>On Thu, 10 Oct 2019 15:18:49 +0200, Jiri Pirko wrote:
>> From: Jiri Pirko <jiri@mellanox.com>
>> 
>> During health reporter operations, driver might want to fill-up
>> the extack message, so propagate extack down to the health reporter ops.
>> 
>> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
>
>> @@ -507,11 +507,14 @@ enum devlink_health_reporter_state {
>>  struct devlink_health_reporter_ops {
>>  	char *name;
>>  	int (*recover)(struct devlink_health_reporter *reporter,
>> -		       void *priv_ctx);
>> +		       void *priv_ctx, struct netlink_ext_ack *extack);
>>  	int (*dump)(struct devlink_health_reporter *reporter,
>> -		    struct devlink_fmsg *fmsg, void *priv_ctx);
>> +		    struct devlink_fmsg *fmsg, void *priv_ctx,
>> +		    struct netlink_ext_ack *extack);
>>  	int (*diagnose)(struct devlink_health_reporter *reporter,
>> -			struct devlink_fmsg *fmsg);
>> +			struct devlink_fmsg *fmsg,
>> +			struct netlink_ext_ack *extack);
>> +
>
>nit: Looks like an extra new line snuck in here?
>
>>  };
>>  
>>  /**
>
>> @@ -4946,11 +4947,12 @@ int devlink_health_report(struct devlink_health_reporter *reporter,
>>  
>>  	mutex_lock(&reporter->dump_lock);
>>  	/* store current dump of current error, for later analysis */
>> -	devlink_health_do_dump(reporter, priv_ctx);
>> +	devlink_health_do_dump(reporter, priv_ctx, NULL);
>>  	mutex_unlock(&reporter->dump_lock);
>>  
>>  	if (reporter->auto_recover)
>> -		return devlink_health_reporter_recover(reporter, priv_ctx);
>> +		return devlink_health_reporter_recover(reporter,
>> +						       priv_ctx, NULL);
>>  
>>  	return 0;
>>  }
>
>Thinking about this again - would it be entirely insane to allocate the
>extack on the stack here? And if anything gets set output into the logs?
>
>For context the situation here is that the health API can be poked from
>user space, but also the recovery actions are triggered automatically
>when failure is detected, if so configured (usually we expect them to
>be).
>
>When we were adding the extack helper for the drivers to use Johannes
>was concerned about printing to logs because that gave us a
>disincentive to convert all locations, and people could get surprised
>by the logs disappearing when more places are converted to extack [1].
>
>I wonder if this is a special case where outputting to the logs is a
>good idea? Really for all auto-recoverable health reporters the extack
>argument will just confuse driver authors. If driver uses extack here
>instead of printing to the logs information why auto-recovery failed is
>likely to get lost.
>
>Am I over-thinking this?

My gut feeling is kidn of odd about allocating some netlink specific
struct and use it separatelly from netlink :/

In any case, I think that this should probably be a separate patch/set.

>
>[1] https://www.spinics.net/lists/netdev/msg431998.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch net-next v2 2/4] devlink: propagate extack down to health reporter ops
  2019-10-11  2:04   ` Jakub Kicinski
  2019-10-11  6:13     ` Jiri Pirko
@ 2019-10-11  6:45     ` Johannes Berg
  2019-10-11 23:10       ` David Ahern
  1 sibling, 1 reply; 10+ messages in thread
From: Johannes Berg @ 2019-10-11  6:45 UTC (permalink / raw)
  To: Jakub Kicinski, Jiri Pirko, davem, dsahern
  Cc: netdev, ayal, moshe, eranbe, mlxsw

On Thu, 2019-10-10 at 19:04 -0700, Jakub Kicinski wrote:

> >  	if (reporter->auto_recover)
> > -		return devlink_health_reporter_recover(reporter, priv_ctx);
> > +		return devlink_health_reporter_recover(reporter,
> > +						       priv_ctx, NULL);
> >  
> >  	return 0;
> >  }
> 
> Thinking about this again - would it be entirely insane to allocate the
> extack on the stack here? And if anything gets set output into the logs?

I think that's fine.

> For context the situation here is that the health API can be poked from
> user space, but also the recovery actions are triggered automatically
> when failure is detected, if so configured (usually we expect them to
> be).

Right. We have similar situations in the wireless stack, and we usually
just get very noisy & WARN. But then it's a level lower, so you don't
have any extack around there :)

> When we were adding the extack helper for the drivers to use Johannes
> was concerned about printing to logs because that gave us a
> disincentive to convert all locations, and people could get surprised
> by the logs disappearing when more places are converted to extack [1].

Yes, but that argument doesn't apply here. The argument was around code
like

> +#define NL_SET_ERR_MSG(extack, msg) do {		\
> +	struct netlink_ext_ack *_extack = (extack);	\
> +	static const char _msg[] = (msg);		\
> +							\
> +	if (_extack)					\
> +		_extack->_msg = _msg;			\
> +	else						\
> +		pr_info("%s\n", _msg);			\
>  } while (0)

(which I quoted in the message you linked).

I still stand by my comment there, I think it'd be a bad idea to do
this, because it'd mean that some random code using NL_SET_ERR_MSG()
might print something to the log if it's called in one way, and perhaps
the fact that it's getting NULL there is due to some higher level call
chain a few steps up "losing" the extack (or not having one), but then
if that's fixed suddenly all the messages disappear.

In the case you're proposing it's entirely different - you're proposing
that a non-netlink caller still supplies an extack that can be filled,
and then prints it out by itself. There's no reason to believe that
would be changed to suddenly have a real netlink extack that *doesn't*
get printed - that wouldn't make sense since you're doing this precisely
because you're *not* inside a netlink call.

Now, if you were saying "let's have a netlink handler that prints the
extack because a few levels up in the callstack we aren't passing the
parameter down yet" I'd still oppose it on similar grounds - that's
something we'd want to fix later to actually report to userspace - but
here there's no chance of that.

So to me this looks fine.

I don't really share the concern about extack being netlink specific and
then using it here - it ultimately doesn't matter whether this thing is
called "netlink_extack" or "extended_error_reporting", IMHO.

I guess we could wrap this so we have

struct devlink_error_report {
  /* pretty much identical content to extack? */
};

and then in this path just print it, while in the actual netlink paths
convert it to extack a level up. That might be the right thing from an
abstraction POV (if we were designing this in Enterprise Architect ;-) )
but pragmatically I'd say it doesn't really matter what the struct is
called, and extack already has helper macros etc.

johannes


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch net-next v2 2/4] devlink: propagate extack down to health reporter ops
  2019-10-11  6:45     ` Johannes Berg
@ 2019-10-11 23:10       ` David Ahern
  0 siblings, 0 replies; 10+ messages in thread
From: David Ahern @ 2019-10-11 23:10 UTC (permalink / raw)
  To: Johannes Berg, Jakub Kicinski, Jiri Pirko, davem, dsahern
  Cc: netdev, ayal, moshe, eranbe, mlxsw

On 10/11/19 12:45 AM, Johannes Berg wrote:
> So to me this looks fine.
> 
> I don't really share the concern about extack being netlink specific and
> then using it here - it ultimately doesn't matter whether this thing is
> called "netlink_extack" or "extended_error_reporting", IMHO.

+1

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch net-next v2 0/4] netdevsim: add devlink health reporters suppor
  2019-10-10 13:18 [patch net-next v2 0/4] netdevsim: add devlink health reporters suppor Jiri Pirko
                   ` (3 preceding siblings ...)
  2019-10-10 13:18 ` [patch net-next v2 4/4] selftests: add netdevsim devlink health tests Jiri Pirko
@ 2019-10-12  4:05 ` David Miller
  4 siblings, 0 replies; 10+ messages in thread
From: David Miller @ 2019-10-12  4:05 UTC (permalink / raw)
  To: jiri; +Cc: netdev, jakub.kicinski, ayal, moshe, eranbe, mlxsw

From: Jiri Pirko <jiri@resnulli.us>
Date: Thu, 10 Oct 2019 15:18:47 +0200

> From: Jiri Pirko <jiri@mellanox.com>
> 
> This patchset adds support for devlink health reporter interface
> testing. First 2 patches are small dependencies of the last 2.

Series applied, thanks Jiri.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-10-12  4:05 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-10 13:18 [patch net-next v2 0/4] netdevsim: add devlink health reporters suppor Jiri Pirko
2019-10-10 13:18 ` [patch net-next v2 1/4] devlink: don't do reporter recovery if the state is healthy Jiri Pirko
2019-10-10 13:18 ` [patch net-next v2 2/4] devlink: propagate extack down to health reporter ops Jiri Pirko
2019-10-11  2:04   ` Jakub Kicinski
2019-10-11  6:13     ` Jiri Pirko
2019-10-11  6:45     ` Johannes Berg
2019-10-11 23:10       ` David Ahern
2019-10-10 13:18 ` [patch net-next v2 3/4] netdevsim: implement couple of testing devlink health reporters Jiri Pirko
2019-10-10 13:18 ` [patch net-next v2 4/4] selftests: add netdevsim devlink health tests Jiri Pirko
2019-10-12  4:05 ` [patch net-next v2 0/4] netdevsim: add devlink health reporters suppor David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).