netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch net-next 0/4] netdevsim: add devlink health reporters support
@ 2019-10-09 11:04 Jiri Pirko
  2019-10-09 11:04 ` [patch net-next 1/4] devlink: don't do reporter recovery if the state is healthy Jiri Pirko
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Jiri Pirko @ 2019-10-09 11:04 UTC (permalink / raw)
  To: netdev; +Cc: davem, jakub.kicinski, ayal, moshe, eranbe, mlxsw

From: Jiri Pirko <jiri@mellanox.com>

This patchset adds support for devlink health reporter interface
testing. First 2 patches are small dependencies of the last 2.

Jiri Pirko (4):
  devlink: don't do reporter recovery if the state is healthy
  devlink: propagate extack down to health reporter ops
  netdevsim: implement couple of testing devlink health reporters
  selftests: add netdevsim devlink health tests

 .../net/ethernet/broadcom/bnxt/bnxt_devlink.c |   9 +-
 .../mellanox/mlx5/core/en/reporter_rx.c       |   6 +-
 .../mellanox/mlx5/core/en/reporter_tx.c       |   6 +-
 .../net/ethernet/mellanox/mlx5/core/health.c  |  12 +-
 drivers/net/netdevsim/Makefile                |   2 +-
 drivers/net/netdevsim/dev.c                   |  17 +-
 drivers/net/netdevsim/health.c                | 315 ++++++++++++++++++
 drivers/net/netdevsim/netdevsim.h             |  14 +
 include/net/devlink.h                         |   9 +-
 net/core/devlink.c                            |  23 +-
 .../drivers/net/netdevsim/devlink.sh          | 124 ++++++-
 11 files changed, 510 insertions(+), 27 deletions(-)
 create mode 100644 drivers/net/netdevsim/health.c

-- 
2.21.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [patch net-next 1/4] devlink: don't do reporter recovery if the state is healthy
  2019-10-09 11:04 [patch net-next 0/4] netdevsim: add devlink health reporters support Jiri Pirko
@ 2019-10-09 11:04 ` Jiri Pirko
  2019-10-09 11:04 ` [patch net-next 2/4] devlink: propagate extack down to health reporter ops Jiri Pirko
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: Jiri Pirko @ 2019-10-09 11:04 UTC (permalink / raw)
  To: netdev; +Cc: davem, jakub.kicinski, ayal, moshe, eranbe, mlxsw

From: Jiri Pirko <jiri@mellanox.com>

If reporter state is healthy, don't call into a driver for recover and
don't increase recovery count.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 net/core/devlink.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/core/devlink.c b/net/core/devlink.c
index eb0a22f05887..95887462eecf 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -4851,6 +4851,9 @@ devlink_health_reporter_recover(struct devlink_health_reporter *reporter,
 {
 	int err;
 
+	if (reporter->health_state == DEVLINK_HEALTH_REPORTER_STATE_HEALTHY)
+		return 0;
+
 	if (!reporter->ops->recover)
 		return -EOPNOTSUPP;
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [patch net-next 2/4] devlink: propagate extack down to health reporter ops
  2019-10-09 11:04 [patch net-next 0/4] netdevsim: add devlink health reporters support Jiri Pirko
  2019-10-09 11:04 ` [patch net-next 1/4] devlink: don't do reporter recovery if the state is healthy Jiri Pirko
@ 2019-10-09 11:04 ` Jiri Pirko
  2019-10-10  3:38   ` Jakub Kicinski
  2019-10-09 11:04 ` [patch net-next 3/4] netdevsim: implement couple of testing devlink health reporters Jiri Pirko
  2019-10-09 11:04 ` [patch net-next 4/4] selftests: add netdevsim devlink health tests Jiri Pirko
  3 siblings, 1 reply; 10+ messages in thread
From: Jiri Pirko @ 2019-10-09 11:04 UTC (permalink / raw)
  To: netdev; +Cc: davem, jakub.kicinski, ayal, moshe, eranbe, mlxsw

From: Jiri Pirko <jiri@mellanox.com>

During health reporter operations, driver might want to fill-up
the extack message, so propagate extack down to the health reporter ops.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 .../net/ethernet/broadcom/bnxt/bnxt_devlink.c |  9 ++++++---
 .../mellanox/mlx5/core/en/reporter_rx.c       |  6 ++++--
 .../mellanox/mlx5/core/en/reporter_tx.c       |  6 ++++--
 .../net/ethernet/mellanox/mlx5/core/health.c  | 12 +++++++----
 include/net/devlink.h                         |  9 ++++++---
 net/core/devlink.c                            | 20 ++++++++++---------
 6 files changed, 39 insertions(+), 23 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
index e664392dccc0..ff1bc0ec2e7c 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
@@ -16,7 +16,8 @@
 #include "bnxt_devlink.h"
 
 static int bnxt_fw_reporter_diagnose(struct devlink_health_reporter *reporter,
-				     struct devlink_fmsg *fmsg)
+				     struct devlink_fmsg *fmsg,
+				     struct netlink_ext_ack *extack)
 {
 	struct bnxt *bp = devlink_health_reporter_priv(reporter);
 	struct bnxt_fw_health *health = bp->fw_health;
@@ -66,7 +67,8 @@ static const struct devlink_health_reporter_ops bnxt_dl_fw_reporter_ops = {
 };
 
 static int bnxt_fw_reset_recover(struct devlink_health_reporter *reporter,
-				 void *priv_ctx)
+				 void *priv_ctx,
+				 struct netlink_ext_ack *extack)
 {
 	struct bnxt *bp = devlink_health_reporter_priv(reporter);
 
@@ -84,7 +86,8 @@ struct devlink_health_reporter_ops bnxt_dl_fw_reset_reporter_ops = {
 };
 
 static int bnxt_fw_fatal_recover(struct devlink_health_reporter *reporter,
-				 void *priv_ctx)
+				 void *priv_ctx,
+				 struct netlink_ext_ack *extack)
 {
 	struct bnxt *bp = devlink_health_reporter_priv(reporter);
 	struct bnxt_fw_reporter_ctx *fw_reporter_ctx = priv_ctx;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c
index b860569d4247..6c72b592315b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c
@@ -222,7 +222,8 @@ static int mlx5e_rx_reporter_recover_from_ctx(struct mlx5e_err_ctx *err_ctx)
 }
 
 static int mlx5e_rx_reporter_recover(struct devlink_health_reporter *reporter,
-				     void *context)
+				     void *context,
+				     struct netlink_ext_ack *extack)
 {
 	struct mlx5e_priv *priv = devlink_health_reporter_priv(reporter);
 	struct mlx5e_err_ctx *err_ctx = context;
@@ -301,7 +302,8 @@ static int mlx5e_rx_reporter_build_diagnose_output(struct mlx5e_rq *rq,
 }
 
 static int mlx5e_rx_reporter_diagnose(struct devlink_health_reporter *reporter,
-				      struct devlink_fmsg *fmsg)
+				      struct devlink_fmsg *fmsg,
+				      struct netlink_ext_ack *extack)
 {
 	struct mlx5e_priv *priv = devlink_health_reporter_priv(reporter);
 	struct mlx5e_params *params = &priv->channels.params;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
index bfed558637c2..b468549e96ff 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
@@ -135,7 +135,8 @@ static int mlx5e_tx_reporter_recover_from_ctx(struct mlx5e_err_ctx *err_ctx)
 }
 
 static int mlx5e_tx_reporter_recover(struct devlink_health_reporter *reporter,
-				     void *context)
+				     void *context,
+				     struct netlink_ext_ack *extack)
 {
 	struct mlx5e_priv *priv = devlink_health_reporter_priv(reporter);
 	struct mlx5e_err_ctx *err_ctx = context;
@@ -205,7 +206,8 @@ mlx5e_tx_reporter_build_diagnose_output(struct devlink_fmsg *fmsg,
 }
 
 static int mlx5e_tx_reporter_diagnose(struct devlink_health_reporter *reporter,
-				      struct devlink_fmsg *fmsg)
+				      struct devlink_fmsg *fmsg,
+				      struct netlink_ext_ack *extack)
 {
 	struct mlx5e_priv *priv = devlink_health_reporter_priv(reporter);
 	struct mlx5e_txqsq *generic_sq = priv->txq2sq[0];
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index d685122d9ff7..be3c3c704bfc 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -390,7 +390,8 @@ static void print_health_info(struct mlx5_core_dev *dev)
 
 static int
 mlx5_fw_reporter_diagnose(struct devlink_health_reporter *reporter,
-			  struct devlink_fmsg *fmsg)
+			  struct devlink_fmsg *fmsg,
+			  struct netlink_ext_ack *extack)
 {
 	struct mlx5_core_dev *dev = devlink_health_reporter_priv(reporter);
 	struct mlx5_core_health *health = &dev->priv.health;
@@ -491,7 +492,8 @@ mlx5_fw_reporter_heath_buffer_data_put(struct mlx5_core_dev *dev,
 
 static int
 mlx5_fw_reporter_dump(struct devlink_health_reporter *reporter,
-		      struct devlink_fmsg *fmsg, void *priv_ctx)
+		      struct devlink_fmsg *fmsg, void *priv_ctx,
+		      struct netlink_ext_ack *extack)
 {
 	struct mlx5_core_dev *dev = devlink_health_reporter_priv(reporter);
 	int err;
@@ -545,7 +547,8 @@ static const struct devlink_health_reporter_ops mlx5_fw_reporter_ops = {
 
 static int
 mlx5_fw_fatal_reporter_recover(struct devlink_health_reporter *reporter,
-			       void *priv_ctx)
+			       void *priv_ctx,
+			       struct netlink_ext_ack *extack)
 {
 	struct mlx5_core_dev *dev = devlink_health_reporter_priv(reporter);
 
@@ -555,7 +558,8 @@ mlx5_fw_fatal_reporter_recover(struct devlink_health_reporter *reporter,
 #define MLX5_CR_DUMP_CHUNK_SIZE 256
 static int
 mlx5_fw_fatal_reporter_dump(struct devlink_health_reporter *reporter,
-			    struct devlink_fmsg *fmsg, void *priv_ctx)
+			    struct devlink_fmsg *fmsg, void *priv_ctx,
+			    struct netlink_ext_ack *extack)
 {
 	struct mlx5_core_dev *dev = devlink_health_reporter_priv(reporter);
 	u32 crdump_size = dev->priv.health.crdump_size;
diff --git a/include/net/devlink.h b/include/net/devlink.h
index 4095657fc23f..d35a1be107b5 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -507,11 +507,14 @@ enum devlink_health_reporter_state {
 struct devlink_health_reporter_ops {
 	char *name;
 	int (*recover)(struct devlink_health_reporter *reporter,
-		       void *priv_ctx);
+		       void *priv_ctx, struct netlink_ext_ack *extack);
 	int (*dump)(struct devlink_health_reporter *reporter,
-		    struct devlink_fmsg *fmsg, void *priv_ctx);
+		    struct devlink_fmsg *fmsg, void *priv_ctx,
+		    struct netlink_ext_ack *extack);
 	int (*diagnose)(struct devlink_health_reporter *reporter,
-			struct devlink_fmsg *fmsg);
+			struct devlink_fmsg *fmsg,
+			struct netlink_ext_ack *extack);
+
 };
 
 /**
diff --git a/net/core/devlink.c b/net/core/devlink.c
index 95887462eecf..97e9a2246929 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -4847,7 +4847,7 @@ EXPORT_SYMBOL_GPL(devlink_health_reporter_state_update);
 
 static int
 devlink_health_reporter_recover(struct devlink_health_reporter *reporter,
-				void *priv_ctx)
+				void *priv_ctx, struct netlink_ext_ack *extack)
 {
 	int err;
 
@@ -4857,7 +4857,7 @@ devlink_health_reporter_recover(struct devlink_health_reporter *reporter,
 	if (!reporter->ops->recover)
 		return -EOPNOTSUPP;
 
-	err = reporter->ops->recover(reporter, priv_ctx);
+	err = reporter->ops->recover(reporter, priv_ctx, extack);
 	if (err)
 		return err;
 
@@ -4878,7 +4878,8 @@ devlink_health_dump_clear(struct devlink_health_reporter *reporter)
 }
 
 static int devlink_health_do_dump(struct devlink_health_reporter *reporter,
-				  void *priv_ctx)
+				  void *priv_ctx,
+				  struct netlink_ext_ack *extack)
 {
 	int err;
 
@@ -4899,7 +4900,7 @@ static int devlink_health_do_dump(struct devlink_health_reporter *reporter,
 		goto dump_err;
 
 	err = reporter->ops->dump(reporter, reporter->dump_fmsg,
-				  priv_ctx);
+				  priv_ctx, extack);
 	if (err)
 		goto dump_err;
 
@@ -4946,11 +4947,12 @@ int devlink_health_report(struct devlink_health_reporter *reporter,
 
 	mutex_lock(&reporter->dump_lock);
 	/* store current dump of current error, for later analysis */
-	devlink_health_do_dump(reporter, priv_ctx);
+	devlink_health_do_dump(reporter, priv_ctx, NULL);
 	mutex_unlock(&reporter->dump_lock);
 
 	if (reporter->auto_recover)
-		return devlink_health_reporter_recover(reporter, priv_ctx);
+		return devlink_health_reporter_recover(reporter,
+						       priv_ctx, NULL);
 
 	return 0;
 }
@@ -5188,7 +5190,7 @@ static int devlink_nl_cmd_health_reporter_recover_doit(struct sk_buff *skb,
 	if (!reporter)
 		return -EINVAL;
 
-	err = devlink_health_reporter_recover(reporter, NULL);
+	err = devlink_health_reporter_recover(reporter, NULL, info->extack);
 
 	devlink_health_reporter_put(reporter);
 	return err;
@@ -5221,7 +5223,7 @@ static int devlink_nl_cmd_health_reporter_diagnose_doit(struct sk_buff *skb,
 	if (err)
 		goto out;
 
-	err = reporter->ops->diagnose(reporter, fmsg);
+	err = reporter->ops->diagnose(reporter, fmsg, info->extack);
 	if (err)
 		goto out;
 
@@ -5256,7 +5258,7 @@ devlink_nl_cmd_health_reporter_dump_get_dumpit(struct sk_buff *skb,
 	}
 	mutex_lock(&reporter->dump_lock);
 	if (!start) {
-		err = devlink_health_do_dump(reporter, NULL);
+		err = devlink_health_do_dump(reporter, NULL, cb->extack);
 		if (err)
 			goto unlock;
 		cb->args[1] = reporter->dump_ts;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [patch net-next 3/4] netdevsim: implement couple of testing devlink health reporters
  2019-10-09 11:04 [patch net-next 0/4] netdevsim: add devlink health reporters support Jiri Pirko
  2019-10-09 11:04 ` [patch net-next 1/4] devlink: don't do reporter recovery if the state is healthy Jiri Pirko
  2019-10-09 11:04 ` [patch net-next 2/4] devlink: propagate extack down to health reporter ops Jiri Pirko
@ 2019-10-09 11:04 ` Jiri Pirko
  2019-10-10  3:53   ` Jakub Kicinski
  2019-10-09 11:04 ` [patch net-next 4/4] selftests: add netdevsim devlink health tests Jiri Pirko
  3 siblings, 1 reply; 10+ messages in thread
From: Jiri Pirko @ 2019-10-09 11:04 UTC (permalink / raw)
  To: netdev; +Cc: davem, jakub.kicinski, ayal, moshe, eranbe, mlxsw

From: Jiri Pirko <jiri@mellanox.com>

Implement "empty" and "dummy" reporters. The first one is really simple
and does nothing. The other one has debugfs files to trigger breakage
and it is able to do recovery. The ops also implement dummy fmsg
content.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/netdevsim/Makefile    |   2 +-
 drivers/net/netdevsim/dev.c       |  17 +-
 drivers/net/netdevsim/health.c    | 315 ++++++++++++++++++++++++++++++
 drivers/net/netdevsim/netdevsim.h |  14 ++
 4 files changed, 345 insertions(+), 3 deletions(-)
 create mode 100644 drivers/net/netdevsim/health.c

diff --git a/drivers/net/netdevsim/Makefile b/drivers/net/netdevsim/Makefile
index 09f1315d2f2a..f4d8f62f28c2 100644
--- a/drivers/net/netdevsim/Makefile
+++ b/drivers/net/netdevsim/Makefile
@@ -3,7 +3,7 @@
 obj-$(CONFIG_NETDEVSIM) += netdevsim.o
 
 netdevsim-objs := \
-	netdev.o dev.o fib.o bus.o
+	netdev.o dev.o fib.o bus.o health.o
 
 ifeq ($(CONFIG_BPF_SYSCALL),y)
 netdevsim-objs += \
diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
index e47fa7b6ca7c..468e157a7cb1 100644
--- a/drivers/net/netdevsim/dev.c
+++ b/drivers/net/netdevsim/dev.c
@@ -730,12 +730,18 @@ static int nsim_dev_reload_create(struct nsim_dev *nsim_dev,
 	if (err)
 		goto err_dummy_region_exit;
 
-	err = nsim_dev_port_add_all(nsim_dev, nsim_bus_dev->port_count);
+	err = nsim_dev_health_init(nsim_dev, devlink);
 	if (err)
 		goto err_traps_exit;
 
+	err = nsim_dev_port_add_all(nsim_dev, nsim_bus_dev->port_count);
+	if (err)
+		goto err_health_exit;
+
 	return 0;
 
+err_health_exit:
+	nsim_dev_health_exit(nsim_dev);
 err_traps_exit:
 	nsim_dev_traps_exit(devlink);
 err_dummy_region_exit:
@@ -797,10 +803,14 @@ static struct nsim_dev *nsim_dev_create(struct nsim_bus_dev *nsim_bus_dev)
 	if (err)
 		goto err_traps_exit;
 
-	err = nsim_bpf_dev_init(nsim_dev);
+	err = nsim_dev_health_init(nsim_dev, devlink);
 	if (err)
 		goto err_debugfs_exit;
 
+	err = nsim_bpf_dev_init(nsim_dev);
+	if (err)
+		goto err_health_exit;
+
 	err = nsim_dev_port_add_all(nsim_dev, nsim_bus_dev->port_count);
 	if (err)
 		goto err_bpf_dev_exit;
@@ -810,6 +820,8 @@ static struct nsim_dev *nsim_dev_create(struct nsim_bus_dev *nsim_bus_dev)
 
 err_bpf_dev_exit:
 	nsim_bpf_dev_exit(nsim_dev);
+err_health_exit:
+	nsim_dev_health_exit(nsim_dev);
 err_debugfs_exit:
 	nsim_dev_debugfs_exit(nsim_dev);
 err_traps_exit:
@@ -837,6 +849,7 @@ static void nsim_dev_reload_destroy(struct nsim_dev *nsim_dev)
 	if (devlink_is_reload_failed(devlink))
 		return;
 	nsim_dev_port_del_all(nsim_dev);
+	nsim_dev_health_exit(nsim_dev);
 	nsim_dev_traps_exit(devlink);
 	nsim_dev_dummy_region_exit(nsim_dev);
 	mutex_destroy(&nsim_dev->port_list_lock);
diff --git a/drivers/net/netdevsim/health.c b/drivers/net/netdevsim/health.c
new file mode 100644
index 000000000000..088ae8fd89fc
--- /dev/null
+++ b/drivers/net/netdevsim/health.c
@@ -0,0 +1,315 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2019 Mellanox Technologies. All rights reserved */
+
+#include <linux/debugfs.h>
+#include <linux/err.h>
+#include <linux/kernel.h>
+#include <linux/slab.h>
+
+#include "netdevsim.h"
+
+static int
+nsim_dev_empty_reporter_dump(struct devlink_health_reporter *reporter,
+			     struct devlink_fmsg *fmsg, void *priv_ctx,
+			     struct netlink_ext_ack *extack)
+{
+	return 0;
+}
+
+static int
+nsim_dev_empty_reporter_diagnose(struct devlink_health_reporter *reporter,
+				 struct devlink_fmsg *fmsg,
+				 struct netlink_ext_ack *extack)
+{
+	return 0;
+}
+
+static const
+struct devlink_health_reporter_ops nsim_dev_empty_reporter_ops = {
+	.name = "empty",
+	.dump = nsim_dev_empty_reporter_dump,
+	.diagnose = nsim_dev_empty_reporter_diagnose,
+};
+
+struct nsim_dev_dummy_reporter_ctx {
+	char *break_msg;
+};
+
+static int
+nsim_dev_dummy_reporter_recover(struct devlink_health_reporter *reporter,
+				void *priv_ctx,
+				struct netlink_ext_ack *extack)
+{
+	struct nsim_dev_health *health = devlink_health_reporter_priv(reporter);
+	struct nsim_dev_dummy_reporter_ctx *ctx = priv_ctx;
+
+	if (health->fail_recover) {
+		/* For testing purposes, user set debugfs fail_recover
+		 * value to true. Fail right away.
+		 */
+		NL_SET_ERR_MSG_MOD(extack, "User setup the recover to fail for testing purposes");
+		return -EINVAL;
+	}
+	if (ctx) {
+		health->recovered_break_msg = kstrdup(ctx->break_msg,
+						      GFP_KERNEL);
+		if (!health->recovered_break_msg)
+			return -ENOMEM;
+	}
+	return 0;
+}
+
+static int nsim_dev_dummy_fmsg_put(struct devlink_fmsg *fmsg, u32 binary_len)
+{
+	char *binary;
+	int err;
+	int i;
+
+	err = devlink_fmsg_bool_pair_put(fmsg, "test_bool", true);
+	if (err)
+		return err;
+	err = devlink_fmsg_u8_pair_put(fmsg, "test_u8", 1);
+	if (err)
+		return err;
+	err = devlink_fmsg_u32_pair_put(fmsg, "test_u32", 3);
+	if (err)
+		return err;
+	err = devlink_fmsg_u64_pair_put(fmsg, "test_u64", 4);
+	if (err)
+		return err;
+	err = devlink_fmsg_string_pair_put(fmsg, "test_string", "somestring");
+	if (err)
+		return err;
+
+	binary = kmalloc(binary_len, GFP_KERNEL);
+	if (!binary)
+		return -ENOMEM;
+	get_random_bytes(binary, binary_len);
+	err = devlink_fmsg_binary_pair_put(fmsg, "test_binary",
+					   binary, binary_len);
+	kfree(binary);
+	if (err)
+		return err;
+
+	err = devlink_fmsg_pair_nest_start(fmsg, "test_nest");
+	if (err)
+		return err;
+	err = devlink_fmsg_obj_nest_start(fmsg);
+	if (err)
+		return err;
+	err = devlink_fmsg_bool_pair_put(fmsg, "nested_test_bool", false);
+	if (err)
+		return err;
+	err = devlink_fmsg_u8_pair_put(fmsg, "nested_test_u8", false);
+	if (err)
+		return err;
+	err = devlink_fmsg_obj_nest_end(fmsg);
+	if (err)
+		return err;
+	err = devlink_fmsg_pair_nest_end(fmsg);
+	if (err)
+		return err;
+
+	err = devlink_fmsg_arr_pair_nest_start(fmsg, "test_bool_array");
+	if (err)
+		return err;
+	for (i = 0; i < 10; i++) {
+		err = devlink_fmsg_bool_put(fmsg, true);
+		if (err)
+			return err;
+	}
+	err = devlink_fmsg_arr_pair_nest_end(fmsg);
+	if (err)
+		return err;
+
+	err = devlink_fmsg_arr_pair_nest_start(fmsg, "test_u8_array");
+	if (err)
+		return err;
+	for (i = 0; i < 10; i++) {
+		err = devlink_fmsg_u8_put(fmsg, i);
+		if (err)
+			return err;
+	}
+	err = devlink_fmsg_arr_pair_nest_end(fmsg);
+	if (err)
+		return err;
+
+	err = devlink_fmsg_arr_pair_nest_start(fmsg, "test_u32_array");
+	if (err)
+		return err;
+	for (i = 0; i < 10; i++) {
+		err = devlink_fmsg_u32_put(fmsg, i);
+		if (err)
+			return err;
+	}
+	err = devlink_fmsg_arr_pair_nest_end(fmsg);
+	if (err)
+		return err;
+
+	err = devlink_fmsg_arr_pair_nest_start(fmsg, "test_u64_array");
+	if (err)
+		return err;
+	for (i = 0; i < 10; i++) {
+		err = devlink_fmsg_u64_put(fmsg, i);
+		if (err)
+			return err;
+	}
+	err = devlink_fmsg_arr_pair_nest_end(fmsg);
+	if (err)
+		return err;
+
+	err = devlink_fmsg_arr_pair_nest_start(fmsg, "test_array_of_objects");
+	if (err)
+		return err;
+	for (i = 0; i < 10; i++) {
+		err = devlink_fmsg_obj_nest_start(fmsg);
+		if (err)
+			return err;
+		err = devlink_fmsg_bool_pair_put(fmsg,
+						 "in_array_nested_test_bool",
+						 false);
+		if (err)
+			return err;
+		err = devlink_fmsg_u8_pair_put(fmsg,
+					       "in_array_nested_test_u8",
+					       i);
+		if (err)
+			return err;
+		err = devlink_fmsg_obj_nest_end(fmsg);
+		if (err)
+			return err;
+	}
+	return devlink_fmsg_arr_pair_nest_end(fmsg);
+}
+
+static int
+nsim_dev_dummy_reporter_dump(struct devlink_health_reporter *reporter,
+			     struct devlink_fmsg *fmsg, void *priv_ctx,
+			     struct netlink_ext_ack *extack)
+{
+	struct nsim_dev_health *health = devlink_health_reporter_priv(reporter);
+	struct nsim_dev_dummy_reporter_ctx *ctx = priv_ctx;
+	int err;
+
+	if (ctx) {
+		err = devlink_fmsg_string_pair_put(fmsg, "break_message",
+						   ctx->break_msg);
+		if (err)
+			return err;
+	}
+	return nsim_dev_dummy_fmsg_put(fmsg, health->binary_len);
+}
+
+static int
+nsim_dev_dummy_reporter_diagnose(struct devlink_health_reporter *reporter,
+				 struct devlink_fmsg *fmsg,
+				 struct netlink_ext_ack *extack)
+{
+	struct nsim_dev_health *health = devlink_health_reporter_priv(reporter);
+	int err;
+
+	if (health->recovered_break_msg) {
+		err = devlink_fmsg_string_pair_put(fmsg,
+						   "recovered_break_message",
+						   health->recovered_break_msg);
+		if (err)
+			return err;
+	}
+	return nsim_dev_dummy_fmsg_put(fmsg, health->binary_len);
+}
+
+static const
+struct devlink_health_reporter_ops nsim_dev_dummy_reporter_ops = {
+	.name = "dummy",
+	.recover = nsim_dev_dummy_reporter_recover,
+	.dump = nsim_dev_dummy_reporter_dump,
+	.diagnose = nsim_dev_dummy_reporter_diagnose,
+};
+
+static ssize_t nsim_dev_health_break_write(struct file *file,
+					   const char __user *data,
+					   size_t count, loff_t *ppos)
+{
+	struct nsim_dev_health *health = file->private_data;
+	struct nsim_dev_dummy_reporter_ctx ctx;
+	char *break_msg;
+	int err;
+
+	break_msg = kmalloc(count + 1, GFP_KERNEL);
+	if (!break_msg)
+		return -ENOMEM;
+
+	if (copy_from_user(break_msg, data, count)) {
+		err = -EFAULT;
+		goto out;
+	}
+	break_msg[count] = '\0';
+	if (break_msg[count - 1] == '\n')
+		break_msg[count - 1] = '\0';
+
+	ctx.break_msg = break_msg;
+	err = devlink_health_report(health->dummy_reporter, break_msg, &ctx);
+	if (err)
+		goto out;
+
+out:
+	kfree(break_msg);
+	return err ? err : count;
+}
+
+static const struct file_operations nsim_dev_health_break_fops = {
+	.open = simple_open,
+	.write = nsim_dev_health_break_write,
+	.llseek = generic_file_llseek,
+};
+
+int nsim_dev_health_init(struct nsim_dev *nsim_dev, struct devlink *devlink)
+{
+	struct nsim_dev_health *health = &nsim_dev->health;
+	int err;
+
+	health->empty_reporter =
+		devlink_health_reporter_create(devlink,
+					       &nsim_dev_empty_reporter_ops,
+					       0, false, health);
+	if (IS_ERR(health->empty_reporter))
+		return PTR_ERR(health->empty_reporter);
+
+	health->dummy_reporter =
+		devlink_health_reporter_create(devlink,
+					       &nsim_dev_dummy_reporter_ops,
+					       0, false, health);
+	if (IS_ERR(health->dummy_reporter)) {
+		err = PTR_ERR(health->dummy_reporter);
+		goto err_empty_reporter_destroy;
+	}
+
+	health->ddir = debugfs_create_dir("health", nsim_dev->ddir);
+	if (IS_ERR_OR_NULL(health->ddir))
+		return PTR_ERR_OR_ZERO(health->ddir) ?: -EINVAL;
+
+	health->recovered_break_msg = NULL;
+	debugfs_create_file("break_health", 0200, health->ddir, health,
+			    &nsim_dev_health_break_fops);
+	health->binary_len = 16;
+	debugfs_create_u32("binary_len", 0600, health->ddir,
+			   &health->binary_len);
+	health->fail_recover = false;
+	debugfs_create_bool("fail_recover", 0600, health->ddir,
+			    &health->fail_recover);
+	return 0;
+
+err_empty_reporter_destroy:
+	devlink_health_reporter_destroy(health->empty_reporter);
+	return err;
+}
+
+void nsim_dev_health_exit(struct nsim_dev *nsim_dev)
+{
+	struct nsim_dev_health *health = &nsim_dev->health;
+
+	debugfs_remove_recursive(health->ddir);
+	kfree(health->recovered_break_msg);
+	devlink_health_reporter_destroy(health->dummy_reporter);
+	devlink_health_reporter_destroy(health->empty_reporter);
+}
diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
index 24358385d869..657cbae50293 100644
--- a/drivers/net/netdevsim/netdevsim.h
+++ b/drivers/net/netdevsim/netdevsim.h
@@ -18,6 +18,7 @@
 #include <linux/list.h>
 #include <linux/netdevice.h>
 #include <linux/u64_stats_sync.h>
+#include <linux/debugfs.h>
 #include <net/devlink.h>
 #include <net/xdp.h>
 
@@ -134,6 +135,18 @@ enum nsim_resource_id {
 	NSIM_RESOURCE_IPV6_FIB_RULES,
 };
 
+struct nsim_dev_health {
+	struct devlink_health_reporter *empty_reporter;
+	struct devlink_health_reporter *dummy_reporter;
+	struct dentry *ddir;
+	char *recovered_break_msg;
+	u32 binary_len;
+	bool fail_recover;
+};
+
+int nsim_dev_health_init(struct nsim_dev *nsim_dev, struct devlink *devlink);
+void nsim_dev_health_exit(struct nsim_dev *nsim_dev);
+
 struct nsim_dev_port {
 	struct list_head list;
 	struct devlink_port devlink_port;
@@ -164,6 +177,7 @@ struct nsim_dev {
 	bool dont_allow_reload;
 	bool fail_reload;
 	struct devlink_region *dummy_region;
+	struct nsim_dev_health health;
 };
 
 static inline struct net *nsim_dev_net(struct nsim_dev *nsim_dev)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [patch net-next 4/4] selftests: add netdevsim devlink health tests
  2019-10-09 11:04 [patch net-next 0/4] netdevsim: add devlink health reporters support Jiri Pirko
                   ` (2 preceding siblings ...)
  2019-10-09 11:04 ` [patch net-next 3/4] netdevsim: implement couple of testing devlink health reporters Jiri Pirko
@ 2019-10-09 11:04 ` Jiri Pirko
  3 siblings, 0 replies; 10+ messages in thread
From: Jiri Pirko @ 2019-10-09 11:04 UTC (permalink / raw)
  To: netdev; +Cc: davem, jakub.kicinski, ayal, moshe, eranbe, mlxsw

From: Jiri Pirko <jiri@mellanox.com>

Add basic tests to verify functionality of netdevsim reporters.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 .../drivers/net/netdevsim/devlink.sh          | 124 +++++++++++++++++-
 1 file changed, 123 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/drivers/net/netdevsim/devlink.sh b/tools/testing/selftests/drivers/net/netdevsim/devlink.sh
index cb0f17e17abc..ff84aede8c73 100755
--- a/tools/testing/selftests/drivers/net/netdevsim/devlink.sh
+++ b/tools/testing/selftests/drivers/net/netdevsim/devlink.sh
@@ -4,7 +4,8 @@
 lib_dir=$(dirname $0)/../../../net/forwarding
 
 ALL_TESTS="fw_flash_test params_test regions_test reload_test \
-	   netns_reload_test resource_test dev_info_test"
+	   netns_reload_test resource_test dev_info_test \
+	   empty_reporter_test dummy_reporter_test"
 NUM_NETIFS=0
 source $lib_dir/lib.sh
 
@@ -303,6 +304,127 @@ dev_info_test()
 	log_test "dev_info test"
 }
 
+empty_reporter_test()
+{
+	RET=0
+
+	devlink health show $DL_HANDLE reporter empty >/dev/null
+	check_err $? "Failed show empty reporter"
+
+	devlink health dump show $DL_HANDLE reporter empty >/dev/null
+	check_err $? "Failed show dump of empty reporter"
+
+	devlink health diagnose $DL_HANDLE reporter empty >/dev/null
+	check_err $? "Failed diagnose empty reporter"
+
+	devlink health recover $DL_HANDLE reporter empty
+	check_err $? "Failed recover empty reporter"
+
+	log_test "empty reporter test"
+}
+
+check_reporter_info()
+{
+	local name=$1
+	local expected_state=$2
+	local expected_error=$3
+	local expected_recover=$4
+	local expected_grace_period=$5
+	local expected_auto_recover=$6
+
+	local show=$(devlink health show $DL_HANDLE reporter $name -j | jq -e -r ".[][][]")
+	check_err $? "Failed show $name reporter"
+
+	local state=$(echo $show | jq -r ".state")
+	[ "$state" == "$expected_state" ]
+	check_err $? "Unexpected \"state\" value (got $state, expected $expected_state)"
+
+	local error=$(echo $show | jq -r ".error")
+	[ "$error" == "$expected_error" ]
+	check_err $? "Unexpected \"error\" value (got $error, expected $expected_error)"
+
+	local recover=`echo $show | jq -r ".recover"`
+	[ "$recover" == "$expected_recover" ]
+	check_err $? "Unexpected \"recover\" value (got $recover, expected $expected_recover)"
+
+	local grace_period=$(echo $show | jq -r ".grace_period")
+	check_err $? "Failed get $name reporter grace_period"
+	[ "$grace_period" == "$expected_grace_period" ]
+	check_err $? "Unexpected \"grace_period\" value (got $grace_period, expected $expected_grace_period)"
+
+	local auto_recover=$(echo $show | jq -r ".auto_recover")
+	[ "$auto_recover" == "$expected_auto_recover" ]
+	check_err $? "Unexpected \"auto_recover\" value (got $auto_recover, expected $expected_auto_recover)"
+}
+
+dummy_reporter_test()
+{
+	RET=0
+
+	check_reporter_info dummy healthy 0 0 0 false
+
+	local BREAK_MSG="foo bar"
+	echo "$BREAK_MSG"> $DEBUGFS_DIR/health/break_health
+	check_err $? "Failed to break dummy reporter"
+
+	check_reporter_info dummy error 1 0 0 false
+
+	local dump=$(devlink health dump show $DL_HANDLE reporter dummy -j)
+	check_err $? "Failed show dump of dummy reporter"
+
+	local dump_break_msg=$(echo $dump | jq -r ".break_message")
+	[ "$dump_break_msg" == "$BREAK_MSG" ]
+	check_err $? "Unexpected dump break message value (got $dump_break_msg, expected $BREAK_MSG)"
+
+	devlink health dump clear $DL_HANDLE reporter dummy
+	check_err $? "Failed clear dump of dummy reporter"
+
+	devlink health recover $DL_HANDLE reporter dummy
+	check_err $? "Failed recover dummy reporter"
+
+	check_reporter_info dummy healthy 1 1 0 false
+
+	devlink health set $DL_HANDLE reporter dummy auto_recover true
+	check_err $? "Failed to dummy reporter auto_recover option"
+
+	check_reporter_info dummy healthy 1 1 0 true
+
+	echo "$BREAK_MSG"> $DEBUGFS_DIR/health/break_health
+	check_err $? "Failed to break dummy reporter"
+
+	check_reporter_info dummy healthy 2 2 0 true
+
+	local diagnose=$(devlink health diagnose $DL_HANDLE reporter dummy -j -p)
+	check_err $? "Failed show diagnose of dummy reporter"
+
+	local rcvrd_break_msg=$(echo $diagnose | jq -r ".recovered_break_message")
+	[ "$rcvrd_break_msg" == "$BREAK_MSG" ]
+	check_err $? "Unexpected recovered break message value (got $rcvrd_break_msg, expected $BREAK_MSG)"
+
+	devlink health set $DL_HANDLE reporter dummy grace_period 10
+	check_err $? "Failed to dummy reporter grace_period option"
+
+	check_reporter_info dummy healthy 2 2 10 true
+	
+	echo "Y"> $DEBUGFS_DIR/health/fail_recover
+	check_err $? "Failed set dummy reporter recovery to fail"
+
+	echo "$BREAK_MSG"> $DEBUGFS_DIR/health/break_health
+	check_fail $? "Unexpected success of dummy reporter break"
+	
+	check_reporter_info dummy error 3 2 10 true
+
+	echo "N"> $DEBUGFS_DIR/health/fail_recover
+	check_err $? "Failed set dummy reporter recovery to be successful"
+	
+	devlink health recover $DL_HANDLE reporter dummy
+	check_err $? "Failed recover dummy reporter"
+	
+	check_reporter_info dummy healthy 3 3 10 true
+
+	log_test "dummy reporter test"
+}
+
 setup_prepare()
 {
 	modprobe netdevsim
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [patch net-next 2/4] devlink: propagate extack down to health reporter ops
  2019-10-09 11:04 ` [patch net-next 2/4] devlink: propagate extack down to health reporter ops Jiri Pirko
@ 2019-10-10  3:38   ` Jakub Kicinski
  2019-10-10  8:07     ` Jiri Pirko
  0 siblings, 1 reply; 10+ messages in thread
From: Jakub Kicinski @ 2019-10-10  3:38 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, davem, ayal, moshe, eranbe, mlxsw

On Wed,  9 Oct 2019 13:04:43 +0200, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@mellanox.com>
> 
> During health reporter operations, driver might want to fill-up
> the extack message, so propagate extack down to the health reporter ops.
> 
> Signed-off-by: Jiri Pirko <jiri@mellanox.com>

I wonder how useful this is for non-testing :( We'd probably expect most
health reporters to have auto-recovery on and therefore they won't be
able to depend on that extack..

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch net-next 3/4] netdevsim: implement couple of testing devlink health reporters
  2019-10-09 11:04 ` [patch net-next 3/4] netdevsim: implement couple of testing devlink health reporters Jiri Pirko
@ 2019-10-10  3:53   ` Jakub Kicinski
  2019-10-10  8:09     ` Jiri Pirko
  0 siblings, 1 reply; 10+ messages in thread
From: Jakub Kicinski @ 2019-10-10  3:53 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, davem, ayal, moshe, eranbe, mlxsw

On Wed,  9 Oct 2019 13:04:44 +0200, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@mellanox.com>
> 
> Implement "empty" and "dummy" reporters. The first one is really simple
> and does nothing. The other one has debugfs files to trigger breakage
> and it is able to do recovery. The ops also implement dummy fmsg
> content.
> 
> Signed-off-by: Jiri Pirko <jiri@mellanox.com>

> diff --git a/drivers/net/netdevsim/health.c b/drivers/net/netdevsim/health.c
> new file mode 100644
> index 000000000000..088ae8fd89fc
> --- /dev/null
> +++ b/drivers/net/netdevsim/health.c

> +static int
> +nsim_dev_dummy_reporter_recover(struct devlink_health_reporter *reporter,
> +				void *priv_ctx,
> +				struct netlink_ext_ack *extack)
> +{
> +	struct nsim_dev_health *health = devlink_health_reporter_priv(reporter);
> +	struct nsim_dev_dummy_reporter_ctx *ctx = priv_ctx;
> +
> +	if (health->fail_recover) {
> +		/* For testing purposes, user set debugfs fail_recover
> +		 * value to true. Fail right away.
> +		 */
> +		NL_SET_ERR_MSG_MOD(extack, "User setup the recover to fail for testing purposes");
> +		return -EINVAL;
> +	}
> +	if (ctx) {
> +		health->recovered_break_msg = kstrdup(ctx->break_msg,

Can't there already be a message there? wouldn't this leak it?

> +						      GFP_KERNEL);
> +		if (!health->recovered_break_msg)
> +			return -ENOMEM;
> +	}
> +	return 0;
> +}

> +static ssize_t nsim_dev_health_break_write(struct file *file,
> +					   const char __user *data,
> +					   size_t count, loff_t *ppos)
> +{
> +	struct nsim_dev_health *health = file->private_data;
> +	struct nsim_dev_dummy_reporter_ctx ctx;
> +	char *break_msg;
> +	int err;
> +
> +	break_msg = kmalloc(count + 1, GFP_KERNEL);
> +	if (!break_msg)
> +		return -ENOMEM;
> +
> +	if (copy_from_user(break_msg, data, count)) {
> +		err = -EFAULT;
> +		goto out;
> +	}
> +	break_msg[count] = '\0';
> +	if (break_msg[count - 1] == '\n')
> +		break_msg[count - 1] = '\0';
> +
> +	ctx.break_msg = break_msg;
> +	err = devlink_health_report(health->dummy_reporter, break_msg, &ctx);
> +	if (err)
> +		goto out;
> +
> +out:
> +	kfree(break_msg);
> +	return err ? err : count;

nit: ?: works here, like you used below

> +}
> +
> +static const struct file_operations nsim_dev_health_break_fops = {
> +	.open = simple_open,
> +	.write = nsim_dev_health_break_write,
> +	.llseek = generic_file_llseek,
> +};
> +
> +int nsim_dev_health_init(struct nsim_dev *nsim_dev, struct devlink *devlink)
> +{
> +	struct nsim_dev_health *health = &nsim_dev->health;
> +	int err;
> +
> +	health->empty_reporter =
> +		devlink_health_reporter_create(devlink,
> +					       &nsim_dev_empty_reporter_ops,
> +					       0, false, health);
> +	if (IS_ERR(health->empty_reporter))
> +		return PTR_ERR(health->empty_reporter);
> +
> +	health->dummy_reporter =
> +		devlink_health_reporter_create(devlink,
> +					       &nsim_dev_dummy_reporter_ops,
> +					       0, false, health);
> +	if (IS_ERR(health->dummy_reporter)) {
> +		err = PTR_ERR(health->dummy_reporter);
> +		goto err_empty_reporter_destroy;
> +	}
> +
> +	health->ddir = debugfs_create_dir("health", nsim_dev->ddir);
> +	if (IS_ERR_OR_NULL(health->ddir))
> +		return PTR_ERR_OR_ZERO(health->ddir) ?: -EINVAL;

goto err_dummy_reporter_destroy?

> +	health->recovered_break_msg = NULL;
> +	debugfs_create_file("break_health", 0200, health->ddir, health,
> +			    &nsim_dev_health_break_fops);
> +	health->binary_len = 16;
> +	debugfs_create_u32("binary_len", 0600, health->ddir,
> +			   &health->binary_len);
> +	health->fail_recover = false;
> +	debugfs_create_bool("fail_recover", 0600, health->ddir,
> +			    &health->fail_recover);
> +	return 0;
> +
> +err_empty_reporter_destroy:
> +	devlink_health_reporter_destroy(health->empty_reporter);
> +	return err;
> +}
> +
> +void nsim_dev_health_exit(struct nsim_dev *nsim_dev)
> +{
> +	struct nsim_dev_health *health = &nsim_dev->health;
> +
> +	debugfs_remove_recursive(health->ddir);
> +	kfree(health->recovered_break_msg);
> +	devlink_health_reporter_destroy(health->dummy_reporter);
> +	devlink_health_reporter_destroy(health->empty_reporter);
> +}
> diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
> index 24358385d869..657cbae50293 100644
> --- a/drivers/net/netdevsim/netdevsim.h
> +++ b/drivers/net/netdevsim/netdevsim.h
> @@ -18,6 +18,7 @@
>  #include <linux/list.h>
>  #include <linux/netdevice.h>
>  #include <linux/u64_stats_sync.h>
> +#include <linux/debugfs.h>

I don't think this include is needed? Forward declaration of 
struct dentry, should be sufficient, if needed, no?

>  #include <net/devlink.h>
>  #include <net/xdp.h>
>  
> @@ -134,6 +135,18 @@ enum nsim_resource_id {
>  	NSIM_RESOURCE_IPV6_FIB_RULES,
>  };
>  
> +struct nsim_dev_health {
> +	struct devlink_health_reporter *empty_reporter;
> +	struct devlink_health_reporter *dummy_reporter;
> +	struct dentry *ddir;
> +	char *recovered_break_msg;
> +	u32 binary_len;
> +	bool fail_recover;
> +};
> +
> +int nsim_dev_health_init(struct nsim_dev *nsim_dev, struct devlink *devlink);
> +void nsim_dev_health_exit(struct nsim_dev *nsim_dev);
> +
>  struct nsim_dev_port {
>  	struct list_head list;
>  	struct devlink_port devlink_port;
> @@ -164,6 +177,7 @@ struct nsim_dev {
>  	bool dont_allow_reload;
>  	bool fail_reload;
>  	struct devlink_region *dummy_region;
> +	struct nsim_dev_health health;
>  };
>  
>  static inline struct net *nsim_dev_net(struct nsim_dev *nsim_dev)


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch net-next 2/4] devlink: propagate extack down to health reporter ops
  2019-10-10  3:38   ` Jakub Kicinski
@ 2019-10-10  8:07     ` Jiri Pirko
  2019-10-10 16:09       ` Jakub Kicinski
  0 siblings, 1 reply; 10+ messages in thread
From: Jiri Pirko @ 2019-10-10  8:07 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: netdev, davem, ayal, moshe, eranbe, mlxsw

Thu, Oct 10, 2019 at 05:38:18AM CEST, jakub.kicinski@netronome.com wrote:
>On Wed,  9 Oct 2019 13:04:43 +0200, Jiri Pirko wrote:
>> From: Jiri Pirko <jiri@mellanox.com>
>> 
>> During health reporter operations, driver might want to fill-up
>> the extack message, so propagate extack down to the health reporter ops.
>> 
>> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
>
>I wonder how useful this is for non-testing :( We'd probably expect most
>health reporters to have auto-recovery on and therefore they won't be
>able to depend on that extack..

That is probably true. But still, what is harm of carrying potential
error message to the user?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch net-next 3/4] netdevsim: implement couple of testing devlink health reporters
  2019-10-10  3:53   ` Jakub Kicinski
@ 2019-10-10  8:09     ` Jiri Pirko
  0 siblings, 0 replies; 10+ messages in thread
From: Jiri Pirko @ 2019-10-10  8:09 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: netdev, davem, ayal, moshe, eranbe, mlxsw

Thu, Oct 10, 2019 at 05:53:51AM CEST, jakub.kicinski@netronome.com wrote:
>On Wed,  9 Oct 2019 13:04:44 +0200, Jiri Pirko wrote:
>> From: Jiri Pirko <jiri@mellanox.com>
>> 
>> Implement "empty" and "dummy" reporters. The first one is really simple
>> and does nothing. The other one has debugfs files to trigger breakage
>> and it is able to do recovery. The ops also implement dummy fmsg
>> content.
>> 
>> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
>
>> diff --git a/drivers/net/netdevsim/health.c b/drivers/net/netdevsim/health.c
>> new file mode 100644
>> index 000000000000..088ae8fd89fc
>> --- /dev/null
>> +++ b/drivers/net/netdevsim/health.c
>
>> +static int
>> +nsim_dev_dummy_reporter_recover(struct devlink_health_reporter *reporter,
>> +				void *priv_ctx,
>> +				struct netlink_ext_ack *extack)
>> +{
>> +	struct nsim_dev_health *health = devlink_health_reporter_priv(reporter);
>> +	struct nsim_dev_dummy_reporter_ctx *ctx = priv_ctx;
>> +
>> +	if (health->fail_recover) {
>> +		/* For testing purposes, user set debugfs fail_recover
>> +		 * value to true. Fail right away.
>> +		 */
>> +		NL_SET_ERR_MSG_MOD(extack, "User setup the recover to fail for testing purposes");
>> +		return -EINVAL;
>> +	}
>> +	if (ctx) {
>> +		health->recovered_break_msg = kstrdup(ctx->break_msg,
>
>Can't there already be a message there? wouldn't this leak it?

It would. Will fix.

>
>> +						      GFP_KERNEL);
>> +		if (!health->recovered_break_msg)
>> +			return -ENOMEM;
>> +	}
>> +	return 0;
>> +}
>
>> +static ssize_t nsim_dev_health_break_write(struct file *file,
>> +					   const char __user *data,
>> +					   size_t count, loff_t *ppos)
>> +{
>> +	struct nsim_dev_health *health = file->private_data;
>> +	struct nsim_dev_dummy_reporter_ctx ctx;
>> +	char *break_msg;
>> +	int err;
>> +
>> +	break_msg = kmalloc(count + 1, GFP_KERNEL);
>> +	if (!break_msg)
>> +		return -ENOMEM;
>> +
>> +	if (copy_from_user(break_msg, data, count)) {
>> +		err = -EFAULT;
>> +		goto out;
>> +	}
>> +	break_msg[count] = '\0';
>> +	if (break_msg[count - 1] == '\n')
>> +		break_msg[count - 1] = '\0';
>> +
>> +	ctx.break_msg = break_msg;
>> +	err = devlink_health_report(health->dummy_reporter, break_msg, &ctx);
>> +	if (err)
>> +		goto out;
>> +
>> +out:
>> +	kfree(break_msg);
>> +	return err ? err : count;
>
>nit: ?: works here, like you used below

Ok.


>
>> +}
>> +
>> +static const struct file_operations nsim_dev_health_break_fops = {
>> +	.open = simple_open,
>> +	.write = nsim_dev_health_break_write,
>> +	.llseek = generic_file_llseek,
>> +};
>> +
>> +int nsim_dev_health_init(struct nsim_dev *nsim_dev, struct devlink *devlink)
>> +{
>> +	struct nsim_dev_health *health = &nsim_dev->health;
>> +	int err;
>> +
>> +	health->empty_reporter =
>> +		devlink_health_reporter_create(devlink,
>> +					       &nsim_dev_empty_reporter_ops,
>> +					       0, false, health);
>> +	if (IS_ERR(health->empty_reporter))
>> +		return PTR_ERR(health->empty_reporter);
>> +
>> +	health->dummy_reporter =
>> +		devlink_health_reporter_create(devlink,
>> +					       &nsim_dev_dummy_reporter_ops,
>> +					       0, false, health);
>> +	if (IS_ERR(health->dummy_reporter)) {
>> +		err = PTR_ERR(health->dummy_reporter);
>> +		goto err_empty_reporter_destroy;
>> +	}
>> +
>> +	health->ddir = debugfs_create_dir("health", nsim_dev->ddir);
>> +	if (IS_ERR_OR_NULL(health->ddir))
>> +		return PTR_ERR_OR_ZERO(health->ddir) ?: -EINVAL;
>
>goto err_dummy_reporter_destroy?

Right.


>
>> +	health->recovered_break_msg = NULL;
>> +	debugfs_create_file("break_health", 0200, health->ddir, health,
>> +			    &nsim_dev_health_break_fops);
>> +	health->binary_len = 16;
>> +	debugfs_create_u32("binary_len", 0600, health->ddir,
>> +			   &health->binary_len);
>> +	health->fail_recover = false;
>> +	debugfs_create_bool("fail_recover", 0600, health->ddir,
>> +			    &health->fail_recover);
>> +	return 0;
>> +
>> +err_empty_reporter_destroy:
>> +	devlink_health_reporter_destroy(health->empty_reporter);
>> +	return err;
>> +}
>> +
>> +void nsim_dev_health_exit(struct nsim_dev *nsim_dev)
>> +{
>> +	struct nsim_dev_health *health = &nsim_dev->health;
>> +
>> +	debugfs_remove_recursive(health->ddir);
>> +	kfree(health->recovered_break_msg);
>> +	devlink_health_reporter_destroy(health->dummy_reporter);
>> +	devlink_health_reporter_destroy(health->empty_reporter);
>> +}
>> diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
>> index 24358385d869..657cbae50293 100644
>> --- a/drivers/net/netdevsim/netdevsim.h
>> +++ b/drivers/net/netdevsim/netdevsim.h
>> @@ -18,6 +18,7 @@
>>  #include <linux/list.h>
>>  #include <linux/netdevice.h>
>>  #include <linux/u64_stats_sync.h>
>> +#include <linux/debugfs.h>
>
>I don't think this include is needed? Forward declaration of 
>struct dentry, should be sufficient, if needed, no?

Right.


>
>>  #include <net/devlink.h>
>>  #include <net/xdp.h>
>>  
>> @@ -134,6 +135,18 @@ enum nsim_resource_id {
>>  	NSIM_RESOURCE_IPV6_FIB_RULES,
>>  };
>>  
>> +struct nsim_dev_health {
>> +	struct devlink_health_reporter *empty_reporter;
>> +	struct devlink_health_reporter *dummy_reporter;
>> +	struct dentry *ddir;
>> +	char *recovered_break_msg;
>> +	u32 binary_len;
>> +	bool fail_recover;
>> +};
>> +
>> +int nsim_dev_health_init(struct nsim_dev *nsim_dev, struct devlink *devlink);
>> +void nsim_dev_health_exit(struct nsim_dev *nsim_dev);
>> +
>>  struct nsim_dev_port {
>>  	struct list_head list;
>>  	struct devlink_port devlink_port;
>> @@ -164,6 +177,7 @@ struct nsim_dev {
>>  	bool dont_allow_reload;
>>  	bool fail_reload;
>>  	struct devlink_region *dummy_region;
>> +	struct nsim_dev_health health;
>>  };
>>  
>>  static inline struct net *nsim_dev_net(struct nsim_dev *nsim_dev)
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [patch net-next 2/4] devlink: propagate extack down to health reporter ops
  2019-10-10  8:07     ` Jiri Pirko
@ 2019-10-10 16:09       ` Jakub Kicinski
  0 siblings, 0 replies; 10+ messages in thread
From: Jakub Kicinski @ 2019-10-10 16:09 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, davem, ayal, moshe, eranbe, mlxsw

On Thu, 10 Oct 2019 10:07:04 +0200, Jiri Pirko wrote:
> Thu, Oct 10, 2019 at 05:38:18AM CEST, jakub.kicinski@netronome.com wrote:
> >On Wed,  9 Oct 2019 13:04:43 +0200, Jiri Pirko wrote:  
> >> From: Jiri Pirko <jiri@mellanox.com>
> >> 
> >> During health reporter operations, driver might want to fill-up
> >> the extack message, so propagate extack down to the health reporter ops.
> >> 
> >> Signed-off-by: Jiri Pirko <jiri@mellanox.com>  
> >
> >I wonder how useful this is for non-testing :( We'd probably expect most
> >health reporters to have auto-recovery on and therefore they won't be
> >able to depend on that extack..  
> 
> That is probably true. But still, what is harm of carrying potential
> error message to the user?

Not sure, it just makes the right thing to do non-obvious to someone
implementing the API (below level 7 on Rusty's API Design Manifesto).
But I don't feel too strongly about it.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-10-10 16:09 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-09 11:04 [patch net-next 0/4] netdevsim: add devlink health reporters support Jiri Pirko
2019-10-09 11:04 ` [patch net-next 1/4] devlink: don't do reporter recovery if the state is healthy Jiri Pirko
2019-10-09 11:04 ` [patch net-next 2/4] devlink: propagate extack down to health reporter ops Jiri Pirko
2019-10-10  3:38   ` Jakub Kicinski
2019-10-10  8:07     ` Jiri Pirko
2019-10-10 16:09       ` Jakub Kicinski
2019-10-09 11:04 ` [patch net-next 3/4] netdevsim: implement couple of testing devlink health reporters Jiri Pirko
2019-10-10  3:53   ` Jakub Kicinski
2019-10-10  8:09     ` Jiri Pirko
2019-10-09 11:04 ` [patch net-next 4/4] selftests: add netdevsim devlink health tests Jiri Pirko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).